Can anyone tell me why is my pipeline wrong?

I am trying to build a pipeline in order to perform GridSearchCV to find the best parameters. I already split the data into train and validation and have the following code:

column_transformer = make_pipeline(

(OneHotEncoder(categories = cols)),

(OrdinalEncoder(categories = X[grade])),

passthrough)


imputer = SimpleImputer(strategy='median')

scaler = StandardScaler()

model = SGDClassifier(loss='log',random_state=42,n_jobs=-1,warm_start=True)

pipeline_sgdlogreg = make_pipeline(imputer, column_transformer, scaler, model)

When I perform GridSearchCV I am getting the follwing error:

cannot use median strategy with non-numeric data (...)

I do not understand why am I getting this error. None of the categorical variables have missing values.

I perfoming the follwing: Imputation-Encoding-Scaling- Modeling

Can anyone shed some light?

Topic pipelines missing-data encoding python

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.