Retrieving the ordinal encoding of a variable after it's placed in a pipeline/columntransformer
I am applying ordinal encoding to a dataset through a column transformer - how can I retrieve the ordinal encoding of a feature (e.g. Area)?
from sklearn.datasets import fetch_openml
df = fetch_openml(data_id=41214, as_frame=True).frame
df
df_train, df_test = train_test_split(df, test_size=0.33, random_state=0)
dt_preprocessor = ColumnTransformer(
[
(
categorical,
OrdinalEncoder(),
[VehBrand, VehPower, VehGas, Area, Region],
),
(numeric, passthrough, [VehAge, DrivAge, BonusMalus,Density]),
],
remainder=drop,
)
f_names = [VehBrand, VehPower, VehGas, Area, Region, VehAge, DrivAge, BonusMalus, Density]
dt = Pipeline(
[
(preprocessor, dt_preprocessor),
(
regressor,
DecisionTreeRegressor(criterion='squared_error', max_depth=3, ccp_alpha=1e-5, min_samples_leaf=2000),
),
]
)
dt.fit(
df_train, df_train['ClaimFreq'], regressor__sample_weight=df_train[Exposure]
)
fig, ax = plt.subplots(figsize=(75, 50))
tree.plot_tree(dt['regressor'], feature_names=f_names, ax=ax, fontsize=30)
plt.show()
```
Topic pipelines encoding python
Category Data Science