How to use prediction model after onehot encoding?
I have created a prediction model for this dataset
df.head()
Service Tasks Difficulty Hours
0 ABC 24 1 0.833333
1 CDE 77 1 1.750000
2 SDE 90 3 3.166667
3 QWE 47 1 1.083333
4 ASD 26 3 1.000000
df.shape
(998,4)
X = df.iloc[:,:-1]
y = df.iloc[:,-1].values
from sklearn.compose import ColumnTransformer
ct = ColumnTransformer([(cat, OneHotEncoder(),[0])], remainder=passthrough)
X = ct.fit_transform(X)
x = X.toarray()
x = x[:,1:]
x.shape
(998,339)
from sklearn.ensemble import RandomForestRegressor
rf_model = RandomForestRegressor(random_state = 1)
rf_model.fit(x,y)
How can I use this model to predict Hours for user input in this format [[SDE, 90, 3]]
I tried
test_input = [[SDE, 90, 3]]
test_input = ct.fit_transform(test_input)
test_input = test_input[[:,1:]
test_input[0]
array([24, 1], dtype=object)
predict_hours = rf_model.predict(test_input)
ValueError
Since my dataset has many categorical values its not possible enter the encoded value of SDE
as input, I need to convert SDE
to onehot encoded format after receiving the input [[SDE, 90, 3]]
I don't know how to do it can anyone help?
Topic one-hot-encoding prediction python
Category Data Science