Random Forest Classifier Output

Used a RandomForestClassifier for my prediciton model. But the output printed is either 0 or in decimals. What do I need to do for my model to show me 0 and 1's instead of decimals?

Note: used feature importance and removed the least important columns,still the accuracy is the same and the output hasn't changed much.

Also, i have my estimators equal to 1000. do i increase or decrease this?

edit:

target col
1
0
0
1

output col
0.994
0
0.355
0.768

thanks for reading this, if you did!

Topic prediction random-forest predictive-modeling machine-learning

Category Data Science


On what data are you training on? Is your training data binary?

If not, then set a treshold when your target variable should be 1 and 0 otherwise. Then train your RandomForestClassifier on the binary labels. Could be that you are training your classifier on a continuous target variable and thats why your performance is so bad.

The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large (Breiman, 2001)

More trees = better. However, it's also computationally more expensive. There is a trade-off. Start low ~64 trees and then work your way up, if the generalization error is still high


Take the numbers given by the model and threshold them. Everything above X (usually .5) is mapped to 0, everything greater than X is mapped to 1.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.