Random forest mode scoring
We are using random forest algorithm but having some trouble understanding the scoring method it uses.
take for example the following CM of the test set:
Threshold 45 cm is:
[[67969 48031]
[ 3321 11120]] and the prescion is: 0.18799344051632602
Threshold 50 cm is:
[[77642 38358]
[ 4785 9656]] and the prescion is: 0.2011080101632834
Threshold 55 cm is:
[[88825 27175]
[ 6796 7645]] and the prescion is: 0.2195577254445159
Threshold 60 cm is:
[[100411 15589]
[ 9629 4812]] and the prescion is: 0.2358707906463611
Threshold 65 cm is:
[[112421 3579]
[ 13098 1343]] and the prescion is: 0.2728565623674755
Threshold 70 cm is:
[[115895 105]
[ 14371 70]] and the prescion is: 0.3999999997714286
Threshold 75 cm is:
[[115998 2]
[ 14440 1]] and the prescion is: 0.3333333222222226
Threshold 80 cm is:
[[116000 0]
[ 14441 0]] and the prescion is: 0.0
Threshold 85 cm is:
[[116000 0]
[ 14441 0]] and the prescion is: 0.0
Threshold 90 cm is:
[[116000 0]
[ 14441 0]] and the prescion is: 0.0
This is how we used the RF and printed it's score:
grid_clf = RandomizedSearchCV(clf, param_grid, cv=tscv, verbose=10,n_iter=20,n_jobs=-1,scoring='roc_auc')
grid_clf.fit(X_train, y_train)
print(grid_clf.score(X_test,y_test))
The score we got for this model is 0.7350173458471928
As far as I understand the scoring when using roc_auc is between 0.5 and 1.
How can such a bad model received such a good score?
How is this scoring calculated?
Provided we predicted enough True Positives, we don't mind missing 1's and predicting False Positives. we of course do mind predicting True Negatives
Can I change the scoring to fit what I believe are better results?
Thanks
Topic scoring decision-trees random-forest
Category Data Science