How can I improve calibration curves?
I am training a binary xgboost classifer with an imbalance of : 85% = 0 class and 14 % = class 1.
This was achieved after i took a random sample fromaround 11m to 1M.
When i calibrate i get the following:
It seems that using isotonic or sigmoid doesn't really improve the calbration much. Any idea how i can improve it?
sig_clf = CalibratedClassifierCV(model, method=sigmoid, cv=prefit)
iso_clf = CalibratedClassifierCV(model, method=isotonic, cv=prefit)
sig_clf.fit(x_valid, y_valid)
iso_clf.fit(x_valid, y_valid)
prob_pos_sigmoid = sig_clf.predict_proba(x_test)[:, 1]
prob_pos_iso = iso_clf.predict_proba(x_test)[:, 1]
y_test_uncalibrated = model.predict_proba(x_test)[:, 1]
Above is what i use to plot the above graph. I have used 'prefit' since i calibrated after i have trained and fitted the model.
Topic probability-calibration class-imbalance scikit-learn python
Category Data Science