LGBM model predicting only single class on unseen data!

I have built a LightGBM based machine learning model on data of molecules of two classes. The distribution is as follows. Class 0 has 5933 data points and class 1 has 4696. The train test accuracy I get on this data is around 87% and 82% respectively. The roc_auc_score is around 81.5%. But when I try to evaluate model performance on an entirely new dataset which model has never seen before with class label 0 and 1 both having 94 data points, the model predicts all class labels as 1 only i.e. all 188 data are predicted as class 1. I am trying to understand where am I going wrong. Please let me know what can be done.

Thanks

Topic binary-classification lightgbm generalization predictive-modeling machine-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.