How to build a unbiased predictive ML model when the record of the event is less compared to the total number of records?
I am trying to build a model that will predict the communication loss of a wireless device. For now I am using RandomForestClassifier along with Device and Location as the features. I am getting both the train score and test score as 99%. So I am pretty sure the model is giving biased result. One of the reason might be because the record of communication loss events are very less compared to the the record with no communication loss Some people advised me that it might not be possible to build a prediction model based on the situation. But I would like to have more suggestion or advice if there is anything I can do about it.
Topic data-science-model data machine-learning
Category Data Science