Logistic Regression test accuracy vs deployment

Question

Logistic Regression test accuracy vs deployment

henrycmcjo

2022年2月12日 19:39

I am working on a problem where I make some weekly predictions. I gathered the data myself and did some pre-processing steps and I end up with 6 features. I split the dataset 60-20-20 in train, holdout, test sets and I train a Logistic Regression model. I get very good holdout and test accuracy (95%) with only a few FP, FN. However, when making predictions for new data in the end the results are nowhere near there. I make predictions weekly and past 2 weeks I have an accuracy of around 60%. How is this explained? I believe is important to note that 1 of my features has a 0.25 correlation with the target variable while the rest have 0.90. Is this what is causing the misleading high test accuracy? Thanks

Topic logistic-regression accuracy confusion-matrix scikit-learn classification

Category Data Science

Logistic Regression test accuracy vs deployment

About