Logistic Regression test accuracy vs deployment

I am working on a problem where I make some weekly predictions. I gathered the data myself and did some pre-processing steps and I end up with 6 features. I split the dataset 60-20-20 in train, holdout, test sets and I train a Logistic Regression model. I get very good holdout and test accuracy (95%) with only a few FP, FN. However, when making predictions for new data in the end the results are nowhere near there. I make predictions weekly and past 2 weeks I have an accuracy of around 60%. How is this explained? I believe is important to note that 1 of my features has a 0.25 correlation with the target variable while the rest have 0.90. Is this what is causing the misleading high test accuracy? Thanks

Topic logistic-regression accuracy confusion-matrix scikit-learn classification

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.