Avoiding overfitting in unsupervised ML

Question

Avoiding overfitting in unsupervised ML

continuous_learner

2021年5月2日 03:30

I am using a unsupervised pattern matching approach to create a trade strategy. I use the output of the pattern matched results to decide whether to enter a trade or not. For deciding the best pattern parameters I run several combinations over the entire data set and choose the best parameters that yield the best profits. My question is whether this would be considered overfitting. If so, how may I avoid the same? I looked at several posts on StackOverflow but did not find anything that relates to my particular use case.

Topic pattern-recognition overfitting unsupervised-learning

Category Data Science

Jayaram Iyer · Accepted Answer · 2021年5月2日 03:30

You could partition your dataset between train and test sets. Then arrive at the pattern parameters from the train set. After that apply your learnt model on the test set to evaluate its performance.

If you figure out that your model is performing poorly then you know you have overfit.

Also since this is a trading strategy and possibly involves timeseries, you want to ensure there is no data leakage and the test set is of a time frame later than that of the test set.

Avoiding overfitting in unsupervised ML

About