Automatic detection of ML problem type: Regression or Classification

I am trying to design an algorithm that based on training data automatically detects ML problem type: Regression or Classification.

There is no need to say that it is impossible to design such an algorithm that will be correct in 100% of cases. The goal is to find a heuristic that will be wrong in 10% or less.

The first obvious, naive idea would be assigning regression model to the data that has at least 80% of unique values. Yet for small data sets that may be wrong. One example is a data set with 125 records labeled with 100 classes, that naive approach will determine as a regression problem, when in fact this is a multi-labeled classification.

Any ideas, links to the existing work in this area? Thanks!

Topic automl regression classification

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.