Dealing with an apparently inseparable dataset
I'm attempting to build a model/suite of models to predict a binary target. The exact details of the models aren't important, but suffice to say that I've tried with half a dozen different types of models, with comparable results from all of them.
On looking at the predictions on various subsets of the training data, it appears that a certain subset of features is important for around 30% of the data, while a different subset is important for the remaining 70%. With the training data/holdout set. This separation is fairly easy to detect when the target is known (run a model with subset1, another with subset2, find the subset where one model does better than the other). Obviously, this is not possible with the test data, since the target is not known there.
There are clearly (at least) two regions in the data that are substantially different from each other, since a model trained on the whole dataset does worse than the combination of separate models trained on each of these regions.
However, the dataset seems spectacularly resistant to separation, with no major difference between principal components (linear and RBF kernel) and no clear/stable clusters after applying several different clustering algorithms (KMeans, agglomerative, mean shift).
Are there any popular methods that allow separation/clustering of data that are resistant to "normal" methods? The end goal here is to find out which model to use on which rows when target is unknown.
Topic domain-adaptation machine-learning
Category Data Science