which type of machine learning algorithms perform better at extrapolation (in general)

Assuming that:

  1. the problem lies in the field of natural science, i.e. relationships between variables are physics-based and does not change depending on context
  2. its a regression based model

Would it be right to assume that kernelized approaches (e.g. SVM) would perform better for unseen combinations of predictor variables, when compared to neural networks etc?

As I understood, many ML models generally fail to provide accurate prediction when the new inputs are out of distributions they were initially trained on. E.g. tree based methods, like random forest, while providing excellent output on the trained combinations of predictors fail if new variables are out of bag. On other hand,kernels ( especially the linear) project decision boundaries beyond the space of the initial training points. So assumingly, kernel-projected thresholds/boundaries help to maintain a better accuracy for unseen combinations.

Topic kernel predictive-modeling machine-learning

Category Data Science


This is a rhetorical question. You cannot say a certain model would perform better without looking at the data and it's distribution. For example if majority of the data has a linear relationship with the label then Linear Regression or linear models "might" work better.

Usually the best strategy is to try all models on the dataset and see which performs better!

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.