Which machine learning models allow online training and which don't?

I am working on a project where I have to update my model every time I get feedback x times. For example, showing an Advertisement on an App and then, when the person doesn't click on in it after seeing it multiple times in a day generates negative example. When they do that's positive. My initial dateset is not very big (20,000) but it's going to significantly increase in future. I am starting with models like logistic Regression, SVM, XGBoost etc. I have being asked to have a system in place to update my models with the newly available data every day. Not the full data just new data.

I have been searching for each model if it can be trained online or not and if yes, how. I am able to find answers but I am unable to understand the reason why some are fine with online training, some bad and why some don't allow it at all.

I understand every model that uses gradient descent or modification of it (RMSProp, Adam etc.) can easily update weights seeing new data. But what about the rest?

Is there a general rule?

Topic online-learning machine-learning

Category Data Science


You can use any model to update new data and retrain. The model you have may be hosted on cloud or on premise. I don't know why you said that some models can be trained online and others not.

Basically you need to set a data pipeline in place which will capture the new data and send it to your model which will then retrain it. This process is called (not surprisingly!) re-training. It is an integral part of the data science life cycle. And you can use any model for re training not just neural nets!

Cheers!

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.