Online learning w/ feature weighting/adjusting

Let's say I have a supervised learning problem with a sequence of features and labels. First, I learn on the training data and then I decide to stream in data, point by point and do online learning. Is it possible to update the weights or figure out the feature importances as each data point comes in? Also, what online learning algorithms would allow me to do this and can this be done in Python?

Topic online-learning feature-selection data-stream-mining

Category Data Science


Online learning actually is an optimization method , dealing with large scale data and huge feature space .

FTRL is a typical one , derived from stochastic gradient descent . You can refer paper http://www.jmlr.org/proceedings/papers/v15/mcmahan11b/mcmahan11b.pdf if you want to know more about that .

There are other specific online methods developed based on it , such as TDAP , you can check paper http://www.cs.cmu.edu/~epxing/papers/2016/HuaWei_KDD16.pdf to know more .

As you said , you wanted to know 『feature importances』while training . Model changes while iteration goes on or data points comes in , so the model will tell you the exact 『feature importances』.

At such circumstances , most of them are developed with scala or java based on Spark , others may be developed with c++ based on OMP , you can develop your own online learning method with python .

Hopes this contributes you -)


Yes, this can be done in Python. Scikit-Learn has a few online learning algorithms available, of which you can derive the feature importances. Look at the following webpage under 6.1.3. Incremental learning:

http://scikit-learn.org/stable/modules/scaling_strategies.html

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.