difference between novelty, concept drift and anomaly

Concept drift is when the relation between the input data and the target variable changes over time. like changes in the conditional distribution.

is novelty an outlier? what should I think of? what is the difference between concept drift and novelty and anomaly? is the concept drift considered a type of novelty? how exactly? can you please explain !!

Topic terminology data-mining

Category Data Science


Roughly all three concepts are related.

Drift means the relationship between input and output is dynamic and changes (stochastically) over (sufficiently long periods of) time. That is, it is not stationary. For example, consumers' criteria about what to buy, change over time, for example as people become more eco-conscious. More importantly drift, when it happens, invalidates the existing model used for prediction.

Anomaly also called an outlier is a very rare non-typical event (when input-output relationship is considered stationary over time), that happens upon exceptional circumstances. Something like a white snake. It may happen but is not typical of snakes and if it happens it does not mean that input-output relationship has necessarily drifted from the original assumptions (eg the assumptions about the color distribution of snakes). Accordingly anomaly, when it happens, does not invalidate the existing model used for prediction.

Novelty as far as I understand it, is an umbrella term for something new and unpredictable happening, which however may be attributable to anything (drift, anomaly, etc).

Please note that determining the reason for the observed novelty requires careful analysis (for example, multiple anomalies may mean drift is what is actually happening)!

References:

  1. Anomaly detection
  2. Concept drift

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.