Why do the performance of DL models increase with the volume of data while that of ML models will flat out or even decrease?

I have read some articles and realized that many of them cited, for example, DLis better for large amount of data than ML.

Typically:

The performance of machine learning algorithms decreases as the number of data increases

Source

Another one says the performance of ML models will plateau,

Source

As far as I understand, the more data, the better. It helps us implement complex models without overfitting as well as the algorithms learn the data better, thus inferring decent patterns for accurate outputs. This should apply to both DL and ML.

So, I feel quite confused the statements from the cited sources, hopefulle guys could help me elaborate more on this matter,

Topic theory

Category Data Science


  1. The claim "Performance of machine learning algorithms decreases as the number of data increases" is definitely wrong, at least in general.

  2. Performance of DL model also plateau after its model capacity is reached, just like any ML model. If you think about it, any DL model is characterized by a finite set (no matter how many) of parameters, so there must be a limit on its expressiveness and thus performance. Modern DL models usually have way more parameters than classical ML models, which is the root cause of its superior performance but not without limit. FYI, Andrew Ng draws the complete chart in his DL course on Coursera

Besides, remember DL is a subset of ML (not alternative), so make sure you know what it means to "compare" them.

Recommendation: stop taking anything from these sources; they are harmful to your health.


The first source doesn't give a source for their claim (that performance decreases as you get more data), so I'd ignore it. As a rule of thumb, the more training data the harder it will be to overfit, and I think this applies to all ML algorithms. I.e. diminishing returns, but it shouldn't get worse.

The second source, the image, is what we do tend to observe, at least in the fields of image and text processing. I think one explanation is that deep learning algorithms have been better able to take advantage of modern GPUs, and thus it has been possible to scale their capacity to the point where they are able to learn from huge data sets.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.