Why does using Gradient descent over Stochatic gradient descent improve performance?

Question

Why does using Gradient descent over Stochatic gradient descent improve performance?

haneulkim

2022年3月19日 00:02

Currently, I'm running two types of logistic regression.

logistic regression with SGD
logistic regression with GD

implemented as follows

SGD= SGDClassifier(loss=log,max_iter=1000,penalty='l1',alpha=0.001)
logreg = LogisticRegression(solver='liblinear', max_iter=100, penalty='l1', C=0.1)

nevermind the hyperparameters as I've used GridsearchCV and tried multiple combinations.

When calculating accuracy logistic with GD performs better than SGD. I want to understand why this is the case, is using GD instead SGD one way to mitigate underfitting model?

Topic sgd gradient-descent logistic-regression python machine-learning

Category Data Science

Carlos Mougan · Accepted Answer · 2021年5月12日 09:40

Gradient Descent should have better results as it runs in your whole data. Stochastic Gradient Descent looks at batches, making it useful for big data. Batches( or subset) make it run faster but, it can get converge at a local minimum.

On Wikipedia you can find the following citation :

SGD replaces the actual gradient (calculated from the entire data set) by an estimate thereof (calculated from a randomly selected subset of the data). Especially in high-dimensional optimization problems, this reduces the computational burden, achieving faster iterations in trade for a lower convergence rate

Abhishek Verma · Accepted Answer · 2021年5月12日 09:18

SGD has a regularization effect and finds the solution faster. GD on the other hand takes a look at whole data and finds the next best step.

SGD may be come to optimal global minima but GD can. But GD is not practical with large data.

Why does using Gradient descent over Stochatic gradient descent improve performance?

About