How variable alpha changes SGDRegressor behavior for outlier?
I am using SGDRegressor with a constant learning rate and default loss function. I am curious to know how changing the alpha parameter in the function from 0.0001 to 100 will change regressor behavior. Below is the sample code I have:
from sklearn.linear_model import SGDRegressor
out=[(0,2),(21, 13), (-23, -15), (22,14), (23, 14)]
alpha=[0.0001, 1, 100]
N= len(out)
plt.figure(figsize=(20,15))
j=1
for i in alpha:
X= b * np.sin(phi) #Since for every alpha we want to start with original dataset, I included X and Y in this section
Y= a * np.cos(phi)
for num in range(N):
plt.subplot(3, N, j)
X=np.append(X,out[num][0]) # Appending outlier to main X
Y=np.append(Y,out[num][1]) # Appending outlier to main Y
j=j+1 # Increasing J so we move on to next plot
model=SGDRegressor(alpha=i, eta0=0.001, learning_rate='constant',random_state=0)
model.fit(X.reshape(-1, 1), Y) # Fitting the model
plt.scatter(X,Y)
plt.title(alpha = + str(i) + | + Slope : + str(round(model.coef_[0], 4))) #Adding title to each plot
abline(model.coef_[0],model.intercept_) # Plotting the line using abline function
plt.show()
As shown above I had the main datset of X and Y and in each iteration, I am adding a point as an outlier to the main dataset and train the model and plot regression line (hyperplane). Below you can see the result for different values of alpha:
I am looking at results and am still confused and can't make solid conclusion as how alhpa parameter changes the model? what's the effect of alpha? is it causing overfitting? underfitting?
Topic sgd hyperparameter-tuning outlier python
Category Data Science