Why are my ridge regression coefficients completely different from ordinary linear regression coefficients in MATLAB?
I am attempting to implement my own Ridge Regression algorithm and I am trying to achieve similar coefficients found in a MATLAB tutorial on regression.
Specifically, on the MATLAB tutorial page you will see:
load carsmall
x1 = Weight;
x2 = Horsepower; % Contains NaN data
y = MPG;
X = [ones(size(x1)) x1 x2 x1.*x2];
b = regress(y,X) % Removes NaN data
b = 4×1
60.7104
-0.0102
-0.1882
0.0000
Above, you can see the first coefficient is about 60, and the rest are pretty close to 0. I am trying to achieve similar results using Ridge Regression with the exact same data set carsmall provided with MATLAB.
The following is MATLAB code I have written:
load carsmall
x1 = Weight;
x2 = Horsepower; % Contains NaN data
y = MPG;
x3 = x1.*x2;
% remove NaN data
y_nan = find (isnan(y));
x2_nan = find(isnan(x2));
all_nan = [y_nan; x2_nan];
counter = 1;
for m=1:length(y)
flag=0;
for j=1:length(all_nan)
if m == all_nan(j)
flag = 1;
end
end
if flag 1
y_clean(counter) = y(m);
x1_clean(counter) = x1(m);
x2_clean(counter) = x2(m);
x3_clean(counter) = x3(m);
counter = counter+1;
end
end
clear x1 x2 x3 y
x1 = x1_clean;
x2 = x2_clean;
x3 = x3_clean;
y = y_clean;
n = length(y);
% at this point, x1,x2,x3, and y should not have any NaN data (i.e. clean)
% normalize the clean data
x1 = x1 / max(x1);
x2 = x2/max(x2);
x3 = x3/max(x3);
y = y/max(y);
% gradient descent iterates this many times
max_iterations=10;
% this is the variable used for penalty in the cost function for Ridge
% Regression
lambda = .1;
% gradient descent uses this to compute a step size
learning_rate = .001;
% initialize parameters
y_int = 10;
B1 = .1;
B2 = .1;
B3 = 0;
% begin gradient descent iterations
thres_y_int = .01; % -- used for stopping condition of gradient descent
for i=1:max_iterations
dJ_d_y_int = 0;
dJ_d_B1 = 0;
dJ_d_B2 = 0;
%dJ_d_B3 = 0;
for j=1:n
% these are actually partial derivatives of cost function with
% respect to the 3 params (y_intercept, B1, and B2)
dJ_d_y_int = dJ_d_y_int -2 * ( y(j) - y_int -B1*x1(j) - B2*x2(j)- B3*x3(j) );
dJ_d_B1 = dJ_d_B1 -2 * x1(j) * (y(j) - y_int -B1*x1(j) - B2*x2(j)- B3*x3(j));
dJ_d_B2 = dJ_d_B2 -2 * x2(j) * (y(j) - y_int -B1*x1(j) - B2*x2(j)- B3*x3(j));
%dJ_d_B3 = dJ_d_B3 -2 * x3(j) * (y(j) - y_int -B1*x1(j) - B2*x2(j)- B3*x3(j));
end
dJ_d_B1 = dJ_d_B1 + 2*lambda*B1;
dJ_d_B2 = dJ_d_B2 + 2*lambda*B2;
%dJ_d_B3 = dJ_d_B3 + 2*lambda*B3;
% step size
delta_y_int = dJ_d_y_int * learning_rate;
delta_B1 = dJ_d_B1 * learning_rate;
delta_B2 = dJ_d_B2 * learning_rate;
%delta_B3 = dJ_d_B3 * learning_rate;
% stopping condition
if ( abs(delta_y_int) thres_y_int)
disp('breaking')
break
end
% update parameters
y_int = y_int - delta_y_int;
B1 = B1 - delta_B1;
B2 = B2 - delta_B2;
% B3 = B3 - delta_B3;
end
Running the above program results in the following coefficients:
B1 =
-3.348401550938010
B2 =
-2.504364991046751
y_int =
4.206818888998534
These numbers look nothing like the coefficients found in the MATLAB tutorial. Hence, I am thinking I am doing something wrong. What I am missing?
Topic ridge-regression linear-regression matlab
Category Data Science