What does a negative coefficient of determination mean for evaluating ridge regression?

Judging by the negative result being displayed from my ridge.score() I am guessing that I am doing something wrong. Maybe someone could point me in the right direction?

# Create a practice data set for exploring Ridge Regression


data_2 = np.array([[1, 2, 0], [3, 4, 1], [5, 6, 0], [1, 3, 1],
           [3, 5, 1], [1, 7, 0], [1, 8, 1]], dtype=np.float64)


# Separate X and Y

x_2 = data_2[:, [0, 1]]
y_2 = data_2[:, 2]

# Train Test Split
x_2_train, x_2_test, y_2_train, y_2_test = train_test_split(x_2, y_2, random_state=0)

# Scale the training data
scaler_2 = StandardScaler()
scaler_2.fit(x_2_train)
x_2_transformed = scaler_2.transform(x_2_train)

# Ridge Regression
ridge_2 = Ridge().fit(x_2_transformed, y_2_train)
x_2_test_scaled = scaler_2.transform(x_2_test)
ridge_2.score(x_2_test_scaled, y_2_test)

Output is: -4.47

EDIT: From reading the scikit learn docs this value is the R$^2$ value. I guess the question is though, how do we interpret this?

Topic ridge-regression scikit-learn machine-learning

Category Data Science


To understand what negative value of coefficient of determination ($r^2$). You need to know what $r^2$ = 0 means.

$r^2$ = 0 means that the squared error of your regressor fit is same as the squared error for a fit that always returns the mean of your targets.

If $r^2$ is negative it means that your regressor fit has a higher squared error than the mean fit. That is, it performs worse than the mean fit.

$r^2$ = 1 - Squared error(your fit)/Squared error(mean fit)

`


A negative value means you're getting a terrible fit - which makes sense if you create a test set that doesn't have the same distribution as the training set.

From the sklearn documentation:

The coefficient $R^2$ is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a $R^2$ score of 0.0.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.