The following code will run Ordinary linear regression and ridge regression on B=100 different samples of Boston data set and calculate the RMSE on the test set for all 100 different test sets and store the RMSE for ordinary linear regression in olr_rmse_all and for ridge in ridge_rmse_all.
You can do any type of analysis on RMSE vectors.
I calculated mean and standard deviation of the RMSE of B=100 for Ordinary Linear regression and Ridge regression
library(MASS)
library (ridge)
data1 = Boston
RMSE = function(actual, predicted){
sqrt(mean((actual - predicted)^2))
}
test_perc = 0.3
B = 100
olr_rmse_all = c()
ridge_rmse_all = c()
for (i in c(1:B)){
cat("running for sample = ", i, '\n')
train_rows = sample(nrow(data), (1-test_perc)*nrow(data))
test_rows = setdiff(c(1:nrow(data)) , train_row)
Boston.train <- Boston[train_rows, ]
Boston.test <- Boston[test_rows, ]
olr_model <- lm(medv ~ ., data = Boston.train)
ridge_model <- linearRidge(medv ~ ., data = Boston.train)
test_predicted_olr = predict(olr_model, newdata = Boston.test)
test_predicted_ridge = predict(ridge_model, newdata = Boston.test)
test_actual = Boston.test$medv
rmse_test_olr = RMSE(test_actual, test_predicted_olr)
rmse_test_ridge = RMSE(test_actual, test_predicted_ridge)
cat(paste("olr_rmse: " , rmse_test_olr, "ridge_rmse: ", rmse_test_ridge, sep = " "), '\n')
olr_rmse_all = c(olr_rmse_all, rmse_test_olr)
ridge_rmse_all = c(ridge_rmse_all, rmse_test_ridge)
}
#-- Mean of RMSE----
olr_rmse_all_mean = mean(olr_rmse_all)
ridge_rmse_all_mean = mean(ridge_rmse_all)
#-- Standard Deviation of RMSE----
olr_rmse_all_sd = sd(olr_rmse_all)
ridge_rmse_all_sd = sd(ridge_rmse_all)
Hope, this answer your question.
There is package for bootstrapping in R, name "boot", you can also use that. I do it without package for more understanding.
reference: https://www.statmethods.net/advstats/bootstrapping.html