Is there a quicker solution to Sklearn MAE?

Question

Is there a quicker solution to Sklearn MAE?

PyNoob

2021年7月29日 10:38

I am attempting to run RandomForestRegressor on this fairly large dataset:

df_train.describe():

         Unnamed: 0           col1           col2           col3           col4          col5
count  8.886500e+05  888650.000000  888650.000000  888650.000000  888650.000000  888650.000000
mean   5.130409e+05       2.636784       3.845549       4.105381       1.554918       1.221922
std    2.998785e+05       2.296243       1.366518       3.285802       1.375791       1.233717
min    4.000000e+00       1.010000       1.010000       1.010000       0.000000       0.000000
25%    2.484332e+05       1.660000       3.230000       2.390000       1.000000       0.000000
50%    5.233705e+05       2.110000       3.480000       3.210000       1.000000       1.000000
75%    7.692788e+05       2.740000       3.950000       4.670000       2.000000       2.000000
max    1.097490e+06      90.580000      43.420000      99.250000      22.000000      24.000000

df_test.describe():
         Unnamed: 0      col1        col2        col3        col4        col5
count  390.000000  390.000000  390.000000  390.000000         0.0         0.0
mean   194.500000    3.393359    4.016821    3.761385         NaN         NaN
std    112.727548    4.504227    1.720292    3.479109         NaN         NaN
min      0.000000    1.020000    2.320000    1.020000         NaN         NaN
25%     97.250000    1.792500    3.272500    2.220000         NaN         NaN
50%    194.500000    2.270000    3.555000    3.055000         NaN         NaN
75%    291.750000    3.172500    4.060000    4.217500         NaN         NaN
max    389.000000   50.000000   18.200000   51.000000         NaN         NaN

While the code runs quickly for MSE which is default for RandomForestRegressor: 21 minutes approximately

However, when I switch to MAE, it takes literally forever (ran my system for 3 days straight still no end in sight)

Is there any way to get MAE to run faster with RandomForestRegressor?

I am running a Ryzen 3700X 8 Core, 32GB RAM machine.

Topic mse random-forest scikit-learn python

Category Data Science

Is there a quicker solution to Sklearn MAE?

About