How to train LGBMClassifier using optuna
I am trying to use lgbm
with optuna
for a classification task.
Here is my model.
from optuna.integration import LightGBMPruningCallback
import optuna.integration.lightgbm as lgbm
import optuna
def objective(trial, X_train, y_train, X_test, y_test):
param_grid = {
# device_type: trial.suggest_categorical(device_type, ['gpu']),
n_estimators: trial.suggest_categorical(n_estimators, [10000]),
learning_rate: trial.suggest_float(learning_rate, 0.01, 0.3, log=True),
num_leaves: trial.suggest_int(num_leaves, 20, 3000, step=20),
max_depth: trial.suggest_int(max_depth, 3, 12),
min_data_in_leaf: trial.suggest_int(min_data_in_leaf, 100, 10000, step=1000),
lambda_l1: trial.suggest_int(lambda_l1, 0, 100, step=5),
min_gain_to_split: trial.suggest_float(min_gain_to_split, 0, 15),
bagging_fraction: trial.suggest_float(
bagging_fraction, 0.2, 0.95, step=0.1
),
bagging_freq: trial.suggest_categorical(bagging_freq, [1]),
feature_fraction: trial.suggest_float(
feature_fraction, 0.2, 0.95, step=0.1
),
max_features: trial.suggest_categorical(
max_features, choices=[auto, sqrt, log2]
),
n_jobs: -1,
random_state: 1121218,
}
model = lgbm.LGBMClassifier(objective=multiclass, **param_grid)
model.fit(
X_train,
y_train,
eval_set=[(X_test, y_test)],
eval_metric=multi_logloss,
early_stopping_rounds=5,
callbacks=[
LightGBMPruningCallback(trial, multi_logloss)
], # Add a pruning callback
)
preds = model.predict_proba(X_test)
return preds, model
I then call the model
%%time
study = optuna.create_study(direction=maximize, study_name=LGBM Classifier)
func = lambda trial: objective(trial, X_train, y_train, X_test, y_test)
preds, model = study.optimize(func, n_trials=100)
But I get the following error:
RuntimeError: scikit-learn estimators should always specify their parameters in the signature of their __init__ (no varargs).
class 'optuna.integration._lightgbm_tuner.sklearn.LGBMClassifier' with constructor (self, *args:Any, **kwargs:Any) - None doesn't follow this convention.
The understand the error, but I'm not sure what the correct way is to do what I want to do.
Topic multiclass-classification scikit-learn python
Category Data Science