Hyper-parameter tuning of NaiveBayes Classier

Question

Hyper-parameter tuning of NaiveBayes Classier

Sameer Zahid

2022年6月4日 16:33

I'm fairly new to machine learning and I'm aware of the concept of hyper-parameters tuning of classifiers, and I've come across a couple of examples of this technique. However, I'm trying to use NaiveBayes Classifier of sklearn for a task but I'm not sure about the values of the parameters that I should try.

What I want is something like this, but for GaussianNB() classifier and not SVM:

from sklearn.model_selection import GridSearchCV
C=[0.05,0.1,0.2,0.3,0.25,0.4,0.5,0.6,0.7,0.8,0.9,1]
gamma=[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]
kernel=['rbf','linear']
hyper={'kernel':kernel,'C':C,'gamma':gamma}
gd=GridSearchCV(estimator=svm.SVC(),param_grid=hyper,verbose=True)
gd.fit(X,Y)
print(gd.best_score_)
print(gd.best_estimator_)

I've tried to search for examples for NaiveBayes, but couldn't find any. What I have right now is simply this:

model = GaussianNB()

What I want is to try different parameters and compare the scores.

Topic hyperparameter-tuning naive-bayes-classifier hyperparameter scikit-learn machine-learning

Category Data Science

Ian · Accepted Answer · 2022年6月4日 16:33

from sklearn.model_selection import GridSearchCV

hyper = {'C':[0.05,0.1,0.2,0.3,0.25,0.4,0.5,0.6,0.7,0.8,0.9,1],
         'gamma':[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0],
         'kernel':['rbf','linear']
        }

gd=GridSearchCV(estimator=svm.SVC(),param_grid=hyper,verbose=True)

gd.fit(X,Y)
print(gd.best_score_)
print(gd.best_estimator_)

Sources:

pep · Accepted Answer · 2021年5月13日 20:27

I think the comment is correct - there are not hyperparameters in the same sense as other ML classifiers.

You do want to make sure that you use the best version of Naive Bayes, based on your data (sklearn user guide: https://scikit-learn.org/stable/modules/naive_bayes.html#gaussian-naive-bayes)

I think one approach to using Naive Bayes in a robust manner might be repeated K-fold cross-validation (https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RepeatedStratifiedKFold.html.

Please correct this answer if I'm off base! I'm not very experienced in ML and had this question myself - this was the best answer I could come up with.

chris · Accepted Answer · 2020年5月4日 17:24

I think you will find Optuna good for this, and it will work for whatever model you want. You might try something like this:

import optuna

def objective(trial):
    hyper_parameter_value = trial.suggest_uniform('x', -10, 10)
    model = GaussianNB(<hyperparameter you are trying to optimize>=hyperparameter_value)

    # evaluate the model here

    return model_accuracy  # or whatever metric you want to optimize

study = optuna.create_study()
study.optimize(objective, n_trials=100)

You can run studies that persist across multiple runs, and you can print out the values of the hyperparameters that worked best, etc.

Hyper-parameter tuning of NaiveBayes Classier

About