Hyper-parameter tuning of NaiveBayes Classier

I'm fairly new to machine learning and I'm aware of the concept of hyper-parameters tuning of classifiers, and I've come across a couple of examples of this technique. However, I'm trying to use NaiveBayes Classifier of sklearn for a task but I'm not sure about the values of the parameters that I should try.

What I want is something like this, but for GaussianNB() classifier and not SVM:

from sklearn.model_selection import GridSearchCV
C=[0.05,0.1,0.2,0.3,0.25,0.4,0.5,0.6,0.7,0.8,0.9,1]
gamma=[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]
kernel=['rbf','linear']
hyper={'kernel':kernel,'C':C,'gamma':gamma}
gd=GridSearchCV(estimator=svm.SVC(),param_grid=hyper,verbose=True)
gd.fit(X,Y)
print(gd.best_score_)
print(gd.best_estimator_)

I've tried to search for examples for NaiveBayes, but couldn't find any. What I have right now is simply this:

model = GaussianNB()

What I want is to try different parameters and compare the scores.

Topic hyperparameter-tuning naive-bayes-classifier hyperparameter scikit-learn machine-learning

Category Data Science


from sklearn.model_selection import GridSearchCV

hyper = {'C':[0.05,0.1,0.2,0.3,0.25,0.4,0.5,0.6,0.7,0.8,0.9,1],
         'gamma':[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0],
         'kernel':['rbf','linear']
        }

gd=GridSearchCV(estimator=svm.SVC(),param_grid=hyper,verbose=True)

gd.fit(X,Y)
print(gd.best_score_)
print(gd.best_estimator_)

Sources:

  1. Hyperparameter Tuning of Machine Learning Model in Python
  2. Tuning Hyperparameters of Machine Learning Model | GitHub
  3. https://www.youtube.com/watch?v=AvWfL1Us3Kg

I think the comment is correct - there are not hyperparameters in the same sense as other ML classifiers.

You do want to make sure that you use the best version of Naive Bayes, based on your data (sklearn user guide: https://scikit-learn.org/stable/modules/naive_bayes.html#gaussian-naive-bayes)

I think one approach to using Naive Bayes in a robust manner might be repeated K-fold cross-validation (https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RepeatedStratifiedKFold.html.

Please correct this answer if I'm off base! I'm not very experienced in ML and had this question myself - this was the best answer I could come up with.


I think you will find Optuna good for this, and it will work for whatever model you want. You might try something like this:

import optuna

def objective(trial):
    hyper_parameter_value = trial.suggest_uniform('x', -10, 10)
    model = GaussianNB(<hyperparameter you are trying to optimize>=hyperparameter_value)

    # evaluate the model here

    return model_accuracy  # or whatever metric you want to optimize

study = optuna.create_study()
study.optimize(objective, n_trials=100)

You can run studies that persist across multiple runs, and you can print out the values of the hyperparameters that worked best, etc.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.