sklearn.neighbors.KernelDensity - score(X) explanation

For sklearn.neighbors.KernelDensity, its score(X) method according to the sklearn KDE documentation says:

Compute the log-likelihood of each sample under the model

For 'gaussian' kernel, I have implemented hyper-parameter tuning for the 'bandwidth' parameter using Bayesian-Optimization as follows:

# The input data for which 'bandwidth' needs to be tuned-
data
# (2880, 64)

def kde_hyperopt_eval(bandwidth):
    params = {}
    params['bandwidth'] = bandwidth
    
    # Initialize a KDE model-
    kde_model = KernelDensity(
        kernel = 'gaussian',
        bandwidth = params['bandwidth']
    )
    
    # Train KDE model on training data-
    kde_model.fit(data)
    
    # Compute the total log-likelihood under the model.
    # Returns the log probability.
    '''
    Total log-likelihood of the data in X. This is normalized to be a
    probability density, so the value will be low for high-dimensional
    data.
    '''
    return kde_model.score(data)

optimizer = BayesianOptimization(
    f = kde_hyperopt_eval,
    pbounds = {
        'bandwidth': (0.01, 10)
        }
)

optimizer.maximize(n_iter = 40, init_points = 15)

and I'm getting the result as

|   iter    |  target   | bandwidth |
-------------------------------------
|  1        | -5.644e+0 |  8.527    |
|  2        |  2.142e+0 |  0.3419   |
|  3        | -1.287e+0 |  0.7963   |
|  4        | -5.916e+0 |  9.883    |
|  5        | -3.604e+0 |  2.817    |
|  6        | -5.71e+05 |  8.835    |
|  7        | -5.246e+0 |  6.868    |
|  8        | -4.385e+0 |  4.305    |
|  9        | -5.546e+0 |  8.082    |
|  10       | -5.86e+05 |  9.585    |
|  11       | -5.226e+0 |  6.794    |
|  12       | -5.196e+0 |  6.685    |
|  13       | -9.934e+0 |  0.6771   |
|  14       | -5.766e+0 |  9.111    |
|  15       |  6.816e+0 |  0.2584   |
|  16       |  6.565e+0 |  0.01     |
|  17       |  6.565e+0 |  0.01     |
|  18       |  6.804e+0 |  0.2585   |
|  19       |  6.804e+0 |  0.2585   |
|  20       |  6.804e+0 |  0.2585   |
|  21       |  6.804e+0 |  0.2585   |
|  22       |  6.804e+0 |  0.2585   |
|  23       |  6.804e+0 |  0.2585   |
|  24       |  6.804e+0 |  0.2585   |
|  25       |  6.804e+0 |  0.2585   |
|  26       |  6.804e+0 |  0.2585   |
|  27       |  6.804e+0 |  0.2585   |
|  28       |  6.804e+0 |  0.2585   |
|  29       |  6.804e+0 |  0.2585   |
|  30       |  6.804e+0 |  0.2585   |
|  31       |  6.804e+0 |  0.2585   |
|  32       |  6.804e+0 |  0.2585   |
|  33       |  6.804e+0 |  0.2585   |
|  34       |  6.804e+0 |  0.2585   |
|  35       |  6.804e+0 |  0.2585   |
|  36       |  6.804e+0 |  0.2585   |
|  37       |  6.804e+0 |  0.2585   |
|  38       |  6.804e+0 |  0.2585   |
|  39       |  6.804e+0 |  0.2585   |
|  40       |  6.804e+0 |  0.2585   |
|  41       |  6.804e+0 |  0.2585   |
|  42       |  6.804e+0 |  0.2585   |
|  43       |  6.804e+0 |  0.2585   |
|  44       |  6.804e+0 |  0.2585   |
|  45       |  6.804e+0 |  0.2585   |
|  46       |  6.804e+0 |  0.2585   |
|  47       |  6.804e+0 |  0.2585   |
|  48       |  6.804e+0 |  0.2585   |
|  49       |  6.804e+0 |  0.2585   |
|  50       |  6.804e+0 |  0.2585   |
|  51       |  6.804e+0 |  0.2585   |
|  52       |  6.804e+0 |  0.2585   |
|  53       |  6.804e+0 |  0.2585   |
|  54       |  6.804e+0 |  0.2585   |
|  55       |  6.804e+0 |  0.2585   |
=====================================

How can I interpret this value returned by score(X) method of sklearn.neighbors.KernelDensity?

For example, in case of Mean Squared Error or Mean Absolute Error, smaller means better, for accuracy, higher percentage value is better.

Topic density-estimation hyperparameter-tuning scikit-learn machine-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.