sklearn.neighbors.KernelDensity - score(X) explanation
For sklearn.neighbors.KernelDensity
, its score(X)
method according to the sklearn KDE documentation says:
Compute the log-likelihood of each sample under the model
For 'gaussian' kernel, I have implemented hyper-parameter tuning for the 'bandwidth' parameter using Bayesian-Optimization as follows:
# The input data for which 'bandwidth' needs to be tuned-
data
# (2880, 64)
def kde_hyperopt_eval(bandwidth):
params = {}
params['bandwidth'] = bandwidth
# Initialize a KDE model-
kde_model = KernelDensity(
kernel = 'gaussian',
bandwidth = params['bandwidth']
)
# Train KDE model on training data-
kde_model.fit(data)
# Compute the total log-likelihood under the model.
# Returns the log probability.
'''
Total log-likelihood of the data in X. This is normalized to be a
probability density, so the value will be low for high-dimensional
data.
'''
return kde_model.score(data)
optimizer = BayesianOptimization(
f = kde_hyperopt_eval,
pbounds = {
'bandwidth': (0.01, 10)
}
)
optimizer.maximize(n_iter = 40, init_points = 15)
and I'm getting the result as
| iter | target | bandwidth |
-------------------------------------
| 1 | -5.644e+0 | 8.527 |
| 2 | 2.142e+0 | 0.3419 |
| 3 | -1.287e+0 | 0.7963 |
| 4 | -5.916e+0 | 9.883 |
| 5 | -3.604e+0 | 2.817 |
| 6 | -5.71e+05 | 8.835 |
| 7 | -5.246e+0 | 6.868 |
| 8 | -4.385e+0 | 4.305 |
| 9 | -5.546e+0 | 8.082 |
| 10 | -5.86e+05 | 9.585 |
| 11 | -5.226e+0 | 6.794 |
| 12 | -5.196e+0 | 6.685 |
| 13 | -9.934e+0 | 0.6771 |
| 14 | -5.766e+0 | 9.111 |
| 15 | 6.816e+0 | 0.2584 |
| 16 | 6.565e+0 | 0.01 |
| 17 | 6.565e+0 | 0.01 |
| 18 | 6.804e+0 | 0.2585 |
| 19 | 6.804e+0 | 0.2585 |
| 20 | 6.804e+0 | 0.2585 |
| 21 | 6.804e+0 | 0.2585 |
| 22 | 6.804e+0 | 0.2585 |
| 23 | 6.804e+0 | 0.2585 |
| 24 | 6.804e+0 | 0.2585 |
| 25 | 6.804e+0 | 0.2585 |
| 26 | 6.804e+0 | 0.2585 |
| 27 | 6.804e+0 | 0.2585 |
| 28 | 6.804e+0 | 0.2585 |
| 29 | 6.804e+0 | 0.2585 |
| 30 | 6.804e+0 | 0.2585 |
| 31 | 6.804e+0 | 0.2585 |
| 32 | 6.804e+0 | 0.2585 |
| 33 | 6.804e+0 | 0.2585 |
| 34 | 6.804e+0 | 0.2585 |
| 35 | 6.804e+0 | 0.2585 |
| 36 | 6.804e+0 | 0.2585 |
| 37 | 6.804e+0 | 0.2585 |
| 38 | 6.804e+0 | 0.2585 |
| 39 | 6.804e+0 | 0.2585 |
| 40 | 6.804e+0 | 0.2585 |
| 41 | 6.804e+0 | 0.2585 |
| 42 | 6.804e+0 | 0.2585 |
| 43 | 6.804e+0 | 0.2585 |
| 44 | 6.804e+0 | 0.2585 |
| 45 | 6.804e+0 | 0.2585 |
| 46 | 6.804e+0 | 0.2585 |
| 47 | 6.804e+0 | 0.2585 |
| 48 | 6.804e+0 | 0.2585 |
| 49 | 6.804e+0 | 0.2585 |
| 50 | 6.804e+0 | 0.2585 |
| 51 | 6.804e+0 | 0.2585 |
| 52 | 6.804e+0 | 0.2585 |
| 53 | 6.804e+0 | 0.2585 |
| 54 | 6.804e+0 | 0.2585 |
| 55 | 6.804e+0 | 0.2585 |
=====================================
How can I interpret this value returned by score(X)
method of sklearn.neighbors.KernelDensity
?
For example, in case of Mean Squared Error or Mean Absolute Error, smaller means better, for accuracy, higher percentage value is better.