How to calculate NDCG in recommendation system
This is a question about NDCG, which is a recommendation evaluation metric.
The following are being used as evaluation indicators for recommendations.
$$DCG = r_1 + \sum\limits_{i=2}^{N}\frac{r_i}{log_2i}$$ $$nDCG = \frac{DCG}{DCG_{perfect}}$$
The code is as follows:
def dcg_score (y_true, y_score, k = 20, gains = "exponential"):
"""Discounted cumulative gain (DCG) at rank k
Parameters
----------
y_true: array-like, shape = [n_samples]
Ground truth (true relevance labels).
y_score: array-like, shape = [n_samples]
Predicted scores.
k: int
Rank.
gains: str
Whether gains should be "exponential" (default) or "linear".
Returns
-------
DCG @k: float
"""
order = np.argsort (y_score) [::-1]
y_true = np.take (y_true, order [: k])
if gains == "exponential":
gains = 2 ** y_true-1
elif gains == "linear":
gains = y_true
else:
raise ValueError ("Invalid gains option.")
# highest rank is 1 so +2 instead of +1
discounts = np.log2 (np.arange (len (y_true)) + 2)
return np.sum (gains / discounts)
def ndcg_score (y_true, y_score, k = 20, gains = "exponential"):
"""Normalized discounted cumulative gain (NDCG) at rank k
Parameters
----------
y_true: array-like, shape = [n_samples]
Ground truth (true relevance labels).
y_score: array-like, shape = [n_samples]
Predicted scores.
k: int
Rank.
gains: str
Whether gains should be "exponential" (default) or "linear".
Returns
-------
NDCG @k: float
"""
best = dcg_score (y_true, y_true, k, gains)
actual = dcg_score (y_true, y_score, k, gains)
return actual / best
Assumes k = 5.
At this time, how should NDCG calculate for items that could not be recommended within kth?
For example,
y_true = [5,4,3,2,1]
y_score = [0,0,0,0,0] # 0 means we could not recommend within the top 5
At this time,
np.argsort ([0,0,0,0]) [::-1]
array ([3, 2, 1, 0])
So, following the above code,
NDCG @ 5 = 1.0
This looks strange.
In such a case, should the score be 0 and not be included in the NDCG score calculation?
If you have any references, just showing them is fine with me.
Thank you.
Topic ndcg python recommender-system
Category Data Science