When do we use one or the other? My use case: I want to evaluate a linear space to see how good retrieval results are. I have a set of data X (m x n) and some weights W (m x 1). I want to measure the nearest neighbour retrieval performance on W'X with a ground truth value Y. This is a continuous value, so I can't use simple precision/recall. If I use rank correlation, I will find the correlation …
I would like to compare recommendations methods using NDCG metric on MovieLens dataset. In ranking problem, the goal is to rank items based on their relevance for user. Ranking models can be learned based on ratings matrix, where each user rates small subset of all items. Ratings for other items are unknown. Collaborative Filtering methods can be used to create model which generalize training datasets and predict ratings for unrated items. Let's consider following example on dataset consisted of 5 …
The NDCG (Normalized Discounted Cumulative Gain) metric for ranking is defined as DCG/IDCG, where IDCG is the ideal DCG and is said to take values in [0, 1]. However, since the DCG will always be positive for any (positive) predicted scores, this metric will never be 0 and it seems to me that it is very biased towards high values. So much so that, in my experiments, I get a NDCG of ~0.8 out of 1.0 for a (custom-made) prediction …
I'm solving a problem of ranking classes for each unique id based on the utilization quantity. I have 6 unique classes in the training and test data. My neural net mode predicts the utilization coressponding to each class. So if there are 10000 test samples, I have 10000X6 prediction array and 10000X6 true value array. I want to validate the model performance using NDCG metrics. I followed https://www.kaggle.com/davidgasquez/ndcg-scorer to compute NDCG. In there, the shapes for the parameters are as …
The problem is recommending stories on a website, just below each story based on how similar the stories are and some historic data based on what recommended stories were clicked or not clicked. So basically the available data looks like Referrer Recommendation Timestamp Position Clicked url1 url2 2020-07-01 05:11:17 3 0 url3 url4 2020-07-02 15:11:17 5 1 For example, a user was reading url1 at the given timestamp when after reading he saw 12 urls as recommendation. Among them, url2 …
This is a question about NDCG, which is a recommendation evaluation metric. The following are being used as evaluation indicators for recommendations. $$DCG = r_1 + \sum\limits_{i=2}^{N}\frac{r_i}{log_2i}$$ $$nDCG = \frac{DCG}{DCG_{perfect}}$$ The code is as follows: def dcg_score (y_true, y_score, k = 20, gains = "exponential"): """Discounted cumulative gain (DCG) at rank k Parameters ---------- y_true: array-like, shape = [n_samples] Ground truth (true relevance labels). y_score: array-like, shape = [n_samples] Predicted scores. k: int Rank. gains: str Whether gains should be …
I am working in a multilabel recommender project and I try to evaluate it as a ranking problem. I calculate recall@k and precision@k which both looks quite well. Recall increases and Precision decreases as I try higher K values, which is expected. However, the NDCG@K increases up to a certain K and after that it stays the same. How can we explain such a behaviour?
One limitation of NDCG and way to overcome the limitation (as mentioned in https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Limitations) is "Normalized DCG does not penalize for missing documents in the result. For example, if a query returns two results with scores 1,1,1 and 1,1,1,1,1 respectively, both would be considered equally good, assuming ideal DCG is computed to rank 3 for the former and rank 5 for the latter. One way to take into account this limitation is to enforce fixed set size for the result …
I'm trying to implement xgboost with an objective of rank:ndcg I want the target to be between 0-3. In my data for most of the groups, there is only 1 event per group which his target is not 0. I wonder if the model will learn different about this product when I put target value 3 (and not 1) when this is the only target that larger than 0 from products on a different group which got target value 1. …