Fix for NDCG Limitation

  • One limitation of NDCG and way to overcome the limitation (as mentioned in https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Limitations) is

    "Normalized DCG does not penalize for missing documents in the result. For example, if a query returns two results with scores 1,1,1 and 1,1,1,1,1 respectively, both would be considered equally good, assuming ideal DCG is computed to rank 3 for the former and rank 5 for the latter. One way to take into account this limitation is to enforce fixed set size for the result set and use minimum scores for the missing documents. In previous example, we would use the scores 1,1,1,0,0 and 1,1,1,1,1 and quote nDCG as nDCG@5"

  • In the mentioned scenario, scores are 1,1,1 and 1,1,1,1,1 and both have NDCG = 1

  • After applying fix scores are 1,1,1,0,0 and 1,1,1,1,1. NDCG is still 1 for both, so it seems the fix is not working

  • Could you explain where im wrong and if possible could you elaborate the fix for NDCG limitation of not penalizing missing documents or point out to some resource. Thanks

Topic ndcg

Category Data Science


the idea is that when reporting nDCG@5 for the two results (associated to the same query) you have to use the same ideal discounted cumulative gain (in your example it would be the one from 1,1,1,1,1 which is the highest). Then the result 1,1,1,0,0 ends up penalized (nDCG@5 < 1) for the missing documents.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.