Why is NDCG high even for wrongly ranked predictions?
The NDCG (Normalized Discounted Cumulative Gain) metric for ranking is defined as DCG/IDCG, where IDCG is the ideal DCG and is said to take values in [0, 1].
However, since the DCG will always be positive for any (positive) predicted scores, this metric will never be 0 and it seems to me that it is very biased towards high values. So much so that, in my experiments, I get a NDCG of ~0.8 out of 1.0 for a (custom-made) prediction that ranks everything the wrong way i.e. the worst possible ranking.
I can't help but feel that seeing a 0.8 out of [0, 1] is misleading since the prediction was one of the worst possible ones. Shouldn't it be 0?
To that end, I observed the fact that I could adjust NDCG to be better distributed in [0, 1] by defining it as (DCG - WDCG)/(IDCG - WDCG) where WDCG is the worst DCG possible, just as in min-max scaling.
However, I have not been able to find anything online about anyone doing the same and everyone seems to use the vanilla NDCG. I cannot be the only one who thought of this.
Any thoughts or suggestions? Have you seen this be done before?
Category Data Science