How to estimate missing values when calculating NDCG
I would like to compare recommendations methods using NDCG metric on MovieLens dataset.
In ranking problem, the goal is to rank items based on their relevance for user. Ranking models can be learned based on ratings matrix, where each user rates small subset of all items. Ratings for other items are unknown.
Collaborative Filtering methods can be used to create model which generalize training datasets and predict ratings for unrated items.
Let's consider following example on dataset consisted of 5 movies. User A rated only 3 movies:
- movie 1 - 5 stars
- movie 3 - 3 stars
- movie 4 - 2 stars
Model predict following results
- movie 1 - 5 stars
- movie 2 - 4 stars
- movie 3 - 3 stars
- movie 4 - 2 stars
- movie 5 - 1 stars
How NDCG@3 should be calculated in this example ? Movie 2 get second best score but it has not been rated by user although it's highly relevant for user A based on other user ratings. Giving movie 2 1 star rating as ground true penalized model because it predicted highly relevant movie which was not rated by user.
Many papers measure model performance on MovieLens using NDCG, but I have not found details how NDCG is calculated. What is the best practice for solving this problem? Is it good idea to estimate unknown rating value based on movie ratings median or average ?
Topic movielens ndcg evaluation recommender-system
Category Data Science