How to use ndcg metric for binary relevance

I am working on a ranking problem to predict the right single document based on the user query and use the NDCG metric to measure the model.

Given the details :

Queries ( Q ), Result Document ( D ), Relevance score. But the relevance score is a binary ( 0 or 1 ) i.e out of document lists, only one document is marked as relevance score =1.

Data set example:

 query, docs,relevance
{
[1, doc2,0],[1, doc3,0],[1, doc4,0 ],[1, doc6,1],[1, doc9,0]
[2, doc3,0],[2, doc5,1],[2, doc10,0],[2, doc11,0],[2, doc1,0]
}

My questions: 1. Is it possible to use the NDCG metric for binary relevance problems? 2. If so, please share some reading notes or suggestions.

Thanks

Topic learning-to-rank text-classification ranking recommender-system machine-learning

Category Data Science


The nDCG depends on the relevance of each document as you can see on the Wikipedia definition. I guess you could use 0 and 1 as relevance scores, but then all relevant documents would have the same score of 1, and then it wouldn't make much sense to apply the nDCG penalty discounts.

A similar measure often used with binary relevance scores is the mean average precision defined as:

$\text{MAP}=\frac{\sum_{q=1}^Q\text{AveP}(q)}{Q}$,

where $Q$ is the number of queries.

A comprehensive explanation of both nDCG and MAP is available here

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.