Search / Multiple Choice System evaluation
I have a DB with N items. My system can output an ID for the item or say N/A (not found). What are different ways to evaluate the performance of such system, and what are the characteristics/tradeoffs of these?
PS. Earlier I came with a definition:
Ground truth ↓, Prediction → | ID1 | N/A |
---|---|---|
ID2 | TP IF ID1 == ID2 ELSE FP | FN |
N/A | FP | TN |
Would be curious to get some thoughts / feedback on this definition and whether we can base Precision / Recall on this.
Topic metric evaluation information-retrieval search
Category Data Science