How to determine the "total number of relevant documents" in calculatiion of Recall in Precision and Recall if it's not known? Can it be estimated?
On Wikipedia there is a practical example of calculating Precision and Recall:
When a search engine returns 30 pages, only 20 of which are relevant, while failing to return 40 additional relevant pages, its precision is 20/30 = 2/3, which tells us how valid the results are, while its recall is 20/60 = 1/3, which tells us how complete the results are.
I absolutely don't understand how one can use the Precision and Recall in real/life scenario of total number of relevant documents is needed.
For example, In my scenario, I have a set of about 9000 collected documents and I am creating a recommender system with several algorithms (like Tf-idf, Doc2Vec, LDA...). It has to recommend the TOP 20 most similar recommendations (articles) based on one selected article. Since I am not going to count the number of all relevant articles manually in 9000 documents for every recommender query, what is a relevant way to estimate the total number of relevant articles so that I can calculate Recall and then proceed to calculate Average Precision?
The only information I found about this problem are this lecture notes where they suggest to create pool of the result:
There are several ways of creating a pool of relevant records: one method is to use all the relevant records found from different searches, another is to manually scan several journals to identify a set of relevant papers.
But I'm trying to find more information on this method of pools elsewhere.
Common sense is telling me that this can be a valid approach: To take, say, 50 random documents and manually count the number of relevant documents in that random sample and estimate the total number of relevant documents from that. Can this be a valid approach? I imagine I could do this for a few recommendation results (although it would be a bit time consuming) or have some test users selected.