Search Query Sample Size Determination for validation set

While designing a search system, which searches in N identifiable categories, how many search queries does one need in each category to validate the target metric (DCG) scores accurately (balanced variance and bias)? does this number depend on N or the corpus size or both? Please add any publications possible. I would also like the understand if effect size and and bayesian effective sample sizes play some role here.

Given a set of search queries Q for retrieving documents from another set D, where there are N possible categories in the corpus with k(n) documents in each set, the objective is to define test minset(Q) such that DCG obtained is reliable.

Example: Searching for Fruits
Search queries: [Red apple, Green apple]
Category: Apple
Documents: [Red Apple, Gala Apple, Green Apple,...]k1 documents
Search queries: [Alphonso, Haden, Keitt]
Category: Mango
Documents: [Keitt, Kent, Haden,....]k2 documents
.
.
.
The search results will be ranked and the accuracy of search is measured with DCG. How many search queries do you need in each set to have a reliable DCG as a function of the categories and the document set?

Topic learning-to-rank search-engine sampling search statistics

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.