Approximate maximum dot product between a vector and set of vectors using only a single vector representation for the latter

Question

Approximate maximum dot product between a vector and set of vectors using only a single vector representation for the latter

Curious Ion

2021年6月11日 12:35

If we have a vector $q$ and a set of vectors $D = \{d_1, d_2, ..., d_l\}$ is there a way to create functions $QF$ and $DF$ such that $QF(q)^TDF(D) \approx \max_i(q^Td_i)$ ?

Use case:

I want to build an information retrieval system in which documents are represented by an arbitrary but small ($100$) number of vectors and the query is represented by a single vector. Ideally, I would like to sort the documents based on $\max_i(q^Td_i)$ but storing all vectors and computing on query time each $q^Td_i$ term for each document does not scale. I was wondering if there is a way to combine the $d_i$ into a single vector and use this vector somehow to approximate the aforementioned score.

Topic vector-space-models information-retrieval

Category Data Science

Approximate maximum dot product between a vector and set of vectors using only a single vector representation for the latter

About