How can I train a model to modify a vector by rewarding the model based on the modified vectors nearest neighbors?
I am experimenting with a document retrieval system in which I have documents represented as vectors. When queries come in, they are turned to vectors by the same method as used for the documents. The query vector's k nearest neighbors are retrieved as the results. Each query has a known answer string.
In order to improve performance, I am now looking to create a model that modifies the query vector. What I was looking to do was use a model that rewarded the model for each of the top k nearest neighbor's corresponding documents that contain the answer string and punish when the string is not present.
In looking for a solution I have been mainly finding results related to multiple rounds of discrete decision making, so I am also not sure if this would count as reinforcement learning or something else entirely. Thank you for any help.