Can I use multi armed bandits to optimize how much both algorithms are weighted when creating a composite score?

So, I'm aware that multi-armed bandits are great for evaluating multiple models and from what I understand, it is mainly used to pick a specific model.

I would still like to evaluate two models but I want to do it differently. Take a look at this simple equation:

W_A * RecoScore_A + W_B * RecoScore_B = CompScore

Rather than optimize for a specific model for a given user, I'd like to optimize for a given set of weights.

I'm wondering if this makes sense and if you have seen any literature related to this. I'm having trouble finding anything online.

Topic ab-test experiments recommender-system

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.