What is a good reward function when objective is to minimize the average along with the variance?
I am trying to formulate a problem where we are trying to minimize the average resource allocated to different users. Due to some inherent properties of the environment, some users can be easily minimized while it is difficult for other users due to which a fairness issue arises. While the main objective is to minimize the average resource consumed by all the users, I also want to ensure that the allocation is fair so the variance of the resource allocation is less.
So is the average+variance
a proper reward function? By proper I mean does it capture what I am trying to achieve - a low average while ensuring some degree of fairness? I have seen optimization problems being formulated as x*average + y*variance
where x+y=1
. Would this kind of formulation be better suited for my case?
Topic reward objective-function machine-learning
Category Data Science