How to formulate reward of an rl agent with two objectives
I have started learning reinforcement learning and trying to apply it for my use case. I am developing an rl agent which can maintain temperature at a particular value, and minimize the energy consumption if equipment by taking different actions that are available for it to take.
I am trying to formulate a reward function for it.
energy and temp_act can be measured
energy_coeff = -10
temp_coeff = -10
temp_penalty = np.abs(temp_setpoint - temp_act)
reward = energy_coeff * energy + temp_coeff * temp_penalty
This is the reward function I am using, but intuitively , I feel it should be better. because absolute value of enenrgy and temp_penalty are on different scales. How do i take into count the scaling problem, while structuring a reward.
Topic discounted-reward dqn monte-carlo q-learning reinforcement-learning
Category Data Science