Deep Reinforcement Learning for dynamic pricing

Question

Deep Reinforcement Learning for dynamic pricing

Karthik Rajkumar

2019年3月15日 12:55

I am trying to implement a Deep Q Network model for Dynamic pricing in Logistics. I can define

State Space (Origin, Destination, type of the shipment, customer, Type of the product, Commodity of the shipment, AVAILABILITY of capacity etc.
Action Space (price itself, can range from 0 to inf) we need to determine the price itself.
Reward Signal (Rewards can be based on a similar offer to other customers, seasonality, remaining capacity.

I am planning to use Multi-Layer Perceptron for getting inputs from the state space and the outputting the price.

I am not sure how to define a reward function. Please help me in defining the mathematical formula for the reward function based on the price as an action space?

-- UPDATE --

State space that evolves over the time is the remaining capacity (Logistics). Consider at the initial time step is 10,000 kgs capacity and at over a period of time the capacity decreases and when the capacity is full and it cannot take anymore shipments, then the episode completes.

The agent will have to find an optimal price based on the following rewards.

Topic deepmind dqn tensorflow reinforcement-learning deep-learning

Category Data Science

Deep Reinforcement Learning for dynamic pricing

About