ML that learns to predict and play a simple wagering game

Question

ML that learns to predict and play a simple wagering game

JDOE

2017年9月22日 20:58

I have a simple game I'm building for fun, just to see how well ML can work with simple data sets.

Basically it's just a game where it has turns that go like this:

Computer generates a random number $x$, and does not show the player.
Player wagers that they can guess a number lower than $x$. Call the wager amount $w$
Player tries to guess a low number $g$.
If $g \lt x$, then player gains $wg$ points.
If $g \ge x$, then player loses their wager, $w$ points.

If player has funds $f_t$ at start of turn $t$ then another way to put this is:

$$f_{t+1} = \begin{cases} f_t + wg, \text{if } g\le x\\ f_t - w, \text{otherwise} \end{cases}$$

Here's an example of play:

Start with $f=50$
Turn 1, $x=11$, player wagers $w=8$ and guesses $g=9$. Player gains $+72$ points, so $f=122$ at the end of the turn.
Turn 2, $x=10.5, w=4, g=7$, points change by $+28$, $f=150$.
Turn 3, $x=20, w=1, g=6$, points change by $+6$, $f=156$.
Turn 4, $x=2, w=10, g=15$, points change by $-10$, $f=146$.

I wanted to use ML to try and predict this by feeding it pre-generated turns to see if it can find any patterns. There are only a few variables involved, so I figured it shouldn't be overly complicated. Ideally I would like for the ML to learn how to play the game.

I am wondering what type of ML is applicable to this kind of problem? It is not clear to me where to start, even though I have made the game simple. I have more complex games that I made in the past that I would like to try to apply ML to as well.

Topic game python machine-learning

Category Data Science

Neil Slater · Accepted Answer · 2017年9月22日 20:58

There are a few different data science and ML techniques you could throw at the numbers output by this game. You could try to analyse human players' style. You could generate a table of expected gains/losses by ignoring actual amount gained or lost and estimating the probability distribution of $x$ even though the player did not see it, just from know whether $x \gt w$ or $x \le w$ from enough examples.

If your end goal is to find out how a computer might play such a game, then one clear machine learning choice would be Reinforcement Learning (RL). It is not the only way for a computer to play a game - there are very many optimisation techniques for this. However, it is a machine learning approach - it learns from data observations - and in many variations it includes numerical analysis that you may find interesting, such as chance of winning from a certain position, or predicted future rewards.

RL is not a single algorithm, or even a single approach. Instead it is a way of framing a problem, and all ways of solving that kind of problem are considered to be RL. For RL to work, a problem needs to be framed as a Markov Decision Process (MDP). The good news is that your example game, and similar ones, are good examples of MDPs already.

In RL, the way that an agent plays a game is called its policy. The usual goals of RL include measuring the performance of a specific policy, or discovering the best policy.

So, where to start? First, if you want to try an RL process on your game you should probably do a few things:

Understand what in the game is the State, what are the Actions and what are the Rewards.
- The state is anything that the agent knows about and can affect the outcome. Probably the only thing to worry about here is the current funds $f$, although this may not have a huge influence unless $w$ can always go up to $f$. Note each value is treated as a different state.
- The action is the agent's choice of both $w$ and $g$ - note each specific combination is considered a different action. If there are a large number of actions, this can make learning harder, requiring more advanced RL algorithms, so I recommend initially you try variations of your game with limited number of choices.
- The most obvious choice of reward is the change to $f$ at the end of each turn. It doesn't have to be though, it depends on what you consider a "winning condition" to be for the game. See below.
Simplify the game options a little. Set a small limit to possible ranges of $x, w, g$. This will make it easier to try things out initially.
Set a clear goal that you want to achieve, and make sure that the game structure works for that. You may need to adjust the game representation for even simple change such as "get the most reward in 10 turns" - because that might encourage large wagers at certain steps, and the agent will need to know how many turns it has left (so your state becomes combination of $(f, t)$. Another variation that may work is to set the goal of getting to a certain amount of funds, such as 1000. In which case, the reward would not be the increase in $f$, but actually be +1 for getting to that target, and 0 for any other result. That might radically change the behaviour of the agent, which could be interesting to experiment with.
Look up simple Reinforcement Learning algorithms. I suggest you start with tabular methods - perhaps Monte Carlo Control, or Q-Learning. They can typically be implemented in a few tens of lines in Python/Numpy.

A useful resource for RL is Sutton & Barto's Reinforcement Learning: An Introduction. The draft of the second version is free to download.

ML that learns to predict and play a simple wagering game

About