Learning a board game using a genetic neural network

Question

Learning a board game using a genetic neural network

user1005909

2021年9月9日 10:42

I've never really done any practical machine learning, this is just a hobby for me.

I'm trying to create a process using a neural network to learn the board game 7 Wonders. Here's how I want this experiment to be done:

Take all inputs (I've calculated 1278 of them to start with).
Send the inputs through a neural network with an arbitrary amount of hidden layers, randomly initialize weights, and calculate values for all possible actions to take (231 outputs with some valid or invalid at any given time. If the chosen action is invalid, pick the next best output based on q-value.)
For an initial round, use the same hidden layers with their biases/weights for the entirety of a game for 100 games and get the average score (fitness) this set of biases/weights produces.
Mutate the hidden layers biases/weights by a small amount.
Run another 100 games using the new set of biases/weights. If this set produces better scores, use this new set as the next generation (a new successful generation). Otherwise, mutate a new set based on the original parent.

As I understand this, what I've described so far is stochastic gradient descent. This can produce suboptimal minima so to avoid that, I want to introduce new parents to mate by doing the same process. And after an arbitrary amount of generations for 2 parents, combine their biases/weights, and reapply the process above to the new child. I would repeat this process until a child is produced that almost always wins games.

My questions are as follows:

Does this experiment as I've described make sense?
I'm trying to write this all in Python. I've been trying to use sklearn's MLPClassifier but I can't seem to figure out how to randomly initialize my hidden layer nor can I figure out how to manually manipulate the hidden layer. I'm not trying to train an MLP using a supervised approach (as far as I understand) which is what this class in sklearn seems to be used for. Is this a good library to use? Does anyone have a suggestion on a different library for this?

Topic game neural-network python

Category Data Science

Neil Slater · Accepted Answer · 2021年9月9日 10:42

As I understand this, what I've described so far is stochastic gradient descent.

Without any way to generate a gradient or any mention of using gradients, this is not gradient descent. Your choice of the word "mutate" plus terms like "parents" and "breed" would lead me to believe that you initially want to train your neural networks using some kind of genetic algorithm (GA). In general, using a GA like this is not gradient descent, but an alternative to it.

Using GAs to train neural networks is viable, and might be used for the same reasons you are considering: You don't have a dataset of correct outputs for the neural network, but you can rank networks as better or worse at some score, and the goal is to maximise that score.

One algorithm that does close to what you want is NEAT, and there is an implementation of it called NEAT-Python.

Some things to bear in mind:

NEAT works best when it is allowed control over the neural network architecture as well as weights. This is it's main innovation over a simpler hand-rolled GA/Neural network combination.
NEAT may not scale well to complex problems with many inputs and outputs. You won't find many NEAT-based chess or Go playing bots.

With this in mind, you may find other approaches to self-play learning and game playing agents more productive:

Tree search algorithms. For a two player zero-sum game with perfect information you might be able to use Negamax, or for a more advanced approach you could use Monte Carlo Tree Search.
Reinforcement learning (RL). This is a large topic to study, but in short this will give you a way to use gradient descent to train the neural network. The RL part would be an "outer" component that collects data that is then used in a similar way to supervised learning. There are plenty of pre-made and example agents for RL for most neural network frameworks.

The famous Alpha Zero uses both these approaches in combination, and is actually a relatively simple algorithm at its core. There is a Python implementation of the general agent that you could use and read code from.

To address your questions more directly:

Does this experiment as I've described make sense?

Yes, you have described a well-known approach to training neural networks when the problem to solve is a task that can be evaluated, as opposed to a direct function to approximate. Other approaches are possible, and may be more suited to your game (just a guess, it depends on the game).

I'm trying to write this all in Python. I've been trying to use sklearn's MLPClassifier [...] Is this a good library to use? Does anyone have a suggestion on a different library for this?

You probably want something lower-level so that you have direct access to the variables that contain the weights. I would suggest using TensorFlow/Keras, or PyTorch. Both of these also have examples that include game-playing agents that train via self-play and RL, in case you decide to change your approach.

An example Python library that applies genetic algorithms to train TensorFlow or PyTorch neural networks is PyGAD, so it looks possible that you could either use that directly or learn from it how it works and adapt for your game.

Learning a board game using a genetic neural network

About