A Neural Network that plays TicTacToe

Neural network

Diagram of a BPN Neural network

Turkish translation courtesy of Zoltan Solak

I find the concept of neural networks quite fascinating, but what could I really do with one? I had developed a couple of different types of networks and played with them, doing the simple classical experiments, such as teaching a Back Propagation Network to recognize the alphabet with each letter written on a simple 5 by 7 grid. I ran variations on those letters through the trained network, and sure enough, it recognized most of them. That's all been done before. What could I do different?

Then it came to me. Now that I had An Invincible Genetic Algorithm for playing Xs and Os, I could use this to teach a neural network the game. I had reduced the complexity of the game down to 17 boards in a -database- for the O player, and a corresponding table of 17 responses to each position. For a BPN neural network the task of memorizing those 17 boards and their responses should be simpler than memorizing the alphabet. It would need an input layer of only 18 units, and an output layer of only 9 units. I chose 15 units for the hidden layer. Perhaps this would be the world's simplest neural net capable of playing TicTacToe.

Now there are nine squares on a board, and each can take on the value of X, O, or blank. The network would need 9 pairs of inputs to 'see' a board. Each square then would be attached to an input pair. Input A of the pair would be activated for an X, say, and input B activated if there was an O, and neither input activated in the case of a blank square.

There would be 9 outputs. The neural network would be trained by being presented each board in the database of 'exemplars' on its input, and required to activate the correct one of the 9 outputs that was the response the genetic algorithm had evolved for that particular pattern.

In the end, it was simple. The network converged on the data set on the first try. The BPN network learned all the boards to a high degree of confidence.

In the first version of the game, I rotated and/or reflected each board to match the configuration in the training set before sending it to the neural net. Of course, then it played a perfect game. I quickly realized that this was really dumb. I was using the neural net as a database where a simple lookup table would do. This demonstrated nothing of the purpose or power of a neural network beyond the fact that I was successful at training it.

The whole idea of using a BPN neural network is to employ its pattern recognition capability. One presents it with some examples of the patterns one wants it to learn to recognize, and the neural net extrapolates from that set of exemplars to classify an unknown pattern.

How could I make this a worthwhile exercise? I decided it would be interesting to see if the neural network might be able to recognize the pattern of a game situation it was trained on even though that board was rotated to some degree.

Consider a pattern of Xs and Os on the game board. You can rotate that board 90, 180, and 270 degrees. It will still be exactly the same game situation, but depending on the degree of symmetry, the pattern will appear different with each rotation. Similarly, you can reflect that board in a mirror (a 180-degree lateral rotation in the Z plane) and the pattern will look different (if it doesn't have bilateral symmetry). Furthermore, after reflecting the board you may be able to do three more 90-degree rotations in the X, Y plane, again depending on the lack of symmetry in the pattern. In total there are up to eight different views possible of the same board.

I trained the neural net on only two views, the view of the board as it happened to appear in the database, and the reflected view. These two views represented only one quarter of all possible views. Would the neural net be able to classify correctly a board rotated to a position it had never seen before?

The answer was yes, though with such limited examples of a very complex notion, it was clearly quite a struggle. The recognition rate was only 30%, but this was twice as good as chance alone could account for. This was also good enough to tip the balance in favour of the neural network winning. With two players making random moves, the player who goes first has a big advantage over the other, and wins at a rate of about two games to one. My neural net played O, the player designated to go second, and yet beat X more than half of the time.

Now you will realize that whenever the board just happens to be in the same configuration as in the training set, the neural net never makes a mistake. For an experiment to see what kind of advantage this alone gives, suppose we have it so that when this is the case, we use the response from the neural net, and when the board is in a position the neural net has never seen, just make a random move. Tests showed that against a random player this is not good enough to give the neural net the advantage, and his opponent (who always goes first) will win about 53% of all games. With the trained neural network recognizing only 30% of patterns it had never seen, it wins about 52% of the time, eliminating X's big advantage of the first move.

For the next version, I increased the training set of exemplars to give yet another view of the original list of boards. Now besides the original 17 boards, and the 180-degree lateral rotation in the Z plane, I added a 180 vertical rotation in the Z plane. To put that another way, I had the original board list, the mirror images of them, and their inversions. Beyond this, I added all the 9 possible first X moves there are (O's response to the first move being of supreme importance). Again, the network converged on this expanded set of exemplars.

In another 100,000 game tournament with the same random player, this time O won 87% of all games won by either side, leaving only 13% of the wins for X - the random player who always goes first.

Of all moves sent to the neural net, 41% were exactly the same positions the neural net was trained on, and thus a direct lookup. However, 58% of all moves sent to the neural net were rotations he had never seen before. Of these boards, the neural net now responded with the correct move a whopping 81% of the time. Of the 19% he failed to classify, more than half were illegal moves and had to be replaced by random moves. Also, a very small number - about 1% - of boards generated during the games were boards for which the neural net had never received any kind of training, and a random response was generated in these cases as well. In total then, about 7% of all moves are actually random moves.

I then put the network code into a DLL, saved the weights gained from training in a file so that the network could be reconstituted, and whipped up a little interface so you can see what it's like to play against the neural network. He is a wily opponent! There is also an option to play against the genetic algorithm, but you will quickly discover it is futile to even think about beating that. I have added the option to play against the random number generator. To make it more interesting, an algorithm was added to make sure no simple wins are to be had by the player and a chance to throw a simple block is never missed by the computer. This 'opponent' is the very same used to grow the genetic algorithms and test the neural networks.

You can download the application -here- [vers. 1.2 - 43 Kb]. Just unzip the files into a folder and click on the exe to play. There is nothing to install or clutter your Windows registry. I used a standard implementation of the BPN network by David M. Skapura, the author of an excellent book I read on neural networks. The source code for this is available -here- [6 Kb].