This post is the sixth in the series of posts focused on the design and implementation of a port in ClojureScript of the game named Tribolo. You can try the game at the following address.
Our first post described the game and identified the following responsibilities, each of them will be developed in a different post:
- Game logic: the rules governing transitions between game states
- Rendering: transforming the current game state into HTML and SVG
- Store: the in-memory data-base holding the application state
- Game loop: coordinates and control of the different aspects
- AI logic: selection of the best next move for an AI player
Among these different parts, we already implemented the game logic as well as the rendering and the game loop. The goal of this post will be to implement a decent artificial intelligence for our Tribolo game.
Reminder on the target architecture
Our different responsibilities are interconnected through the following high level architecture (you can refer to our first post for more details):
Store ----------------> Rendering ---------------> Reagent ^ (reaction) ^ (send hiccup) | | | | (listen) | (update) | \-------------------- Game loop | | ,-------------------------| (use to update store) | | v v Game logic <-------------- AI logic
The last missing piece to complete the game is our AI. Until now, we assumed the existence of a function find-best-move that would implement this AI.
This function takes as input a game and returns what it believes to be the best coordinate for the current player to play at.
The goal of today is to find a satisfying implementation for this function.
Designing our AI
Before diving in the implementation, we will start by thinking a bit about what we want the achieve, and how we want to achieve it.
Choice of the weapon
There are many different types of games. Some games naturally encourage specific pattern for the development of their associated AI. Our game follows a structure such that it is easily possible to build a Game Tree for it.
- A turn represents a valid point in time in a complete game
- A transition is a valid progression from one turn a next turn
One overall good strategy to explore a Game Tree is to use the Minimax algorithm. This algorithm was originally created to deal with two player games, so we will need to adapt it to make it work for the 3 players of Tribolo.
No Alpha – No Beta
The issue with the Minimax algorithm is the combinatorial explosion due to the high branching factor of the Game Tree. One good complement that allows to reduce the search tree is to use Alpha Beta pruning.
As for the Minimax, this algorithm was originally developed to deal with two player games. But unlike the Minimax, the Alpha Beta pruning is pretty hard to adapt to deal with three player games. You can check this paper for more details.
In addition to this, the rules of Tribolo are such that there is no clear alternating sequence of players. A transition might make another player unable to play, skipping his turn.
This means the next turns might not have the same player. Since the player drives whether we should minimize or maximise, this in turn means that there is no guarantied interleaving of min and max phases.
All these concerns conspire to make it really hard to come up with an implementation of Alpha-Beta pruning for Tribolo. So we will have to be do with a Minimax implementation, adapted for three players, without branch pruning.
High level AI
Our Minimax algorithm will not be able to go very deep in the Game Tree. But we might be able to offset this by choosing good evaluation functions. An evaluation function is defined in Wikipedia as a function used by game-playing programs to estimate the value or goodness of a position in the minimax algorithm.
In general though, there is no single evaluation function that will fit all situations best. One common approach is to select different evaluation functions depending on the stage of the game.
- Are we in in the early game? Or in the late game?
- Is the AI in a specific dangerous situation?
- Is the opponent in a specific dangerous situation?
- Is the AI loosing? Or is it winning?
Our AI will try to adopt different evaluation functions to deal with some specific situations. We will call these AI strategies. Selecting the right strategy can have a big impact on how good the AI performs.
One obvious evaluation function for Tribolo is to use the score of the game. Namely, we could optimize for the number of cells owned by the player. This is a good strategy for the late game especially, in which we only care about winning.
Among the specific situation, there are cases in which a player is in great danger of not being able to play a single move in the following turns. We will call this situation being trapped. We might change the evaluation function to maximize the number of available transitions when such a situation seems to come at horizon. This strategy is very useful in early or middle game, but much less useful in late game.
Last but not least, we can make the AI more aggressive against the winning player. To compensate for the lack of pruning in our Minimax implementation, the AIs will both ally against the player (to ruin his score) when the player is winning. This strategy is valid in all stages of the game.
These three strategies are the ones we will implement for our Tribolo game. Our design will need to support them (and an easy way to add more in general).
We determined that our algorithm of choice will be a variation on Minimax. We need to adapt the simple Minimax to include the support for:
- Three players instead of two players
- Non standard alternation of players
- Different evaluation functions
AI Strategy Protocol
Based on the fact players might alternate in different patterns, and based on the fact we want to support AI alliance against the player, the MIN and MAX parts of the Minimax algorithm will need to be parameterized.
In addition, we have to support different kind of evaluation functions. This leads us to define a protocol for an evaluation strategy that contains two methods:
- eval-turn represents the minimax evaluation function for a turn
- maximizing? indicates whether or not we want to min or max the scores
We call this protocol AIStrategy:
For instance, a strategy where both AI play against the human player would provide an implementation of maximizing? that would return true for turns played by the AI and false otherwise. It would naturally lead to both AI helping each other.
Our next task is to implement a variant of a fixed depth Minimax implementation, parameterized by an implementation of the AIStrategy protocol we defined in the previous section.
This algorithm can be described by a nice recursion. The base case is when we reach a leaf of a search tree (the maximum depth we want to search for, or the end of the game), in which case we apply the eval-turn to give a score to the turn. Otherwise, we recur and apply a min or a max on the scores returned by each of the sub-trees.
This can be summarized with the following piece of code, which makes use of the function minimax-step to implement the recursion (which we will see after):
The implementation of minimax-step follows the open recursion technic. Instead of having minimax calling itself directly, we provide to minimax-step the function to call to recur. Thanks to this, we will be able to reuse this open recursive definition to implement our AI Strategies later.
Our minimax-step function will also be parameterizable with different min and max implementations. This flexibility will also help us a lot later.
Note 1: We do not need to worry about blowing the stack with such recursions. The search tree grows exponentially with the search depth, so we will never be able to search very deep anyway.
Note 2: If you are curious about open recursion and its different usage, you can read more about it in this previous article in which we used it to help with the implementation memoization.
Top level search
Our minimax function attributes a score to an turn by looking at the game tree starting from it. To find our best move, we need to call our minimax function for each reachable turn from our initial turn.
We should be able to re-use our minimax-step with one twist. As we need to keep track of the coordinate to play, we want to return tuples containing both a move and a score and maximize by score alone.
To customize the comparison, we create a variant minimax-step-by that accepts a projection for the comparison:
Using this minimax-step-by, we can implement our top search game-tree-search function. It takes an AIStrategy, a turn from which to start the search, and a maximum depth search. It returns what it believes to be the best move to play.
High level AI
Our Minimax-based AI algorithm is now ready and implemented by game-tree-search. Our next job is to implement useful AIStrategies to use in this game tree search.
Our first AIStrategy consists in optimizing the score of a chosen player. Because our turns contains the transitions to the next turns, we can be clever and use this data to go one level deeper in the search.
We will use our open recursive minimax-step function, but instead of calling our minimax recursively, we provide it with a function that computes the scores taking into account the transition. This computation is much cheaper than accounting for the whole transition.
We extract this logic inside the eval-next-score-of function, that takes as input the players whose scores should be summed and optimized for.
Based on this previous function, we can easily derive the implementations of our two AI that care about the scores:
- The optimize-own-score-ai which optimize the score of the playing AI
- The optmize-ai-scores-ai which optimize the scores of both AI (allied)
Our second family of AIStrategy is about avoiding being trapped with no move left to play. The corresponding evaluation function will maximize the number of available transition instead of the score. This is pretty simple to implement:
Assembling the pieces
Both our Minimax algorithm and our AI evaluation functions are ready. We only need to plug these parts together.
We first implement a High Level AI that will select the appropriate AI strategy to use depending on the game configuration.
Note: the details of the different predicates used in the function above are available in the following GitHub Gist.
Our last task is to complete our find-best-move function, which was the goal of this post. This functions is the only public function of the whole AI namespace and does:
- Select the appropriate AI strategy using the high-level-ai function
- Run a game tree search using game-tree-search and the selected AI
Note: The resulting search is performed with a depth of 2 which is quite low. In reality, because our score evaluation function also looks ahead, we get a search tree of depth 3. This strikes a balance between the need for the AI to run on the Web browser of low power devices (tablets or smart phones) and the difficulty.
We could have chosen different depth for different devices, but it would make the game harder on PC than on smartphones. Apart from being annoying to implement, it is not quite fair. To go deeper, we would instead need to add pruning to our game tree.
With this last post focusing on the implementation of the artificial intelligence, our Tribolo game is now fully complete. You can check the source code in the following GitHub repository.
I hope you did enjoy the series and the game that we built through it. I hope it did give you some ideas, made you discover at least one thing, or was as much fun as it was for me to implement it.
I would like to thank Manwe56 for our discussions on how to build an smart enough AI. His great insights helped me in designing my first ever artificial intelligence. In particular, I owe him the concept of high level AI, which helped me tremendously.
Manwe56 is regularly present in the top 1% of some of the major competitive programming contests. He is also the author of the competitive-programming GitHub repository that contains some generic implementation of the most common AI algorithms for both Java and C++.