Building a Clojurescript game: AI

This post is the sixth in the series of posts focused on the design and implementation of a port in ClojureScript of the game named Tribolo. You can try the game at the following address.

Our first post described the game and identified the following responsibilities, each of them will be developed in a different post:

Game logic: the rules governing transitions between game states
Rendering: transforming the current game state into HTML and SVG
Store: the in-memory data-base holding the application state
Game loop: coordinates and control of the different aspects
AI logic: selection of the best next move for an AI player

Among these different parts, we already implemented the game logic as well as the rendering and the game loop. The goal of this post will be to implement a decent artificial intelligence for our Tribolo game.

Reminder on the target architecture

Our different responsibilities are interconnected through the following high level architecture (you can refer to our first post for more details):

   Store ----------------> Rendering ---------------> Reagent
     ^      (reaction)         ^      (send hiccup)
     |                         |
     |                         | (listen)
     |       (update)          |
     \-------------------- Game loop
                               |
                               |
     ,-------------------------| (use to update store)
     |                         |
     v                         v
 Game logic <-------------- AI logic

The last missing piece to complete the game is our AI. Until now, we assumed the existence of a function find-best-move that would implement this AI.

This function takes as input a game and returns what it believes to be the best coordinate for the current player to play at.

	(s/fdef find-best-move
	:args (s/cat :game ::game/game)
	:ret ::board/coord)

view raw triboard.game-loop.ai-expectations.cljs hosted with ❤ by GitHub

The goal of today is to find a satisfying implementation for this function.

Designing our AI

Before diving in the implementation, we will start by thinking a bit about what we want the achieve, and how we want to achieve it.

Choice of the weapon

There are many different types of games. Some games naturally encourage specific pattern for the development of their associated AI. Our game follows a structure such that it is easily possible to build a Game Tree for it.

A Game Tree is defined as a directed graph whose nodes are positions in a game and whose edges are moves in Wikipedia. This is pretty much the structure followed by our Game Logic API:

A turn represents a valid point in time in a complete game
A transition is a valid progression from one turn a next turn

One overall good strategy to explore a Game Tree is to use the Minimax algorithm. This algorithm was originally created to deal with two player games, so we will need to adapt it to make it work for the 3 players of Tribolo.

No Alpha – No Beta

The issue with the Minimax algorithm is the combinatorial explosion due to the high branching factor of the Game Tree. One good complement that allows to reduce the search tree is to use Alpha Beta pruning.

As for the Minimax, this algorithm was originally developed to deal with two player games. But unlike the Minimax, the Alpha Beta pruning is pretty hard to adapt to deal with three player games. You can check this paper for more details.

In addition to this, the rules of Tribolo are such that there is no clear alternating sequence of players. A transition might make another player unable to play, skipping his turn.

This means the next turns might not have the same player. Since the player drives whether we should minimize or maximise, this in turn means that there is no guarantied interleaving of min and max phases.

All these concerns conspire to make it really hard to come up with an implementation of Alpha-Beta pruning for Tribolo. So we will have to be do with a Minimax implementation, adapted for three players, without branch pruning.

High level AI

Our Minimax algorithm will not be able to go very deep in the Game Tree. But we might be able to offset this by choosing good evaluation functions. An evaluation function is defined in Wikipedia as a function used by game-playing programs to estimate the value or goodness of a position in the minimax algorithm.

In general though, there is no single evaluation function that will fit all situations best. One common approach is to select different evaluation functions depending on the stage of the game.

Are we in in the early game? Or in the late game?
Is the AI in a specific dangerous situation?
Is the opponent in a specific dangerous situation?
Is the AI loosing? Or is it winning?

Our AI will try to adopt different evaluation functions to deal with some specific situations. We will call these AI strategies. Selecting the right strategy can have a big impact on how good the AI performs.

AI Strategies

One obvious evaluation function for Tribolo is to use the score of the game. Namely, we could optimize for the number of cells owned by the player. This is a good strategy for the late game especially, in which we only care about winning.

Among the specific situation, there are cases in which a player is in great danger of not being able to play a single move in the following turns. We will call this situation being trapped. We might change the evaluation function to maximize the number of available transitions when such a situation seems to come at horizon. This strategy is very useful in early or middle game, but much less useful in late game.

Last but not least, we can make the AI more aggressive against the winning player. To compensate for the lack of pruning in our Minimax implementation, the AIs will both ally against the player (to ruin his score) when the player is winning. This strategy is valid in all stages of the game.

These three strategies are the ones we will implement for our Tribolo game. Our design will need to support them (and an easy way to add more in general).

Minimax implementation

We determined that our algorithm of choice will be a variation on Minimax. We need to adapt the simple Minimax to include the support for:

Three players instead of two players
Non standard alternation of players
Different evaluation functions

AI Strategy Protocol

Based on the fact players might alternate in different patterns, and based on the fact we want to support AI alliance against the player, the MIN and MAX parts of the Minimax algorithm will need to be parameterized.

In addition, we have to support different kind of evaluation functions. This leads us to define a protocol for an evaluation strategy that contains two methods:

eval-turn represents the minimax evaluation function for a turn
maximizing? indicates whether or not we want to min or max the scores

We call this protocol AIStrategy:

	(defprotocol AIStrategy
	(eval-turn
	[this turn]
	"Heuristic to score the value of a turn")
	(maximizing?
	[this turn]
	"Whether the turn should minimize or maximize the transitions"))

view raw triboard.ai.ai-strategy.cljs hosted with ❤ by GitHub

For instance, a strategy where both AI play against the human player would provide an implementation of maximizing? that would return true for turns played by the AI and false otherwise. It would naturally lead to both AI helping each other.

Minimax

Our next task is to implement a variant of a fixed depth Minimax implementation, parameterized by an implementation of the AIStrategy protocol we defined in the previous section.

This algorithm can be described by a nice recursion. The base case is when we reach a leaf of a search tree (the maximum depth we want to search for, or the end of the game), in which case we apply the eval-turn to give a score to the turn. Otherwise, we recur and apply a min or a max on the scores returned by each of the sub-trees.

This can be summarized with the following piece of code, which makes use of the function minimax-step to implement the recursion (which we will see after):

	(defn minimax
	"Implements the minimax recursion:
	* Call the leaf node evaluation if the depth is zero
	* Otherwise goes one level deeper"
	[ai turn depth]
	(if (or (zero? depth) (turn/game-over? turn))
	(eval-turn ai turn)
	(minimax-step ai turn
	(fn [_ transition]
	(minimax ai
	(turn/next-turn turn transition)
	(dec depth))))))

view raw triboard.ai.minimax.cljs hosted with ❤ by GitHub

The implementation of minimax-step follows the open recursion technic. Instead of having minimax calling itself directly, we provide to minimax-step the function to call to recur. Thanks to this, we will be able to reuse this open recursive definition to implement our AI Strategies later.

Our minimax-step function will also be parameterizable with different min and max implementations. This flexibility will also help us a lot later.

	(defn minimax-step
	"One stage of the minimax algorithm:
	* Apply the maximizing or mininizing step to all transitions of the turn
	* Evaluate the lower level using the on-transition function"
	[ai turn open-recur
	& {:keys [max-fn min-fn]
	:or {max-fn max, min-fn min}}]
	(apply
	(if (maximizing? ai turn) max-fn min-fn)
	(map
	(fn [[coord transition]] (open-recur coord transition))
	(turn/transitions turn))))

view raw triboard.ai.minimax-step.cljs hosted with ❤ by GitHub

Note 1: We do not need to worry about blowing the stack with such recursions. The search tree grows exponentially with the search depth, so we will never be able to search very deep anyway.

Note 2: If you are curious about open recursion and its different usage, you can read more about it in this previous article in which we used it to help with the implementation memoization.

Top level search

Our minimax function attributes a score to an turn by looking at the game tree starting from it. To find our best move, we need to call our minimax function for each reachable turn from our initial turn.

We should be able to re-use our minimax-step with one twist. As we need to keep track of the coordinate to play, we want to return tuples containing both a move and a score and maximize by score alone.

To customize the comparison, we create a variant minimax-step-by that accepts a projection for the comparison:

	(defn minimax-step-by
	"One stage of the minimax algorithm, with custom comparison functions"
	[key-fn ai turn open-recur]
	(minimax-step
	ai turn open-recur
	:min-fn (partial min-key key-fn)
	:max-fn (partial max-key key-fn)))

view raw triboard.ai.minimax-step-by.cljs hosted with ❤ by GitHub

Using this minimax-step-by, we can implement our top search game-tree-search function. It takes an AIStrategy, a turn from which to start the search, and a maximum depth search. It returns what it believes to be the best move to play.

	(defn- game-tree-search
	"The top level of the minimax algorithm
	* Trigger sub-game-trees minimax evaluations
	* Remember the transition that led to the max"
	[ai turn depth]
	(first
	(ai-algo/minimax-step-by
	second ai turn
	(fn [coord transition]
	(let [new-turn (turn/next-turn turn transition)]
	[coord (ai-algo/minimax ai new-turn depth)]))
	)))

view raw triboard.ai.tree-search.cljs hosted with ❤ by GitHub

High level AI

Our Minimax-based AI algorithm is now ready and implemented by game-tree-search. Our next job is to implement useful AIStrategies to use in this game tree search.

Optimizing scores

Our first AIStrategy consists in optimizing the score of a chosen player. Because our turns contains the transitions to the next turns, we can be clever and use this data to go one level deeper in the search.

We will use our open recursive minimax-step function, but instead of calling our minimax recursively, we provide it with a function that computes the scores taking into account the transition. This computation is much cheaper than accounting for the whole transition.

We extract this logic inside the eval-next-score-of function, that takes as input the players whose scores should be summed and optimized for.

	(defn- eval-next-score-of
	"Perform a last minimax step on the next scores following the turn
	* Allows to see one level deeper for simple scoring strategies
	* While being fast (the transition does not need to be followed)"
	[ai {:keys [scores] :as turn} players]
	(minimax/minimax-step ai turn
	(fn [_ transition]
	(let [scores (scores/update-scores scores transition)]
	(apply + (map #(get scores %) players)))
	)))

view raw triboard.ai.strategies.eval-next-score-of.cljs hosted with ❤ by GitHub

Based on this previous function, we can easily derive the implementations of our two AI that care about the scores:

The optimize-own-score-ai which optimize the score of the playing AI
The optmize-ai-scores-ai which optimize the scores of both AI (allied)

	(defn optimize-own-score-ai
	"Create an AI strategy to optmize its own score: good late game play"
	[player]
	(reify minimax/AIStrategy
	(eval-turn [this turn] (eval-next-score-of this turn [player]))
	(maximizing? [_ turn] (= (:player turn) player))
	))

	(defn optmize-ai-scores-ai
	"Create an AI strategy to optmize the AI scores: good cheat when the player wins"
	[]
	(reify minimax/AIStrategy
	(eval-turn [this turn] (eval-next-score-of this turn [:red :green]))
	(maximizing? [_ turn] (not= (:player turn) :blue))
	))

view raw triboard.ai.strategies.max-ai-scores.cljs hosted with ❤ by GitHub

Avoiding traps

Our second family of AIStrategy is about avoiding being trapped with no move left to play. The corresponding evaluation function will maximize the number of available transition instead of the score. This is pretty simple to implement:

	(defn optimize-own-choices-ai
	"Create an AI strategy to optmize the AI choices: avoid being trapped with no moves left"
	[player]
	(reify minimax/AIStrategy
	(eval-turn [_ turn]
	(count (get (transition/all-transitions (:board turn)) player)))
	(maximizing? [_ turn] (= (:player turn) player))
	))

view raw triboard.ai.strategies.avoid-traps.cljs hosted with ❤ by GitHub

Assembling the pieces

Both our Minimax algorithm and our AI evaluation functions are ready. We only need to plug these parts together.

We first implement a High Level AI that will select the appropriate AI strategy to use depending on the game configuration.

	(defn- high-level-ai
	"High level AI: choose the right evaluation function to play"
	[{:keys [player scores] :as turn}]
	(cond
	(human-player-winning? scores) (strategies/optmize-ai-scores-ai)
	(in-late-game? scores) (strategies/optimize-own-score-ai player)
	(limited-move-options? turn) (strategies/optimize-own-choices-ai player)
	:else (strategies/optimize-own-score-ai player)
	))

view raw triboard.ai.high-level-ai.cljs hosted with ❤ by GitHub

Note: the details of the different predicates used in the function above are available in the following GitHub Gist.

Our last task is to complete our find-best-move function, which was the goal of this post. This functions is the only public function of the whole AI namespace and does:

Select the appropriate AI strategy using the high-level-ai function
Run a game tree search using game-tree-search and the selected AI

	(defn find-best-move
	"Find the best available move for the current player"
	[game]
	(let [turn (game/current-turn game)
	ai (high-level-ai turn)]
	(game-tree-search ai turn 1)))

view raw triboard.ai.find-best-move.cljs hosted with ❤ by GitHub

Note: The resulting search is performed with a depth of 2 which is quite low. In reality, because our score evaluation function also looks ahead, we get a search tree of depth 3. This strikes a balance between the need for the AI to run on the Web browser of low power devices (tablets or smart phones) and the difficulty.

We could have chosen different depth for different devices, but it would make the game harder on PC than on smartphones. Apart from being annoying to implement, it is not quite fair. To go deeper, we would instead need to add pruning to our game tree.

Conclusion

With this last post focusing on the implementation of the artificial intelligence, our Tribolo game is now fully complete. You can check the source code in the following GitHub repository.

I hope you did enjoy the series and the game that we built through it. I hope it did give you some ideas, made you discover at least one thing, or was as much fun as it was for me to implement it.

Special thanks

I would like to thank Manwe56 for our discussions on how to build an smart enough AI. His great insights helped me in designing my first ever artificial intelligence. In particular, I owe him the concept of high level AI, which helped me tremendously.

Manwe56 is regularly present in the top 1% of some of the major competitive programming contests. He is also the author of the competitive-programming GitHub repository that contains some generic implementation of the most common AI algorithms for both Java and C++.

Reminder on the target architecture

Designing our AI

Choice of the weapon

No Alpha – No Beta

High level AI

AI Strategies

Minimax implementation

AI Strategy Protocol

Minimax

Top level search

High level AI

Optimizing scores

Avoiding traps

Assembling the pieces

Conclusion

Special thanks

Share this:

Related

Leave a comment Cancel reply