I bet this AI can beat you at Connect4:
How does it work? I’ll spend the rest of this article explaining it.
Rules of Connect4
Before we can teach a computer to play, we first need to lay out the rules of a Connect4 game. A complete Connect4 game works as a sequence of states, and the transitions between the states are called moves. We can start by modelling what a state consists of:
nextPlayervariable, representing who’s turn it is to move next. This is always equal either to 1 (for player-1) or 2 (for player-2).
- A key-value dictionary of $6*7=42$ spot states, where the value of each spot state is either 0 (meaning an empty spot), 1 (spot taken by player-1) or 2 (spot taken by player-2), and the keys are the following ordered pairs:
(0,0), (0,1), (0,2), (0,3), (0,4), (0,5), (0,6)
(1,0), (1,1), (1,2), (1,3), (1,4), (1,5), (1,6)
(2,0), (2,1), (2,2), (2,3), (2,4), (2,5), (2,6)
(3,0), (3,1), (3,2), (3,3), (3,4), (3,5), (3,6)
(4,0), (4,1), (4,2), (4,3), (4,4), (4,5), (4,6)
(5,0), (5,1), (5,2), (5,3), (5,4), (5,5), (5,6)
Then, we need to define what valid moves are. This is fairly straight-forward in Connect4: on any given turn, a player only has at most 7 choices—they must choose one out of the seven columns. And the only restriction is that they can only move in columns which aren’t already full.
Finally, we need to define what the “end” (i.e., end-of-game) states are. There are two types of end states: a win state, or a stalemate state. A state is in a “win” state for a player if that player has four pieces in a row. A stalemate means all the spots have been filled, and neither player has won.
With these rules, we already pretty much have a working game of human-vs-human Connect4:
Here’s the cool thing: in order for us to write a really strong Connect4 AI, we’re going to use the Minimax algorithm (explained later), and the only thing this algorithm asks from us is that, in addition to defining the game rules, we define a single function, called the heuristic function—this is a function which inspects a single state of the game, and gives out a single number that represents an opinion of who is winning, and by how much.
The heuristic function I came up with is as follows: first, I slice the board into sets of four-spots-in-a-row—note that there are many of these sets and that they (highly) overlap. For each set, if a player is the only player who has pieces inside that set, then that player will receive a point for each piece they have in that set. This means pieces will be counted multiple times if they appear in multiple sets which are only occupied by that player. The total number of points is the heuristic value for that player.
Try playing against the computer again, and this time it will show your heuristic value minus the opponents heuristic value. Positive means you’re winning, negative means you’re losing:
Pretty frustrating, isn’t it?
The Minimax Algorithm
The real “brain” of the AI is the Minimax algorithm. You give this algorithm a start state, a list of allowed next states, and an heuristic function, and it will miraculously give you back a super-intelligent AI.
MIT professor Patrick Winston describes this algorithm in detail, better than I possibly could: https://www.youtube.com/watch?v=STjW3eH0Cik.
The quick explanation is that you tell the Minimax algorithm how many moves ahead, $i$, to look, and the algorithm will exhaustively project all possible states $i$ moves ahead, then work backwards to find the sequence of moves which maximizes the player’s heuristic value while minimizing the opponent’s heuristic value, while respecting the fact that—and this is key—the opponent will be doing the same thing.