CS 387: GAME AI
Transcript of CS 387: GAME AI
![Page 1: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/1.jpg)
CS 387: GAME AI BOARD GAMES
5/24/2016 Instructor: Santiago Ontañón
[email protected] Class website:
https://www.cs.drexel.edu/~santi/teaching/2016/CS387/intro.html
![Page 2: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/2.jpg)
Reminders • Check BBVista site for the course regularly • Also: https://www.cs.drexel.edu/~santi/teaching/2016/CS387/intro.html
• Thursday, project 4 submission deadline
![Page 3: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/3.jpg)
Outline • Board Games • Game Tree Search • Portfolio Search • Monte Carlo Search • UCT
![Page 4: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/4.jpg)
Outline • Board Games • Game Tree Search • Portfolio Search • Monte Carlo Search • UCT
![Page 5: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/5.jpg)
Game AI Architecture
AI
World Interface
(perception)
Strategy
Decision Making
Movement
![Page 6: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/6.jpg)
So far, we have seen: • Perception • Movement (Steering behaviors):
• FPS, Car driving
• Pathfinding • FPS, RTS, RPG, etc.
• Decision Making • FPS, RPG, RTS, etc.
• Tactics and Strategy • FPS, RTS
• PCG • Many genres.
![Page 7: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/7.jpg)
Board Games • Main characteristic: turn-based
• The AI has a lot of time to decide the next move
![Page 8: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/8.jpg)
Board Games • Not just chess…
![Page 9: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/9.jpg)
Board Games • Not just chess…
![Page 10: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/10.jpg)
Board Games • Not just chess…
![Page 11: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/11.jpg)
Board Games • Not just chess…
![Page 12: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/12.jpg)
Board Games • Not just chess…
![Page 13: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/13.jpg)
Board Games • Not just chess…
![Page 14: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/14.jpg)
Board Games • From an AI point of view:
• Turn-based • Discrete actions • Complete information (mostly)
• Those features make these games amenable to game tree search!
![Page 15: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/15.jpg)
Outline • Board Games • Game Tree Search • Portfolio Search • Monte Carlo Search • UCT
![Page 16: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/16.jpg)
Game Tree
Current Situation
Player 1 action
U(s) U(s) U(s)
![Page 17: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/17.jpg)
Game Tree
Current Situation
Player 1 action
U(s) U(s) U(s) Pick the action that
leads to the state with maximum expected
utility
![Page 18: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/18.jpg)
Game Tree
Current Situation
Player 1 action
Player 2 action
U(s) U(s) U(s) U(s) U(s) U(s)
• Game trees capture the effects of successive action executions:
![Page 19: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/19.jpg)
Game Tree
Current Situation
Player 1 action
Player 2 action
U(s) U(s) U(s) U(s) U(s) U(s)
• Game trees capture the effects of successive action executions:
Pick the action that leads to the state with maximum expected
utility after taking into account what the other
players might do
![Page 20: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/20.jpg)
Game Tree
Current Situation
Player 1 action
Player 2 action
U(s) U(s) U(s) U(s) U(s) U(s)
• Game trees capture the effects of successive action executions:
In this example, we look ahead only one player 1 action and one player 2 action.
But we could grow the tree arbitrarily deep
![Page 21: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/21.jpg)
Minimax Principle
Current Situation
Player 1 action
Player 2 action
U(s) = -1 U(s) = 0 U(s) = -1 U(s) = 0 U(s) = 0 U(s) = 0
• Positive utility is good for player 1, and negative for player 2 • Player 1 chooses actions that maximize U, player 2 chooses
actions that minimize U
![Page 22: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/22.jpg)
Minimax Principle
Current Situation
Player 1 action
Player 2 action
U(s) = -1 U(s) = 0 U(s) = -1 U(s) = 0 U(s) = 0 U(s) = 0
• Positive utility is good for player 1, and negative for player 2 • Player 1 chooses actions that maximize U, player 2 chooses
actions that minimize U
Only looking at the utility values, which
move should player 1 choose?
![Page 23: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/23.jpg)
Minimax Principle
Current Situation
Player 1 action
Player 2 action (min)
U(s) = -1 U(s) = 0 U(s) = -1 U(s) = 0 U(s) = 0 U(s) = 0
• Positive utility is good for player 1, and negative for player 2 • Player 1 chooses actions that maximize U, player 2 chooses
actions that minimize U
![Page 24: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/24.jpg)
Minimax Principle
Current Situation
Player 1 action
Player 2 action (min)
U(s) = -1 U(s) = 0 U(s) = -1 U(s) = 0 U(s) = 0 U(s) = 0
• Positive utility is good for player 1, and negative for player 2 • Player 1 chooses actions that maximize U, player 2 chooses
actions that minimize U
U(s) = -1 U(s) = -1 U(s) = 0
![Page 25: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/25.jpg)
Minimax Principle
Current Situation
Player 1 action (max)
Player 2 action (min)
U(s) = -1 U(s) = 0 U(s) = -1 U(s) = 0 U(s) = 0 U(s) = 0
• Positive utility is good for player 1, and negative for player 2 • Player 1 chooses actions that maximize U, player 2 chooses
actions that minimize U
U(s) = -1 U(s) = -1 U(s) = 0
![Page 26: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/26.jpg)
Minimax Algorithm Minimax(state, player, MAX_DEPTH)
IF MAX_DEPTH == 0 RETURN (U(state),-) BestAction = null BestScore = null FOR Action in actions(player, state)
(Score,Action2) = Minimax(result(action, state), nextplayer(player), MAX_DEPTH-1)
IF BestScore == null || (player == 1 && Score>BestScore) || (player == 2 && Score<BestScore) BestScore = Score BestAction = Action
ENDFOR RETURN (BestScore, BestAction)
![Page 27: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/27.jpg)
Minimax Algorithm Minimax(state, player, MAX_DEPTH)
IF MAX_DEPTH == 0 RETURN (U(state),-) BestAction = null BestScore = null FOR Action in actions(player, state)
(Score,Action2) = Minimax(result(action, state), nextplayer(player), MAX_DEPTH-1)
IF BestScore == null || (player == 1 && Score>BestScore) || (player == 2 && Score<BestScore) BestScore = Score BestAction = Action
ENDFOR RETURN (BestScore, BestAction)
![Page 28: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/28.jpg)
Minimax Algorithm • Needs:
• Utility function U • Way to determine which actions can a player execute in a given state
• MAX_DEPTH controls how deep is the search tree going to be: • Size of the tree is exponential in MAX_DEPTH • Branching factor is the number of moves that can be executed per
state
• The higher MAX_DEPTH, the better the AI will play
• There are ways to increase speed: alpha-beta pruning
![Page 29: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/29.jpg)
Minimax Algorithm • Needs:
• Utility function U • Way to determine which actions can a player execute in a given state
• MAX_DEPTH controls how deep is the search tree going to be: • Size of the tree is exponential in MAX_DEPTH • Branching factor is the number of moves that can be executed per
state
• The higher MAX_DEPTH, the better the AI will play
• There are ways to increase speed: alpha-beta pruning
- Given: - Branching factor: B - Maximum tree depth: D
- What is the time complexity? - What is the memory complexity?
![Page 30: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/30.jpg)
Successes of Minimax • Deep Blue defeated Kasparov in Chess (1997)
• Checkers was completely solved by Jonathan Shaeffert (2007): • If no players make mistakes, the game is a draw (like tick-tack-toe)
• Go: • Using a variant of minimax, based on Monte Carlo Tree Search • In 2011 The program Zen19S reached 4 dan (professional humans
are rated between 1 to 9 dan) • In 2016 AlphaGO defeated Lee SeDol (one of the best players in
the world)
![Page 31: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/31.jpg)
Interesting Uses of Minimax • “bastet” (Bastard Tetris):
http://blahg.res0l.net/2009/01/bastet-bastard-tetris/
![Page 32: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/32.jpg)
Iterative Deepening • As described before, minimax receives a MAX_DEPTH
and it is impossible to predict how much time will it take to execute
• In a game, minimax will receive a certain amount of time (e.g. 20 seconds) that it can use to decide the next move
• Solution: iterative deepening
![Page 33: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/33.jpg)
Iterative Deepening • Idea:
• Open the tree at depth 1 • If there is still time, open it at depth 2 • If there is still time, open it at depth 3 • Etc.
![Page 34: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/34.jpg)
Iterative Deepening • Idea:
• Open the tree at depth 1 • If there is still time, open it at depth 2 • If there is still time, open it at depth 3 • Etc.
If we end up searching up to depth, say 5, how much time is wasted?
![Page 35: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/35.jpg)
Iterative Deepening • Idea:
• Open the tree at depth 1 • If there is still time, open it at depth 2 • If there is still time, open it at depth 3 • Etc.
• Given the branching factor d, each subsequent iteration is d times larger in average than the previous.
• For typical values of d (larger than 10), the extra cost of iterative deepening is negligible
If we end up searching up to depth, say 5, how much time is wasted?
![Page 36: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/36.jpg)
Alpha-Beta Pruning • Not all the nodes in the search tree are relevant for
deciding the next move
5 2 4 1 3 4 2 6 1
![Page 37: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/37.jpg)
Alpha-Beta Pruning • Not all the nodes in the search tree are relevant for
deciding the next move
2
2 1 1
5 2 4 1 3 4 2 6 1
![Page 38: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/38.jpg)
Alpha-Beta Pruning • Not all the nodes in the search tree are relevant for
deciding the next move
2
2 1 1
5 2 4 1 3 4 2 6 1
What would happen is this value was higher?
What would happen if this value was lower?
![Page 39: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/39.jpg)
Alpha-Beta Pruning • Not all the nodes in the search tree are relevant for
deciding the next move
2
2 1 1
5 2 4 1 3 4 2 6 1
What would happen is this value was higher?
What would happen if this value was lower?
NOTHING!
![Page 40: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/40.jpg)
Alpha-Beta Pruning • Not all the nodes in the search tree are relevant for
deciding the next move
2
2 1 1
5 2 4 1 3 4 2 6 1
These two nodes are irrelevant! They do not have to be explored!
This is because the first node has a “1”,
which is lower than the lowest found in any other branch so far
![Page 41: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/41.jpg)
Minimax Algorithm α = - infinity β = infinity alphabeta(state, MAX_DEPTH, α, β, player) if MAX_DEPTH = 0 or state is a terminal state return U(state) if player= 1 for action in actions(player, state) α := max(α, alphabeta(result(action,state), MAX_DEPTH-1, α, β, 2)) if β ≤ α break return α else for action in actions(player, state) β := min(β, alphabeta(result(action,state), MAX_DEPTH-1, α, β, 1)) if β ≤ α break return β
![Page 42: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/42.jpg)
Minimax Algorithm α = - infinity β = infinity alphabeta(state, MAX_DEPTH, α, β, player) if MAX_DEPTH = 0 or state is a terminal state return U(state) if player= 1 for action in actions(player, state) α := max(α, alphabeta(result(action,state), MAX_DEPTH-1, α, β, 2)) if β ≤ α break return α else for action in actions(player, state) β := min(β, alphabeta(result(action,state), MAX_DEPTH-1, α, β, 1)) if β ≤ α break return β
![Page 43: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/43.jpg)
Alpha-Beta Pruning
![Page 44: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/44.jpg)
Alpha-Beta Pruning • Does pruning occur independently of the order in which
nodes are visited?
2
2 1 1
5 2 4 1 3 4 2 6 1
![Page 45: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/45.jpg)
Alpha-Beta Pruning • Notice that pruning depends on the order in which the
children are explored
2
2 1 1
5 2 4 1 3 4 2 6 1
If we expand the “1” first, then “2” and “6”
do not have to be explored
![Page 46: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/46.jpg)
Alpha-Beta Pruning • How to decide a good order for children expansion?
![Page 47: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/47.jpg)
Alpha-Beta Pruning • How to decide a good order for children expansion? • Idea: Iterative deepening
• Explore first the children that was selected as the best move in the previous iteration of iterative deepening
• With this modification, iterative deepening is actually faster in practice than just opening the tree at a given depth!
• Other domain specific heuristics exist for well known games such as Chess.
![Page 48: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/48.jpg)
Outline • Board Games • Game Tree Search • Portfolio Search • Monte Carlo Search • UCT
![Page 49: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/49.jpg)
What is an Action? • Consider a complex board game like Settlers or
Scrabble, what is the set of actions a player can perform in her turn?
![Page 50: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/50.jpg)
What is an Action? • Consider a complex board game like Settlers or
Scrabble, what is the set of actions a player can perform in her turn?
Way too many actions to consider in minimax!!
![Page 51: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/51.jpg)
Portfolio Search • Consider the game of Monopoly • The set of possible actions is too large (just imagine all
possible deals we can offer any player!) • However, we can do the following:
• We can devise 3 or 4 strategies to play the game: A. Never do any deal nor build any house, just roll dies and buy streets. B. Never do any deal, but build one house in the most expensive street
we can. C. Never do any deal, but build as many houses as we can, in the
cheapest street we can. D. Do not build houses, but offer a deal to get the cheapest full set we
could get by trading a single card with one player (offering her some predefined amount of money, a factor of the price of what we are getting)
![Page 52: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/52.jpg)
Portfolio Search • Consider the game of Monopoly • The set of possible actions is too large (just imagine all
possible deals we can offer any player!) • However, we can do the following:
• We can devise 3 or 4 strategies to play the game: A. Never do any deal nor build any house, just roll dies and buy streets. B. Never do any deal, but build one house in the most expensive street
we can. C. Never do any deal, but build as many houses as we can, in the
cheapest street we can. D. Do not build houses, but offer a deal to get the cheapest full set we
could get by trading a single card with one player (offering her some predefined amount of money, a factor of the price of what we are getting)
Certainly, these different strategies would do better in different situations.
The key idea of portfolio search is to consider these strategies as the “actions”.
![Page 53: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/53.jpg)
Minimax Portfolio Search • At each level of the tree, use each of the predefined strategies
to generate the next action, and only consider those actions.
Action proposed By strategy A
Action proposed By strategy B
Action proposed By strategy C
![Page 54: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/54.jpg)
Minimax Portfolio Search • At each level of the tree, use each of the predefined strategies
to generate the next action, and only consider those actions.
Action proposed By strategy A
Action proposed By strategy B
Action proposed By strategy C
This simple idea can make minimax search feasible in games with a set of actions that is too large to consider
the whole tree.
![Page 55: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/55.jpg)
Simple Portfolio Search • Forget about game trees, just do this: given a set of
strategies S
• For each s1 in S: • For each s2 in S:
• Simulate a game for D game cycles where player 1 uses s1, and player 2 uses s2
• Compute the average reward obtained by each strategy s1, and select the one that achieved the highest average.
![Page 56: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/56.jpg)
Portfolio Search • Imagine this situation:
• Branching factor B • Search up to a depth D • We have a set of S strategies (S << B)
• What is the time / memory complexity of: • Minimax? • Minimax portfolio search? • Simple portfolio search?
![Page 57: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/57.jpg)
Portfolio Search • Imagine this situation:
• Branching factor B • Search up to a depth D • We have a set of S strategies (S << B)
• What is the time / memory complexity of: • Minimax? B^D , D • Minimax portfolio search? S^D , D • Simple portfolio search? D * S^2 , 2
![Page 58: CS 387: GAME AI](https://reader030.fdocuments.net/reader030/viewer/2022012604/6199759509ecbe1d6d3daff4/html5/thumbnails/58.jpg)
Portfolio Search • Imagine this situation:
• Branching factor B • Search up to a depth D • We have a set of S strategies
• What is the time / memory complexity of: • Minimax? B^D , D • Minimax portfolio search? S^D , D • Simple portfolio search? D * S^2 , 2
In terms of play strength: Minimax > minimax portfolio search > simple portfolio search
In terms of computational needs:
Minimax > minimax portfolio search > simple portfolio search
Thus, if you can use minimax, that’s the simplest thing to do. But if you cannot (due to CPU constraints),
portfolio search is a good option to consider.