Backtracking and Game Trees
-
Upload
kelly-baird -
Category
Documents
-
view
32 -
download
1
description
Transcript of Backtracking and Game Trees
Backtracking and Game Trees
15-211: Fundamental Data Structures and Algorithms
Margaret Reid-Miller5 April 2005
2
Announcements
HW5 now available!
May do in groups of two.
Review in recitation
No fancy data structures except trie!
Due Monday 11:59 pm
Backtracking
X O XO X OX O O C D
B
A
4
Backtracking
An algorithm-design technique
“Organized brute force”
Explore the space of possible answers
Do it in an organized way
Example: maze• Try S, E, N, W
IN
OUT
5
Backtracking is useful …
… when a problem is too hard to be solved directly 8-queens problem Knight’s tour
… when we have limited time and can accept a good but potentially not optimal solution Game playing (second part of lecture) Planning (not in this class)
6
Backtracking decisions
Develop answers by identifying a set of successive decisions. Maze
Where do I go now: N, S, E or W?
8 Queens Where do I put the next queen?
Knight’s tour Where do I jump next?
7
Backtracking decisions
Develop answers by identifying a set of successive decisions.
Pure Backtracking Decisions can be binary: ok or
impossible
Cannot put an “X” in two places
Can move the rook two spaces forward
8
Backtracking decisions
Develop answers by identifying a set of successive decisions.
Pure Backtracking Decisions can be binary: ok or
impossible
Heuristic Backtracking Decisions can have goodness:
Good to sacrifice a pawn to take a bishop
Bad to sacrifice the queen to take a pawn
Bad not to block in tic-tac-toe
9
Backtracking technique
When you reach a dead-end
A
B
C
10
Backtracking technique
When you reach a dead-endWithdraw the most recent choice
A
B
C
11
Backtracking technique
When you reach a dead-endWithdraw the most recent choice
Undo its consequences
A
B
C
A
B
12
Backtracking technique
When you reach a dead-endWithdraw the most recent choice
Undo its consequences
Is there a new choice? If not, you are at another dead-end
A
B
C
A
B
13
Backtracking technique
When you reach a dead-endWithdraw the most recent choice
A
B
C
A
B
14
Backtracking technique
When you reach a dead-endWithdraw the most recent choice
Undo its consequences
A
B
C
A
B
A
15
Backtracking technique
When you reach a dead-endWithdraw the most recent choice
Undo its consequences
Is there a new choice? If so, try that
A
B
C
A
B
A
E
16
Backtracking technique
When you reach a dead-endWithdraw the most recent choice
Undo its consequences
Is there a new choice? If so, try that If not, you are at another dead-end
Search tree
A
B
C
E
17
Search trees
The tree
Is rarely explicit as a data structure!
How do we keep track of our search?
Use a stack (“control stack”)
Represents a stack of tentative decisions
Tracks decisions to be undone when backtracking.
Can be implicit in recursive calls
18
No optimality guarantees
If there are multiple solutions, simple backtracking will find one of them, but not necessarily the optimal.
No guarantees that we’ll reach a solution quickly.
19
Backtracking Summary
Organized brute force
Formulate problem so that answer amounts to taking a set of successive decisions
When stuck, go back, try the remaining alternatives
Does not give optimal solutions
Games
X O XO X OX O O
21
Why Make Computers Play Games?
People have been fascinated by games since the dawn of civilization.
Idealization of real-world problems containing adversary situations
Typically rules are very simple
State of the world is fully accessible
Example: auctions
22
No, not Quake (although interesting research there, too)
Simpler Strategy/Board Game Chess
Checkers
Othello
Go
Let’s look at how we might go about playing Tic-Tac-Toe
What Kind of Games?
23
Consider this position
We are playing X, and it is now our turn.
24
Let’s write out all possibilities
Each number represents a position after each legalmove we have.
25
Now let’s look at their options
Here we are looking at all of the opponent responses to the first possible move we could make.
26
Now let’s look at their options
Opponent options after our second possibility. Not good again…
27
Now let’s look at their options
Struggling…
28
More interesting case
Now they don’t have a way to win on their next move. So now we have to consider our responses to their responses.
29
Our options
We have a win for any move they make. So the original position in purple is an X win.
30
Finishing it up…
They win again if we take our fifth move.
31
Summary of the Analysis
So which move should we make? ;-)
32
A Tic Tac Toe Game Tree
moves
moves
moves
33
A Tic Tac Toe Game Tree
moves
moves
moves
Nodes•Denote board configurations•Include a score
Edges•Denote legal moves
Path•Successive moves by the players
34
A path in a game tree
KARPOV-KASPAROV, 1985
1.e4 c52.Nf3 e63.d4 cxd44.Nxd4 Nc6 5.Nb5 d6 6.c4 Nf6 7.N1c3 a6 8.Na3 d5!? 9.cxd5 exd5 10.exd5 Nb4 11.Be2!?N Bc5! 12.0-0 0-0 13.Bf3 Bf5 14.Bg5 Re8! 15.Qd2 b5 16.Rad1 Nd3! 17.Nab1? h6! 18.Bh4 b4! 19.Na4 Bd6 20.Bg3 Rc8
21.b3 g5!! 22.Bxd6 Qxd6 23.g3 Nd7! 24.Bg2 Qf6! 25.a3 a5 26.axb4 axb4 27.Qa2 Bg6 28.d6 g4! 29.Qd2 Kg7 30.f3 Qxd6 31.fxg4 Qd4+ 32.Kh1 Nf6 33.Rf4 Ne4 34.Qxd3 Nf2+ 35.Rxf2 Bxd3 36.Rfd2 Qe3! 37.Rxd3 Rc1!! 38.Nb2 Qf2! 39.Nd2 Rxd1+40.Nxd1 Re1+ White resigned
35
Two players
Deterministic no randomness, e.g., dice
Perfect information No hidden cards or hidden Chess pieces…
Finite Game must end in finite number of moves
Zero sum Total winnings of all players
Properties of Tic Tac Toe
36
Games Not Considered
Poker Hidden cards
Backgammon not deterministic (e.g., dice)
Prisoner’s dilemma, game of chicken Not zero-sum
Have rich and beautiful theories and algorithms, though. (see CS 15-859)
37
Two-player games
We can define the value (goodness) of a certain game state (board).
What about the non-final board?
Look at board, assign value
Look at children in game tree, assign value
1 0 -1 ?
38
How do we play?
Traverse the “game tree”.
Enumerate all possible moves at each node.
Start by evaluating the terminal positions, where the game is over.
Propogate the values up the tree, assuming we pick the best move for us and the opponent picks the best move for her.
39
How to play?
moves
moves
moves
1 0 0 0 0 1
1 0 0 0 0 1
0 00
0
40
How do we play?
What is the rule to compute the value of nonterminal node?
If it is our turn (X) then we pick the maximum outcome from the children.
If it is the opponent’s turn (O) then we pick the minimum outcome from the children.
This process is known as the Minimax algorithm.
We play the move that gave us the maximum at the root.
41
More generally
Player A (us) maximize goodness.
Player B (opponent) minimizes goodness
Player A maximize
Player B minimize
Player A maximize
5
9 7
8 2
a b
c d
42
Games Trees are useful…
Provide “lookahead” to determine what move to make next
Build whole tree = we know how to play perfectly
5
9 7
8 2
a b
c d
43
But there’s one problem…
Games have large search trees:
Tic-Tac-Toe
There are 9 ways we can make the first move and opponent has 8 possible moves he can make.
Then when it is our turn again, we can make 7 possible moves for each of the 8 moves and so on….
9! = 362,880
44
Or chess…
Suppose we look at 16 possible moves at each step
And suppose we explore to a depth of 50 turns
Then the tree will have 1650=10120 nodes!
DeepThought (1990) searched to a depth of 10
45
So…
Need techniques to avoid enumerating and evaluating the whole tree
Heuristic search
46
Heuristic search
1. Organize search to eliminate large sets of possibilities Chess: Consider major moves first
2. Explore decisions in order of likely success Chess: Use a library of known strategies
3. Save time by guessing search outcomes Chess: Estimate the quality of a situation
Count pieces
Count pieces, using a weighting scheme
Count pieces, using a weighting scheme, considering threats
47
Mini-Max Algorithm
48
Let’s see it in action
10 2 12 16 2 7 -5 -80
10 16 7 -5
10 -5
10Max (Player A)
Max(Player A)
Min (Player B)
Evaluation function applied to the leaves!
49
Minimax, in reality
Rarely do we reach an actual leaf
Use estimator functions to statically guess goodness of missing subtrees
Max A moves
Min B moves
Max A moves2 7 1 8
50
Minimax, in reality
Rarely do we reach an actual leaf
Use estimator functions to statically guess goodness of missing subtrees
Max A moves
Min B moves
Max A moves
2
2 7
1
1 8
51
Minimax, in reality
Rarely do we reach an actual leaf
Use estimator functions to statically guess goodness of missing subtrees
Max A moves
Min B moves
Max A moves
2
2
2 7
1
1 8
52
Minimax
Trade-off
Quality vs. Speed
Quality: deeper search
Speed: use of estimator functions
Balancing
Relative costs of move generation and estimator functions
Quality and cost of estimation function
53
Minimax algorithm - pseudo code
Max( P, depth ) { if( depth == 0 ) return estimator(P) m = - for each possible move P’ m = max( m, Min(P’, depth-1) ) return m}
Min( P, depth ) { if( depth == 0 ) return estimator(P) m = for each possible move P’ m = min( m, Max(P’, depth-1) ) return m}
54
NegaMax Algorithm
Rewrite from the point of view of moving player:
negaMax( P, depth ) { if (depth == 0) return estimator(P) m = -
for each possible move P’ v = -negaMax( P’, depth-1 ) if m < v then m = v return m}
Can you prove these two versions are the same?
55
How fast?
Minimax is pretty slow even for a modest depth.
It is basically a brute force search.
What is the running time?
Each level of the tree has some average b moves per level. We have d levels. So the running time is O(bd).
56
Can we speed this up?
Let us observe the following game tree.
2
2
2 7
1
1
What do we know about the root?
Max
Min
Max
What do we know about the root’s right child?
57
Alpha Beta Pruning
58
Pruning
Minimax sometimes evaluates nodes that have no impact on the search.
E.g., Evaluating moves in Chess
My 1st move: I’m ahead a rook
My 2nd move: Opponent 1st move leaves me down a queen.
No need to look at Opponent’s 2nd move to find out it is check mate. Already know my 1st move is better.
59
Alpha Beta Pruning
Idea: Track “window” of expectations. Two variables:
– Best score so far at a max node: increases
At a child min node:• Parent wants max. To affect the parent’s current , our
cannot drop below .
– Best score so far at a min node: decreases
At a child max node. • Parent wants min. To affect the parent’s current , our
cannot get above the parent’s .
Either case: If :• Stop searching further subtrees of that child. They do
not matter!
Start the process with an infinite window ( = -, = ).
60
Alpha-Beta algorithm - pseudo code
MaxAB( P, depth, , ) { if( depth == 0 ) return estimator(P)
for each possible move P’ && < = max(, MinAB( P’, depth-1, , )) return }
MinAB( P, depth, , ) { if( depth == 0 ) return estimator(P)
for each possible move P’ && < = min(, MaxAB( P’, depth-1, , )) return }
61
Alpha-Beta algorithm - Alternative
AB( P, depth, , ) { if( depth == 0 ) return estimator(P)
for each possible move P’ && < v = -AB( P’, depth-1, -, - )
if < v then = v
return }
In MinAB replace with - and with -, and negate the return value. Get MaxAB.
62
10 2 12
10 12
10
Alpha Beta Example
Max
Max
Min
Min
=10
= 12
> !
63
Alpha Beta Example
10 2 12 2 7
10 12 7
10 7
10Max
Max
Min
Min
= 10
=7
> !
64
Alpha Beta Pruning
Does Alpha Beta ever return a different root value than Minimax? No! As long as the root value is
within the root (Alpha, Beta) range.
Alpha Beta does the same thing Minimax does, except it is able to detect parts of the tree that make no difference.
65
Alpha Beta Pruning
Does Alpha Beta ever return a different root value than Minimax? No! As long as the root value is
within the root (Alpha, Beta) range.
Alpha Beta does the same thing Minimax does, except it is able to detect parts of the tree that make no difference.
66
Alpha Beta Pruning
Theorem: Let v(P) be the value of position P. Let X be the value returned by AB(, ).Then one of the following holds:
v(p) ≤ and X ≤
< v(P) < and X = v(P)
≤ v(P) and X ≥
The proof is by induction, and is an easy but tedious case analysis.
67
Alpha Beta speedup
Claim: The optimal Alpha Beta search tree is O(bd/2) nodes or the square root of the number of nodes in the regular Minimax tree.Can enable twice the depth
The speedup is greatly dependent on the order in which you consider moves at each node. Why?
68
Summary
Backtracking
Organized brute force
Answer to problem = set of successive decisions
When stuck, go back, try the remaining alternatives
No optimality guarantees
69
Summary
Game playing
Game trees
Mini-max algorithm
Optimization
Alpha-beta pruning
Heuristics search