Backtracking and Game Trees

Backtracking and Game Trees

15-211: Fundamental Data Structures and Algorithms

Margaret Reid-Miller5 April 2005

2

Announcements

HW5 now available!

May do in groups of two.

Review in recitation

No fancy data structures except trie!

Due Monday 11:59 pm

Backtracking

X O XO X OX O O C D

B

A

4

Backtracking

An algorithm-design technique

“Organized brute force”

Explore the space of possible answers

Do it in an organized way

Example: maze• Try S, E, N, W

IN

OUT

5

Backtracking is useful …

… when a problem is too hard to be solved directly 8-queens problem Knight’s tour

… when we have limited time and can accept a good but potentially not optimal solution Game playing (second part of lecture) Planning (not in this class)

6

Backtracking decisions

Develop answers by identifying a set of successive decisions. Maze

Where do I go now: N, S, E or W?

8 Queens Where do I put the next queen?

Knight’s tour Where do I jump next?

7


Develop answers by identifying a set of successive decisions.

Pure Backtracking Decisions can be binary: ok or

impossible

Cannot put an “X” in two places

Can move the rook two spaces forward

8


Develop answers by identifying a set of successive decisions.

Pure Backtracking Decisions can be binary: ok or

impossible

Heuristic Backtracking Decisions can have goodness:

Good to sacrifice a pawn to take a bishop

Bad to sacrifice the queen to take a pawn

Bad not to block in tic-tac-toe

9

Backtracking technique

When you reach a dead-end

A

B

C

10


When you reach a dead-endWithdraw the most recent choice

A

B

C

11



Undo its consequences

A

B

C

A

B

12




Is there a new choice? If not, you are at another dead-end

A

B

C

A

B

13



A

B

C

A

B

14




A

B

C

A

B

A

15




Is there a new choice? If so, try that

A

B

C

A

B

A

E

16




Is there a new choice? If so, try that If not, you are at another dead-end

Search tree

A

B

C

E

17

Search trees

The tree

Is rarely explicit as a data structure!

How do we keep track of our search?

Use a stack (“control stack”)

Represents a stack of tentative decisions

Tracks decisions to be undone when backtracking.

Can be implicit in recursive calls

18

No optimality guarantees

If there are multiple solutions, simple backtracking will find one of them, but not necessarily the optimal.

No guarantees that we’ll reach a solution quickly.

19

Backtracking Summary

Organized brute force

Formulate problem so that answer amounts to taking a set of successive decisions

When stuck, go back, try the remaining alternatives

Does not give optimal solutions

Games

X O XO X OX O O

21

Why Make Computers Play Games?

People have been fascinated by games since the dawn of civilization.

Idealization of real-world problems containing adversary situations

Typically rules are very simple

State of the world is fully accessible

Example: auctions

22

No, not Quake (although interesting research there, too)

Simpler Strategy/Board Game Chess

Checkers

Othello

Go

Let’s look at how we might go about playing Tic-Tac-Toe

What Kind of Games?

23

Consider this position

We are playing X, and it is now our turn.

24

Let’s write out all possibilities

Each number represents a position after each legalmove we have.

25

Now let’s look at their options

Here we are looking at all of the opponent responses to the first possible move we could make.

26


Opponent options after our second possibility. Not good again…

27


Struggling…

28

More interesting case

Now they don’t have a way to win on their next move. So now we have to consider our responses to their responses.

29

Our options

We have a win for any move they make. So the original position in purple is an X win.

30

Finishing it up…

They win again if we take our fifth move.

31

Summary of the Analysis

So which move should we make? ;-)

32

A Tic Tac Toe Game Tree

moves

moves

moves

33

A Tic Tac Toe Game Tree

moves

moves

moves

Nodes•Denote board configurations•Include a score

Edges•Denote legal moves

Path•Successive moves by the players

34

A path in a game tree

KARPOV-KASPAROV, 1985

1.e4 c52.Nf3 e63.d4 cxd44.Nxd4 Nc6 5.Nb5 d6 6.c4 Nf6 7.N1c3 a6 8.Na3 d5!? 9.cxd5 exd5 10.exd5 Nb4 11.Be2!?N Bc5! 12.0-0 0-0 13.Bf3 Bf5 14.Bg5 Re8! 15.Qd2 b5 16.Rad1 Nd3! 17.Nab1? h6! 18.Bh4 b4! 19.Na4 Bd6 20.Bg3 Rc8

21.b3 g5!! 22.Bxd6 Qxd6 23.g3 Nd7! 24.Bg2 Qf6! 25.a3 a5 26.axb4 axb4 27.Qa2 Bg6 28.d6 g4! 29.Qd2 Kg7 30.f3 Qxd6 31.fxg4 Qd4+ 32.Kh1 Nf6 33.Rf4 Ne4 34.Qxd3 Nf2+ 35.Rxf2 Bxd3 36.Rfd2 Qe3! 37.Rxd3 Rc1!! 38.Nb2 Qf2! 39.Nd2 Rxd1+40.Nxd1 Re1+ White resigned

35

Two players

Deterministic no randomness, e.g., dice

Perfect information No hidden cards or hidden Chess pieces…

Finite Game must end in finite number of moves

Zero sum Total winnings of all players

Properties of Tic Tac Toe

36

Games Not Considered

Poker Hidden cards

Backgammon not deterministic (e.g., dice)

Prisoner’s dilemma, game of chicken Not zero-sum

Have rich and beautiful theories and algorithms, though. (see CS 15-859)

37

Two-player games

We can define the value (goodness) of a certain game state (board).

What about the non-final board?

Look at board, assign value

Look at children in game tree, assign value

1 0 -1 ?

38

How do we play?

Traverse the “game tree”.

Enumerate all possible moves at each node.

Start by evaluating the terminal positions, where the game is over.

Propogate the values up the tree, assuming we pick the best move for us and the opponent picks the best move for her.

39

How to play?

moves

moves

moves

1 0 0 0 0 1

1 0 0 0 0 1

0 00

0

40

How do we play?

What is the rule to compute the value of nonterminal node?

If it is our turn (X) then we pick the maximum outcome from the children.

If it is the opponent’s turn (O) then we pick the minimum outcome from the children.

This process is known as the Minimax algorithm.

We play the move that gave us the maximum at the root.

41

More generally

Player A (us) maximize goodness.

Player B (opponent) minimizes goodness

Player A maximize

Player B minimize

Player A maximize

5

9 7

8 2

a b

c d

42

Games Trees are useful…

Provide “lookahead” to determine what move to make next

Build whole tree = we know how to play perfectly

5

9 7

8 2

a b

c d

43

But there’s one problem…

Games have large search trees:

Tic-Tac-Toe

There are 9 ways we can make the first move and opponent has 8 possible moves he can make.

Then when it is our turn again, we can make 7 possible moves for each of the 8 moves and so on….

9! = 362,880

44

Or chess…

Suppose we look at 16 possible moves at each step

And suppose we explore to a depth of 50 turns

Then the tree will have 1650=10120 nodes!

DeepThought (1990) searched to a depth of 10

45

So…

Need techniques to avoid enumerating and evaluating the whole tree

Heuristic search

46

Heuristic search

1. Organize search to eliminate large sets of possibilities Chess: Consider major moves first

2. Explore decisions in order of likely success Chess: Use a library of known strategies

3. Save time by guessing search outcomes Chess: Estimate the quality of a situation

Count pieces

Count pieces, using a weighting scheme

Count pieces, using a weighting scheme, considering threats

47

Mini-Max Algorithm

48

Let’s see it in action

10 2 12 16 2 7 -5 -80

10 16 7 -5

10 -5

10Max (Player A)

Max(Player A)

Min (Player B)

Evaluation function applied to the leaves!

49

Minimax, in reality

Rarely do we reach an actual leaf

Use estimator functions to statically guess goodness of missing subtrees

Max A moves

Min B moves

Max A moves2 7 1 8

50

Minimax, in reality



Max A moves

Min B moves

Max A moves

2

2 7

1

1 8

51

Minimax, in reality



Max A moves

Min B moves

Max A moves

2

2

2 7

1

1 8

52

Minimax

Trade-off

Quality vs. Speed

Quality: deeper search

Speed: use of estimator functions

Balancing

Relative costs of move generation and estimator functions

Quality and cost of estimation function

53

Minimax algorithm - pseudo code

Max( P, depth ) { if( depth == 0 ) return estimator(P) m = - for each possible move P’ m = max( m, Min(P’, depth-1) ) return m}

Min( P, depth ) { if( depth == 0 ) return estimator(P) m = for each possible move P’ m = min( m, Max(P’, depth-1) ) return m}

54

NegaMax Algorithm

Rewrite from the point of view of moving player:

negaMax( P, depth ) { if (depth == 0) return estimator(P) m = -

for each possible move P’ v = -negaMax( P’, depth-1 ) if m < v then m = v return m}

Can you prove these two versions are the same?

55

How fast?

Minimax is pretty slow even for a modest depth.

It is basically a brute force search.

What is the running time?

Each level of the tree has some average b moves per level. We have d levels. So the running time is O(bd).

56

Can we speed this up?

Let us observe the following game tree.

2

2

2 7

1

1

What do we know about the root?

Max

Min

Max

What do we know about the root’s right child?

57

Alpha Beta Pruning

58

Pruning

Minimax sometimes evaluates nodes that have no impact on the search.

E.g., Evaluating moves in Chess

My 1st move: I’m ahead a rook

My 2nd move: Opponent 1st move leaves me down a queen.

No need to look at Opponent’s 2nd move to find out it is check mate. Already know my 1st move is better.

59

Alpha Beta Pruning

Idea: Track “window” of expectations. Two variables:

– Best score so far at a max node: increases

At a child min node:• Parent wants max. To affect the parent’s current , our

cannot drop below .

– Best score so far at a min node: decreases

At a child max node. • Parent wants min. To affect the parent’s current , our

cannot get above the parent’s .

Either case: If :• Stop searching further subtrees of that child. They do

not matter!

Start the process with an infinite window ( = -, = ).

60

Alpha-Beta algorithm - pseudo code

MaxAB( P, depth, , ) { if( depth == 0 ) return estimator(P)

for each possible move P’ && < = max(, MinAB( P’, depth-1, , )) return }

MinAB( P, depth, , ) { if( depth == 0 ) return estimator(P)

for each possible move P’ && < = min(, MaxAB( P’, depth-1, , )) return }

61

Alpha-Beta algorithm - Alternative

AB( P, depth, , ) { if( depth == 0 ) return estimator(P)

for each possible move P’ && < v = -AB( P’, depth-1, -, - )

if < v then = v

return }

In MinAB replace with - and with -, and negate the return value. Get MaxAB.

62

10 2 12

10 12

10

Alpha Beta Example

Max

Max

Min

Min

=10

= 12

> !

63

Alpha Beta Example

10 2 12 2 7

10 12 7

10 7

10Max

Max

Min

Min

= 10

=7

> !

64

Alpha Beta Pruning

Does Alpha Beta ever return a different root value than Minimax? No! As long as the root value is

within the root (Alpha, Beta) range.

Alpha Beta does the same thing Minimax does, except it is able to detect parts of the tree that make no difference.

65

Alpha Beta Pruning

Does Alpha Beta ever return a different root value than Minimax? No! As long as the root value is

within the root (Alpha, Beta) range.

Alpha Beta does the same thing Minimax does, except it is able to detect parts of the tree that make no difference.

66

Alpha Beta Pruning

Theorem: Let v(P) be the value of position P. Let X be the value returned by AB(, ).Then one of the following holds:

v(p) ≤ and X ≤

< v(P) < and X = v(P)

≤ v(P) and X ≥

The proof is by induction, and is an easy but tedious case analysis.

67

Alpha Beta speedup

Claim: The optimal Alpha Beta search tree is O(bd/2) nodes or the square root of the number of nodes in the regular Minimax tree.Can enable twice the depth

The speedup is greatly dependent on the order in which you consider moves at each node. Why?

68

Summary

Backtracking

Organized brute force

Answer to problem = set of successive decisions

When stuck, go back, try the remaining alternatives

No optimality guarantees

69

Summary

Game playing

Game trees

Mini-max algorithm

Optimization

Alpha-beta pruning

Heuristics search

Backtracking and Game Trees

Documents

Transcript of Backtracking and Game Trees