Backtracking and Game Trees

69
Backtracking and Game Trees 15-211: Fundamental Data Structures and Algorithms Margaret Reid-Miller 5 April 2005

description

Backtracking and Game Trees. 15-211: Fundamental Data Structures and Algorithms. Margaret Reid-Miller 5 April 2005. Announcements. HW5 now available! May do in groups of two. Review in recitation No fancy data structures except trie! Due Monday 11:59 pm. A. X O X O X O X O O. B. C. - PowerPoint PPT Presentation

Transcript of Backtracking and Game Trees

Page 1: Backtracking and Game Trees

Backtracking and Game Trees

15-211: Fundamental Data Structures and Algorithms

Margaret Reid-Miller5 April 2005

Page 2: Backtracking and Game Trees

2

Announcements

HW5 now available!

May do in groups of two.

Review in recitation

No fancy data structures except trie!

Due Monday 11:59 pm

Page 3: Backtracking and Game Trees

Backtracking

X O XO X OX O O C D

B

A

Page 4: Backtracking and Game Trees

4

Backtracking

An algorithm-design technique

“Organized brute force”

Explore the space of possible answers

Do it in an organized way

Example: maze• Try S, E, N, W

IN

OUT

Page 5: Backtracking and Game Trees

5

Backtracking is useful …

… when a problem is too hard to be solved directly 8-queens problem Knight’s tour

… when we have limited time and can accept a good but potentially not optimal solution Game playing (second part of lecture) Planning (not in this class)

Page 6: Backtracking and Game Trees

6

Backtracking decisions

Develop answers by identifying a set of successive decisions. Maze

Where do I go now: N, S, E or W?

8 Queens Where do I put the next queen?

Knight’s tour Where do I jump next?

Page 7: Backtracking and Game Trees

7

Backtracking decisions

Develop answers by identifying a set of successive decisions.

Pure Backtracking Decisions can be binary: ok or

impossible

Cannot put an “X” in two places

Can move the rook two spaces forward

Page 8: Backtracking and Game Trees

8

Backtracking decisions

Develop answers by identifying a set of successive decisions.

Pure Backtracking Decisions can be binary: ok or

impossible

Heuristic Backtracking Decisions can have goodness:

Good to sacrifice a pawn to take a bishop

Bad to sacrifice the queen to take a pawn

Bad not to block in tic-tac-toe

Page 9: Backtracking and Game Trees

9

Backtracking technique

When you reach a dead-end

A

B

C

Page 10: Backtracking and Game Trees

10

Backtracking technique

When you reach a dead-endWithdraw the most recent choice

A

B

C

Page 11: Backtracking and Game Trees

11

Backtracking technique

When you reach a dead-endWithdraw the most recent choice

Undo its consequences

A

B

C

A

B

Page 12: Backtracking and Game Trees

12

Backtracking technique

When you reach a dead-endWithdraw the most recent choice

Undo its consequences

Is there a new choice? If not, you are at another dead-end

A

B

C

A

B

Page 13: Backtracking and Game Trees

13

Backtracking technique

When you reach a dead-endWithdraw the most recent choice

A

B

C

A

B

Page 14: Backtracking and Game Trees

14

Backtracking technique

When you reach a dead-endWithdraw the most recent choice

Undo its consequences

A

B

C

A

B

A

Page 15: Backtracking and Game Trees

15

Backtracking technique

When you reach a dead-endWithdraw the most recent choice

Undo its consequences

Is there a new choice? If so, try that

A

B

C

A

B

A

E

Page 16: Backtracking and Game Trees

16

Backtracking technique

When you reach a dead-endWithdraw the most recent choice

Undo its consequences

Is there a new choice? If so, try that If not, you are at another dead-end

Search tree

A

B

C

E

Page 17: Backtracking and Game Trees

17

Search trees

The tree

Is rarely explicit as a data structure!

How do we keep track of our search?

Use a stack (“control stack”)

Represents a stack of tentative decisions

Tracks decisions to be undone when backtracking.

Can be implicit in recursive calls

Page 18: Backtracking and Game Trees

18

No optimality guarantees

If there are multiple solutions, simple backtracking will find one of them, but not necessarily the optimal.

No guarantees that we’ll reach a solution quickly.

Page 19: Backtracking and Game Trees

19

Backtracking Summary

Organized brute force

Formulate problem so that answer amounts to taking a set of successive decisions

When stuck, go back, try the remaining alternatives

Does not give optimal solutions

Page 20: Backtracking and Game Trees

Games

X O XO X OX O O

Page 21: Backtracking and Game Trees

21

Why Make Computers Play Games?

People have been fascinated by games since the dawn of civilization.

Idealization of real-world problems containing adversary situations

Typically rules are very simple

State of the world is fully accessible

Example: auctions

Page 22: Backtracking and Game Trees

22

No, not Quake (although interesting research there, too)

Simpler Strategy/Board Game Chess

Checkers

Othello

Go

Let’s look at how we might go about playing Tic-Tac-Toe

What Kind of Games?

Page 23: Backtracking and Game Trees

23

Consider this position

We are playing X, and it is now our turn.

Page 24: Backtracking and Game Trees

24

Let’s write out all possibilities

Each number represents a position after each legalmove we have.

Page 25: Backtracking and Game Trees

25

Now let’s look at their options

Here we are looking at all of the opponent responses to the first possible move we could make.

Page 26: Backtracking and Game Trees

26

Now let’s look at their options

Opponent options after our second possibility. Not good again…

Page 27: Backtracking and Game Trees

27

Now let’s look at their options

Struggling…

Page 28: Backtracking and Game Trees

28

More interesting case

Now they don’t have a way to win on their next move. So now we have to consider our responses to their responses.

Page 29: Backtracking and Game Trees

29

Our options

We have a win for any move they make. So the original position in purple is an X win.

Page 30: Backtracking and Game Trees

30

Finishing it up…

They win again if we take our fifth move.

Page 31: Backtracking and Game Trees

31

Summary of the Analysis

So which move should we make? ;-)

Page 32: Backtracking and Game Trees

32

A Tic Tac Toe Game Tree

moves

moves

moves

Page 33: Backtracking and Game Trees

33

A Tic Tac Toe Game Tree

moves

moves

moves

Nodes•Denote board configurations•Include a score

Edges•Denote legal moves

Path•Successive moves by the players

Page 34: Backtracking and Game Trees

34

A path in a game tree

KARPOV-KASPAROV, 1985

1.e4 c52.Nf3 e63.d4 cxd44.Nxd4 Nc6 5.Nb5 d6 6.c4 Nf6 7.N1c3 a6 8.Na3 d5!? 9.cxd5 exd5 10.exd5 Nb4 11.Be2!?N Bc5! 12.0-0 0-0 13.Bf3 Bf5 14.Bg5 Re8! 15.Qd2 b5 16.Rad1 Nd3! 17.Nab1? h6! 18.Bh4 b4! 19.Na4 Bd6 20.Bg3 Rc8

21.b3 g5!!  22.Bxd6 Qxd6 23.g3 Nd7!  24.Bg2 Qf6! 25.a3 a5 26.axb4 axb4 27.Qa2 Bg6 28.d6 g4! 29.Qd2 Kg7 30.f3 Qxd6 31.fxg4 Qd4+ 32.Kh1 Nf6  33.Rf4 Ne4 34.Qxd3 Nf2+ 35.Rxf2 Bxd3 36.Rfd2 Qe3! 37.Rxd3 Rc1!! 38.Nb2 Qf2! 39.Nd2 Rxd1+40.Nxd1 Re1+   White resigned

Page 35: Backtracking and Game Trees

35

Two players

Deterministic no randomness, e.g., dice

Perfect information No hidden cards or hidden Chess pieces…

Finite Game must end in finite number of moves

Zero sum Total winnings of all players

Properties of Tic Tac Toe

Page 36: Backtracking and Game Trees

36

Games Not Considered

Poker Hidden cards

Backgammon not deterministic (e.g., dice)

Prisoner’s dilemma, game of chicken Not zero-sum

Have rich and beautiful theories and algorithms, though. (see CS 15-859)

Page 37: Backtracking and Game Trees

37

Two-player games

We can define the value (goodness) of a certain game state (board).

What about the non-final board?

Look at board, assign value

Look at children in game tree, assign value

1 0 -1 ?

Page 38: Backtracking and Game Trees

38

How do we play?

Traverse the “game tree”.

Enumerate all possible moves at each node.

Start by evaluating the terminal positions, where the game is over.

Propogate the values up the tree, assuming we pick the best move for us and the opponent picks the best move for her.

Page 39: Backtracking and Game Trees

39

How to play?

moves

moves

moves

1 0 0 0 0 1

1 0 0 0 0 1

0 00

0

Page 40: Backtracking and Game Trees

40

How do we play?

What is the rule to compute the value of nonterminal node?

If it is our turn (X) then we pick the maximum outcome from the children.

If it is the opponent’s turn (O) then we pick the minimum outcome from the children.

This process is known as the Minimax algorithm.

We play the move that gave us the maximum at the root.

Page 41: Backtracking and Game Trees

41

More generally

Player A (us) maximize goodness.

Player B (opponent) minimizes goodness

Player A maximize

Player B minimize

Player A maximize

5

9 7

8 2

a b

c d

Page 42: Backtracking and Game Trees

42

Games Trees are useful…

Provide “lookahead” to determine what move to make next

Build whole tree = we know how to play perfectly

5

9 7

8 2

a b

c d

Page 43: Backtracking and Game Trees

43

But there’s one problem…

Games have large search trees:

Tic-Tac-Toe

There are 9 ways we can make the first move and opponent has 8 possible moves he can make.

Then when it is our turn again, we can make 7 possible moves for each of the 8 moves and so on….

9! = 362,880

Page 44: Backtracking and Game Trees

44

Or chess…

Suppose we look at 16 possible moves at each step

And suppose we explore to a depth of 50 turns

Then the tree will have 1650=10120 nodes!

DeepThought (1990) searched to a depth of 10

Page 45: Backtracking and Game Trees

45

So…

Need techniques to avoid enumerating and evaluating the whole tree

Heuristic search

Page 46: Backtracking and Game Trees

46

Heuristic search

1. Organize search to eliminate large sets of possibilities Chess: Consider major moves first

2. Explore decisions in order of likely success Chess: Use a library of known strategies

3. Save time by guessing search outcomes Chess: Estimate the quality of a situation

Count pieces

Count pieces, using a weighting scheme

Count pieces, using a weighting scheme, considering threats

Page 47: Backtracking and Game Trees

47

Mini-Max Algorithm

Page 48: Backtracking and Game Trees

48

Let’s see it in action

10 2 12 16 2 7 -5 -80

10 16 7 -5

10 -5

10Max (Player A)

Max(Player A)

Min (Player B)

Evaluation function applied to the leaves!

Page 49: Backtracking and Game Trees

49

Minimax, in reality

Rarely do we reach an actual leaf

Use estimator functions to statically guess goodness of missing subtrees

Max A moves

Min B moves

Max A moves2 7 1 8

Page 50: Backtracking and Game Trees

50

Minimax, in reality

Rarely do we reach an actual leaf

Use estimator functions to statically guess goodness of missing subtrees

Max A moves

Min B moves

Max A moves

2

2 7

1

1 8

Page 51: Backtracking and Game Trees

51

Minimax, in reality

Rarely do we reach an actual leaf

Use estimator functions to statically guess goodness of missing subtrees

Max A moves

Min B moves

Max A moves

2

2

2 7

1

1 8

Page 52: Backtracking and Game Trees

52

Minimax

Trade-off

Quality vs. Speed

Quality: deeper search

Speed: use of estimator functions

Balancing

Relative costs of move generation and estimator functions

Quality and cost of estimation function

Page 53: Backtracking and Game Trees

53

Minimax algorithm - pseudo code

Max( P, depth ) { if( depth == 0 ) return estimator(P) m = - for each possible move P’ m = max( m, Min(P’, depth-1) ) return m}

Min( P, depth ) { if( depth == 0 ) return estimator(P) m = for each possible move P’ m = min( m, Max(P’, depth-1) ) return m}

Page 54: Backtracking and Game Trees

54

NegaMax Algorithm

Rewrite from the point of view of moving player:

negaMax( P, depth ) { if (depth == 0) return estimator(P) m = -

for each possible move P’ v = -negaMax( P’, depth-1 ) if m < v then m = v return m}

Can you prove these two versions are the same?

Page 55: Backtracking and Game Trees

55

How fast?

Minimax is pretty slow even for a modest depth.

It is basically a brute force search.

What is the running time?

Each level of the tree has some average b moves per level. We have d levels. So the running time is O(bd).

Page 56: Backtracking and Game Trees

56

Can we speed this up?

Let us observe the following game tree.

2

2

2 7

1

1

What do we know about the root?

Max

Min

Max

What do we know about the root’s right child?

Page 57: Backtracking and Game Trees

57

Alpha Beta Pruning

Page 58: Backtracking and Game Trees

58

Pruning

Minimax sometimes evaluates nodes that have no impact on the search.

E.g., Evaluating moves in Chess

My 1st move: I’m ahead a rook

My 2nd move: Opponent 1st move leaves me down a queen.

No need to look at Opponent’s 2nd move to find out it is check mate. Already know my 1st move is better.

Page 59: Backtracking and Game Trees

59

Alpha Beta Pruning

Idea: Track “window” of expectations. Two variables:

– Best score so far at a max node: increases

At a child min node:• Parent wants max. To affect the parent’s current , our

cannot drop below .

– Best score so far at a min node: decreases

At a child max node. • Parent wants min. To affect the parent’s current , our

cannot get above the parent’s .

Either case: If :• Stop searching further subtrees of that child. They do

not matter!

Start the process with an infinite window ( = -, = ).

Page 60: Backtracking and Game Trees

60

Alpha-Beta algorithm - pseudo code

MaxAB( P, depth, , ) { if( depth == 0 ) return estimator(P)

for each possible move P’ && < = max(, MinAB( P’, depth-1, , )) return }

MinAB( P, depth, , ) { if( depth == 0 ) return estimator(P)

for each possible move P’ && < = min(, MaxAB( P’, depth-1, , )) return }

Page 61: Backtracking and Game Trees

61

Alpha-Beta algorithm - Alternative

AB( P, depth, , ) { if( depth == 0 ) return estimator(P)

for each possible move P’ && < v = -AB( P’, depth-1, -, - )

if < v then = v

return }

In MinAB replace with - and with -, and negate the return value. Get MaxAB.

Page 62: Backtracking and Game Trees

62

10 2 12

10 12

10

Alpha Beta Example

Max

Max

Min

Min

=10

= 12

> !

Page 63: Backtracking and Game Trees

63

Alpha Beta Example

10 2 12 2 7

10 12 7

10 7

10Max

Max

Min

Min

= 10

=7

> !

Page 64: Backtracking and Game Trees

64

Alpha Beta Pruning

Does Alpha Beta ever return a different root value than Minimax? No! As long as the root value is

within the root (Alpha, Beta) range.

Alpha Beta does the same thing Minimax does, except it is able to detect parts of the tree that make no difference.

Page 65: Backtracking and Game Trees

65

Alpha Beta Pruning

Does Alpha Beta ever return a different root value than Minimax? No! As long as the root value is

within the root (Alpha, Beta) range.

Alpha Beta does the same thing Minimax does, except it is able to detect parts of the tree that make no difference.

Page 66: Backtracking and Game Trees

66

Alpha Beta Pruning

Theorem: Let v(P) be the value of position P. Let X be the value returned by AB(, ).Then one of the following holds:

v(p) ≤ and X ≤

< v(P) < and X = v(P)

≤ v(P) and X ≥

The proof is by induction, and is an easy but tedious case analysis.

Page 67: Backtracking and Game Trees

67

Alpha Beta speedup

Claim: The optimal Alpha Beta search tree is O(bd/2) nodes or the square root of the number of nodes in the regular Minimax tree.Can enable twice the depth

The speedup is greatly dependent on the order in which you consider moves at each node. Why?

Page 68: Backtracking and Game Trees

68

Summary

Backtracking

Organized brute force

Answer to problem = set of successive decisions

When stuck, go back, try the remaining alternatives

No optimality guarantees

Page 69: Backtracking and Game Trees

69

Summary

Game playing

Game trees

Mini-max algorithm

Optimization

Alpha-beta pruning

Heuristics search