COMP-4640: Intelligent & Interactive Systems Game Playing

30
COMP-4640: Intelligent & Interactive Syste Game Playing A game can be formally defined as a search problem with: -An initial state -a set of operators (actions or moves) -a terminal test -a utility (payoff)

description

COMP-4640: Intelligent & Interactive Systems Game Playing. A game can be formally defined as a search problem with: An initial state a set of operators (actions or moves) a terminal test a utility (payoff) function. COMP-4640: Intelligent & Interactive Systems Game Playing. - PowerPoint PPT Presentation

Transcript of COMP-4640: Intelligent & Interactive Systems Game Playing

Page 1: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing

A game can be formally defined as a search problem with:

-An initial state

-a set of operators (actions or moves)

-a terminal test

-a utility (payoff) function

Page 2: COMP-4640: Intelligent & Interactive Systems Game Playing

1. Multi-agent environment– Multi-player games involve planning and acting in environments

populated by other active agents– Agents use sense/plan/act architecture that does not plan too far into

the unpredictable future– But with proper information agent can construct plan that consider the

effects of the actions of other agents– In AI we will consider the special case of a games,

• deterministic• turn taking• two-player• zero sum games of perfect-information

2. Zero Sum Games– either one of them wins (and the other loses), or a draw results– +1 win -1 loss 0 draw

3. Agents utility functions make the games adversarial

COMP-4640: Intelligent & Interactive SystemsGame Playing

Page 3: COMP-4640: Intelligent & Interactive Systems Game Playing

Multi-agent environmentRobot Soccer

COMP-4640: Intelligent & Interactive SystemsGame Playing

Page 4: COMP-4640: Intelligent & Interactive Systems Game Playing
Page 5: COMP-4640: Intelligent & Interactive Systems Game Playing

Game tree (2-player, deterministic, turns)

Page 6: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame PlayingThe Minimax Algorithm

Page 7: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing

The Minimax Algorithm

Page 8: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing

• The evaluation function:• Must have the same terminal states (goal states)

as the utility function• Must be of reasonable complexity so that it can

be computed quickly (this is a trade-off between Accuracy and Time)

• Should be accurate• The performance of the game playing system

depends on the accuracy “goodness” of the evaluation function

Page 9: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing

• One problem with using minimax is that it may not be feasible to search the whole game tree for a minimax decision (move or action)

• Using depth-limited search may speed thing up the minimax decision process but instead of using the utility function one would need to construct an evaluation fuction.

• This evaluation function would provide an estimate of the expected utility of a game position

Page 10: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive Systems

Game Playing Properties of minimax• Complete? Yes (if tree is finite)• Optimal? Yes (against an optimal opponent)• Time complexity? O(bm)• Space complexity? O(bm) (depth-first exploration)

• For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible

•••••

Page 11: COMP-4640: Intelligent & Interactive Systems Game Playing

Once we have developed a good evaluation function, we must also consider:

• The depth-limit

• The Horizon Problem– Difficult to eliminate– When a program is facing a move by the opponent

that causes serious damage and is ultimately unavoidable

– Stalling pushes the move over the horizon to a place where it can’t be detected

COMP-4640: Intelligent & Interactive SystemsGame Playing

Page 12: COMP-4640: Intelligent & Interactive Systems Game Playing

• Once we have an evaluation function and a depth-limit we can then re-apply minimax search.

• However, for depth-limited search minimax may still be inefficient.

• Minimax will expand nodes that need not be searched.

• By making our search method more efficient, we will be able to search at deeper levels of our game tree.

COMP-4640: Intelligent & Interactive SystemsGame Playing

Page 13: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing: Alpha-Beta Pruning

1. Search below a MIN node may be alpha-pruned if the beta value is < to the alpha value of some MAX ancestor.

2. Search below a

MAX node may be beta-pruned if the alpha value is > to the beta value of some MIN ancestor.

27

Page 14: COMP-4640: Intelligent & Interactive Systems Game Playing

Alpha-Beta Pruning (αβ prune) • Rules of Thumb

– α is the highest max found so far– β is the lowest min value found so far

– If Min is on top Alpha prune– If Max is on top Beta prune

– You will only have alpha prune’s at Min level– You will only have beta prunes at Max level

– See detailed algorithm p167

Page 15: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing: Alpha-Beta Pruning

1. Search below a MIN node may be alpha-pruned if the beta value is < to the alpha value of some MAX ancestor.

2. Search below a

MAX node may be beta-pruned if the alpha value is > to the beta value of some MIN ancestor.

27

Page 16: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing: Alpha-Beta Pruning

1. Search below a MIN node may be alpha-pruned if the beta value is < to the alpha value of some MAX ancestor.

2. Search below a

MAX node may be beta-pruned if the alpha value is > to the beta value of some MIN ancestor.

3

2 3

3

593

3

β

5

Page 17: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing: Alpha-Beta Pruning

1. Search below a MIN node may be alpha-pruned if the beta value is < to the alpha value of some MAX ancestor.

2. Search below a

MAX node may be beta-pruned if the alpha value is > to the beta value of some MIN ancestor.

3

2 3

3

5 0

9

93

3

9

0

0

747

0 7

α

β

Page 18: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing: Alpha-Beta Pruning

1. Search below a MIN node may be alpha-pruned if the beta value is < to the alpha value of some MAX ancestor.

2. Search below a

MAX node may be beta-pruned if the alpha value is > to the beta value of some MIN ancestor.

3

2 3

3

5 0

9

93

3

3

90

0

02

22 6

22 1 5

6747 6

0 7

αα

β

Page 19: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing

Page 20: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing

3

3

5 0 6 1

65 3

4 7

73

5

5 6

55

3

2

Page 21: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing

α

β3

3

5 0 6 1

65 3

4 7

73

5

5 6

55

3

2

Page 22: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame of Chance: Expecti-minimax

•Initial value of leaves indicate board state•Use percentage chance based upon roll for first calculated value•Min eval f(n) selects Max value•The second roll uses different assigned percentage chance•Max eval f(n) selects Max value

Page 23: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame of Chance: Expecti-minimax

3

0

0

3 0

(3*1.0)

•Initial value of leaves indicate board state•Use percentage chance based upon roll for first calculated value•Min eval f(n) selects Max value•The second roll uses different assigned percentage chance•Max eval f(n) selects Max value

Page 24: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame of Chance: Expecti-minimax

3

0

0

6

6 0 12

9

3 6

3 0 6

9(3*1.0)

•Initial value of leaves indicate board state•Use percentage chance based upon roll for first calculated value•Min eval f(n) selects Max value•The second roll uses different assigned percentage chance•Max eval f(n) selects Max value

Page 25: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame of Chance: Expecti-minimax

3

0

0

6

6 0 12

9

3 6

3 0 6

9

(0*0.67 + 6*0.33)2

2

(3*1.0)

•Initial value of leaves indicate board state•Use percentage chance based upon roll for first calculated value•Min eval f(n) selects Max value•The second roll uses different assigned percentage chance•Max eval f(n) selects Max value

Page 26: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame of Chance: Expecti-minimax

3

0

0

6

6 0 12

9

3 6

0 63 0 6

9

3 0 6 12

(0*0.67 + 6*0.33)2

2

(3*1.0)

•Initial value of leaves indicate board state•Use percentage chance based upon roll for first calculated value•Min eval f(n) selects Max value•The second roll uses different assigned percentage chance•Max eval f(n) selects Max value

Page 27: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame of Chance: Expecti-minimax

3

0

0

6

6 0 12

9

3 6

0 63 0 6

9

3 0 6 12

(0*0.67 + 6*0.33) (0*0.67 + 6*0.33)2 2

22 2

(3*1.0)

•Initial value of leaves indicate board state•Use percentage chance based upon roll for first calculated value•Min eval f(n) selects Max value•The second roll uses different assigned percentage chance•Max eval f(n) selects Max value

Page 28: COMP-4640: Intelligent & Interactive Systems Game Playing

Cutting off searchMinimaxCutoff is identical to MinimaxValue except

1. Terminal? is replaced by Cutoff?2. Utility is replaced by Eval

Does it work in practice?bm = 106, b=35 m=4

4-ply lookahead is a hopeless chess player!– 4-ply ≈ human novice– 8-ply ≈ typical PC, human master– 12-ply ≈ Deep Blue, Kasparov

Page 29: COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Deterministic games in practice

• Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions.

»• Chess: Deep Blue defeated human world champion Garry Kasparov

in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.

• Othello: human champions refuse to compete against computers, who are too good.

• Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.

•••

»

Page 30: COMP-4640: Intelligent & Interactive Systems Game Playing

http://www.research.ibm.com/deepblue/