Post on 19-Jan-2020
Wang/CIS 630 2
Games
n Two-agent, perfect information, two-player games• Two players move in turn until one wins (the other
thereby loses), or result is draw
• Perfect knowledge of environment
• Idealized model of multiple-agent interaction, and mapsexactly onto many common games, including chess
Wang/CIS 630 4
Mini-max search terminology
n Two players: MAX and MIN
n Task is to find next “best” move for MAX
n MAX moves first (next), two players movealternately thereafter
n A ply of ply-depth k in a game tree consists ofnodes of depths 2k and 2k+1• Extent of search in game tree usually given in terms of
ply depth
Wang/CIS 630 5
Why chess is hard
n Estimate of number of nodes in complete searchgraph for chess:
n Branching factor ≈ 35
n Search to termination in such games isimpossible(≈1022 centuries)
n This is why chess is an empirical phenomenon thathas retained its fascination over centuries
1040
Wang/CIS 630 6
Evaluation functions
n Best we can do: search to a specific depth, andapply static evaluation function to the state• Measures “worth” of being in that state
• Adopt convention: if favorable to MAX, evaluationfunction is positive; if favorable to MIN, negative
n Evaluation function is game-specific, extractingfeatures from the current state of the game• E.g., in chess, piece count is important, weighted by
type of piece and position
Wang/CIS 630 7
The minimax procedure
n When it is MAX’s turn, assume MAX wouldchoose best (highest) valued nodes among hisoptions
n So backed-up value of a MAX node parent ofMIN tip nodes is the maximum of staticevaluations of tip nodes
n Backed-up value of MIN node parent of MAX tipnodes is minimum of children
Wang/CIS 630 8
Minimax procedure, cont.
n Mini-max applies static evaluation functions to theleaf nodes, and “backs-up” the values level bylevel through the game tree• What if the rival is not smart?
Wang/CIS 630 9
Tic-tac-toe example
n Consider following evaluation function e(p)• If p is not a winning position for either player, e(p) =
(number of complete rows, columns or diagonals stillopen for MAX) – number of complete rows, columns,diagonals open for MIN)
• If p is a win for MAX, e(p)= +infinity, if p is a win forMIN, e(p) = –infinity
Wang/CIS 630 11
Alpha-beta procedure
n Basic ideas behind alpha-beta:• Don’t separate generation of search tree and evaluation
of nodes (do these together)
• If you know a move is going to be worse than the bestalternative in hand, don’t waste time finding out howmuch worse it is
Wang/CIS 630 13
Alpha-beta cutoff
n Maintain two threshold values• α : Greatest lower bound on this side
• β : Least upper bound on the rival side
• Cut off any branch < α or > β
Wang/CIS 630 14
Efficiency of alpha-beta searchn Best-case for alpha-beta happens when nodes are
ordered perfectly• Impossible to guarantee this!
n In such cases, number of leaf nodes
generated is• Or, in same time search without alpha-beta would
search to depth d/2
• Or, reduces effective branching factor to about
n Happily, in practice, alpha-beta comes close toachieving optimal reduction
O bd
( )2
b
Wang/CIS 630 15
Other heuristics
n Progressive Deepening Analyze the situation at progressive depths
b: branching factor
n Singular extension• Continue search if one move is much better (worse)
than the rest
• To avoid the Horizon effect caused by a fixed depth
# of nodes at the leaf level # of nodes before the leaf level
= −b 1
Wang/CIS 630 16
Deep Blue vs. Gary Kasparov
n In May, 1997, Deep Blue beat Gary Kasparov3.5 to 2.5 in a 6-game match
Wang/CIS 630 17
Features of Deep Blue
n Specific features of Deep Blue:• Alpha-beta minimax search
• Progressive deepening for handling time constraints
• Singular extension to handle the horizon effect
• A complex static evaluation function, encoding ~6,000features
• Weighted evaluation function “tuned” automaticallyagainst a library of 900 grandmaster games, andmanually against grandmaster Joel Benjamin