Adversarial Search - cs.duke.edu
Transcript of Adversarial Search - cs.duke.edu
![Page 2: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/2.jpg)
Games
“Chess is the Drosophila of Artificial Intelligence”Kronrod, c. 1966
TuroChamp, 1948
![Page 3: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/3.jpg)
Why Study Games?Of interest:• Many human activities (especially intellectual ones) can be
modeled as games.• Prestige.!Convenient:• Perfect information.• Concise, precise rules.• Well defined “score”.
![Page 4: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/4.jpg)
“Solved” GamesA game is solved if an optimal strategy is known.!Strong solved: all positions.Weakly solved: some (start) positions.!
![Page 5: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/5.jpg)
Typical Game SettingGames are usually:
• 2 player• Alternating• Zero-sum
• Gain for one loss for another.• Perfect information
!Very much like search:
• Start state• Successor function• Terminal states (many)• Objective function
but alternating control.
![Page 6: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/6.jpg)
Game Trees
o oo…
…o o ox x
x
player 1 moves
player 1 moves
player 2 moves
ox
ox
ox
…o o
o
![Page 7: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/7.jpg)
Key Differences vs. Search
p1
p2 p2 p2
p1p1p1
…
only get score here
you select to max score
they select to min score
![Page 8: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/8.jpg)
Minimax AlgorithmMax player: select action to maximize return.Min player: select action to minimize return.!This is optimal for both players (if zero sum).Assumes perfect play, worst case.!Can run as depth first:
• Time O(bd)• Space O(bd)
![Page 9: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/9.jpg)
Minimaxp1
p2 p2 p2
p1 p1 p1 p1 p1 p110 5 -3 20 -5 2
max
min
5 -3 -5
5
![Page 10: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/10.jpg)
In PracticeDepth is too deep.
• 10s to 100s of moves.Breadth is too broad.
• Chess: 35, Go: 361.!Full search never terminates for non-trivial games.!Solution: substitute evaluation function.
• Like a heuristic - estimate value.• Perhaps run to fixed depth then estimate.
![Page 11: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/11.jpg)
Search Control• Horizon Effects
• What if something interesting at horizon + 1?• How do you know?
!• When to generate more nodes?• How to selectively expand the frontier?• How to allocate fixed move time?
![Page 12: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/12.jpg)
PruningSingle most useful search control method:
• Throw away whole branches.• Use the min-max behavior.
!• Cutoff search at min nodes where max can force a better
outcome.!
• Cutoff search at max nodes when min can force a worse outcome.!
Resulting algorithm: alpha-beta pruning.
![Page 13: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/13.jpg)
Alpha-Betap1
p2 p2 p2
p1 p1 p1 p1 p1 p110 5 -3 20 -5 2
max
min
5
![Page 14: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/14.jpg)
Alpha-BetaEmpirically, has the effect of reducing the branching factor by a square root for many problems.!Effectively doubles the search horizon.!Alpha-beta makes the difference between novice and expert computer game players. Most successful players use alpha-beta.
![Page 15: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/15.jpg)
Deep Blue (1997)
480 Special Purpose Chips200 million positions/secSearch depth 6-8 moves (up to 20)
![Page 16: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/16.jpg)
Games TodayWorld champion level:• Backgammon• Chess• Checkers (solved)• Othello• Some poker types:“Heads-up Limit Hold’em Poker is Solved”, Bowling et al., Science, January 2015.
!Perform well:• Bridge• Other poker types!Far off: Go
![Page 17: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/17.jpg)
![Page 18: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/18.jpg)
Go
![Page 19: Adversarial Search - cs.duke.edu](https://reader030.fdocuments.net/reader030/viewer/2022012604/61996fad16f1c07ab8160a4f/html5/thumbnails/19.jpg)
Very Recently
Fan HuiEuropean Go
Champion
AlphaGo(Google Deepmind)
0 - 5