Fatin Sezgin - MCQMC2010 - Monte Carlo and Quasi-Monte Carlo
Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo...
Transcript of Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo...
![Page 1: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/1.jpg)
Enhancements for Multi-Player Monte-Carlo Tree Search
J. (Pim) A.M. NijssenMark H.M. Winands
29 September 2010
![Page 2: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/2.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 2
Overview• Introduction• Progressive History• MP-MCTS-Solver• Test domains• Experiments and Results• Conclusions• Future Research
![Page 3: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/3.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 3
Introduction• Enhancements for Multi-Player Monte-
Carlo Tree Search– More than 2 players– Techniques
• maxn (Luckhardt and Irani, 1986)• Paranoid (Sturtevant and Korf, 2000)
– Games• Chinese Checkers• Hearts
![Page 4: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/4.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 4
Introduction• Enhancements for Multi-Player Monte-
Carlo Tree Search– Best-first search technique– Monte Carlo simulations– Four phases
• Selection (UCT)• Expansion (1 node per sample)• Playout (ε-greedy)• Backpropagation
![Page 5: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/5.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 5
Introduction• Enhancements for Multi-Player Monte-
Carlo Tree Search– Stores tuple of size N in nodes– Game returns tuple of size N
• Winner gets a score of 1, losers get a score of 0• Score is split in case of multiple winners
– e.g. [½, ½, 0] is returned if Players 1 and 2 both win
![Page 6: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/6.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 6
Introduction• Enhancements for Multi-Player Monte-
Carlo Tree Search– Progressive History– Multi-Player Monte-Carlo Tree Search Solver
![Page 7: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/7.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 7
Progressive History• Combination of Progressive Bias (Chaslot
et al., 2008) and the history heuristic (Schaeffer, 1983)
• Move selection strategy uses action information
• More information available• Information is less accurate• Influence decreases over time
![Page 8: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/8.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 8
Progressive History
1)ln(
+−×+×+=
iia
a
i
p
i
ii sn
Wns
nn
Cnsv
History heuristic Progressive Bias
Divide by number of losses
UCT
![Page 9: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/9.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 9
MP-MCTS-Solver• Multi-Player version of MCTS-Solver
(Winands et al., 2008)• Updating game-theoretical values• Update rules
– Standard (mate in one, one winner)– Paranoid– First winner
![Page 10: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/10.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 10
MP-MCTS-Solver
A
B C D
E F G H I
Player 3
Player 1
[0,1,0][…]
[1,0,0]
[0,1,0]
[1,0,0]
[1,0,0]
[0,1,0]
[0,1,0]
[?] Paranoid [0,1,0]
[1,0,0]First winner
![Page 11: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/11.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 11
Test domains• Multi-player games• Zero-sum• Perfect information
• Focus• Chinese Checkers
![Page 12: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/12.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 12
Focus• Capturing pieces
by creating stacks• Goal
– Total number of pieces captured
– Number of pieces captured from each opponent
![Page 13: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/13.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 13
Focus• Moving
– Only stacks one owns– Orthogonally– Move as many squares
as the number of pieces
– Maximum stack size is 5
• Capture pieces by creating larger stacks
![Page 14: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/14.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 14
Chinese Checkers• Goal: move pieces to
other side of the board
• Move pieces to adjacent fields or jump over other pieces– Sequential jumps
![Page 15: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/15.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 15
Experiments and Results• Processor: AMD64 2.4 GHz• Programming language: Java 6
• MCTS settings: C = 0.2, ε = 0.05
• Time: 2.5s per turn• 3360 games per tournament• All possible configurations
![Page 16: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/16.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 16
Experiments and Results• Progressive History in Focus
W 2 players 3 players 4 players
0 52.0% 51.2% 50.8%
0.5 59.0% 61.1% 57.5%
0.1 59.8% 63.0% 58.9%
0.25 61.3% 62.9% 59.4%
0.5 64.1% 65.5% 59.9%
1 66.0% 65.4% 58.2%
3 62.2% 65.2% 59.6%
5 57.9% 63.8% 59.6%
7.5 51.3% 60.6% 57.1%
10 47.4% 57.8% 56.9%
![Page 17: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/17.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 17
Experiments and Results• Progressive History in Chinese Checkers
W 2 players 3 players 4 players
0.25 52.8% 59.0% 56.9%
0.5 58.2% 62.8% 58.3%
1 67.8% 63.5% 61.9%
3 79.9% 66.7% 66.4%
5 83.5% 65.8% 66.8%
10 83.2% 65.3% 69.6%
15 81.0% 65.0% 69.2%
20 60.8% 60.2% 63.2%
![Page 18: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/18.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 18
Experiments and Results• Divide by number of losses
Game 2 players 3 players 4 players
Focus 64.8% 61.0% 52.0%
Chinese Checkers 57.6% 54.8% 53.9%
![Page 19: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/19.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 19
Experiments and Results• MP-MCTS-Solver in Focus
Update rule 2 players 3 players 4 players
Standard 53.0% 54.9% 53.3%
Paranoid 51.9% 50.4% 44.9%
First winner 52.8% 51.5% 43.4%
![Page 20: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/20.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 20
Conclusions• Progressive history
– Significant enhancement in Chinese Checkers and Focus
– Dividing by number of losses in Progressive Bias part increases performance
• MP-MCTS-Solver– Small but significant enhancement in Chinese
Checkers– Standard update rule works best
![Page 21: Enhancements for Multi-Player Monte-Carlo Tree Search · Enhancements for Multi-Player Monte-Carlo Tree Search J. (Pim) A.M. Nijssen Mark H.M. Winands 29 September 2010](https://reader035.fdocuments.net/reader035/viewer/2022070211/60ffd44459714f14953de755/html5/thumbnails/21.jpg)
5 October 2010 Enhancements for Multi-Player Monte-Carlo Tree Search 21
Future Research• Test Progressive History in other games• Compare Progressive History with similar
techniques, like RAVE, prior knowledge (Gelly and Silver, 2007), Gibbs Sampling (Björnsson and Finnsson, 2009), etc.
• Create new update rules for MP-MCTS-Solver