AlphaGo andMonteCarloTreeSearch - University of Washington5/30/17 1 AlphaGo andMonteCarloTreeSearch...

5

5/30/17 1 AlphaGo and Monte Carlo Tree Search Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade 1 ©Sham Kakade 2017 A feat previously thought to be (a decade?) away!!! 2016 26 AlphaGo deep RL defeats Lee Sedol (4-1)

Upload
others
Category

Documents
view
6
download
0

Embed Size (px):

Transcript of AlphaGo andMonteCarloTreeSearch - University of Washington5/30/17 1 AlphaGo andMonteCarloTreeSearch...

Page 1: AlphaGo andMonteCarloTreeSearch - University of Washington5/30/17 1 AlphaGo andMonteCarloTreeSearch Machine0Learning0forBig0Data00 CSE547/STAT548, University0of0Washington Sham0Kakade

5/30/17

1

AlphaGo and Monte Carlo Tree Search

Machine Learning for Big Data CSE547/STAT548, University of Washington

Sham Kakade

1©Sham Kakade 2017

A feat previously thought to be (a decade?) away!!!

9/28/16

12

2011

25

http://www.youtube.com/watch?v=WFR3lOm_xhE

2016

26

AlphaGo deep RL defeats Lee Sedol (4-1)

Page 2: AlphaGo andMonteCarloTreeSearch - University of Washington5/30/17 1 AlphaGo andMonteCarloTreeSearch Machine0Learning0forBig0Data00 CSE547/STAT548, University0of0Washington Sham0Kakade

5/30/17

2

Chess vs. Alpha Go

1997, AI named “Deep Blue” beat chess world champion.

Search space: 𝑏": 𝑏 = 35, 𝑑 = 80Search space: 𝑏": 𝑏 = 250, 𝑑 = 150

Key Points

• Given current game state, – how to decide next action “a”?– How to evaluate each legal action?

Page 3: AlphaGo andMonteCarloTreeSearch - University of Washington5/30/17 1 AlphaGo andMonteCarloTreeSearch Machine0Learning0forBig0Data00 CSE547/STAT548, University0of0Washington Sham0Kakade

5/30/17

3

Monte Carlo Tree Search (MCTS)• A popular heuristic search algorithm for game play– By lots of simulations and select the most visited action.

Evaluation

Node value: Wining rate

Key point: how to calculate this valueOne playout to simulate the game

Monte Carlo Tree Search (MCTS)• AlphaGo

– Key point: how to calculate the value for each action (path)

Page 4: AlphaGo andMonteCarloTreeSearch - University of Washington5/30/17 1 AlphaGo andMonteCarloTreeSearch Machine0Learning0forBig0Data00 CSE547/STAT548, University0of0Washington Sham0Kakade

5/30/17

4

The visited number for this edge

The mean value of all simulations passing through it.

Winning rate

Winning rate is predicted by deep reinforcement learning andpredicted by fast rollout playout

Action probability predicted by supervised and reinforcement learning

V(s) is estimated in two different ways.

(Fast) Methods to evaluate V

30 million games

30 million self-‐play games

Page 5: AlphaGo andMonteCarloTreeSearch - University of Washington5/30/17 1 AlphaGo andMonteCarloTreeSearch Machine0Learning0forBig0Data00 CSE547/STAT548, University0of0Washington Sham0Kakade

5/30/17

5

Thanks!

• ML for big data: – lots of different methods/challenges in the wild.– Participate in the ML community.

• Posters: – Looking forward to the poster session.

• Have a great summer!

• Some slides modified from: prlab.tudelft.nl/sites/default/files/deepmind_go_nature.pptx

Acknowledgments

©Sham Kakade 2017

neural networks and tree search - Semantic Scholar€¦ · The final version of AlphaGo used 40 search threads, 48 CPUs, and 8 GPUs. Distributed version of AlphaGo that exploited

neural networks and tree search - Semantic Scholar€¦ · The final version of AlphaGo used 40 search threads, 48 CPUs, and 8 GPUs. Distributed version of AlphaGo that exploited

Mastering the game of Go without human knowledge · Alphago Zero (This paper) The second Alphago paper Mastering the game of Go without human knowledge 100 - 0 Alphago Lee David Silver,

Mastering the game of Go without human knowledge · Alphago Zero (This paper) The second Alphago paper Mastering the game of Go without human knowledge 100 - 0 Alphago Lee David Silver,

PH:DOX* 2018: ALPHAGO · ALPHAGO, om filmen . Ingen havde troet, at en kunstig intelligens ville kunne mestre verdens mest avancerede spil, Go – et brætspil så komplekst, at det

PH:DOX* 2018: ALPHAGO · ALPHAGO, om filmen . Ingen havde troet, at en kunstig intelligens ville kunne mestre verdens mest avancerede spil, Go – et brætspil så komplekst, at det

Scaling’Up’LASSO’Solvers’...4/30/15 1 ©EmilyFox2015’ 1 Machine’Learning’for’Big’Data’’ CSE547/STAT548,’University’of’Washington’ EmilyFox April’30th,2015

Scaling’Up’LASSO’Solvers’...4/30/15 1 ©EmilyFox2015’ 1 Machine’Learning’for’Big’Data’’ CSE547/STAT548,’University’of’Washington’ EmilyFox April’30th,2015

AlphaGo, 새로운 시대의 시작

AlphaGo, 새로운 시대의 시작

Neuroverkot ja AlphaGo Zero - jyx.jyu.fi

Neuroverkot ja AlphaGo Zero - jyx.jyu.fi

AlphaGo vs AlphaGo€¦ · · 2016-09-12AlphaGo vs AlphaGo Game 3: “Freedom” ... strategy fashioned over time through many selfplay games. ... compensates for Black's losses

AlphaGo vs AlphaGo€¦ · · 2016-09-12AlphaGo vs AlphaGo Game 3: “Freedom” ... strategy fashioned over time through many selfplay games. ... compensates for Black's losses

AlphaGo – Artificial Intelligencecis.csuohio.edu/~sschung/CIS601/Presentation1... · AlphaGo –Artificial Intelligence CIS 601 - Graduate Seminar Presentation ... Employing Deep

AlphaGo – Artificial Intelligencecis.csuohio.edu/~sschung/CIS601/Presentation1... · AlphaGo –Artificial Intelligence CIS 601 - Graduate Seminar Presentation ... Employing Deep

lsh-randproj annotated 2 - University of Washington...4/20/17 2 ©Sham,Kakade,2017 3 MachineLearningforBigData,, CSE547/STAT548,University,of,Washington Sham,Kakade April,18,2017 LocalityZSensitiveHashing

lsh-randproj annotated 2 - University of Washington...4/20/17 2 ©Sham,Kakade,2017 3 MachineLearningforBigData,, CSE547/STAT548,University,of,Washington Sham,Kakade April,18,2017 LocalityZSensitiveHashing

UNDERSTANDING & GENERALIZING ALPHAGO ZERO

UNDERSTANDING & GENERALIZING ALPHAGO ZERO

Alice Gao Lecture 1 - University of Waterlooa23gao/cs486_f18/slides/lec01_intro_ai_nos… · Go More than 10360 positions AlphaGo, Google DeepMind AlphaGo v.s. Lee Sedol (9-dan rank)

Alice Gao Lecture 1 - University of Waterlooa23gao/cs486_f18/slides/lec01_intro_ai_nos… · Go More than 10360 positions AlphaGo, Google DeepMind AlphaGo v.s. Lee Sedol (9-dan rank)

DeepBlue, AlphaGo, and AI?€¦ · Chess vs. Alpha Go •Will the technical advances (underlying AlphaGo) have broader implications? 1997, AI named Deep Blue beat chess world champion.

DeepBlue, AlphaGo, and AI?€¦ · Chess vs. Alpha Go •Will the technical advances (underlying AlphaGo) have broader implications? 1997, AI named Deep Blue beat chess world champion.

AlphaGo vs AlphaGo - deepmind … · viewers of the match will remember, AlphaGo has a strong proclivity for the Chinese opening. In this game, Black approaches the corner at …

AlphaGo vs AlphaGo - deepmind … · viewers of the match will remember, AlphaGo has a strong proclivity for the Chinese opening. In this game, Black approaches the corner at …

AlphaGo Zero 解説

AlphaGo Zero 解説

Neurónové siete a AlphaGo - cuni.cz

Neurónové siete a AlphaGo - cuni.cz

Machine Learning Techniques at the core of ... - Meetupfiles.meetup.com/20381738/AlphaGo meet up 2016.pdf · AlphaGo ML System Conclusion Machine Learning Techniques at the core of

Machine Learning Techniques at the core of ... - Meetupfiles.meetup.com/20381738/AlphaGo meet up 2016.pdf · AlphaGo ML System Conclusion Machine Learning Techniques at the core of

AlphaGo vs AlphaGo - Game 2 (PDF) - deepmind · PDF fileAlphaGo vs AlphaGo Game 2: “Slaying the Dragon” Commentary by Fan Hui ... White's peep at 2 is a difficult tesuji for Black

AlphaGo vs AlphaGo - Game 2 (PDF) - deepmind · PDF fileAlphaGo vs AlphaGo Game 2: “Slaying the Dragon” Commentary by Fan Hui ... White's peep at 2 is a difficult tesuji for Black

Intro Logistic Regression Gradient Descent + SGDIntro Logistic Regression Gradient Descent + SGD Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade

Intro Logistic Regression Gradient Descent + SGDIntro Logistic Regression Gradient Descent + SGD Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade

cse547, math547 DISCRETE MATHEMATICS

cse547, math547 DISCRETE MATHEMATICS

Languages

Pages

Legal

Copyright © 2022 FDOCUMENTS