game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame...
Transcript of game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame...
![Page 1: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/1.jpg)
Human-aware Robotics
1
Game Theory• 2019/04/22
Ø Announcement:q Slides for this lecture are here
http://www.public.asu.edu/~yzhan442/teaching/CSE591S19-HAR/Lectures/game.pdf
Slides are largely based on information from Mike Conlin
![Page 2: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/2.jpg)
Human-aware RoboticsGame theoryThe study of strategic decision making. More formally, it is the study of mathematical models of conflict and cooperation between intelligent rational decision-makers.
Tic-Tac-Toe: a zero-sum game
![Page 3: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/3.jpg)
Human-aware RoboticsGame theoryGame theory is, in essence, the science of strategic thinking—a way of making the best decision possible based on the way you expect other people to act. It was once the domain of Nobel Prize-winning economists and big thinkers on geopolitics, but now parents are getting in on the act. Though game theory assumes, as a technical matter, that its players are rational, it applies just as well to not-always-rational children.
A key lesson in game theory, says Barry Nalebuff, a professor at the Yale School of Management, is to understand the perspective of the other players. It isn't about what you would do in another person's shoes, he says; it's about what they would do in their shoes. "Good game theory," he says, "appreciates the quirks and features that make us unique and takes us as we are."
![Page 4: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/4.jpg)
Human-aware Robotics
• Games, of course• National Defense – Terrorism and Cold War• Auctions • Sports – Cards, Cycling, and race car driving• Politics – positions taken and $$/time spent on
campaigning• Personnel management• …
Game theory applications
![Page 5: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/5.jpg)
Human-aware RoboticsCake cuttingThe party is over, and you're down to the last bit of cake. All three of your children want it. If you're familiar with game theory, you might think of the classic strategy in which one person cuts the cake and the other chooses the slice. But how do you divide it three ways without anyone throwing a fit?
You want a strategy where everyone feels that they are being equally treated!
![Page 6: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/6.jpg)
Human-aware RoboticsGame theory terminologySimultaneous Move Game – Game in which each player makes decisions without knowledge of the other players’ decisions (e.g., the prisoner’s dilemma)
Sequential Move Game – Game in which one player makes a move after observing the other player’s move (e.g., Stackelberg game).
Strategy – In game theory, a decision rule that describes the actions a player will take at each decision point.
Normal Form Game – A representation of a game indicating the players, their possible strategies, and the payoffs resulting from alternative strategies.
![Page 7: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/7.jpg)
Human-aware RoboticsPrisoner’s dilemma
Martha’s options
Don’t Confess Confess
Peter’s Options
Don’t Confess P: 2 years, M: 2 years P: 10 years, M: 1 year
Confess P: 1 year, M: 10 years P: 6 years, M: 6 years
What is Peter’s best option if Martha doesn’t confess?What is Peter’s best option if Martha confess?
![Page 8: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/8.jpg)
Human-aware RoboticsPrisoner’s dilemma
Martha’s options
Don’t Confess Confess
Peter’s Options
Don’t Confess P: 2 years, M: 2 years P: 10 years, M: 1 year
Confess P: 1 year, M: 10 years P: 6 years, M: 6 years
What is Martha’s best option if Peter doesn’t confess?What is Martha’s best option if Peter Confesses?
![Page 9: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/9.jpg)
Human-aware RoboticsPrisoner’s dilemma
Martha’s options
Don’t Confess Confess
Peter’s Options
Don’t Confess P: 2 years, M: 2 years P: 10 years, M: 1 year
Confess P: 1 year, M: 10 years P: 6 years, M: 6 years
Dominant Strategy – A strategy that results in the highest payoff to a player regardless of the opponent’s action.
![Page 10: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/10.jpg)
Human-aware RoboticsNash Equilibrium
A condition describing a set of strategies in which no player can improve her payoff by unilaterally changing her own strategy, given the other player’s strategy
--every player is doing its best given the other player’s strategy (best response): e.g., for NE (A, B), B is a best response to row agent’s strategy A, and A is a best response to column agent’s strategy B
![Page 11: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/11.jpg)
Human-aware RoboticsPrisoner’s dilemma
Martha’s options
Don’t Confess Confess
Peter’s Options
Don’t Confess P: 2 years, M: 2 years P: 10 years, M: 1 year
Confess P: 1 year, M: 10 years P: 6 years, M: 6 years
A pure strategy Nash Equilibrium here is (Confess, Confess)
![Page 12: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/12.jpg)
Human-aware RoboticsNash Equilibrium
Theorem: Strictly dominated strategies cannot be a part of a Nash equilibrium.• If all players have a strictly dominant
strategy, there is a unique Nash equilibrium• Weakly dominated strategies may be part of
Nash equilibria.Ø A NE may not be always be associated with a
dominant strategy.
There always exists a dominant strategy? C dominates D
![Page 13: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/13.jpg)
Human-aware RoboticsBK vs McDBurger King’s options
Enter Tempe Marketplace
Don’t Enter Tempe Marketplace
McDonalds’ Options
Enter Tempe Marketplace
PM = -30, PBK = -40 PM = 50, PBK = 0
Don’t Enter Tempe Marketplace
PM = 0, PBK = 40 PM = 0, PBK = 0
Is there a dominant strategy for BK? Is there a dominant strategy for McD?
![Page 14: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/14.jpg)
Human-aware RoboticsBurger King’s options
Enter Tempe Marketplace
Don’t Enter Tempe Marketplace
McDonalds’ Options
Enter Tempe Marketplace
PM = -30, PBK = -40 PM = 50, PBK = 0
Don’t Enter Tempe Marketplace
PM = 0, PBK = 40 PM = 0, PBK = 0
Is there a pure strategy Nash Equilibrium?
BK vs McD
![Page 15: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/15.jpg)
Human-aware RoboticsBurger King’s options
Enter Tempe Marketplace
Don’t Enter Tempe Marketplace
McDonalds’ Options
Enter Tempe Marketplace
PM = -30, PBK = -40 PM = 50, PBK = 0
Don’t Enter Tempe Marketplace
PM = 0, PBK = 40 PM = 0, PBK = 0
Yes, there are 2 – (Enter, Don’t Enter) and (Don’t Enter, Enter).
BK vs McD
![Page 16: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/16.jpg)
Human-aware RoboticsNash EquilibriumNo need for a dominant strategy to have Nash Equilibrium
Is there always a pure strategy Nash Equilibrium?
![Page 17: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/17.jpg)
Human-aware RoboticsWorker MonitoringWorker’s options
Work Shirk
Manager’s Options
Monitor M: -1, W: 1 M: 1, W: -1
Don’t Monitor M: 1, W: -1 M: -1, W: 1
Is there a dominant strategy for the worker?
Is there a dominant strategy for the manager?
![Page 18: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/18.jpg)
Human-aware RoboticsWorker MonitoringWorker’s options
Work Shirk
Manager’s Options
Monitor M: -1, W: 1 M: 1, W: -1
Don’t Monitor M: 1, W: -1 M: -1, W: 1
Is there a pure strategy Nash Equilibrium for the worker?
![Page 19: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/19.jpg)
Human-aware Robotics
Definition:A strategy whereby a player randomizes over two or more available actions in order to keep rivals from being able to predict his or her actions.
John Nash proved a mixed Nash Equilibrium always exists
Mixed Strategy Nash Equilibrium
![Page 20: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/20.jpg)
Human-aware RoboticsMixed Strategy Nash Equilibrium
How to compute the mixed strategy?
![Page 21: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/21.jpg)
Human-aware Robotics
Worker’s options
Work Shirk
Manager’s Options
Monitor M: -1, W: 1 M: 1, W: -1
Don’t Monitor M: 1, W: -1 M: -1, W: 1PM
1-PM
PW 1-PW
Mixed Strategy
How to compute the mixed strategy?
![Page 22: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/22.jpg)
Human-aware Robotics
Worker’s options
Work Shirk
Manager’s Options
Monitor M: -1, W: 1 M: 1, W: -1
Don’t Monitor M: 1, W: -1 M: -1, W: 1PM
1-PM
PW 1-PW
Mixed Strategy
Consider the manager’s strategy: if the worker best-responds with a mixed strategy, the manager must have made the worker indifferent between Work and Shirk!
![Page 23: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/23.jpg)
Human-aware Robotics
Manager selects PM to make Worker indifferent between working and shirking (i.e., same expected payoff)
• Worker’s expected payoff from workingPM * (1)+(1 - PM) * (-1) = -1+ 2 * PM
• Worker’s expected payoff from shirkingPM * (-1)+(1 - PM) * (1) = 1 – 2 * PM
Worker’s expected payoff is the same from working and shirking if PM=0.5. This expected payoff is 0 (-1+2 * 0.5 = 0 and 1 – 2 * 0.5 = 0). Therefore, worker’s best response is to either work or shirk or randomize between working and shirking.
Mixed Strategy
![Page 24: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/24.jpg)
Human-aware Robotics
Worker selects PW to make Manager indifferent between monitoring and not monitoring.
• Manager’s expected payoff from monitoringPW * (-1) + (1 - PW) * (1) = 1 – 2 * PW
• Manager’s expected payoff from not monitoringPW * (1) + (1 - PW) * (-1) = -1 + 2 * PW
Manager’s expected payoff is the same from monitoring and not monitoring if PW = 0.5. Therefore, the manager’s best response is to either monitor or not monitor or randomize between monitoring or not monitoring .
Mixed Strategy
![Page 25: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/25.jpg)
Human-aware Robotics
• Worker works with probability 0.5 and shirks with probability 0.5 (i.e., PW = 0.5)
• Manager monitors with probability 0.5 and doesn’t monitor with probability 0.5 (i.e., PM = 0.5)
Neither the Worker nor the Manager can increase their expected payoff by playing some other strategy (expected payoff for both is zero). They are both playing a best response to the other player’s strategy.
Mixed Strategy
![Page 26: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/26.jpg)
Human-aware RoboticsMixed Strategy
Worker’s options
Work Shirk
Manager’s Options
Monitor M: -1, W: 1 M: 1, W: -1
Don’t Monitor M: 1, W: -1 M: -1, W: 1-.5 1.5
What if the monitoring cost decreases?
PM
1-PM
PW 1-PW
![Page 27: game - Arizona State Universityyzhan442/teaching/CSE591S19-HAR/Lectures/game.pdfGame theoryHuman-aware Robotics The study of strategic decision making. More formally, it is the study](https://reader033.fdocuments.net/reader033/viewer/2022053008/5f0c15e27e708231d433ab82/html5/thumbnails/27.jpg)
Human-aware Robotics
• Worker works with probability 0.625 and shirks with probability 0.375 (i.e., PW = 0.625)
• Manager monitors with probability 0.5 and doesn’t monitor with probability 0.5 (i.e., PM = 0.5)
Mixed Strategy Nash Equilibrium
The decrease in monitoring costs does not change the probability that the manager monitors. However, it increases the probability that the worker works.