What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear)...
Transcript of What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear)...
12/7/15
1
LinearProgrammingandGameTheory
RonParrCPS570
WiththankstoVinceConitzerforsomecontent
WhatareLinearPrograms?• Linearprogramsareconstrainedop-miza-onproblems• ConstrainedopDmizaDonproblemsaskustomaximizeor
minimizeafuncDonsubjecttomathemaDcalconstraintsonthevariables– ConvexprogramshaveconvexobjecDvefuncDonsandconvexconstraints
– Linearprograms(specialcaseofconvexprograms)havelinearobjecDvefuncDonsandlinearconstraints
• LPs=genericlanguageforwiderangeproblems• LPsolvers=widelyavailablehammers• EnDreclassesandvastexperDseinvestedinmaking
problemslooklikenails
Linearprograms:example
maximize3x+2ysubjectto4x+2y≤16x+2y≤8x+y≤5x≥0y≥0
• MakereproducDonsof2painDngs
• PainDng1:• Sellsfor$30• Requires4unitsofblue,1green,1red
• PainDng2• Sellsfor$20• Requires2blue,2green,1red
• Wehave16unitsblue,8green,5red
Solvingthelinearprogramgraphically
maximize3x+2ysubjectto4x+2y≤16x+2y≤8x+y≤5x≥0y≥0 2
0
4
6
8
2 4 6 8
opDmalsoluDon:x=3,y=2
Feasibleregion=regionnotviolaDngconstraints
12/7/15
2
LinearProgramsinGeneral
• Linearconstraints,linearobjecDvefuncDon– Maximize(minimize):
– Subjectto:
• Canswapmaximize/minimize,≤/≥;canaddequality• Viewassearch:Searchesspaceofvaluesofx• AlternaDvely:SearchforDghtconstraintsw/highobjecDvefuncDonvalue
!!!!
€
f (x)
!!
€
Ax ≤ b
LinearfuncDonofvectorx
MatrixA
WhatHappensInHigherDimensions(1)UnderstandingtheFeasibleRegion
2
0
4
6
8
2 4 6 8
IntuiDon:• ObjecDvefuncDondefines“down”• Feasibleregionisa“bowl”• Wanttofindlowestpointontherotatedbowl
• Inequalityw/2variables->onesideofaline• 3variables->onesideofaplane• kvariables->onesideofhyperplane• PhysicalintuiDon:
WhatHappensInHigherDimensions(2)lines->hyperplanes
hip://www.rubylane.com/item/623546-4085/Orrefors-x22Zenithx22-Paiern-Crystal-Bowl
Solvinglinearprograms(1)
• OpDmalsoluDonsalwaysexistatverDcesofthefeasibleregion– Why?– Assumeyouarenotatavertex,youcanalwayspushfurtherindirecDonthatimprovesobjecDvefuncDon(oratleastdoesn’thurt)
– HowmanyverDcesdoesakxnmatriximply?
• Dumb(est)algorithm:– Givennvariables,kconstraints– Checkallk-choose-n=O(kn)possibleverDces
12/7/15
3
Solvinglinearprograms(2)
• Smarteralgorithm(simplex)– Pickavertex– Repeatedlyhoptoneighboring(onedifferentDghtconstrain)verDcesthatimprovetheobjecDvefuncDon
– GuaranteedtofindsoluDon(nolocalopDma)– MaytakeexponenDalDmeinworstcase(thoughrarely)
• SDllsmarteralgorithm– Moveinsidetheinteriorofthefeasibleregion,indirecDonthatincreasesobjecDvefuncDon
– Stopwhennofurtherimprovementspossible– Trickytogetthedetailsright,butweaklypolynomialDme
SolvingLPsinPracDce
• Usecommercialproductslikecplexorgurobi
• DonottrytoimplementanLPsolveryourself!
• Donotusematlab’slinprogforanythingotherthansmallproblems.Really.No–REALLY!
ModifiedLPmaximize3x+2y
subjectto4x+2y≤15x+2y≤8x+y≤5x≥0y≥0
OpDmalsoluDon:x=2.5,y=2.5
SoluDonvalue=7.5+5=12.5
HalfpainDngs?
Integer(linear)programmaximize3x+2y
subjectto4x+2y≤15x+2y≤8x+y≤5
x≥0,integery≥0,integer 2
0
4
6
8
2 4 6 8
opDmalLPsoluDon:x=2.5,y=2.5(objecDve12.5)
opDmalIPsoluDon:x=2,y=3(objecDve12)
12/7/15
4
Mixedinteger(linear)programmaximize3x+2y
subjectto4x+2y≤15x+2y≤8x+y≤5x≥0
y≥0,integer 2
0
4
6
8
2 4 6 8
opDmalLPsoluDon:x=2.5,y=2.5(objecDve12.5)
opDmalIPsoluDon:x=2,y=3(objecDve12)
opDmalMIPsoluDon:x=2.75,y=2(objecDve12.25)
Solving(M)IPs
• (Mixed)IntegerprogramsareNP-hardtosolve• IntuiDon:Constraintsurfaceisjagged;noobviouswaytoavoidcheckingexponenDalnumberofassignmentstointegervariables
• InpracDce:– ConstraintsosengivecluesonhowtorestrictnumberofsoluDonsconsidered
– Smartsolvers(cplex,gurobi)cansome-mesfindsoluDonstolarge(M)IPssurprisinglyquickly(andsurprisinglyslowly)
• Supposeyouhaveahugenumberofconstraints,butasmallnumberofvariables(k>>n)
• ConstraintgeneraDon:– Startwithasubsetoftheconstraints– FindsoluDontosimplifiedLP– Findmostviolatedconstraint,addbacktoLP– Repeat
• Whydoesthiswork?– Ifmissingconstraintsareunviolated,thenaddingthembackwouldn’tchangethesoluDon
– SomeDmesterminatesaseraddinginonlyafracDonoftotalconstraints
– Noguarantees,butosenhelpfulinpracDce
LPTrick(oneofmany) Duality
• ForeveryLPthereisanequivalent“Dual”probelm• SoluDontoprimalcanbeusedtoreconstructsoluDontodual,andviceversa
• LPduality:
0::tosubject:minimize
≥
=
xbx
xcT
A0:
:tosubject
:maximize
≥
=
ycy
ybT
T
A
12/7/15
5
MDPSolvedasanLP
Issue:Turnthenon-linearmaxintoacollecDonoflinearconstraints
V(s)=maxa R(s,a)+γ P(s'|s,a)V(s')s'∑
!!
€
∀s,a :V(s) ≥R(s,a) +γ P(s' | s,a)V(s')s'
∑
MINIMIZE:!!
€
V(s)s∑ OpDmalacDonhas
Dghtconstraints
WhatisGameTheory?I• VerygeneralmathemaDcalframeworktostudysituaDons
wheremulDpleagentsinteract,including:– PopularnoDonsofgames– EverythinguptoandincludingmulDstep,mulDagent,simultaneousmove,parDalinformaDongames
– ExampleDukeCSresearch:Aimingsensorstocatchhidingenemies,assigningguardstoposts
– CanevenincludenegoDaDng,posturinganduncertaintyabouttheplayersandgameitself
• vonNeumannandMorgenstern(1944)wasamajorlaunchingpointformoderngametheory
• Nash:Existenceofequilibriaingeneralsumgames (wikipedia)
Whatisgametheory?II• StudyofseungswheremulDpleagentseachhave
– Differentpreferences(uDlityfuncDons),– DifferentacDons
• Eachagent’suDlity(potenDally)dependsonallagents’acDons– WhatisopDmalforoneagentdependsonwhatotheragentsdo– Canbecircular
• GametheorystudieshowagentscanraDonallyformbeliefsoverwhatotheragentswilldo,and(hence)howagentsshouldact
• UsefulforacDngand(potenDally)predicDngbehaviorofothers
• NotnecessarilydescripDve
RealWorldGameTheoryExamples• War• AucDons• Animalbehavior• Networkingprotocols• Peertopeernetworkingbehavior• Roadtraffic
• Mechanismdesign:– SupposewewantpeopletodoX?– HowtoengineersituaDonsotheywillactthatway?
12/7/15
6
CoveredToday
• 2player,zerosumsimultaneousmovegames• Example:Rock,Paper,Scissors
• LinearprogrammingsoluDon
LinearPrograms(maxformulaDon)
• Note:minformulaDonalsopossible– Min:cTx– Subjectto:Ax≥b
• SomeuseequalityasthecanonicalrepresentaDon (introducingslackvariables)
• LPtricks– MulDplyby-1toreverseinequaliDes– Caneasilyintroduceequalityconstraints,orarbitrarydomainconstraints
0::tosubject:maximize
≥
≤
xbx
xcT
A
Rock,Paper,ScissorsZeroSumFormulaDon
• Inzerosumgames,oneplayer’slossisother’sgain• Payoffmatrix:
• MinimaxsoluDonmaximizesworstcaseoutcome!!!!
€
R P SR 0 −1 1P 1 0 −1S −1 1 0
Rock,Paper,ScissorsEquaDons
• R,P,S=probabilitythatweplayrock,paper,orscissorsrespecDvely(R+P+S=1)
• UisourexpecteduDlity• BoundingouruDlity:
– Opponentrockcase:U≤P–S– Opponentpapercase:U≤S–R– Opponentscissorscase:U≤R–P
• WanttomaximizeUsubjecttoconstraints• SoluDon:(1/3,1/3,1/3)
12/7/15
7
Rock,Paper,ScissorsLPFormulaDon
• Ourvariablesare:x=[U,R,P,S]T• Wewant:
– MaximizeU– U≤P–S– U≤S–R– U≤R–P– R+P+S=1
• Howdowemakethisfit:0:
:tosubject:maximize
≥
≤
xbx
xcT
A ?
RockPaperScissorsLPFormulaDon
x = [U,R,P,S]T
A=
1 0 −1 11 1 0 −11 −1 1 00 1 1 10 −1 −1 −1
⎛
⎝
⎜⎜⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟⎟⎟
b= [0,0,0,1,−1]T
c = [1,0,0,0]T
0::tosubject:maximize
≥
≤
xbx
xcT
A
Rock,Paper,ScissorsSoluDon• IfwefeedthisLPtoanLPsolverweget:
– R=P=S=1/3– U=0
• SoluDonfortheotherplayeris:– Thesame…– Bysymmetry
• ThisistheminimaxsoluDon• Thisisalsoanequilibrium
– NoplayerhasanincenDvetodeviate– (Definedmorepreciselylater)
Tangent:WhyisRPSFun?
• OK,it’snot…
• WhymightRPSbefun?– Trytoexploitnon-randomnessinyourfriends– Trytoberandomyourself
12/7/15
8
MinimaxSoluDonsinGeneral• WhatdoweknowaboutminimaxsoluDons?
– CanasubopDmalopponenttrickminimax?– Whenshouldweabandonminimax?
• MinimaxsoluDonsfor2-playerzero-sumgamescanalwaysbefoundbysolvingalinearprogram
• TheminimaxsoluDonswillalsobeequilibria
• Forgeneralsumgames:– Minimaxdoesnotapply– Equilibriamaynotbeunique– NeedtosearchforequilibriausingmorecomputaDonallyintensive
methods
Outline• Digression:LinearProgramming
• 2player,zerosumsimultaneousmovegames• Example:Rock,Paper,Scissors
• LinearprogrammingsoluDon
• Generalsumgames
“Chicken”
0, 0 -1, 1 1, -1 -5, -5
D
S
D S
S
D
D
S
• Twoplayersdrivecarstowardseachother• Ifoneplayergoesstraight,thatplayerwins• Ifbothgostraight,theybothdie
notzero-sum
Source:wikipedia
Rock-paper-scissors–Seinfeldvariant
0, 0 1, -1 1, -1
-1, 1 0, 0 -1, 1 -1, 1 1, -1 0, 0
MICKEY:Allright,rockbeatspaper!(MickeysmacksKramer'shandforlosing)KRAMER:Ithoughtpapercoveredrock.MICKEY:Nah,rockfliesrightthroughpaper.KRAMER:Whatbeatsrock?MICKEY:(looksathand)Nothingbeatsrock.
12/7/15
9
Dominance• Playeri’sstrategysistrictlydominatessi’if
– foranys-i,ui(si,s-i)>ui(si’,s-i)• siweaklydominatessi’if
– foranys-i,ui(si,s-i)≥ui(si’,s-i);and– forsomes-i,ui(si,s-i)>ui(si’,s-i)
0, 0 1, -1 1, -1
-1, 1 0, 0 -1, 1 -1, 1 1, -1 0, 0
strictdominance
weakdominance
-i=“theplayer(s)otherthani”
Prisoner’sDilemma
-2, -2 0, -3 -3, 0 -1, -1
confess
• Pairofcriminalshasbeencaught• Districtaiorneyhasevidencetoconvictthemofaminorcrime(1yearinjail);knowsthattheycommiiedamajorcrimetogether(3yearsinjail)butcannotproveit
• Offersthemadeal:– Ifbothconfesstothemajorcrime,theyeachgeta1yearreducDon– Ifonlyoneconfesses,thatonegets3yearsreducDon
don’tconfess
don’tconfess
confess
“ShouldIbuyanSUV?”
-10, -10 -7, -11 -11, -7 -8, -8
cost:5
cost:3
cost:5 cost:5
cost:5 cost:5
cost:8 cost:2
purchasing+gascost accidentcost
“2/3oftheaverage”game• Everyonewritesdownanumberbetween0and100• Personclosestto2/3oftheaveragewins• Example:
– Asays50– Bsays10– Csays90– Average(50,10,90)=50– 2/3ofaverage=33.33– Aisclosest(|50-33.33|=16.67),soAwins
12/7/15
10
Iterateddominance• Iterateddominance:remove(strictly/weakly)dominatedstrategy,repeat
• IteratedstrictdominanceonSeinfeld’sRPS:
0, 0 1, -1 1, -1 -1, 1 0, 0 -1, 1 -1, 1 1, -1 0, 0
0, 0 1, -1 -1, 1 0, 0
“2/3oftheaverage”gamerevisited
0
100
(2/3)*100
(2/3)*(2/3)*100
…
dominated
dominateda2erremovalof(originally)dominatedstrategies
Mixedstrategies• Mixedstrategyforplayeri=probabilitydistribuDonoverplayeri’s(pure)strategies
• E.g.1/3,1/3,1/3• Exampleofdominancebyamixedstrategy:
3, 0 0, 0
0, 0 3, 0 1, 0 1, 0
1/2
1/2
BestResponses• LetAbeamatrixofplayer1’spayoffs• Letσ2beamixedstrategyforplayer2• Aσ2=vectorofexpectedpayoffsforeachstrategyforplayer1
• Highestentryindicatesbestresponseforplayer1• AnymixtureofDesisalsoBR• Generalizesto>2players
0, 0 -1, 1 1, -1 -5, -5
σ2
12/7/15
11
Nashequilibrium[Nash50]
• Avectorofstrategies(oneforeachplayer)=astrategyprofile• Strategyprofile(σ1,σ2,…,σn)isaNashequilibriumifeachσiisa
bestresponsetoσ-i– Thatis,foranyi,foranyσi’,ui(σi,σ-i)≥ui(σi’,σ-i)
• DoesnotsayanythingaboutmulDpleagentschangingtheirstrategiesatthesameDme
• Inany(finite)game,atleastoneNashequilibrium(possiblyusingmixedstrategies)exists[Nash50]
• (Note-singular:equilibrium,plural:equilibria)
EquilibriumStrategiesvs.
BestResponses
• equilibriumstrategy->bestresponse?
• bestresponse->equilibriumstrategy?
• ConsiderRock-Paper-Scissors– Is(1/3,1/3,1/3)abestresponseto(1/3,1/3,1/3)?– Is(1,0,0)abestresponseto(1/3,1/3,1/3)?– Is(1,0,0)astrategyforanyequilibrium? 0, 0 -1, 1 1, -1
1, -1 0, 0 -1, 1
-1, 1 1, -1 0, 0
Nashequilibriaof“chicken”
0, 0 -1, 1 1, -1 -5, -5
D
S
D S
S
D
D
S
• (D,S)and(S,D)areNashequilibria– Theyarepure-strategyNashequilibria:nobodyrandomizes– TheyarealsostrictNashequilibria:changingyourstrategywillmakeyou
strictlyworseoff
• Nootherpure-strategyNashequilibria
EquilibriumSelecDon
0, 0 -1, 1 1, -1 -5, -5
D
S
D S
S
D
D
S
• (D,S)and(S,D)areNashequilibria• Whichdoyouplay?• Whatifplayer1assumes(S,D),player2assumes(D,S)• Playis(S,S)=(-5,-5)!!!
• Thisistheequilibriumselec;onproblem
12/7/15
12
Rock-paper-scissors
0, 0 -1, 1 1, -1
1, -1 0, 0 -1, 1 -1, 1 1, -1 0, 0
• Anypure-strategyNashequilibria?• Ithasamixed-strategyNashequilibrium: Bothplayersputprobability1/3oneachacDon
Nashequilibriaof“chicken”…
0, 0 -1, 1 1, -1 -5, -5
D
S
D S
• IsthereaNashequilibriumthatusesmixedstrategies--say,whereplayer1usesamixedstrategy?
• Ifamixedstrategyisabestresponse,thenallofthepurestrategiesthatitrandomizesovermustalsobebestresponses
• Soweneedtomakeplayer1indifferentbetweenDandS• Player1’suDlityforplayingD=-pcS• Player1’suDlityforplayingS=pcD-5pcS=1-6pcS• Soweneed-pcS=1-6pcSwhichmeanspcS=1/5• Then,player2needstobeindifferentaswell• Mixed-strategyNashequilibrium:((4/5D,1/5S),(4/5D,1/5S))
– Peoplemaydie!ExpecteduDlity-1/5foreachplayer
-pcS=probabilitythatcolumnplayerplayss
ComputaDonalIssues
• Zero-sumgames-solvedefficientlyasLP• GeneralsumgamesmayrequireexponenDalDme(in#ofacDons)tofindasingleequilibrium(noknownefficientalgorithmandgoodreasonstosuspectthatnoneexists)
• Somebeiernews:Despitebadworst-casecomplexity,manygamescanbesolvedquickly
GameTheoryIssues• HowdescripDveisgametheory?
– Someevidencethatpeopleplayequilibria– Also,someevidencethatpeopleactirraDonally– IfitiscomputaDonallyintractabletosolveforequilibriaoflargegames,seemsunlikelythatpeoplearedoingthis
• Howreasonableis(basic)gametheory?– Arepayoffsknown?– AresituaDonsreallysimultaneousmovewithnoinformaDonabouthowtheotherplayerwillact?
– AresituaDonsreallysingle-shot?(repeatedgames)– HowisequilibriumselecDonhandledinpracDce?
12/7/15
13
Extensions• ParDalinformaDon• Uncertaintyaboutthegameparameters,e.g.,payoffs(Bayesian
games)• Repeatedgames:Simplelearningalgorithmscanconvergeto
equilibriainsomerepeatedgames• MulDstepgameswithdistribuDonsovernextstates(gametheory+
MDPs=stochasDcgames)• MulDstep+parDalinformaDon(ParDallyobservablestochasDc
games)
• Gametheoryissogeneral,thatitcanencompassessenDallyallaspectsofstrategic,mulDagentbehavior,e.g.,negoDaDng,threats,bluffs,coaliDons,bribes,etc.