What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear)...

12/7/15

1

LinearProgrammingandGameTheory

RonParrCPS570

WiththankstoVinceConitzerforsomecontent

WhatareLinearPrograms?•  Linearprogramsareconstrainedop-miza-onproblems•  ConstrainedopDmizaDonproblemsaskustomaximizeor

minimizeafuncDonsubjecttomathemaDcalconstraintsonthevariables–  ConvexprogramshaveconvexobjecDvefuncDonsandconvexconstraints

–  Linearprograms(specialcaseofconvexprograms)havelinearobjecDvefuncDonsandlinearconstraints

•  LPs=genericlanguageforwiderangeproblems•  LPsolvers=widelyavailablehammers•  EnDreclassesandvastexperDseinvestedinmaking

problemslooklikenails

Linearprograms:example

maximize3x+2ysubjectto4x+2y≤16x+2y≤8x+y≤5x≥0y≥0

•  MakereproducDonsof2painDngs

•  PainDng1:•  Sellsfor$30•  Requires4unitsofblue,1green,1red

•  PainDng2•  Sellsfor$20•  Requires2blue,2green,1red

•  Wehave16unitsblue,8green,5red

Solvingthelinearprogramgraphically

maximize3x+2ysubjectto4x+2y≤16x+2y≤8x+y≤5x≥0y≥0 2

0

4

6

8

2 4 6 8

opDmalsoluDon:x=3,y=2

Feasibleregion=regionnotviolaDngconstraints

12/7/15

2

LinearProgramsinGeneral

•  Linearconstraints,linearobjecDvefuncDon– Maximize(minimize):

– Subjectto:

•  Canswapmaximize/minimize,≤/≥;canaddequality•  Viewassearch:Searchesspaceofvaluesofx•  AlternaDvely:SearchforDghtconstraintsw/highobjecDvefuncDonvalue

!!!!

€

f (x)

!!

€

Ax ≤ b

LinearfuncDonofvectorx

MatrixA

WhatHappensInHigherDimensions(1)UnderstandingtheFeasibleRegion

2

0

4

6

8

2 4 6 8

IntuiDon:•  ObjecDvefuncDondefines“down”•  Feasibleregionisa“bowl”•  Wanttofindlowestpointontherotatedbowl

•  Inequalityw/2variables->onesideofaline•  3variables->onesideofaplane•  kvariables->onesideofhyperplane•  PhysicalintuiDon:

WhatHappensInHigherDimensions(2)lines->hyperplanes

hip://www.rubylane.com/item/623546-4085/Orrefors-x22Zenithx22-Paiern-Crystal-Bowl

Solvinglinearprograms(1)

•  OpDmalsoluDonsalwaysexistatverDcesofthefeasibleregion– Why?– Assumeyouarenotatavertex,youcanalwayspushfurtherindirecDonthatimprovesobjecDvefuncDon(oratleastdoesn’thurt)

– HowmanyverDcesdoesakxnmatriximply?

•  Dumb(est)algorithm:– Givennvariables,kconstraints–  Checkallk-choose-n=O(kn)possibleverDces

12/7/15

3

Solvinglinearprograms(2)

•  Smarteralgorithm(simplex)–  Pickavertex–  Repeatedlyhoptoneighboring(onedifferentDghtconstrain)verDcesthatimprovetheobjecDvefuncDon

– GuaranteedtofindsoluDon(nolocalopDma)– MaytakeexponenDalDmeinworstcase(thoughrarely)

•  SDllsmarteralgorithm– Moveinsidetheinteriorofthefeasibleregion,indirecDonthatincreasesobjecDvefuncDon

–  Stopwhennofurtherimprovementspossible–  Trickytogetthedetailsright,butweaklypolynomialDme

SolvingLPsinPracDce

•  Usecommercialproductslikecplexorgurobi

•  DonottrytoimplementanLPsolveryourself!

•  Donotusematlab’slinprogforanythingotherthansmallproblems.Really.No–REALLY!

ModifiedLPmaximize3x+2y

subjectto4x+2y≤15x+2y≤8x+y≤5x≥0y≥0

OpDmalsoluDon:x=2.5,y=2.5

SoluDonvalue=7.5+5=12.5

HalfpainDngs?

Integer(linear)programmaximize3x+2y

subjectto4x+2y≤15x+2y≤8x+y≤5

x≥0,integery≥0,integer 2

0

4

6

8

2 4 6 8

opDmalLPsoluDon:x=2.5,y=2.5(objecDve12.5)

opDmalIPsoluDon:x=2,y=3(objecDve12)

12/7/15

4

Mixedinteger(linear)programmaximize3x+2y

subjectto4x+2y≤15x+2y≤8x+y≤5x≥0

y≥0,integer 2

0

4

6

8

2 4 6 8

opDmalLPsoluDon:x=2.5,y=2.5(objecDve12.5)

opDmalIPsoluDon:x=2,y=3(objecDve12)

opDmalMIPsoluDon:x=2.75,y=2(objecDve12.25)

Solving(M)IPs

•  (Mixed)IntegerprogramsareNP-hardtosolve•  IntuiDon:Constraintsurfaceisjagged;noobviouswaytoavoidcheckingexponenDalnumberofassignmentstointegervariables

•  InpracDce:–  ConstraintsosengivecluesonhowtorestrictnumberofsoluDonsconsidered

–  Smartsolvers(cplex,gurobi)cansome-mesfindsoluDonstolarge(M)IPssurprisinglyquickly(andsurprisinglyslowly)

•  Supposeyouhaveahugenumberofconstraints,butasmallnumberofvariables(k>>n)

•  ConstraintgeneraDon:–  Startwithasubsetoftheconstraints–  FindsoluDontosimplifiedLP–  Findmostviolatedconstraint,addbacktoLP–  Repeat

•  Whydoesthiswork?–  Ifmissingconstraintsareunviolated,thenaddingthembackwouldn’tchangethesoluDon

–  SomeDmesterminatesaseraddinginonlyafracDonoftotalconstraints

–  Noguarantees,butosenhelpfulinpracDce

LPTrick(oneofmany) Duality

•  ForeveryLPthereisanequivalent“Dual”probelm•  SoluDontoprimalcanbeusedtoreconstructsoluDontodual,andviceversa

•  LPduality:

0::tosubject:minimize

≥

=

xbx

xcT

A0:

:tosubject

:maximize

≥

=

ycy

ybT

T

A

12/7/15

5

MDPSolvedasanLP

Issue:Turnthenon-linearmaxintoacollecDonoflinearconstraints

V(s)=maxa R(s,a)+γ P(s'|s,a)V(s')s'∑

!!

€

∀s,a :V(s) ≥R(s,a) +γ P(s' | s,a)V(s')s'

∑

MINIMIZE:!!

€

V(s)s∑ OpDmalacDonhas

Dghtconstraints

WhatisGameTheory?I•  VerygeneralmathemaDcalframeworktostudysituaDons

wheremulDpleagentsinteract,including:–  PopularnoDonsofgames–  EverythinguptoandincludingmulDstep,mulDagent,simultaneousmove,parDalinformaDongames

–  ExampleDukeCSresearch:Aimingsensorstocatchhidingenemies,assigningguardstoposts

–  CanevenincludenegoDaDng,posturinganduncertaintyabouttheplayersandgameitself

•  vonNeumannandMorgenstern(1944)wasamajorlaunchingpointformoderngametheory

•  Nash:Existenceofequilibriaingeneralsumgames (wikipedia)

Whatisgametheory?II•  StudyofseungswheremulDpleagentseachhave

–  Differentpreferences(uDlityfuncDons),–  DifferentacDons

•  Eachagent’suDlity(potenDally)dependsonallagents’acDons–  WhatisopDmalforoneagentdependsonwhatotheragentsdo–  Canbecircular

•  GametheorystudieshowagentscanraDonallyformbeliefsoverwhatotheragentswilldo,and(hence)howagentsshouldact

•  UsefulforacDngand(potenDally)predicDngbehaviorofothers

•  NotnecessarilydescripDve

RealWorldGameTheoryExamples•  War•  AucDons•  Animalbehavior•  Networkingprotocols•  Peertopeernetworkingbehavior•  Roadtraffic

•  Mechanismdesign:–  SupposewewantpeopletodoX?– HowtoengineersituaDonsotheywillactthatway?

12/7/15

6

CoveredToday

•  2player,zerosumsimultaneousmovegames•  Example:Rock,Paper,Scissors

•  LinearprogrammingsoluDon

LinearPrograms(maxformulaDon)

•  Note:minformulaDonalsopossible–  Min:cTx–  Subjectto:Ax≥b

•  SomeuseequalityasthecanonicalrepresentaDon (introducingslackvariables)

•  LPtricks–  MulDplyby-1toreverseinequaliDes–  Caneasilyintroduceequalityconstraints,orarbitrarydomainconstraints

0::tosubject:maximize

≥

≤

xbx

xcT

A

Rock,Paper,ScissorsZeroSumFormulaDon

•  Inzerosumgames,oneplayer’slossisother’sgain•  Payoffmatrix:

•  MinimaxsoluDonmaximizesworstcaseoutcome!!!!

€

R P SR 0 −1 1P 1 0 −1S −1 1 0

Rock,Paper,ScissorsEquaDons

•  R,P,S=probabilitythatweplayrock,paper,orscissorsrespecDvely(R+P+S=1)

•  UisourexpecteduDlity•  BoundingouruDlity:

– Opponentrockcase:U≤P–S– Opponentpapercase:U≤S–R– Opponentscissorscase:U≤R–P

•  WanttomaximizeUsubjecttoconstraints•  SoluDon:(1/3,1/3,1/3)

12/7/15

7

Rock,Paper,ScissorsLPFormulaDon

•  Ourvariablesare:x=[U,R,P,S]T•  Wewant:

– MaximizeU– U≤P–S– U≤S–R– U≤R–P– R+P+S=1

•  Howdowemakethisfit:0:

:tosubject:maximize

≥

≤

xbx

xcT

A ?

RockPaperScissorsLPFormulaDon

x = [U,R,P,S]T

A=

1 0 −1 11 1 0 −11 −1 1 00 1 1 10 −1 −1 −1

⎛

⎝

⎜⎜⎜⎜⎜⎜

⎞

⎠

⎟⎟⎟⎟⎟⎟

b= [0,0,0,1,−1]T

c = [1,0,0,0]T

0::tosubject:maximize

≥

≤

xbx

xcT

A

Rock,Paper,ScissorsSoluDon•  IfwefeedthisLPtoanLPsolverweget:

–  R=P=S=1/3– U=0

•  SoluDonfortheotherplayeris:–  Thesame…–  Bysymmetry

•  ThisistheminimaxsoluDon•  Thisisalsoanequilibrium

–  NoplayerhasanincenDvetodeviate–  (Definedmorepreciselylater)

Tangent:WhyisRPSFun?

•  OK,it’snot…

•  WhymightRPSbefun?– Trytoexploitnon-randomnessinyourfriends– Trytoberandomyourself

12/7/15

8

MinimaxSoluDonsinGeneral•  WhatdoweknowaboutminimaxsoluDons?

–  CanasubopDmalopponenttrickminimax?–  Whenshouldweabandonminimax?

•  MinimaxsoluDonsfor2-playerzero-sumgamescanalwaysbefoundbysolvingalinearprogram

•  TheminimaxsoluDonswillalsobeequilibria

•  Forgeneralsumgames:–  Minimaxdoesnotapply–  Equilibriamaynotbeunique–  NeedtosearchforequilibriausingmorecomputaDonallyintensive

methods

Outline•  Digression:LinearProgramming

•  2player,zerosumsimultaneousmovegames•  Example:Rock,Paper,Scissors

•  LinearprogrammingsoluDon

•  Generalsumgames

“Chicken”

0, 0 -1, 1 1, -1 -5, -5

D

S

D S

S

D

D

S

•  Twoplayersdrivecarstowardseachother•  Ifoneplayergoesstraight,thatplayerwins•  Ifbothgostraight,theybothdie

notzero-sum

Source:wikipedia

Rock-paper-scissors–Seinfeldvariant

0, 0 1, -1 1, -1

-1, 1 0, 0 -1, 1 -1, 1 1, -1 0, 0

MICKEY:Allright,rockbeatspaper!(MickeysmacksKramer'shandforlosing)KRAMER:Ithoughtpapercoveredrock.MICKEY:Nah,rockfliesrightthroughpaper.KRAMER:Whatbeatsrock?MICKEY:(looksathand)Nothingbeatsrock.

12/7/15

9

Dominance•  Playeri’sstrategysistrictlydominatessi’if

–  foranys-i,ui(si,s-i)>ui(si’,s-i)•  siweaklydominatessi’if

–  foranys-i,ui(si,s-i)≥ui(si’,s-i);and–  forsomes-i,ui(si,s-i)>ui(si’,s-i)

0, 0 1, -1 1, -1

-1, 1 0, 0 -1, 1 -1, 1 1, -1 0, 0

strictdominance

weakdominance

-i=“theplayer(s)otherthani”

Prisoner’sDilemma

-2, -2 0, -3 -3, 0 -1, -1

confess

•  Pairofcriminalshasbeencaught•  Districtaiorneyhasevidencetoconvictthemofaminorcrime(1yearinjail);knowsthattheycommiiedamajorcrimetogether(3yearsinjail)butcannotproveit

•  Offersthemadeal:–  Ifbothconfesstothemajorcrime,theyeachgeta1yearreducDon–  Ifonlyoneconfesses,thatonegets3yearsreducDon

don’tconfess

don’tconfess

confess

“ShouldIbuyanSUV?”

-10, -10 -7, -11 -11, -7 -8, -8

cost:5

cost:3

cost:5 cost:5

cost:5 cost:5

cost:8 cost:2

purchasing+gascost accidentcost

“2/3oftheaverage”game•  Everyonewritesdownanumberbetween0and100•  Personclosestto2/3oftheaveragewins•  Example:

–  Asays50–  Bsays10–  Csays90–  Average(50,10,90)=50–  2/3ofaverage=33.33–  Aisclosest(|50-33.33|=16.67),soAwins

12/7/15

10

Iterateddominance•  Iterateddominance:remove(strictly/weakly)dominatedstrategy,repeat

•  IteratedstrictdominanceonSeinfeld’sRPS:

0, 0 1, -1 1, -1 -1, 1 0, 0 -1, 1 -1, 1 1, -1 0, 0

0, 0 1, -1 -1, 1 0, 0

“2/3oftheaverage”gamerevisited

0

100

(2/3)*100

(2/3)*(2/3)*100

…

dominated

dominateda2erremovalof(originally)dominatedstrategies

Mixedstrategies•  Mixedstrategyforplayeri=probabilitydistribuDonoverplayeri’s(pure)strategies

•  E.g.1/3,1/3,1/3•  Exampleofdominancebyamixedstrategy:

3, 0 0, 0

0, 0 3, 0 1, 0 1, 0

1/2

1/2

BestResponses•  LetAbeamatrixofplayer1’spayoffs•  Letσ2beamixedstrategyforplayer2•  Aσ2=vectorofexpectedpayoffsforeachstrategyforplayer1

•  Highestentryindicatesbestresponseforplayer1•  AnymixtureofDesisalsoBR•  Generalizesto>2players

0, 0 -1, 1 1, -1 -5, -5

σ2

12/7/15

11

Nashequilibrium[Nash50]

•  Avectorofstrategies(oneforeachplayer)=astrategyprofile•  Strategyprofile(σ1,σ2,…,σn)isaNashequilibriumifeachσiisa

bestresponsetoσ-i–  Thatis,foranyi,foranyσi’,ui(σi,σ-i)≥ui(σi’,σ-i)

•  DoesnotsayanythingaboutmulDpleagentschangingtheirstrategiesatthesameDme

•  Inany(finite)game,atleastoneNashequilibrium(possiblyusingmixedstrategies)exists[Nash50]

•  (Note-singular:equilibrium,plural:equilibria)

EquilibriumStrategiesvs.

BestResponses

•  equilibriumstrategy->bestresponse?

•  bestresponse->equilibriumstrategy?

•  ConsiderRock-Paper-Scissors–  Is(1/3,1/3,1/3)abestresponseto(1/3,1/3,1/3)?–  Is(1,0,0)abestresponseto(1/3,1/3,1/3)?–  Is(1,0,0)astrategyforanyequilibrium? 0, 0 -1, 1 1, -1

1, -1 0, 0 -1, 1

-1, 1 1, -1 0, 0

Nashequilibriaof“chicken”

0, 0 -1, 1 1, -1 -5, -5

D

S

D S

S

D

D

S

•  (D,S)and(S,D)areNashequilibria–  Theyarepure-strategyNashequilibria:nobodyrandomizes–  TheyarealsostrictNashequilibria:changingyourstrategywillmakeyou

strictlyworseoff

•  Nootherpure-strategyNashequilibria

EquilibriumSelecDon

0, 0 -1, 1 1, -1 -5, -5

D

S

D S

S

D

D

S

•  (D,S)and(S,D)areNashequilibria•  Whichdoyouplay?•  Whatifplayer1assumes(S,D),player2assumes(D,S)•  Playis(S,S)=(-5,-5)!!!

•  Thisistheequilibriumselec;onproblem

12/7/15

12

Rock-paper-scissors

0, 0 -1, 1 1, -1

1, -1 0, 0 -1, 1 -1, 1 1, -1 0, 0

•  Anypure-strategyNashequilibria?•  Ithasamixed-strategyNashequilibrium: Bothplayersputprobability1/3oneachacDon

Nashequilibriaof“chicken”…

0, 0 -1, 1 1, -1 -5, -5

D

S

D S

•  IsthereaNashequilibriumthatusesmixedstrategies--say,whereplayer1usesamixedstrategy?

•  Ifamixedstrategyisabestresponse,thenallofthepurestrategiesthatitrandomizesovermustalsobebestresponses

•  Soweneedtomakeplayer1indifferentbetweenDandS•  Player1’suDlityforplayingD=-pcS•  Player1’suDlityforplayingS=pcD-5pcS=1-6pcS•  Soweneed-pcS=1-6pcSwhichmeanspcS=1/5•  Then,player2needstobeindifferentaswell•  Mixed-strategyNashequilibrium:((4/5D,1/5S),(4/5D,1/5S))

–  Peoplemaydie!ExpecteduDlity-1/5foreachplayer

-pcS=probabilitythatcolumnplayerplayss

ComputaDonalIssues

•  Zero-sumgames-solvedefficientlyasLP•  GeneralsumgamesmayrequireexponenDalDme(in#ofacDons)tofindasingleequilibrium(noknownefficientalgorithmandgoodreasonstosuspectthatnoneexists)

•  Somebeiernews:Despitebadworst-casecomplexity,manygamescanbesolvedquickly

GameTheoryIssues•  HowdescripDveisgametheory?

–  Someevidencethatpeopleplayequilibria–  Also,someevidencethatpeopleactirraDonally–  IfitiscomputaDonallyintractabletosolveforequilibriaoflargegames,seemsunlikelythatpeoplearedoingthis

•  Howreasonableis(basic)gametheory?–  Arepayoffsknown?–  AresituaDonsreallysimultaneousmovewithnoinformaDonabouthowtheotherplayerwillact?

–  AresituaDonsreallysingle-shot?(repeatedgames)–  HowisequilibriumselecDonhandledinpracDce?

12/7/15

13

Extensions•  ParDalinformaDon•  Uncertaintyaboutthegameparameters,e.g.,payoffs(Bayesian

games)•  Repeatedgames:Simplelearningalgorithmscanconvergeto

equilibriainsomerepeatedgames•  MulDstepgameswithdistribuDonsovernextstates(gametheory+

MDPs=stochasDcgames)•  MulDstep+parDalinformaDon(ParDallyobservablestochasDc

games)

•  Gametheoryissogeneral,thatitcanencompassessenDallyallaspectsofstrategic,mulDagentbehavior,e.g.,negoDaDng,threats,bluffs,coaliDons,bribes,etc.

What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear)...

Documents

Transcript of What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear)...