What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear)...

13
12/7/15 1 Linear Programming and Game Theory Ron Parr CPS 570 With thanks to Vince Conitzer for some content What are Linear Programs? Linear programs are constrained op-miza-on problems Constrained opDmizaDon problems ask us to maximize or minimize a funcDon subject to mathemaDcal constraints on the variables Convex programs have convex objecDve funcDons and convex constraints Linear programs (special case of convex programs) have linear objecDve funcDons and linear constraints LPs = generic language for wide range problems LP solvers = widely available hammers EnDre classes and vast experDse invested in making problems look like nails Linear programs: example maximize 3x + 2y subject to 4x + 2y ≤ 16 x + 2y ≤ 8 x+y≤5 x≥0 y≥0 Make reproducDons of 2 painDngs PainDng 1: Sells for $30 Requires 4 units of blue, 1 green, 1 red PainDng 2 Sells for $20 Requires 2 blue, 2 green, 1 red We have 16 units blue, 8 green, 5 red Solving the linear program graphically maximize 3x + 2y subject to 4x + 2y ≤ 16 x + 2y ≤ 8 x+y≤5 x≥0 y≥0 2 0 4 6 8 2 4 6 8 opDmal soluDon: x=3, y=2 Feasible region = region not violaDng constraints

Transcript of What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear)...

Page 1: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

1

LinearProgrammingandGameTheory

RonParrCPS570

WiththankstoVinceConitzerforsomecontent

WhatareLinearPrograms?•  Linearprogramsareconstrainedop-miza-onproblems•  ConstrainedopDmizaDonproblemsaskustomaximizeor

minimizeafuncDonsubjecttomathemaDcalconstraintsonthevariables–  ConvexprogramshaveconvexobjecDvefuncDonsandconvexconstraints

–  Linearprograms(specialcaseofconvexprograms)havelinearobjecDvefuncDonsandlinearconstraints

•  LPs=genericlanguageforwiderangeproblems•  LPsolvers=widelyavailablehammers•  EnDreclassesandvastexperDseinvestedinmaking

problemslooklikenails

Linearprograms:example

maximize3x+2ysubjectto4x+2y≤16x+2y≤8x+y≤5x≥0y≥0

•  MakereproducDonsof2painDngs

•  PainDng1:•  Sellsfor$30•  Requires4unitsofblue,1green,1red

•  PainDng2•  Sellsfor$20•  Requires2blue,2green,1red

•  Wehave16unitsblue,8green,5red

Solvingthelinearprogramgraphically

maximize3x+2ysubjectto4x+2y≤16x+2y≤8x+y≤5x≥0y≥0 2

0

4

6

8

2 4 6 8

opDmalsoluDon:x=3,y=2

Feasibleregion=regionnotviolaDngconstraints

Page 2: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

2

LinearProgramsinGeneral

•  Linearconstraints,linearobjecDvefuncDon– Maximize(minimize):

– Subjectto:

•  Canswapmaximize/minimize,≤/≥;canaddequality•  Viewassearch:Searchesspaceofvaluesofx•  AlternaDvely:SearchforDghtconstraintsw/highobjecDvefuncDonvalue

!!!!

f (x)

!!

Ax ≤ b

LinearfuncDonofvectorx

MatrixA

WhatHappensInHigherDimensions(1)UnderstandingtheFeasibleRegion

2

0

4

6

8

2 4 6 8

IntuiDon:•  ObjecDvefuncDondefines“down”•  Feasibleregionisa“bowl”•  Wanttofindlowestpointontherotatedbowl

•  Inequalityw/2variables->onesideofaline•  3variables->onesideofaplane•  kvariables->onesideofhyperplane•  PhysicalintuiDon:

WhatHappensInHigherDimensions(2)lines->hyperplanes

hip://www.rubylane.com/item/623546-4085/Orrefors-x22Zenithx22-Paiern-Crystal-Bowl

Solvinglinearprograms(1)

•  OpDmalsoluDonsalwaysexistatverDcesofthefeasibleregion– Why?– Assumeyouarenotatavertex,youcanalwayspushfurtherindirecDonthatimprovesobjecDvefuncDon(oratleastdoesn’thurt)

– HowmanyverDcesdoesakxnmatriximply?

•  Dumb(est)algorithm:– Givennvariables,kconstraints–  Checkallk-choose-n=O(kn)possibleverDces

Page 3: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

3

Solvinglinearprograms(2)

•  Smarteralgorithm(simplex)–  Pickavertex–  Repeatedlyhoptoneighboring(onedifferentDghtconstrain)verDcesthatimprovetheobjecDvefuncDon

– GuaranteedtofindsoluDon(nolocalopDma)– MaytakeexponenDalDmeinworstcase(thoughrarely)

•  SDllsmarteralgorithm– Moveinsidetheinteriorofthefeasibleregion,indirecDonthatincreasesobjecDvefuncDon

–  Stopwhennofurtherimprovementspossible–  Trickytogetthedetailsright,butweaklypolynomialDme

SolvingLPsinPracDce

•  Usecommercialproductslikecplexorgurobi

•  DonottrytoimplementanLPsolveryourself!

•  Donotusematlab’slinprogforanythingotherthansmallproblems.Really.No–REALLY!

ModifiedLPmaximize3x+2y

subjectto4x+2y≤15x+2y≤8x+y≤5x≥0y≥0

OpDmalsoluDon:x=2.5,y=2.5

SoluDonvalue=7.5+5=12.5

HalfpainDngs?

Integer(linear)programmaximize3x+2y

subjectto4x+2y≤15x+2y≤8x+y≤5

x≥0,integery≥0,integer 2

0

4

6

8

2 4 6 8

opDmalLPsoluDon:x=2.5,y=2.5(objecDve12.5)

opDmalIPsoluDon:x=2,y=3(objecDve12)

Page 4: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

4

Mixedinteger(linear)programmaximize3x+2y

subjectto4x+2y≤15x+2y≤8x+y≤5x≥0

y≥0,integer 2

0

4

6

8

2 4 6 8

opDmalLPsoluDon:x=2.5,y=2.5(objecDve12.5)

opDmalIPsoluDon:x=2,y=3(objecDve12)

opDmalMIPsoluDon:x=2.75,y=2(objecDve12.25)

Solving(M)IPs

•  (Mixed)IntegerprogramsareNP-hardtosolve•  IntuiDon:Constraintsurfaceisjagged;noobviouswaytoavoidcheckingexponenDalnumberofassignmentstointegervariables

•  InpracDce:–  ConstraintsosengivecluesonhowtorestrictnumberofsoluDonsconsidered

–  Smartsolvers(cplex,gurobi)cansome-mesfindsoluDonstolarge(M)IPssurprisinglyquickly(andsurprisinglyslowly)

•  Supposeyouhaveahugenumberofconstraints,butasmallnumberofvariables(k>>n)

•  ConstraintgeneraDon:–  Startwithasubsetoftheconstraints–  FindsoluDontosimplifiedLP–  Findmostviolatedconstraint,addbacktoLP–  Repeat

•  Whydoesthiswork?–  Ifmissingconstraintsareunviolated,thenaddingthembackwouldn’tchangethesoluDon

–  SomeDmesterminatesaseraddinginonlyafracDonoftotalconstraints

–  Noguarantees,butosenhelpfulinpracDce

LPTrick(oneofmany) Duality

•  ForeveryLPthereisanequivalent“Dual”probelm•  SoluDontoprimalcanbeusedtoreconstructsoluDontodual,andviceversa

•  LPduality:

0::tosubject:minimize

=

xbx

xcT

A0:

:tosubject

:maximize

=

ycy

ybT

T

A

Page 5: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

5

MDPSolvedasanLP

Issue:Turnthenon-linearmaxintoacollecDonoflinearconstraints

V(s)=maxa R(s,a)+γ P(s'|s,a)V(s')s'∑

!!

∀s,a :V(s) ≥R(s,a) +γ P(s' | s,a)V(s')s'

MINIMIZE:!!

V(s)s∑ OpDmalacDonhas

Dghtconstraints

WhatisGameTheory?I•  VerygeneralmathemaDcalframeworktostudysituaDons

wheremulDpleagentsinteract,including:–  PopularnoDonsofgames–  EverythinguptoandincludingmulDstep,mulDagent,simultaneousmove,parDalinformaDongames

–  ExampleDukeCSresearch:Aimingsensorstocatchhidingenemies,assigningguardstoposts

–  CanevenincludenegoDaDng,posturinganduncertaintyabouttheplayersandgameitself

•  vonNeumannandMorgenstern(1944)wasamajorlaunchingpointformoderngametheory

•  Nash:Existenceofequilibriaingeneralsumgames (wikipedia)

Whatisgametheory?II•  StudyofseungswheremulDpleagentseachhave

–  Differentpreferences(uDlityfuncDons),–  DifferentacDons

•  Eachagent’suDlity(potenDally)dependsonallagents’acDons–  WhatisopDmalforoneagentdependsonwhatotheragentsdo–  Canbecircular

•  GametheorystudieshowagentscanraDonallyformbeliefsoverwhatotheragentswilldo,and(hence)howagentsshouldact

•  UsefulforacDngand(potenDally)predicDngbehaviorofothers

•  NotnecessarilydescripDve

RealWorldGameTheoryExamples•  War•  AucDons•  Animalbehavior•  Networkingprotocols•  Peertopeernetworkingbehavior•  Roadtraffic

•  Mechanismdesign:–  SupposewewantpeopletodoX?– HowtoengineersituaDonsotheywillactthatway?

Page 6: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

6

CoveredToday

•  2player,zerosumsimultaneousmovegames•  Example:Rock,Paper,Scissors

•  LinearprogrammingsoluDon

LinearPrograms(maxformulaDon)

•  Note:minformulaDonalsopossible–  Min:cTx–  Subjectto:Ax≥b

•  SomeuseequalityasthecanonicalrepresentaDon (introducingslackvariables)

•  LPtricks–  MulDplyby-1toreverseinequaliDes–  Caneasilyintroduceequalityconstraints,orarbitrarydomainconstraints

0::tosubject:maximize

xbx

xcT

A

Rock,Paper,ScissorsZeroSumFormulaDon

•  Inzerosumgames,oneplayer’slossisother’sgain•  Payoffmatrix:

•  MinimaxsoluDonmaximizesworstcaseoutcome!!!!

R P SR 0 −1 1P 1 0 −1S −1 1 0

Rock,Paper,ScissorsEquaDons

•  R,P,S=probabilitythatweplayrock,paper,orscissorsrespecDvely(R+P+S=1)

•  UisourexpecteduDlity•  BoundingouruDlity:

– Opponentrockcase:U≤P–S– Opponentpapercase:U≤S–R– Opponentscissorscase:U≤R–P

•  WanttomaximizeUsubjecttoconstraints•  SoluDon:(1/3,1/3,1/3)

Page 7: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

7

Rock,Paper,ScissorsLPFormulaDon

•  Ourvariablesare:x=[U,R,P,S]T•  Wewant:

– MaximizeU– U≤P–S– U≤S–R– U≤R–P– R+P+S=1

•  Howdowemakethisfit:0:

:tosubject:maximize

xbx

xcT

A ?

RockPaperScissorsLPFormulaDon

x = [U,R,P,S]T

A=

1 0 −1 11 1 0 −11 −1 1 00 1 1 10 −1 −1 −1

⎜⎜⎜⎜⎜⎜

⎟⎟⎟⎟⎟⎟

b= [0,0,0,1,−1]T

c = [1,0,0,0]T

0::tosubject:maximize

xbx

xcT

A

Rock,Paper,ScissorsSoluDon•  IfwefeedthisLPtoanLPsolverweget:

–  R=P=S=1/3– U=0

•  SoluDonfortheotherplayeris:–  Thesame…–  Bysymmetry

•  ThisistheminimaxsoluDon•  Thisisalsoanequilibrium

–  NoplayerhasanincenDvetodeviate–  (Definedmorepreciselylater)

Tangent:WhyisRPSFun?

•  OK,it’snot…

•  WhymightRPSbefun?– Trytoexploitnon-randomnessinyourfriends– Trytoberandomyourself

Page 8: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

8

MinimaxSoluDonsinGeneral•  WhatdoweknowaboutminimaxsoluDons?

–  CanasubopDmalopponenttrickminimax?–  Whenshouldweabandonminimax?

•  MinimaxsoluDonsfor2-playerzero-sumgamescanalwaysbefoundbysolvingalinearprogram

•  TheminimaxsoluDonswillalsobeequilibria

•  Forgeneralsumgames:–  Minimaxdoesnotapply–  Equilibriamaynotbeunique–  NeedtosearchforequilibriausingmorecomputaDonallyintensive

methods

Outline•  Digression:LinearProgramming

•  2player,zerosumsimultaneousmovegames•  Example:Rock,Paper,Scissors

•  LinearprogrammingsoluDon

•  Generalsumgames

“Chicken”

0, 0 -1, 1 1, -1 -5, -5

D

S

D S

S

D

D

S

•  Twoplayersdrivecarstowardseachother•  Ifoneplayergoesstraight,thatplayerwins•  Ifbothgostraight,theybothdie

notzero-sum

Source:wikipedia

Rock-paper-scissors–Seinfeldvariant

0, 0 1, -1 1, -1

-1, 1 0, 0 -1, 1 -1, 1 1, -1 0, 0

MICKEY:Allright,rockbeatspaper!(MickeysmacksKramer'shandforlosing)KRAMER:Ithoughtpapercoveredrock.MICKEY:Nah,rockfliesrightthroughpaper.KRAMER:Whatbeatsrock?MICKEY:(looksathand)Nothingbeatsrock.

Page 9: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

9

Dominance•  Playeri’sstrategysistrictlydominatessi’if

–  foranys-i,ui(si,s-i)>ui(si’,s-i)•  siweaklydominatessi’if

–  foranys-i,ui(si,s-i)≥ui(si’,s-i);and–  forsomes-i,ui(si,s-i)>ui(si’,s-i)

0, 0 1, -1 1, -1

-1, 1 0, 0 -1, 1 -1, 1 1, -1 0, 0

strictdominance

weakdominance

-i=“theplayer(s)otherthani”

Prisoner’sDilemma

-2, -2 0, -3 -3, 0 -1, -1

confess

•  Pairofcriminalshasbeencaught•  Districtaiorneyhasevidencetoconvictthemofaminorcrime(1yearinjail);knowsthattheycommiiedamajorcrimetogether(3yearsinjail)butcannotproveit

•  Offersthemadeal:–  Ifbothconfesstothemajorcrime,theyeachgeta1yearreducDon–  Ifonlyoneconfesses,thatonegets3yearsreducDon

don’tconfess

don’tconfess

confess

“ShouldIbuyanSUV?”

-10, -10 -7, -11 -11, -7 -8, -8

cost:5

cost:3

cost:5 cost:5

cost:5 cost:5

cost:8 cost:2

purchasing+gascost accidentcost

“2/3oftheaverage”game•  Everyonewritesdownanumberbetween0and100•  Personclosestto2/3oftheaveragewins•  Example:

–  Asays50–  Bsays10–  Csays90–  Average(50,10,90)=50–  2/3ofaverage=33.33–  Aisclosest(|50-33.33|=16.67),soAwins

Page 10: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

10

Iterateddominance•  Iterateddominance:remove(strictly/weakly)dominatedstrategy,repeat

•  IteratedstrictdominanceonSeinfeld’sRPS:

0, 0 1, -1 1, -1 -1, 1 0, 0 -1, 1 -1, 1 1, -1 0, 0

0, 0 1, -1 -1, 1 0, 0

“2/3oftheaverage”gamerevisited

0

100

(2/3)*100

(2/3)*(2/3)*100

dominated

dominateda2erremovalof(originally)dominatedstrategies

Mixedstrategies•  Mixedstrategyforplayeri=probabilitydistribuDonoverplayeri’s(pure)strategies

•  E.g.1/3,1/3,1/3•  Exampleofdominancebyamixedstrategy:

3, 0 0, 0

0, 0 3, 0 1, 0 1, 0

1/2

1/2

BestResponses•  LetAbeamatrixofplayer1’spayoffs•  Letσ2beamixedstrategyforplayer2•  Aσ2=vectorofexpectedpayoffsforeachstrategyforplayer1

•  Highestentryindicatesbestresponseforplayer1•  AnymixtureofDesisalsoBR•  Generalizesto>2players

0, 0 -1, 1 1, -1 -5, -5

σ2

Page 11: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

11

Nashequilibrium[Nash50]

•  Avectorofstrategies(oneforeachplayer)=astrategyprofile•  Strategyprofile(σ1,σ2,…,σn)isaNashequilibriumifeachσiisa

bestresponsetoσ-i–  Thatis,foranyi,foranyσi’,ui(σi,σ-i)≥ui(σi’,σ-i)

•  DoesnotsayanythingaboutmulDpleagentschangingtheirstrategiesatthesameDme

•  Inany(finite)game,atleastoneNashequilibrium(possiblyusingmixedstrategies)exists[Nash50]

•  (Note-singular:equilibrium,plural:equilibria)

EquilibriumStrategiesvs.

BestResponses

•  equilibriumstrategy->bestresponse?

•  bestresponse->equilibriumstrategy?

•  ConsiderRock-Paper-Scissors–  Is(1/3,1/3,1/3)abestresponseto(1/3,1/3,1/3)?–  Is(1,0,0)abestresponseto(1/3,1/3,1/3)?–  Is(1,0,0)astrategyforanyequilibrium? 0, 0 -1, 1 1, -1

1, -1 0, 0 -1, 1

-1, 1 1, -1 0, 0

Nashequilibriaof“chicken”

0, 0 -1, 1 1, -1 -5, -5

D

S

D S

S

D

D

S

•  (D,S)and(S,D)areNashequilibria–  Theyarepure-strategyNashequilibria:nobodyrandomizes–  TheyarealsostrictNashequilibria:changingyourstrategywillmakeyou

strictlyworseoff

•  Nootherpure-strategyNashequilibria

EquilibriumSelecDon

0, 0 -1, 1 1, -1 -5, -5

D

S

D S

S

D

D

S

•  (D,S)and(S,D)areNashequilibria•  Whichdoyouplay?•  Whatifplayer1assumes(S,D),player2assumes(D,S)•  Playis(S,S)=(-5,-5)!!!

•  Thisistheequilibriumselec;onproblem

Page 12: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

12

Rock-paper-scissors

0, 0 -1, 1 1, -1

1, -1 0, 0 -1, 1 -1, 1 1, -1 0, 0

•  Anypure-strategyNashequilibria?•  Ithasamixed-strategyNashequilibrium: Bothplayersputprobability1/3oneachacDon

Nashequilibriaof“chicken”…

0, 0 -1, 1 1, -1 -5, -5

D

S

D S

•  IsthereaNashequilibriumthatusesmixedstrategies--say,whereplayer1usesamixedstrategy?

•  Ifamixedstrategyisabestresponse,thenallofthepurestrategiesthatitrandomizesovermustalsobebestresponses

•  Soweneedtomakeplayer1indifferentbetweenDandS•  Player1’suDlityforplayingD=-pcS•  Player1’suDlityforplayingS=pcD-5pcS=1-6pcS•  Soweneed-pcS=1-6pcSwhichmeanspcS=1/5•  Then,player2needstobeindifferentaswell•  Mixed-strategyNashequilibrium:((4/5D,1/5S),(4/5D,1/5S))

–  Peoplemaydie!ExpecteduDlity-1/5foreachplayer

-pcS=probabilitythatcolumnplayerplayss

ComputaDonalIssues

•  Zero-sumgames-solvedefficientlyasLP•  GeneralsumgamesmayrequireexponenDalDme(in#ofacDons)tofindasingleequilibrium(noknownefficientalgorithmandgoodreasonstosuspectthatnoneexists)

•  Somebeiernews:Despitebadworst-casecomplexity,manygamescanbesolvedquickly

GameTheoryIssues•  HowdescripDveisgametheory?

–  Someevidencethatpeopleplayequilibria–  Also,someevidencethatpeopleactirraDonally–  IfitiscomputaDonallyintractabletosolveforequilibriaoflargegames,seemsunlikelythatpeoplearedoingthis

•  Howreasonableis(basic)gametheory?–  Arepayoffsknown?–  AresituaDonsreallysimultaneousmovewithnoinformaDonabouthowtheotherplayerwillact?

–  AresituaDonsreallysingle-shot?(repeatedgames)–  HowisequilibriumselecDonhandledinpracDce?

Page 13: What are Linear Programs? Linear Programming and Game ... · 12/7/15 4 Mixed integer (linear) program maximize 3x + 2y subject to 4x + 2y ≤ 15 x + 2y ≤ 8 x + y ≤ 5 x ≥ 0 y

12/7/15

13

Extensions•  ParDalinformaDon•  Uncertaintyaboutthegameparameters,e.g.,payoffs(Bayesian

games)•  Repeatedgames:Simplelearningalgorithmscanconvergeto

equilibriainsomerepeatedgames•  MulDstepgameswithdistribuDonsovernextstates(gametheory+

MDPs=stochasDcgames)•  MulDstep+parDalinformaDon(ParDallyobservablestochasDc

games)

•  Gametheoryissogeneral,thatitcanencompassessenDallyallaspectsofstrategic,mulDagentbehavior,e.g.,negoDaDng,threats,bluffs,coaliDons,bribes,etc.