Repeated Games and Applications - Perfect...

Repeated Games and Applications - Perfect Monitoring

Wouter Vergote (FUSL and CORE)

FUSL and CORE

December 2009

Wouter Vergote (FUSL and CORE) Repeated Games December 2009 1 / 43

WW1: war in the trenches


WW1: war in the trenches

trenchcolor

2.jpg


WW1: truce in the trenches

How to explain this? Robert Axelrod: "the evolution of cooperation"


Phases of the moon


Introduction: phases of the moon ?

Collusion through bid rigging

In one bid-rigging conspiracy �rms General Electric and Westinghouseused the "phases of the moon" to take turns and determine whichamongst them would submit the "low" bid to win the contracts.

Theoretical underpinning: McAfee and McMillan (AER 1992)

Is this e¢ cient if costs are private information and no transfers can bemade?


Introduction: other areas

Why do countries wish to adhere to the rules of the WTO, even ifthere are no truly enforceable penalties from deviating?

Why do people voluntarily contribute to public goods?

Why do we observe informal insurance mechanisms in poor countries?

...


Canonical Stage Game

Players i : 1, ...,NCompact action sets Ai � Rk for some k,

ai 2 Aia = (a1, ..., an) 2 A = ∏

iAi

αi is mixed action for i , inducing a probability distribution over Ai :

αi (ai ) � 0 and ∑ai2Ai

αi (ai ) = 1

set of mixed actions of player i : ∆(Ai );∆(A) = ∏i ∆(Ai ) is the set of mixed action pro�les

payo¤s are given by a continuous funtions u :

u : ∏i Ai ! Rn

how to extend u to mixed actions ?


Canonical Stage Game: Payo¤s

set of generated payo¤s: F = fv 2 Rn : 9a 2 A s.t. v = u(a)gset of feasible payo¤s: F 0=coFv belongs to the pareto frontier of F 0 if @v 0 2 F 0 s.t. v 0i > viv 0 weakly dominates v if v 0i � vi for all i and v 0j > vj for some jv 2 coF is strongly e¢ cient if it is PO and not weakly dominated.


Canonical Stage Game: Assumptions and minmax

1 Ai is either �nite or a continuum action space: a compact and convexsubset of Rk for some k.

2 If Ai is a continuum action space, then u is continuous and ui isquasiconcave in ai .

player i 0s pure action minmax payo¤ is given by:

vpi � mina�i2A�i

maxai2Ai

ui (ai , a�i )

is this payo¤ well de�ned?a (pure action) minmax pro�le for player i is bai = (baii ,bai�i )How is it de�ned? is it unique?


Canonical Stage Game: Individual Rationality

v is weakly/strictly (pure action) individually rational if

vi � vpi /vi > vpi for all i .

Set of feasible SIR payo¤s:

F 0p = fv 2 coF : vi > vpi , i = 1, ..., ng

mixed action minmax payo¤ for player i :

v i � minα�i2∏j 6=i ∆(Aj )

maxai2Ai

ui (ai , α�i )

Set of feasible SIR payo¤s (relative to mixtures):

F � = fv 2 coF : vi > v i , i = 1, ..., ng


Canonical Stage Game: Public Correlation / PublicRadomization

Why?

What? a probability space

De�nitionA public correlation device is a probability space ([0, 1] ,B,λ) , where isthe Borel sigma-algebra and is the Lebesgue measure.

First a realization ω 2 [0, 1] is drawn, observed by all players beforethe choose actions: ai : [0, 1]! ∆(Ai )expected payo¤: take expectations over ω 2 [0, 1] .(a1, ..., an) induces a joint distribution over ∏i ∆(Ai )evaluation of deviation: ex post (after ω is drawn)


The Repeated Game

Time t 2 f0, 1, ...g �nite or in�niteafter each period all actions are observed and there is perfect recall.

Set of period t histories is: Ht � At with A0 � f∅g andAt = ∏t�1

s=0 A

Set of all possible histories: H �∞St=0Ht

A pure strategy for player i : σi : H ! Ai .

mixed strategy?behavior strategy for player i : σi : H ! ∆(Ai )


The Repeated Game: continuation game

for any ht 2 H the continuation game is the repeated game thatbegins in period t.

any strategy σ induces a continuation strategy σ jht where

σi jht (hτ) = σi (hthτ) for all hτ 2 Aτ

note that σi jht : H ! Ai ) subgame is strategically equivalent tooriginal repeated game

Hence repeated games with perfect monitoring have a RECURSIVEstructure


Outcome Path

An outcome path is an in�nite sequence of action pro�lesa � (a0, a1, ...) 2 A∞.

outcome path is di¤erent from a history

Denote at� (a0, a1, ..., at ) 2 At = history corresponding to aA pure strategy σ induces an outcome path (a0(σ), a1(σ), ...) asfollows

a0(σ) � (σ1(∅); ..., σn(∅))a1(σ) �

�σ1(a0(σ)); ..., σn(a0(σ))

�a2(σ) �

�σ1(a0(σ), a1(σ)); ..., σn(a0(σ), a1(σ))

�...


Outcome Path

Behavioral strategy induce a path of play

σ(∅) = α0 2 ∏i ∆(Ai )for each history a0 in the support of α0,

σ(a0) = α1 2 ∏i ∆(Ai ) ...

path of play at time t speci�es a probability distribution over thehistories at .the underlying behavioral strategy speci�es for period t the mixedactions for each such history at , in turn inducing a probabilitydistribution αt+1(at ) over period t + 1 action pro�les at+1, and overperiod t + 1 histories at+1


Payo¤s

A pure strategy σ induces an in�nite sequence of stage-game payo¤s(ui (a0(σ)), ui (a1(σ)), ui (a2(σ)), ...) 2 R∞

Payo¤s are discounted using a common discount factor δ 2 [0, 1) .If futi g is a sequence of stage game payo¤s, the average discountedutility is given by

(1� δ)∞∑t=0

δtuti

If Ai �nite, the average discounted utility received from an actionpath is continuous with respect to the product topology.

Given a pure strategy pro�le σ we obtain

Ui (σ) = (1� δ)∞∑t=0

δtui (at (σ)).

Normalization ensures us that U(σ) 2 F 0


Nash Equilibrium and sequential rationality

De�nitionσ is a NE if 8i , 8σ0i :

Ui (σ) � Ui (σ0i , σ�i )

Lemmaif σ is a NE then 8i

Ui (σ) � vpi (v i ) if σ is in pure (mixed) strategies

NE does not impose rational behavior out of equilibrium. We wish to re�neNE by imposing sequential rationality: equilibrium behavior in everysubgame.


Subgame perfect equilibrium

De�nitionσ is a SPE if 8ht 2 At σ jht is a NE of the repeated game

existence?

demanding concept: we need to check countably in�nite amount ofhistories and for every strategy there is an in�nite amount ofdeviations

we need to simplify1 One shot deviation principle: limits the amount of alternative strategiesto check

2 Automaton representation of strategies: allows us to organizesubgames in equivalence classes


One shot deviation (OSD) principle

De�nition

a OSD for player i from σi is a strategy bσi 6= σi such that 9! eht 2 At suchthat 8ht 6= eht : σi (ht ) = bσi (ht ).

example: one shot deviation from grim trigger

De�nition

Fix σ�i . A OSD bσi from σi is PROFITABLE if, at eht s.t. σi (eht ) 6= bσi (eht ) :

Ui (bσi jeht , σ�i jeht ) > Ui (σ jeht )A NE can have pro�table one shot deviations (example see in class)


Subgame perfect equilibria and the one shot deviationprinciple

Lemmaσ is a SPE i¤ @ pro�table OSDs

Proof.in pure strategies

What about NE? Does no pro�table OSDs on the equilibrium path imply aNE? No (see example)


Automaton Representations of Strategy Pro�les

De�nition

An automaton is a collection�W ,w0, f , τ

�where W is a set of states, w0

is the initial state, a decision function f : W !∏i ∆(Ai ) and a transitionfunction τ : W� A!W .

The probability associated by f to action a in state w is f w (a), suchthat ∑

a2Af w (a) = 1.

Any automaton�W ,w0, f , τ

�such that f speci�es a pure action at

every state induces an outcome�a0, a1, ...

as follows:

a0 = σ(∅) = f (w0)a1 = σ(a0) = f (τ(w0, a0))

a2 = σ(a0, a1) = f (τ(τ(w0, a0), a1)) ...

We extend the domain to �nd the strategy induced by an automaton.Wouter Vergote (FUSL and CORE) Repeated Games December 2009 22 / 43

Automaton representation of strategy pro�les

extend the domain of τ from W� A to W �Hn f∅g by de�ning

τ(w , ht ) = τ(τ(w , ht�1), at�1)

Now de�ne the strategy σ as σ(∅) = f (w0) and σ(ht ) =f (τ(w0, ht )).

The other direction: can a strategy pro�le be represented by anautomaton?

W = Hw0 = f∅gf (ht ) = σ(ht )τ(ht , a) = ht+1 where ht+1 � (ht , a)


OK, but why use automata?

We can partition H into sets with identical continuation strategies.Clear separation of �today�and �tomorrow�(continuation pro�les)

De�nitionA state w 0 2 W is accessible from another state w 2 W if there exists ht

such that w 0 = τ(w , ht ).

Lemma(prop 2.3.1.) The strategy pro�le with representing automaton�W ,w0, f , τ

�is a SPE i¤ for all w 2 W accessible from w0, the strategy

pro�le induced by�W ,w0, f , τ

�is a NE of the repeated game.

When a strategy pro�le σ is described by�W ,w0, f , τ

�the

continuation strategy pro�le after history ht ; σ jht is described by�W , τ(w0, ht ), f , τ

�. If every w 2 W is accessible from w0, then the

collection of all continuation pro�les is f(W ,w , f , τ) ,w 2 Wg .Wouter Vergote (FUSL and CORE) Repeated Games December 2009 24 / 43

Credible Continuation Payo¤s

One shot deviation principle ) check that each state induces a NE ina one-shot instead of a repeated game.

Fix�W ,w0, f , τ

�such that all w 2 W are accessible from w0.

Let Vi (w) be player i 0s discounted payo¤ from play that begins instate w .

Vi (w) = (1� δ)ui (f (w)) + δVi (τ(w , f (w)))

Mixed strategies payo¤s if A is �nite:

Vi (w) = (1� δ)∑aui (a)f w (a)+ δ ∑

aVi (τ(w , a))f w (a) for all w 2 W

System of linear equations which has a unique solution (in the spaceof bounded functions on W)


Continuation promises

If currently in w the automaton speci�es pure action f (w) and iexpects all others to follow this pro�le, then player i expects toreceive ui (ai , f�i (w)) from playing ai .

Then (ai , f�i (w)) implies a transition to statew 0 = τ(w , (ai , f�i (w))).

If all players follow the strategy in subsequent periods (OSD), thenplayer i expects a continuation payo¤ (promise) of Vi (w 0).

For an equilibrium to be subgame perfect, continuation �promises�need to be �credible�.


Credibility of continuation promises

Given Vi (w) at each state w , player i is willing to choose action ai in thesupport of fi (w) if for all a0i 2 Ai :

(1� δ) ∑a�iui (ai , a�i )f

w (a�i ) + δ ∑a�iVi (τ(w , (ai , a�i )))f

w (a�i )

�(1� δ) ∑

a�iui (a

0i , a�i )f

w (a�i ) + δ ∑a�iVi (τ(w , (a

0i , a�i )))f

w (a�i )

equivalently, for all a0i 2 Ai

Vi (w)� (1� δ) ∑a�iui (a

0i , a�i )f

w (a�i ) + δ ∑a�iVi (τ(w , (a

0i , a�i )))f

w (a�i )

Continuation promises are credible if if the above inequality holds for allplayers and all states.


Credibility of continuation promises

Let V (w) = (V1(w), ...,Vn(w)).

Lemma

Let σ be described by�W ,w0, f , τ

�.Then σ is a SPE i¤ for all w 2 W

accessible from w0, f (w) is a NE of the normal form game described bythe payo¤ function gw : A! Rn, where

gw (a) = (1� δ)u(a) + δV (τ(w , a))

LemmaIn the game with public correlation : same story


Constructing Equilibria I - Self-Generation

Let E(δ) � Rn denote the set of pure strategy SPE payo¤s

Abreu, Pierce and Stacchetti (Econometrica, 1990)

Self-generating sets of equilibrium payo¤s

Di¤erence:

perfect observability. (see later for extension to moral hazard)pure strategies



De�nitionA pure action pro�le a� is enforceable on W if there exists a functionγ : A!W such that, for all i and all ai 2 Ai ;

(1� δ)ui (a�) + δγi (a�) � (1� δ)ui (ai , a��i ) + δγi (ai , a

��i ).

De�nitionA payo¤ v 2 F 0 is pure action decomposable on W if there exists a pureaction pro�le a� enforceable on W , through function γ, such that,

vi = (1� δ)ui (a�) + δγi (a�).

v is pure action decomposable on W � v is �one period�credible. Welook for in�nite period credibility.



LemmaAny set of payo¤s W � F 0 such that every payo¤ in W is pure actiondecomposable on W is a set of pure-strategy SPE payo¤s.

De�nitionA set W is pure-action self-generating if every payo¤ in W is pure-actiondecomposable on W .

CorollaryThe set Ep of pure strategy SPE payo¤s is the largets pure actionself-generating set.



De�nitionFor any set W � F 0, let Ba(W) be the set of payo¤s decomposed bya 2 A and continuations in W .

Then Ep is the largets set W satisfying

W =Sa2A

Ba(W)

LemmaThe set Ep of pure strategy SPE payo¤s is compact

Importance of compactness?


Constructing Equilibria II - Simple Strategies and PenalCodes

There is a collection of pure strategy SPE pro�les�

σ1, ..., σnsuch

that σi yields the lowest possible pure strategy SPE payo¤ for playeri . Why?

De�nitionGiven n+ 1 outcomes fa(0), a(1), ..., a(n)g the associated simple strategypro�le σ fa(0), a(1), ..., a(n)g is given by the automaton:

W = f0, 1, ..., ng � f1, 2, ...g ,w0 = (0, 0)

f (j , t) = at (j)

τ((j , t), a) =(i , 0) if ai 6= ati (j) and a�i = at�i (j)

(j , t + 1), otherwise.



A simple strategy speci�es an equilibrium path, a(0), and a penalcode,fa(1), ..., a(n)g , describing responses to deviations.the punishment for a deviation is independent of when it occurs andof its nature.

these pro�les are used to prove the folk theorem. We can use theOSD principle to obtain necessary and su¢ cient conditions for asimple strategy to be a SPE.

Let the payo¤ to player i from outcome path�at , at+1, ...

be

U ti (a) = (1� δ) ∑s=t

δs�tui (as )



LemmaThe simple strategy pro�le σ fa(0), a(1), ..., a(n)g is a SPE i¤

U ti (a(j)) � maxai2Ai

(1� δ)ui (ai , at�i (j)) + δU0i (a(i))

for all i = 1, ..., n, j = 0, 1, ..., n, and t = 0, 1, ...

�Nash Reversion�and �Trigger Strategies�v �i = min fUi (σ) : σ 2 Epg = smallest pure strategy SPE payo¤ i

De�nitionLet fa(i), i = 1, ..., ng be n outcome paths satisfyingU0i (a(i)) = v

�i , i = 1, ..., n. The collection of n simple strategy pro�les

σ(i) = σ fa(i), a(1), ..., a(n)g , is an optimal penal code ifσ(i) 2 Ep , i = 1, ..., n.

Existence of optimal penal codes? Are the associated simple strategypro�les equilibria?



Abreu 1988

Theorem1. Let fa(i)gni=1 be n outcome paths of pure strategy SPE fbσ(i)gni=1satisfying Ui (bσ(i)) = v �i , i = 1, ..., n. The simple strategy pro�leσ(i) = σ fa(i), a(1), ..., a(n)g , is an pure strategy SPE and hencefσ(i)gni=1 is an optimal penal code. 2. The pure outcome path a(0) canbe supported as an outcome of pure strategy SPE i¤ there exists pureoutcome paths fa(1), ..., a(n)g such that the simple strategy pro�leσ(i) = σ fa(0), a(1), ..., a(n)g is a SPE.

Corollary

Suppose a(0) is the outcome path of a pure strategy SPE. Then thesimple strategy pro�le σ fa(0), a(1), ..., a(n)g , where a(i) yields thelowest possible SPE payo¤ v �i to player i , is a SPE.


The Folk Theorem with perfect monitoring

Example

Interpretation

two players =) mutual minmax

more than two players =) counterexample

We need stronger conditions and a di¤erent approach (player speci�cpunishments)


The General n-player minmax Folk Theorem: build-up

De�nitionv 2 F 0allows player speci�c asymmetric punishment if 9 player pro�les�v ini=1 s.t. for all i , v

i 2 F 0 and vi > v ii , and for all j 6= i : v ji > vii . The

collection�v ini=1 constitutes a player speci�c punishment for v

Lemma(prop 3.4.1 statement 1) Suppose v 2 F pallows pure-action player speci�casymmetric punishment in F p .Then 9δ < 1 such that for all δ 2 (δ, 1) ,there exists a SPE with payo¤s v .

Lemma(prop 3.4.1 statement 2) Suppose v 2 F 0pallows player speci�casymmetric punishment in F 0p .Then 9δ < 1 such that for all δ 2 (δ, 1) ,there exists a SPE with payo¤s v .


Implications

Do the above lemma�s imply the Folk Theorem?

We need conditions such that all feasible and strictly individually rationalpayo¤s allow pure strategy asymmetric punishments.Idea: if v is in the interior of F p we can contruct small balls �below�it:vary one�s player�s payo¤ without moving the others�payo¤s.

If the action space is �nite then F p is (in general) not convexF 0p is convex, hence full dimensionality of F 0p provides su¢ cientconditions for the folk theorem.


The General n-player minmax Folk Theorem

Theorem(Folk Theorem) Suppose F p is convex and has a non-empty interior. Thenfor all v 2

nev 2 F p : 9v 0 2 F p , v 0i < evi8iothere exists δ < 1 such that

for all δ 2 (δ, 1) , there exists a SPE with payo¤s v . (similar statementfor F 0p)nev 2 F p : 9v 0 2 F p , v 0i < evi8io?

non empty interior of convex set � full dimensionality

proof: for any v �pick a ball�around some v 0 < v

But we still need to prove the above lemma


Proof of lemma (prop 3.4.1)

We show statement 1 in class

assume players play pro�le a that yields v .a deviation by player i causes others to minmax i for a long enoughperioddi¤erent from the two player case, it is not obvious players will �nd itin their interest to execute the punishment (no mutual minmaxing)we must follow a di¤erent approach: reward punishing players on thecompletion of a punishment. To do this, we need player speci�cpunishments.


Repeated Games with prefect monitoring: Price wars(book section 6.1)

Stage Game

two �rms i = 1, 2

produce homogenous product at no cost

choose prices pia state s determine the size of the market (willingness to pay)

Bertrand competition:

Di (pi , pj ) = 0 if pi > pj= s � pi if pi < pj=

s � pi2

if pi = pj

Payo¤s not continuous: problematic ?

state s 2 f1, 2g and probability p(1) = p(2) = 12


Repeated Games with prefect monitoring: Price wars(book section 6.1)

Repeated Game

Does the folk theorem hold?

Focus on strongly symmetric equilibrium

which SSE is the best collusive oneDoes the folk theorem still hold restricting attention to SSE?For which levels of patience does full collusion break down?Suppose δ = 1

2 , in the best collusive equilibrium there arecountercyclical mark ups (Rotemberg and Saloner 1986)


Repeated Games and Applications - Perfect...

Documents

Transcript of Repeated Games and Applications - Perfect...