The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private...

21
The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case ¤ Ichiro Obara Department of Economics University of Pennsylvania [email protected] First version: December, 1998 This version: April, 2000 Abstract This paper studies the repeated prisoners’ dilemma with private monitoring for arbitrary number of players. It is shown that a mixture of a grim trigger strategy and permanent defection can achieve an almost e¢cient outcome for some range of discount factors if private monitoring is almost perfect and sym- metric, and if the number of players is large. This approximate e¢cieicny result also holds when the number of players is two for any prisoners’ dilemma as long as monitoring is almost perfect. A detailed characterization of these sequential equilibria is provided. 1. Introduction This paper examines the repeated prisoners’ dilemma with arbitrary number of play- ers, where players only observe private and imperfect signals about the other players’ actions. This game belongs to the class of repeated games with private monitor- ing. While repeated games with public monitoring have been extensively analyzed in, for example, Abreu, Pearce and Stachetti [1] or Fudenberg, Levin, and Maskin [8], few things are known about repeated games with private monitoring. It is shown in Compte [5] and Kandori and Matsushima [10] that a folk theorem still holds in this class of game with communication between players, but it is di¢cult to analyze it without communication because the simple recursive structure is lost. The two player prisoner’s dilemma was already examined in Sekiguchi [16], which is the …rst paper to show that the e¢cient outcome can be achieved in some repeated prisoner’s dilemmas with almost perfect private monitoring. This paper is an exten- sion of Sekiguchi [16] in the sense that (1): a similar grim trigger strategy is employed, ¤ I am grateful to my advisers George Mailath and Andrew Postlewaite for their encouragement and valuable suggestions, and Johannes Horner, Stephen Morris, Nicolas Persico, Tadashi Sekiguchi for their helpful comments. All remaining errors are mine.

Transcript of The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private...

Page 1: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

The Repeated Prisoners’ Dilemma with PrivateMonitoring: a N-player case¤

Ichiro ObaraDepartment of EconomicsUniversity of Pennsylvania

[email protected]

First version: December, 1998This version: April, 2000

AbstractThis paper studies the repeated prisoners’ dilemma with private monitoring

for arbitrary number of players. It is shown that a mixture of a grim triggerstrategy and permanent defection can achieve an almost e¢cient outcome forsome range of discount factors if private monitoring is almost perfect and sym-metric, and if the number of players is large. This approximate e¢cieicny resultalso holds when the number of players is two for any prisoners’ dilemma as longas monitoring is almost perfect. A detailed characterization of these sequentialequilibria is provided.

1. Introduction

This paper examines the repeated prisoners’ dilemma with arbitrary number of play-ers, where players only observe private and imperfect signals about the other players’actions. This game belongs to the class of repeated games with private monitor-ing. While repeated games with public monitoring have been extensively analyzed in,for example, Abreu, Pearce and Stachetti [1] or Fudenberg, Levin, and Maskin [8],few things are known about repeated games with private monitoring. It is shown inCompte [5] and Kandori and Matsushima [10] that a folk theorem still holds in thisclass of game with communication between players, but it is di¢cult to analyze itwithout communication because the simple recursive structure is lost.

The two player prisoner’s dilemma was already examined in Sekiguchi [16], whichis the …rst paper to show that the e¢cient outcome can be achieved in some repeatedprisoner’s dilemmas with almost perfect private monitoring. This paper is an exten-sion of Sekiguchi [16] in the sense that (1): a similar grim trigger strategy is employed,

¤I am grateful to my advisers George Mailath and Andrew Postlewaite for their encouragementand valuable suggestions, and Johannes Horner, Stephen Morris, Nicolas Persico, Tadashi Sekiguchifor their helpful comments. All remaining errors are mine.

Page 2: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

(2): the e¢cient outcome is obtained for any prisoner’s dilemma with two players,(3): this e¢ciency result for the two player case is extended to the case of arbitrarynumber of players with some symmetric condition on the monitoring structure, and(4): the sequential equilibrium corresponding to this grim trigger Nash equilibrium isexplicitly constructed.

In Sekiguchi [16], the critical step of the arguments is to obtain the unique optimalaction with respect to a player’s subjective belief about the other player’s continuationstrategy. Since players randomize between a grim trigger strategy and a permanentdefection in the …rst period, each player’s continuation strategy is always one of thesetwo strategies after any history. This means that a player’s subjective belief about theother player’s strategy can be summarized in one parameter: a subjective probabilityof the permanent defection being played by the other player. In this paper, the clearcut characterization of the optimal action is provided as a function of this belief. InSekiguchi [16], it is shown that a player should start defecting if she is very con…dentthat the other player has started defecting, and a player should cooperate if she isreally con…dent that the other player is still cooperating. However, it is not clearwhat a player should do if the belief is somewhere in the middle. This paper …lls inthat gap.

Through a detailed examination of players’ incentive, we also …nd that the payo¤restriction imposed in Sekiguchi [16] is not really necessary, that is, the approximatee¢ciency result is valid for any prisoner’s dilemma game.

Although the same kind of clear characterization of the optimal action is possiblewith many players, it is not straightforward to extend this e¢ciency result to the nplayer case. The dynamics of belief is richer with more than two players. In particular,it is possible to have a belief that some player started defecting but the other playersare still cooperating. In such a case, a player might think that it is better to continuecooperating because it might keep cooperative players from starting defection. So, itis no longer clear when players should pull the trigger.

Under the assumption that the probability of any signal pro…le only depends onthe number of the total errors it contains, it is shown that the e¢ciency outcomecan be supported with the mixture of the permanent defection and a certain kind ofgrim trigger strategy, where players start defecting if they observe even one signal ofdeviation by any other player. This strategy generates an extreme belief dynamicsunder the assumption on the signal distribution, which in turn rationalizes the use ofthis strategy. As soon as a player observes any bad signal from any other player, theplayer expects that some other players also got some bad signals with high probability.Then, she becomes pessimistic enough to start defecting for herself because defectionshould prevail among all players using the same strategy at least in the next period.

A sequence of papers have re…ned the result of Sekiguchi [16] for the two playercase. Piccione [15] also achieves the e¢cient outcome for any prisoners’ dilemma withtwo players and almost perfect private monitoring. Moreover, he establishes a folktheorem for a class of prisoners’ dilemma using a strategy which allows players torandomize between cooperation and defection after every history. The strategy usedin his paper can be represented as an automaton with countably in…nite states. Elyand Välimäki [7] prove a folk theorem using a similar strategy, but their strategy

2

Page 3: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

is “simple” in the sense that it is a two states automaton. Bhaskar [3] is closest tothis paper in terms of results and strategies employed in the two player case. Heessentially shows (2) and (4), and also proves a folk theorem for a class of prisoner’sdilemma through a di¤erent line of attack from Piccione [15] or Ely and Välimäki [7]using a public randomization device.

Mailath and Morris [11] is the …rst paper to deal with the n player case in theprivate monitoring framework. They show that a subgame perfect equilibrium withpublic monitoring is robust to the introduction of private monitoring if players’ con-tinuation strategies are approximately common knowledge after every history. A folktheorem can be obtained when information is almost public and almost perfect. Al-though the stage game in this paper has a more speci…c structure, the informationstructure allowed in this paper is not nested in their information structure. Especially,private signals can be independent over players in this paper.

This paper is organized as follows. In Section 2, the model is described. In Section3, the assumptions on the information structure are presented. Section 4 discusses theoptimal action with respect to player’s beliefs and the belief dynamics generated by theequilibrium strategy proposed in this paper. A sequential equilibrium is constructedin Section 5. Section 6 gives a detailed characterization of the sequential equilibriumconstructed in Section 5. Section 6 concludes with a comparison to other relatedpapers.

2. The Model

Let N = f1; 2; :::; ng be the set of players and g be the stage game played by thoseplayers. The stage game g is as follows. Player i chooses an action ai from theaction set Ai = fC;Dg : Actions are not observable to the other players and taken

simultaneously. A n-tuple action pro…le is denoted by a 2 A =n¦

i=1Ai: A pro…le of all

player’s actions but player i0s is a¡i 2 ¦j 6=i

Aj :

Each player receives a private signal about all the other players’ actions withinthat period. Let !i = (!i;1; :::; !i;i¡1; !i;i+1;:::; !i;n) 2 fC;Dgn¡1 = ­i be a genericsignal received by player i; where !i;j stands for the signal which player i receivesabout player j0s action: A generic signal pro…le is denoted by ! = (!1; :::; !n) 2 ­.All players have the same payo¤ function u. Player i’s payo¤ u (ai; !i) depends onher own action ai and private signal !i. Other players’ actions a¤ect a player i’spayo¤ only through the distribution over the signal which player i receives. Thedistribution conditional on a is denoted by p (!ja). It is assumed that p (!ja) are fullsupport distributions, that is, p (!ja) > 0 8a8!: The space of a system of full supportdistributions fp (!ja)ga2A is denoted by P:

Since we are interested in the situation where information is almost perfect, werestrict attention to a subset of P where information is almost perfect. Information isalmost perfect when every person’s signal pro…le is equal to the actual action pro…lein that period with probability larger than 1 ¡ " for some small number ":

To sum up, the space of the information structure we mainly deal with in this

3

Page 4: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

paper is the following subset of P :

P" =

(fp (!ja)ga2A 2 <n£(n¡1)£2n

++

¯̄¯̄¯

p (!ja) > 1 ¡ " if !i = a¡i for all i;and 8a,

P!

p (!ja) = 1

)

and p" is a generic element of P":We also introduce the perfectly informative signal distribution P0 = fp0 (!ja)ga2A,

where, for any a 2 A; p0 (!ja) = 1 if !i = a¡i for all i. The whole space of theinformation structure P

SP0 is endowed with the Euclidean norm.

The stage game payo¤ only depends on the number of signals C and D a player re-ceives. Let d (!i) be the number of “D”s contained in !i:Then, u (ai; !0

i) = u (ai; !00i )

if d (!0i) = d (!00

i ) for any ai: Let u¡ai;D

be the payo¤ of player i when d (!i) = k:

The deviation gain when k defections are observed is M (k) = u¡D;Dk

¢¡u

¡C;Dk

¢>

0: The payo¤sP!

u (C;!i) p (!j (C; :::; C)) andP!

u (D;!i) p (!j (D; :::;D)) are nor-

malized to 1 and 0 respectively for all i: It is assumed that (1; ::; 1) is an e¢cient stagegame payo¤. The largest deviation gain and the smallest deviation gain are M and Mrespectively, where M = max

1·k·n¡1M (k) and M = min

1·k·n¡1M (k)

The stage game g is repeated in…nitely many times by n players, who discounttheir payo¤s with a common discount factor ± 2 (0; 1) : Time is discrete and denotedby t = 1; 2; ::::Player i’s private history is ht

i =¡¡

a1i ; !

1i

¢; :::;

¡at¡1

i ; !t¡1i

¢¢for t = 2

and h1i = ;: Let Ht

i be the set of all such history hti and Hi =

1St=1

Hti : Player i’s

strategy is a sequence of mappings ¾i = (¾i;1; ¾i;2; ::::) ; each ¾i;t being a mappingfrom Ht

i to probability measures on Ai: Discounted average payo¤ is Vi (¾ : p; ±) =

(1 ¡ ±)1P

t=1±t¡1E [u ((at

i; !ti)) j¾; p] ; where the probability measure on Ht

i is generated

by (¾; p) : The least upper bound and largest lower bound of the discounted averagepayo¤ are denoted by V and V respectively.

For this n¡player repeated prisoner’s dilemma, the grim trigger ¾C and the per-manent defection ¾D are de…ned as follows:

¾C (hti) =

½C if ht

i = ((C;C) ; :::; (C;C)) or t=1D otherwise

¾D (hti) = D for all ht

i 2 Hi

We also use ¾C or ¾D for any continuation strategy which is identical to ¾C or ¾D,that is, any continuation strategy from the period t+1 such that ¾

¡ht+k

i

¢= ¾ai

¡hk

i

¢

for k = 1; 2; ::: and ai = C or D: Moreover, any continuation strategy which isrealization equivalent to ¾C or ¾D is also denoted by ¾C or ¾D respectively1 . Thisgrim trigger strategy is the harshest one among all the variations of the grim triggerstrategies in the n player case. Players using ¾C switch to ¾D as soon as they observeany signal pro…le which is not fully cooperative. When player i is mixing ¾C and ¾D

with probability (qi; 1 ¡ qi) ; this strategy is denoted by qi¾C + (1 ¡ qi)¾D:.

1A strategy is realization equivalent to another strategy if the former generates the same outcomedistribution as the latter independent of the other players’ strategies.

4

Page 5: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

Suppose that ¾C or ¾D is chosen by all players. Let µ 2 £ = f0; 1; :::; n ¡ 1g be thenumber of players using ¾D as a continuation strategy among n players. Then a prob-ability measure q¡i (ht

i; p) 2 on the space £ = f0; 1; :::; n ¡ 1g is derived conditionalon the realization of the private history ht

i: Clearly, this measure also depends on theinitial level of mixture between ¾C and ¾D by every player, but this dependence isnot shown explicitly as it is obvious.

In the two player case, a player’s strategy can be represented as a function ofbelief, using the fact that the other player is always playing either ¾C or ¾D on ando¤ the equilibrium path. Here the space of the other player’s “types” is much larger.However, there is a convenient way of summarizing information further. We classify £into the two sets: f0g and f1; :::; n ¡ 1g ; that is, the state no one have ever switchedto ¾D and the state where there is at least one player who has already switched to¾D. Player i0s conditional subjective probability that no player has started using ¾D

is denoted by Á (q¡i) = q¡i (hti; p) (0) : The reason why we just focus on this number

is because the exact number of players who are playing permanent defection does notmake much di¤erence in what will happen in the future given everyone’s strategy. Assoon as someone starts playing ¾D, every other player starts playing ¾D with veryhigh probability from the very next period on by the assumption of almost perfectmonitoring. What is important is not how many players have switched to ¾D; butwhether anyone has switched to ¾D or not.

Let Vi (¾i; µ : p; ±) be player i’s discounted average payo¤ when µ other players areplaying ¾D and n¡µ¡1 other players are playing ¾C: This notation is justi…ed underthe assumption of the symmetry distribution, which is introduced in the next section.Finally, we need the following notations:

Vi (¾i;q¡i : p; ±) =n¡1X

µ=0

Vi (¾i; µ : p; ±)q¡i (µ)

M (q¡i; p) =n¡1X

µ=0

©Ui

¡¡D;Dµ

¢: p

¢¡ Ui

¡¡C;Dµ

¢: p

¢ªq¡i (µ)

3. Information Structure

In this section, various assumptions on the information structure are proposed anddiscussed. In the following sections, a sequential equilibrium is constructed with amixture of grim trigger strategy and permanent defection, which achieves an approx-imately e¢cient outcome for some range of discount factors. As is the case with anyequilibrium based on simple grim trigger strategies, this equilibrium satis…es the fol-lowing property; players stick to the grim trigger strategy as long as they have anoptimistic belief about the others, and they switch to the permanent defection once

2Here, q¡i is a mapping from the space of private history (and the monitoring technology p) toa space of probability measures on £: However, we also use q¡i as a point in a n ¡ 1 dimensional

simplex½fplgn¡1li=0 2 <+j

n¡1Pl=0

pl = 1

¾; where pl means the probability of the event µ = l:

5

Page 6: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

they become pessimistic and never come back. This property is satis…ed in gameswith perfect monitoring, but not so easily satis…ed in games with imperfect privatemonitoring. In order to achieve a certain level of coordination, which is necessaryfor an equilibrium with trigger strategies, we impose some assumptions on p (!ja) inaddition to the assumption that it is almost perfect.

The …rst assumption, which is maintained throughout this paper, is

Assumption 1

p (!ja) = p¡¡

!¿(i)¿(j)

¢j¡a¿(1); :::; a¿(i); :::; a¿(N)

¢¢for any permutation ¿ : N ! N:

This assumption implies the followings. First, each player’s expected payo¤ is thesame under the same situation combined with the assumption on u: Second, onlythe number of C and D played matters in terms of expected payo¤. Finally, thedynamics of belief for each player is the same with the same sort of private history.This assumption allows us to treat agents symmetrically and justi…es all the notationsintroduced in Section 2.

Although Assumption 1 is strong enough to achieve an almost e¢cient outcomefor two players, a stronger assumption is called upon to achieve similar results withmore than two players. Let #(!ja) stand for the number of errors in !: The followingassumptions is strong enough for that purpose:

Assumption 2.

p (!0ja) = p (!00ja) if #(!0ja) = #(!00ja) for any !0; !00 2 ­; a 2 A

A couple of remarks on these assumptions are in order.First, Assumption 1 is a relatively weak assumption about the symmetry of a

signal distribution and satis…ed in most of the papers in reference which analyze therepeated prisoner’s dilemma with almost perfect private monitoring. Second, whileAssumption 2 is much stronger than Assumption 1 in general, it is very close toAssumption 1 in the two player case. Consequently, this assumption is also satis…edin those papers as most of them concentrate on the two player case.

Assumption 2 means that the probability of some signal pro…le only depends onthe number of errors contained in that pro…le. For example, given that everyoneis playing C; the probability that a player receives two “D” signals while the otherplayers get correct signals is equal to the probability that two players receive one “D”while the rest of the players gets correct signals.

6

Page 7: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

For example, the following information structure satis…es Assumption 2 for generaln:

² Example: Totally Decomposable Case

p (!ja) =¦j

¦i 6=j

p (!i;j jaj) for all a 2 A and ! 2 ­; where p (!i;j jaj) = 1 ¡ " if

!i;j = aj :

Given the action by player j; the probability that player i 6= j receives the rightsignal or the wrong signal about player j’s action is the same across i 6= j. Alsonote that players’ signals are conditionally independent over players.

4. Belief Dynamics and Best Response

We represent our equilibrium strategy as a mapping from belief to actions 4fC;Dg.More explicitly, our strategy takes the following form:

½ (q¡i) =

8<:

C if q¡i 2 QC

q if q¡i 2 QI

D if q¡i 2 QD

where QC ; QI ; QD are mutually exclusive and exhaustive sets in the space of theprobability measures on £; and q means playing C with probability q and playing Dwith probability 1 ¡ q:

The steps to achieve the approximate e¢cient payo¤ is as follows. First, we give aalmost complete characterization of the unique optimal action as a function of belief.As a next step, we analyze the dynamics of belief when players are playing either ¾C

or ¾D: Since the dynamics of belief in the n¡player case has some particular featurewhich does not appear in the two player case, we analyze it in detail. The third stepis to construct a sequential equilibrium. We …rst have to …nd ½ (q¡i) which assignsthe optimal action for any possible belief on and o¤ the equilibrium path. For thispurpose, we strengthen the characterization result in the …rst step to get this function½ (q¡i) : In order to prove that ½ (q¡i) is a sequential equilibrium, we also have tocheck consistency, that is, to check if players are actually playing either ¾C or ¾D

on the equilibrium path by following ½ (q¡i). Since this sequential equilibrium isconstructed for a discount factor in the middle range, as the …nal step, we divide theoriginal repeated game to component repeated games or use a public randomizationdevice to implement the same payo¤ by reducing high ± e¤ectively.

4.1. Belief and the Best Response

Before analyzing the unique optimal action, we …rst extend one property which obvi-ously holds in the two player case to the n¡player case. In the two player case, thedi¤erence in payo¤s by ¾C and ¾D is linear and there is a unique q¤ (±; p0) where aplayer is indi¤erent between ¾C and ¾D with perfect monitoring if ± is high enough

7

Page 8: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

as shown in Figure 1. When private monitoring is almost perfect, there is q¤ (±; p")where a player is indi¤erent between ¾C and ¾D and q¤ (±; p") ! q¤ (±; p0) as " ! 0:

Put Figure 1 here.

When the number of players is more than two; the corresponding object Vi (¾C;q¡i : p0; ±)¡Vi (¾D;q¡i : p0; ±) for the n¡player case is a slightly more complex object. Evenwhen players randomize independently and symmetrically playing C with probabil-

ity q and D with probability (1 ¡ q), that is, q¡i (µ) =n¡1Pµ=0

(1 ¡ q)µ qn¡1¡µ¡n¡1

µ

¢for

µ = 0; :::; n¡ 1, it is a n ¡ 1 degree polynomial in q 2 [0; 1] : Potentially, this equationmay have n ¡ 1 solutions between 0 and 1 as shown in …gure 2. We de…ne q¤ (±; p")to be the solution which is closest to 0 for Vi (¾C ;q¡i : p0; ±)¡Vi (¾D;q¡i : p0; ±) = 0

where q¡i (µ) =n¡1Pµ=0

(1 ¡ q)µ qn¡1¡µ¡n¡1

µ

¢:

Put Figure 2 here.

When monitoring is almost perfect, Vi (¾C ;q¡i : p"; ±)¡Vi (¾D;q¡i : p"; ±) is veryclose to Vi (¾C;q¡i : p0; ±) ¡ Vi (¾D;q¡i : p0; ±) : Actually, it is easy to con…rm thatthe former converges to the latter uniformly in q as " ! 0:3 If ± > M(0)

1+M(0) ; thenVi (¾C ; 0 : p0; ±) ¡ Vi (¾D; 0 : p0; ±) > 0; which implies that there exists q¤ (±; p0) be-tween 0 and 1. The following lemma extends a useful property for the two player caseto the n¡player case.

Lemma 1. q¤ (±; p0) ! 1 as ± # M(0)1+M(0)

Proof. See Appendix

Now we show that the optimal action as a function of q¡i is still similar to theone with perfect monitoring if the dynamics of q¡i is very close to the dynamics ofq¡i with perfect monitoring. As a …rst step to show that, the following lemma showsthat ¾D is still optimal if a player knows that someone has switched to the permanentdefection and " is small.

Lemma 2. There exists a b" > 0 such that Vi (¾i;q¡i : p"; ±) is maximized by ¾D forany pb"; if q¡i (µ) = 1 for any µ 6= 0.

3Also note that convergence of Vi (¾i; qj : p") to Vi (¾i; qj : p0) is independent of the choice ofassociated sequence fp"g because of the de…nition of P".

8

Page 9: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

Proof.Take ¾D and any strategy which starts with C. The least deviation gain is

(1 ¡ ±) 4: The largest loss caused by the di¤erence in continuation payo¤s with ¾D

and the latter strategy is ±"V : Setting b" small enough guarantees that (1 ¡ ±)4 > ±"Vfor any " 2 (0;b") : Then, D must be the optimal action for any such ": Since playersare using permanent defection, q¡i (µ) = 1 for some µ 6= 0 in the next period. Thisimplies that D is the optimal action in all the following periods. ¥

Using p (¢j¢) and given the fact that all players are playing either ¾C or ¾D; weintroduce a notation for the transition probability of the number of players who haveswitched to ¾D: Let ¼ (ljm) be a probability that l players will play ¾D from thenext period when m players are playing ¾D now. In other words, this ¼ (ljm) is aprobability that l ¡ m players playing C receive the signal D when n ¡ m playersplay C and m players play D: Of course, ¼ (ljm) > 0 if l = m and ¼ (ljm) = 0 ifl < m. The following lemma, which is a sort of Maximum theorem, provides variousinformative and useful bounds on the variations of discounted average payo¤s causedby introducing small imperfectness in private monitoring.

Lemma 3.

1. infp"2P"

Vi (¾C ; 0 : p"; ±) = (1¡±)+±"V1¡±(1¡")

2. Given ± 2³

M(0)1+M(0) ; 1

´; There exists a " > 0 such that for any " 2 [0; "] ;

sup¾i;p"2P"

Vi (¾i; 0 : p"; ±) 5 1¡±+±"V1¡±(1¡")

Proof.

(1): For any " 2 (0; 1) and p" 2 P";

Vi (¾C; 0 : p"; ±) = (1 ¡ ±) + ±¼ (0j0)Vi (¾C; 0 : p"; ±) + ± (1 ¡ ¼ (0j0))V

So,

Vi (¾C ; 0 : p"; ±) = (1 ¡ ±) + ± (1 ¡ ¼ (0j0))V

1 ¡ ±¼ (0j0)= (1 ¡ ±) + ±"V

1 ¡ ± (1 ¡ ")

(2): Given ± 2³

M(0)1+M(0) ; 1

´; it is easy to check that Vi (¾C ; 0 : p0; ±) > Vi (¾D; 0 : p0; ±) :

Pick " small enough such that (i) Vi (¾C ; 0 : p"; ±) > Vi (¾D; 0 : p"; ±) for any p" and(ii) " < b": Let ¾¤

0 be the optimal strategy given that everyone is using ¾C :4 Supposethat ¾¤

0 assigns D for the …rst period. Then for any " 2 [0; "] ;

Vi (¾¤0; 0 : p") 5 (1 ¡ ±)U

¡¡D;D0

¢: p"

¢+

±

½¼ (1j1) Vi (¾¤

0; 0 : p"; ±)+nP

µ=2

¼ (µj1)Vi (¾D; µ ¡ 1 : p"; ±)

¾

4This ¾¤ exists because the strategy space is a compact space in product topology, on whichdiscounted average payo¤ functions are continuous. Of course, this ¾¤ depends on the choice of p":

9

Page 10: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

In this inequality, the second component represents what player i could get if sheknew the true continuation strategies of her opponents at each possible state. To seethat this additional information is valuable, suppose that the continuation strategyof ¾¤

0 leads to a higher expected payo¤ than Vi (¾¤0; 0 : p"; ±) or Vi (¾D; µ ¡ 1 : p"; ±) at

the corresponding states, then this contradicts the optimality of ¾¤0 or ¾D by Lemma

2. So this inequality holds.Then, for any " 2 [0; "] ;

Vi (¾¤0; 0 : p"; ±) 5

(1 ¡ ±)U (D;D0 : p") + ±nP

µ=2

¼ (kj1)Vi (¾D; µ ¡ 1 : p"; ±)

1 ¡ ±¼ (1j1)= Vi (¾D; 0 : p"; ±)

< Vi (¾C; 0 : p"; ±)

Since this contradicts the optimality of ¾¤0; ¾¤

0 has to assign C for the …rst period.Now,

Vi (¾¤0; 0 : p"; ±) 5 (1 ¡ ±) + ±¼ (0j0)Vi (¾¤

0; 0 : p"; ±) + ± (1 ¡ ¼ (0j0))V

So,

Vi (¾¤0; 0 : p"; ±) 5 (1 ¡ ±) + ± (1 ¡ ¼ (0j0))V

1 ¡ ±¼ (0j0)5 (1 ¡ ±) + ±"V

1 ¡ ± (1 ¡ ")

This implies that sup¾i;p"2P"

Vi (¾i; 0 : p"; ±) 5 1¡±+±"V1¡±(1¡") for any " 2 [0; "] : ¥

(1) means that a small departure from the perfect monitoring does not reduce thepayo¤ of ¾C much when all the other players are using a grim trigger strategy. (2)means that there is not much to be exploited by using other strategies than ¾C witha small imperfection in the private signal as long as all the other players are using agrim trigger strategy.

The following result shows that the unique optimal action is almost completelycharacterized as a function of q¡i except for an arbitrary small neighborhood andequivalent to the optimal action with perfect monitoring.

Proposition 1. Given ±; for any ´ > 0; there exists a " > 0 such that for any p";

² it is not optimal to play C for player i if q¡i satis…es Á (q¡i) 5 1¡±± M (q¡i; p0)¡

´

² it is not optimal to play D for player i if q¡i satis…es Á (q¡i) = 1¡±± M (q¡i; p0)+

´

Proof:(1): It is not optimal to play C if

10

Page 11: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

(1 ¡ ±) M (q¡i; p")

> ±

·Á (q¡i)

½(1 ¡ ") sup

¾i;Vi (¾i; 0 : p"; ±) + "V

¾+ (1 ¡ Á (q¡i)) "V

¸

By Lemma 3.2., this inequality is satis…ed for any " 2 [0; "] and any p" if

(1 ¡ ±) M (q¡i; p")

> ±

·Á (q¡i)

½(1 ¡ ")

1 ¡ ± + ±"V

1 ¡ ± (1 ¡ ")+ "V

¾+ (1 ¡ Á (q¡i)) "V

¸

LHS converges to (1 ¡ ±) M (q¡i; p0) and RHS converges to ±Á (q¡i) as " ! 0:So, if q¡i satis…es Á (q¡i) 5 1¡±

± M (q¡i; p0) ¡ ´ for any ´ > 0; then there exists a"0 (±; ´;q¡i) 2 (0; ") and a neighborhood B (q¡i) of q¡i such that C is not optimal forany p"0(±;´;q¡i) and any q0

¡i 2 B (q¡i) : This "0 (±; ´;q¡i) > 0 can be set independentof q¡i by the standard arguments because q¡i is in a compact space:

(2): It is not optimal to play D if

(1 ¡ ±) M (q¡i; p")

< ±£Á (q¡i) f(1 ¡ ")Vi (¾C ; 0 : p"; ±) + "V g + (1 ¡ Á (q¡i)) "V ¡ "V

¤

this inequality is satis…ed for " 2 (0; 1) and any p" if

(1 ¡ ±) M (q¡i; p")

< ±

·Á (q¡i)

½(1 ¡ ")

1 ¡ ± + ±"V

1 ¡ ± (1 ¡ ")+ "V

¾+ (1 ¡ Á (q¡i)) "V ¡ "V

¸

This inequality converges to Á (q¡i) = 1¡±± M (q¡i; p0) as " ! 0: So, if q¡i satis…es

Á (q¡i) = 1¡±± M (q¡i; p0) + ´ for any ´ > 0; there exists a "00 (±; ´;q¡i) such that D

is not optimal for any p" 2 P"00(±;´;q¡i) and any q0¡i around q¡i: Again, "00 (±; ´;q¡i)

can be set independent of q¡i:Finally, setting " (±; ´) = min f"0 (±; ´) ; "00 (±; ´)g completes the proof. ¥

This proposition implies that the optimal action can be completely characterizedexcept for an arbitrary small neighborhood of the manifold satisfying Á (q¡i) = 1¡±

± M(q¡i; p0) in a n¡1 dimensional simplex; where player i is indi¤erent between ¾C and¾D with perfect monitoring: Note that this unique optimal action is equivalent to theone with perfect monitoring. Since the continuation payo¤s with private monitoringconverge to the continuation payo¤ with perfect monitoring as monitoring gets accu-rate, a player prefers C(D) to D (C) in private monitoring if and only if she prefers

11

Page 12: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

C (D) to D (C) in perfect monitoring except for the region where she is almost indif-ferent between C and D with perfect monitoring. A slight noise still matters in thisregion, but we also characterize the best response for this region later.

Although this argument is essentially the path dominant argument used in Sekiguchi[16] for n = 2, it extends that argument to the n¡player case and provides a sharpercharacterization even for n = 2: In fact, this proposition is the reason why an almoste¢cient outcome is achieved for any prisoner’s dilemma in the two player case.

An immediate corollary of this proposition is that C is the unique optimal actiongiven that Á is su¢ciently close to 1, ± > M(0)

1+M(0) ; and " is small:

Corollary 1. Given ± > M(0)1+M(0) ; there exists Á > 0 and " > 0 such that for any p";

it is not optimal for player i to play D if Á 2£Á; 1

¤:

4.2. Belief Dynamics

Since the optimal action is almost characterized as a function of q¡i; we have toanalyze the dynamics of q¡i associated with grim trigger strategies. Given the optimalaction shown above, all we have to make sure for the consistency of grim triggerstrategies is that q¡i stays in the “C area” described by Proposition 1 as long asplayer i has observed fully cooperative signals from the beginning and q¡i stays inthe “D area” once player i received a bad signal or started playing defection forherself. Assumption 2 on the signal distribution is required here for the …rst time asthe following arguments show.

First, consider a history where player i observes some defection for the …rst time.If this is the …rst period, player i interprets this as a signal of ¾D rather than as anerror if " is small5 . Suppose next that this kind of history is reached after the …rstperiod. Also suppose that the number of players is three for simplicity and player 1observes 1 defection by player 2. With Assumption 2, player 1 can interpret this as a1-error event and still believe that everyone is cooperative. On the other hand, it isequally likely that player 2’s observation contained 1 error in the last period and thecurrent signal is correct. Note that there are two such events. The player for whomplayer 2 observed \D" last period can be player 1 or player 3. Since someone shouldhave already defected after all other possible histories, the probability that no onehas switched to ¾D is at most 1

3 . Obviously, this ‡exibility of interpretation increasesas the number of players increases, which makes it easier to move Á closer to 0 afterthis sort of history. Note that this upper bound of Á does not depend on the level of":

Second, consider a history where player i has already started defection. Supposethat all players but player i have been cooperative until the present. Also supposeagain that the number of players is three and i = 1 for the sake of simple exposition:

5We need to let players to randomize between ¾C and ¾D in the initial period for this argumentto work in the initial period. If players start with, say, ¾C with probability 1, no learning occursafter the initial period. This is …rst observed by Matsushima [12]. Note that this is the only reasonwhy the initial randomization is needed. The rest of arguments does not depend on this initialrandomization.

12

Page 13: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

For everyone to be still cooperative after the current period, all players but player1 should observe the wrong signal C about player 1 and the correct signal C aboutthe other players in the current period. Again, there are other events with the sameprobability, where some player switches to ¾D: For example, player 2 may observe thecorrect signal D about player 1 and the wrong signal D about player 3. This eventcontains the same number of errors. Since there are 5 such events, the probabilitythat at least one player has switched to ¾D is at least 5

6 even though it is assumed thatall players but player i have been cooperative until the current period. With positiveprobability that someone has already started defection, the posterior Á is strictly lessthan 1

6 : This argument is again independent of the level of ":6

The above arguments strongly suggest that once players start playing ¾D; thenthey continue to do so. The following proposition summarizes the above argumentsand also provide another useful property to support ¾C: Let Wk =

©q¡ijÁ (q¡i) = 1¡±

± M (q¡i; p0) ¡ kª

;

where k > 0 and 0 < 1¡±± M (n ¡ 1) ¡ k: When k is small, Wk just covers the region

where C is the unique optimal action and the unique optimal action is indeterminate.We show that Át > Á if a player plays C, observes C; and Át¡1 2 Wk.

Proposition 2. Suppose that every player plays q¾C + (1 ¡ q) ¾D with q 2 (0; 1) inthe …rst period, and (i) : Assumption 1 is satis…ed and n = 2, or (ii) : Assumption 2is satis…ed. Then for all i and t = 2; 3; :::

² For any Á0 > 0; k > 0; there exists e" such that for any " 2 (0;e")Á

¡q¡i

¡ht+n

i

¢¢= Á0 for ht+n

i =¡ht¡1

i ; (C;C) ; :::; (C;C)¢

when Á¡q¡i

¡ht¡1

i

¢¢2

Wk

² Á (q¡i (hti)) 5 1

n after histories such as

–½

hti =

¡ht¡1

i ;¡C;!t¡1

i

¢ ¢for t = 3

hti =

¡C;!t¡1

i

¢for t = 2

with !t¡1i 6= C

or

–½

hti =

¡ht¡1

i ;¡D;!t¡1

i

¢ ¢for t = 3

hti =

¡D;!t¡1

i

¢for t = 2

Proof. See appendix.

6This last argument is speci…c to the n = 3 player case. The two player case has to be treatedseparately. See Sekiguchi [16] for that case.

13

Page 14: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

5. Sequential Equilibrium with the “Grim Trigger” Strategy

As in section 2, the equilibrium strategy is represented as a mapping from the spaceof belief Q to 4fC;Dg : The following notatins are useful:

QC" = fq¡ijV (¾C ;q¡i : p"; ±) > V (¾D;q¡i : p"; ±)g

QI" = fq¡ijV (¾C ;q¡i : p"; ±) = V (¾D;q¡i : p"; ±)g

QD" = fq¡ijV (¾C ;q¡i : p"; ±) < V (¾D;q¡i : p"; ±)g

Qn =

½q¡ijÁ (q¡i) 5 1

n

¾

These are subsets of the n ¡ 1 dimensional simplex on £: The subset Qn is a setcontaining the absorbing set of the dynamics of q¡i under the grim trigger. QI

" is amanifold where player i is indi¤erent between ¾C and ¾D: In particular, q¤

¡i (±; p") 2QI

" by de…nition. Note that QC" converges to QC

0 =©q¡ijÁ (q¡i) > 1¡±

± 4 (q¡i; p0)ª

and QD" converges to QD

0 =©q¡ijÁ (q¡i) < 1¡±

± 4 (q¡i; p0)ª

as " ! 0:Now de…ne ½ as a mapping from q¡i 2 Q to 4fC;Dg in the following way:

½¤" (q¡i) =

8<:

C if q¡i 2 QC"

q¤ (±; p") if q¡i 2 QI"

D if q¡i 2 QD"

We know from Proposition 1 that this function assigns the best response action almosteverywhere except for a neighborhood of QI

" when " is small. Thanks to the lastproposition about the dynamics of belief, now we can verify that ½¤

" (q¡i) actuallyassigns the unique optimal action for any belief q¡i =2 QI

" with one weak assumption.

Proposition 3. Given ±; if Qn 2 QD0 and " is small enough, then

² C is the unique optimal action if and only if q¡i 2 QC" [ QI

"

² D is the unique optimal action if and only if q¡i 2 QD" [ QI

"

Proof:Fix ´ > 0 in Proposition 1 and set k > 0 slightly larger than ´ > 0: Now take any

q¡i such that Á (q¡i) > 1¡±± 4 (q¡i; p0) ¡ k and q¡i 2 QD

" : We prove that D is theunique optimal action in this region. If a player plays C, then Proposition 2 impliesthat the continuation strategy is ¾C if " is su¢ciently small: Since ¾C is dominated by¾D in this region, the unique optimal action should be D: Similarly, take any Á suchthat 1¡±

± 4 (q¡i; p0) + ´ > Á (q¡i) and q¡i 2 QC" : If D is played, then Proposition 2

and the assumption Qn 2 QD0 implies that the continuation strategy is ¾D as long as

" is small enough. Since ¾D is dominated by ¾C in this region, the unique optimalaction should be C:

Since any other q¡i 2 QD" satis…es Á (q¡i) 5 1¡±

± 4 (q¡i; p0) ¡ k and any otherq¡i 2 QD

" satis…es Á (q¡i) = 1¡±± 4 (q¡i; p0) + ´ satis…es q¡i 2 QC

" for small " > 0;the proof is complete.¥

14

Page 15: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

The last thing we have to make sure is that players are actually playing either ¾C

or ¾D on the equilibrium path after they initially randomize between these strategies.It is almost obvious from the arguments we used to prove the propositions.

Proposition 4. Suppose that (i) : Assumption 1 is satis…ed and n = 2, or (ii) :

Assumption 2 is satis…ed. Given ± 2³

M(0)1+M(0) ; 1

´; if Qn ½ QD

0 ; then there is a

" > 0 such that for any p"; The following proposition shows that ½¤" (q¡i) is a

symmetric sequential equilibrium, which generates the same outcome distributionas (:::; q¤ (±; p")¾C + (1 ¡ q¤ (±; p"))¾D; :::):

Proof.Since ± 2

³M(0)

1+M(0) ; 1´

; q¤ (±; p") exists in (0; 1) if " is small enough. Qn ½ QD" is

also satis…ed for small ":Suppose that player i chooses ¾C as her strategy in the …rst period. Set " > 0

small enough for Corollary 1 and Proposition 2 to hold. As long as (C;C) is theoutcome, Á > Á and C is the unique optimal action by Corollary 1 and Proposition2.

Consider a history where player i observed some D for the …rst time or a historywhere player i started playing D; After this sort of history, Át

i = Á (q¡i (hti)) is going

to stay in Qn forever by Proposition 2 as long as she continues playing D: SinceQn ½ QD

" , this continuation strategy is clearly ¾D:We also know that ½¤

" (q¡i) assigns the best response after any o¤ the equilibriumhistory. Moreover, consistency is obvious.

Taking " small such that all the above arguments go through, we con…rm that½¤

" (q¡i) is a symmetric sequential equilibrium, which generates the same outcomedistribution as

(:::; q¤ (±; p")¾C + (1 ¡ q¤ (±; p"))¾D; :::) by construction:¥

Since the probability that everyone chooses ¾C in this sequential equilibrium;

q¤ (±; p")n¡1 ; gets closer to 1 as ± gets closer to M(0)

1+M(0) by Lemma 1, an outcome

arbitrary close to the e¢cient outcome can be achieved for ± arbitrary close to M(0)1+M(0) .

For high ±; we can use a public randomization device again to reduce ± e¤ectively.It is also possible to use Ellison’s trick as in Ellison [6] to achieve an almost e¢cientoutcome although the strategy is more complex and no longer a grim trigger. . Hereis the corollary of Proposition 3 regarding to an approximately e¢cient outcome.

Corollary 2. Suppose that (i) : Assumption 1 is satis…ed and n = 2, or (ii) : As-sumption 2 is satis…ed. For any k > 0; if Qn ½ QD

0 ; then there is a " > 0 such thatfor any p"; there exists a sequential equilibrium whose symmetric equilibrium payo¤is more than 1 ¡ k:

Finally, we have to check if the assumption Qn ½ QD0 used above to prove propo-

sitions is not so strong. First of all, this is always satis…ed when n = 2 for a rangeof ± with q¤ (±; p0) 2

¡12 ; 1

¢; that is, when ± is between M(0)

1+M(0) and M(0)+M(1)1+M(0)+M(1) : The

following proposition provides su¢cient conditions for Qn ½ QD0 to hold for general

n.

15

Page 16: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

Proposition 5. :

² If M (k) =M for k = 1; :::; n¡1, then Qn ½ QD0 and QI

0 6= ; for ± 2³

M1+M ; nM

1+nM

´

² Regarding n as a parameter, take a sequence of the stage game with n = 2; 3; ::::::If there exists a lower bound M > 0 such that min

1·k·n¡1M (k) = M independent

of n; then there exists n such that for all n = n; Qn ½ QD0 (n) and QI

0 (n) 6= ;for ± 2

³M(0)

1+M(0) ;nM

1+nM

´

Proof: If the deviation gain is constant, QI0 =

©q¡ijÁ (q¡i) = 1¡±

± 4ª

So, 1n <

1¡±± 4 i¤ Qn ½ QD

0 (n) : Combining this inequality with M1+M < ± for QI

0 6= ;; ± 2³M

1+M ; nM1+nM

´is obtained:

If M is independent of n; Qn ½ QD0 (n) if 1

n < 1¡±± M: So, Qn ½ QD

0 (n) and

QI0 (n) 6= ; for ± 2

³M(0)

1+M(0) ;nM

1+nM

´for all n = n if n is chosen such that M(0)

1+M(0) < nM1+nM

The assumption of constant deviation gain is natural in the team production,where a player’s cost comes from disutility associated with the e¤ort level.

6. Conclusion

In this paper, we clarify the incentive structure in a general repeated prisoner’sdilemma with private monitoring when players are using a mixture of a grim trig-ger strategy and permanent defection, and provide the su¢cient conditions underwhich the simple grim trigger strategy supports the e¢cient outcome as a sequentialequilibrium for some range of discount factors. It is also shown that the best responseto a mixture of grim trigger strategy and permanent defection can be characterizedalmost uniquely, which makes it possible to provide the clear representation of thesequential equilibrium supporting the e¢cient outcome .

There have been two lines of research exploring the sustainability of the e¢cientoutcome or the possibility of any folk theorem in repeated prisoners’ dilemma withprivate monitoring. One direction of research is based on grim trigger strategies.Bhaskar [3] and Sekiguchi [16] belong to this literature as well as this paper. Theemphasis of these papers are on coordination of players’ actions and beliefs. Asshown in Mailath and Morris [11], the almost common knowledge of continuationstrategies is very important for subgame perfect equilibria or perfect public equilibriato be robust to the perturbation with respect to private noise because it enablesplayers to keep coordinating with others without public signals. In this paper, in facta player strongly believes that the other players are playing ¾C when she continuesplaying C; and believes that the other players are playing ¾D once she started playingD: However, note that our strategy does not exhibit such a strong coordination asrequired in Mailath and Morris [11]. While Mailath and Morris [11] requires thatplayers’ continuation strategies are almost common knowledge after any history, adiscoordination of belief or continuation strategy can occur almost surely for our

16

Page 17: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

mixing grim trigger strategy. When a player observes D for the …rst time, she is stilluncertain about the other player’s continuation strategies.

The other direction of research is based on a complete mixed strategies whichmakes the other player indi¤erent over many strategies. Ely and Välimäki [7] andPiccione [15] are among papers in this direction.7

One advantage of the …rst approach compared to the second approach is:

1. The equilibrium needs to use mixing only at the …rst period of the game, whilethe latter approach uses a behavior strategy which let players randomize atevery period after every history.

Another advantage, which is closely related to the …rst one, is as follows:

2. Since our strategy is almost pure, it is very easy to justify the use of a mixedstrategy. Especially, our strategy satis…es the re…nement criterion for repeatedgame equilibria with respect to incomplete information on players’ payo¤s,which is proposed by Bhaskar[4]. This criterion prohibits players to play dif-ferent continuation strategies after di¤erent histories if players are indi¤erentbetween these continuation strategies after all these di¤erent histories. Strate-gies in Ely and Välimäki [7] or Piccione [15] violate this criterion at the secondperiod when private signals are independent. While puri…cation is straightfor-ward for our strategy by introducing a small amount of uncertainty into stagegame payo¤s, payo¤ uncertainty has to depend on a private history in a par-ticular way to purify equilibria in Ely and Välimäki [7] or Piccione [15]. It isalso easy to adopt Nash’s population interpretation to purify our equilibrium.What we have in our mind is a pool of players who are matching with the otherplayers to play repeated games, where most of the players use grim trigger andonly a small portion of the players use permanent defection.

The case where private monitoring is not almost perfect has not been systemati-cally investigated and it remains as a important topic for future research.

7Obara [14] uses the same kind of strategy for repeated partenership games with imperfect publicmonitoring and constructs a sequential equilibrium which cannot be supported by public perfectequilibria.

17

Page 18: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

Appendix.

Proof of Lemma 1.

When ± = 4(0)1+4(0) ; q¤ (±; p0) = 1 is the solution of the equation in q:

Vi (¾C ;q¡i : p0; ±) ¡ Vi (¾D;q¡i : p0; ±) = 0

where q¡i (µ) =n¡1Pµ=0

(1 ¡ q)µ qn¡1¡µ¡n¡1

µ

¢for kµ = 0; :::; n ¡ 1::

We just need to show that @q¤(±;p0)@± j

±= 4(0)1+4(0)

< 0 using the implicit function theo-

rem. Since

Vi (¾C ;q¡i : p0; ±) ¡ Vi (¾D;q¡i : p0; ±)

= (1 ¡ ±)n¡1X

µ=0

(1 ¡ q)µ qn¡1¡µ

µn ¡ 1

µ

¶4 (µ; p0) ¡ ±qn¡1

@q¤ (±; p0)

@±j±= 4(0)

1+4(0)= ¡

@Vi(¾C ;q¤¡i:p0;±)¡Vi(¾D;q¤

¡i:p0;±)@±

@Vi(¾C ;q¤¡i:p0;±)¡Vi(¾D;q¤

¡i:p0;±)@q

j±= 4(0)

1+4(0)

= ¡ 1 + 4 (0)

(1 ¡ ±) (n ¡ 1) 4 (1) + ± (n ¡ 1)j±= 4(0)

1+4(0)

= ¡ 1

n ¡ 1

1

4 (0) + 4 (1)> 0

¥

Proof of Proposition2

case 1:Let ht

i =¡ht¡1

i ; (C;C)¢

such that Át¡1i = Á

¡q¡i

¡ht¡1

i

¢¢2 Wk:

Applying Bayes’ rule8 ,

Áti = Á (q¡i (ht

i))

=Át¡1

i P(!t¡1=Cj ht¡1i )

(1¡Át¡1i )P(!t¡1

i =Cjµt¡1 6=0;ht¡1i )+Át¡1

i P(!t¡1i =Cj ht¡1

i )

This function is increasing in Át¡1i and crosses 45± line once. Note that this

function is bounded below by '¡Át¡1

i

¢=

Át¡1i (1¡")

(1¡Át¡1i )"+Át¡1

i

: Let bÁ be the unique …xed

point of this mapping. Given that Áti = Á

¡q¡i

¡ht¡1

i

¢¢2 Wk; it is easy to see that

8All the conditional distributions implicitly depend on the level of initial mixture between ¾C ;¾D:

18

Page 19: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

'¡Át¡1

i

¢ ³and bÁ

´can be made larger than Á0 > 0 for any q¡i uniformly by choosing

" small enough . If Át¡1i < bÁ; then, as long as players continue to observe C; 'n

¡Á1

i

¢

is going to increase monotonically to bÁ: On the other hand, since Áti = '

¡Át¡1

i

¢and

Át+ni = '

¡Át+n¡1

i

¢; Át+n¡1

i is larger than 'n¡Át¡1

i

¢for any n = 0; 1; ::: If Át¡1

i = bÁ;

then Át+n¡1i = bÁ > Á0 for any n: This implies that

©Át+n

i

ª1n=0

is always above Á0:

case 2:½

hti =

¡ht¡1

i ;¡C;!t¡1

i

¢ ¢for t = 3

hti =

¡C;!t¡1

i

¢for t = 2

with !t¡1i = !0

i 6= C

Suppose that t = 3: By Bayes’ Rule,

Áti = Á (q¡i (ht

i))

=Át¡2

i P((!t¡2;!t¡1¡i ;!t¡1

i )=(C;C;!0i)jµt¡2=0)

(1¡Át¡2i )P((!t¡2

i ;!t¡1i )=(C;!0

i)jµt¡2 6=0; ht¡2i )+Át¡2

i P((!t¡2i ;!t¡1

i )=(C;!0i)jµt¡2=0)

This is bounded above by

P¡¡

!t¡2; !t¡1¡i ; !t¡1

i

¢= (C;C; !0

i) jµt¡2 = 0¢

P¡¡

!t¡2i ; !t¡1

i

¢= (C; !0

i) jµt¡2 = 0¢

5 P (# (!0ijC))

P (# (!0ijC)) + (n ¡ 1)#(!0

ijC) P (# (!0ijC))

=1

1 + (n ¡ 1)#(!0ijC)

5 1

n

where P (# (!0ijC)) is the probability of the event that #(!0

ijC) errors occur.So, once players observed a bad signal for the …rst time, the posterior Át

i jumpsdown at least below 1

n independent of the prior Át¡1i or qt¡1

¡i for t = 3: This argumentis independent of the level of ":

When t = 2; Áti decreases enough to be less than 1

n if " is very small. This isbecause players do not interpret it as an error but as a signal of ¾D at the …rstperiod.

case 3:½

hti =

¡ht¡1

i ;¡D;!t¡1

i

¢ ¢for t = 3

hti =

¡D;!t¡1

i

¢for t = 2

with !t¡1i = !00

i

we have to treat (i) n = 3 and (ii) n = 2 separately again.

(i): n = 3By Bayes’ Rule,

19

Page 20: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

Áti = Á (q¡i (ht

i))

=Át¡1

i P((!t¡1¡i ;!t¡1

i )=(C;!00i )jµt¡1=0)

(1¡Át¡1i )P(!t¡1

i =!00i jµt¡1 6=0)+Át¡1

i P(!t¡1i =!00

i jµt¡1=0)

This is bounded above by

P¡¡

!t¡1¡i ; !t¡1

i

¢= (C;!00

i ) jµt¡1 = 0¢

P¡!t¡1

i = !00i jµt¡1 = 0

¢

5 P (# (!00i jC) + n ¡ 1)

P (# (!00i jC) + n ¡ 1) + P (# (!00

i jC) + n ¡ 1)n¡1Pm=1

¡(n¡2)(n¡1)

m

¢¡n¡1m

¢

=1

1+n¡1Pm=1

¡(n¡2)(n¡1)

m

¢¡n¡1m

¢

5 1

6(This holds with equality when n = 3)

This argument is independent of "; too.

(ii): n = 2

Á¡q¡i

¡ht¡1

i ; (D;C)¢¢

=Át¡1

i (1¡2P (1)¡P (2))+(1¡Át¡1i )(P (1)+P (2))

Át¡1i (1¡P (1)¡p(2))+(1¡Át¡1

i )(P (1)+P (2))

Á¡q¡i

¡ht¡1

i ; (D;D)¢¢

=Át¡1

i P (1)+(1¡Át¡1i )(1¡P (1)¡P (2))

Át¡1i (P (1)+P (2))+(1¡Át¡1

i )(1¡P (1)¡P (2))

where P (k) is a probability that k errors occur.It can be shown that Á

¡qt

¡i

¡ht¡1

i ; (D;!i)¢¢

5 12 when " is small. See Sekiguchi

[16] for detail.With case 2 and case 3, I can conclude that Á (q¡i (ht

i)) 5 1n after any history

such as

²½

hti =

¡ht¡1

i ;¡C;!t¡1

i

¢ ¢for t = 3

hti =

¡C;!t¡1

i

¢for t = 2

with !t¡1i 6= C

or

²½

hti =

¡ht¡1

i ;¡D;!t¡1

i

¢ ¢for t = 3

hti =

¡D;!t¡1

i

¢for t = 2

¥

20

Page 21: The Repeated Prisoners’ Dilemma with Private …The Repeated Prisoners’ Dilemma with Private Monitoring: a N-player case¤ Ichiro Obara Department of Economics University of Pennsylvania

References

[1] Abreu, D. Pearce, and E. Stachetti, Toward a theory of discounted repeatedgames with imperfect monitoring, Econometrica 58 (1990), 1041-1064.

[2] V. Bhaskar, Informational constraints and overlapping generations model: folkand anti-folk theorem, Rev. of Econ. Stud 65, (1998), 135-149.

[3] V. Bhaskar, Sequential equilibria in the repeated prisoner’s dilemma with privatemonitoring, mimeo, 1999.

[4] V. Bhaskar, The robustness of repeated game equilibria to incomplete payo¤information, mimeo, 1999.

[5] O. Compte, Communication in repeated games with imperfect private monitor-ing, Econometrica 66 (1998), 597-626.

[6] G. Ellison, Cooperation in the prisoner’s dilemma with anonymous randommatching, J. Econ. Theory 61 (1994), 567-588.

[7] J. C. Ely and J. Välimäki, A robust folk theorem for the prisoner’s dilemma,mimeo, 1999.

[8] D. Fudenberg, D. Levine, and E. Maskin, The folk theorem with imperfect publicinformation, Econometrica 62 (1994), 997-1040.

[9] J. Harsanyi, Games with randomly distributed payo¤s: a new rational for mixed-strategy equilibrium points, Int. J. Game Theory 2, (1973), 1-23.

[10] M. Kandori and H. Matsushima, Private observation, communication, and col-lusion, Econometrica 66 (1998), 627-652.

[11] G. Mailath and S. Morris, Repeated games with almost-public monitoring,mimeo, 1999.

[12] H. Matsushima, On the theory of repeated game with non-observable actions,part I: Anti-Folk Theorem without communication. Econ. Letters 35 (1990),253-256.

[13] I. Obara, Repeatd prisoners’ dilemma with private monitoring: a N-player case,mimeo, 1999

[14] I. Obara, Private strategy and e¢ciency: repeated partnership games revisited,mimeo, 1999.

[15] M. Piccione, The repeated prisoner’s dilemma with imperfect private monitoring,mimeo, 1998.

[16] T. Sekiguchi, E¢ciency in repeated prisoner’s dilemma with private monitoring,J. Econ. Theory 76 (1997), 345-361.

21