Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel...

82
Repeated Games Examples of Repeated Prisoner’s Dilemma • Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the Global Commons Free-rider Problems

description

Can threats and promises about future actions influence behavior in the present? Consider the following game, played 2X: C 3,3 0,5 D 5,0 1,1 Repeated Games C D See Gibbons:

Transcript of Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel...

Page 1: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Repeated Games

Examples of Repeated Prisoner’s Dilemma

• Overfishing• Transboundary pollution• Cartel enforcement• Labor union• Public goods

The Tragedy of the Global Commons

Free-rider Problems

Page 2: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Repeated Games

Some Questions:

• What happens when a game is repeated? • Can threats and promises about the future

influence behavior in the present?• Cheap talk• Finitely repeated games: Backward induction• Indefinitely repeated games: Trigger strategies

Page 3: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Can threats and promises about future actions influence behavior in the present? Consider the following game, played 2X:

C 3,3 0,5

D 5,0 1,1

Repeated Games

C D

See Gibbons: 82-104.

Page 4: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Repeated Games

Draw the extensive form game:

(3,3) (0,5) (5,0) (1,1)

(6,6) (3,8) (8,3) (4,4) (3,8)(0,10)(5,5)(1,6)(8,3) (5,5)(10,0) (6,1) (4,4) (1,6) (6,1) (2,2)

Page 5: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Repeated Games Now, consider three repeated game strategies:

D (ALWAYS DEFECT): Defect on every move.

C (ALWAYS COOPERATE): Cooperate on every move.

T (TRIGGER): Cooperate on the first move, then cooperate after the other cooperates. If the other defects, then defect forever.

Page 6: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Repeated GamesIf the game is played twice, the V(alue) to a player using ALWAYS DEFECT (D) against an opponent using ALWAYS DEFECT(D) is:

V (D/D) = 1 + 1 = 2, and so on. . . V (C/C) = 3 + 3 = 6V (T/T) = 3 + 3 = 6V (D/C) = 5 + 5 = 10V (D/T) = 5 + 1 = 6V (C/D) = 0 + 0 = 0V (C/T) = 3 + 3 = 6

V (T/D) = 0 + 1 = 1V (T/C) = 3 + 3 = 6

Page 7: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Repeated GamesAnd 3x:

V (D/D) = 1 + 1 + 1 = 3 V (C/C) = 3 + 3 + 3 = 9V (T/T) = 3 + 3 + 3 = 9V (D/C) = 5 + 5 + 5 = 15V (D/T) = 5 + 1 + 1 = 7V (C/D) = 0 + 0 + 0 = 0V (C/T) = 3 + 3 + 3 = 9

V (T/D) = 0 + 1 + 1 = 2V (T/C) = 3 + 3 + 3 = 9

Page 8: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Repeated GamesTime average payoffs:

n=3

V (D/D) = 1 + 1 + 1 = 3 /3 = 1V (C/C) = 3 + 3 + 3 = 9 /3 = 3V (T/T) = 3 + 3 + 3 = 9 /3 = 3V (D/C) = 5 + 5 + 5 = 15 /3 = 5V (D/T) = 5 + 1 + 1 = 7 /3 = 7/3V (C/D) = 0 + 0 + 0 = 0 /3 = 0V (C/T) = 3 + 3 + 3 = 9 /3 = 3

V (T/D) = 0 + 1 + 1 = 2 /3 = 2/3V (T/C) = 3 + 3 + 3 = 9 /3 = 3

Page 9: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Repeated GamesTime average payoffs:

n

V (D/D) = 1 + 1 + 1 + ... /n = 1V (C/C) = 3 + 3 + 3 + ... /n = 3V (T/T) = 3 + 3 + 3 + ... /n = 3V (D/C) = 5 + 5 + 5 + ... /n = 5V (D/T) = 5 + 1 + 1 + ... /n = 1 + V (C/D) = 0 + 0 + 0 + ... /n = 0V (C/T) = 3 + 3 + 3 + … /n = 3

V (T/D) = 0 + 1 + 1 + ... /n = 1 - V (T/C) = 3 + 3 + 3 + ... /n = 3

Page 10: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Repeated Games Now draw the matrix form of this game:

1x

T 3,3 0,5 3,3

C 3,3 0,5 3,3

D 5,0 1,1 5,0

C D T

Page 11: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Repeated Games

T 3,3 1-1+ 3,3

C 3,3 0,5 3,3

D 5,0 1,1 1+,1-

C D T

If the game is repeated, ALWAYS DEFECTis no longer dominant.

Time Average Payoffs

Page 12: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Repeated Games

T 3,3 1-1+ 3,3

C 3,3 0,5 3,3

D 5,0 1,1 1+,1-

C D T

… and TRIGGERachieves “a NE with itself.”

Page 13: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Repeated Games

Time Average Payoffs

T(emptation) >R(eward)>P(unishment)>S(ucker)

T R,R P-P+ R,R

C R,R S,T R,R

D T,S P,P P+,P-

C D T

Page 14: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

DiscountingThe discount parameter, , is the weight of the next payoff

relative to the current payoff.

In a indefinitely repeated game, can also be interpreted as the likelihood of the game continuing for another round (so that the expected number of moves per game is 1/(1-)).  

The V(alue) to someone using ALWAYS DEFECT (D) when playing with someone using TRIGGER (T) is the sum of T for the first move, P for the second, 2P for the third, and so on (Axelrod: 13-4): 

V (D/T) = T + P + 2P + …

“The Shadow of the Future”

Page 15: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Discounting

Writing this as V (D/T) = T + P + 2P +..., we have the following:

V (D/D) = P + P + 2P + … = P/(1-) V (C/C) = R + R + 2R + … = R/(1-) V (T/T) = R + R + 2R + … = R/(1-) V (D/C) = T + T + 2T + … = T/(1-) V (D/T) = T + P + 2P + … = T+ P/(1-) V (C/D) = S + S + 2S + … = S/(1-) V (C/T) = R + R + 2R + … = R/(1- )

V (T/D) = S + P + 2P + … = S+ P/(1-) V (T/C) = R + R + 2R + … = R/(1- )

Page 16: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

T

C

D

DiscountedPayoffs

T > R > P > S 0 > > 1

R/(1-) S/(1-) R/(1-)

R/(1-) T/(1-) R/(1-)T/(1-) P/(1-) T + P/(1-)

S/(1-) P/(1-) S + P/(1-)

Discounting

C D T

R/(1-) S + P/(1-) R/(1- )

R/(1-) T + P/(1-) R/(1-)

Page 17: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

T

C

D

DiscountedPayoffs

T > R > P > S 0 > > 1

T weakly dominates C

R/(1-) S/(1-) R/(1-)

R/(1-) T/(1-) R/(1-)T/(1-) P/(1-) T + P/(1-)

S/(1-) P/(1-) S + P/(1-)

Discounting

C D T

R/(1-) S + P/(1-) R/(1- )

R/(1-) T + P/(1-) R/(1-)

Page 18: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

DiscountingNow consider what happens to these values as varies (from 0-1):

V (D/D) = P + P + 2P + … = P/(1-) V (C/C) = R + R + 2R + … = R/(1-) V (T/T) = R + R + 2R + … = R/(1-) V (D/C) = T + T + 2T + … = T/(1-) V (D/T) = T + P + 2P + … = T+ P/(1-) V (C/D) = S + S + 2S + … = S/(1-) V (C/T) = R + R + 2R + … = R/(1- )

V (T/D) = S + P + 2P + … = S+ P/(1-) V (T/C) = R + R + 2R + … = R/(1- )

Page 19: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

DiscountingNow consider what happens to these values as varies (from 0-1):

V (D/D) = P + P + 2P + … = P/(1-) V (C/C) = R + R + 2R + … = R/(1-) V (T/T) = R + R + 2R + … = R/(1-) V (D/C) = T + T + 2T + … = T/(1-) V (D/T) = T + P + 2P + … = T+ P/(1-) V (C/D) = S + S + 2S + … = S/(1-) V (C/T) = R + R + 2R + … = R/(1- )

V (T/D) = S + P + 2P + … = S+ P/(1-) V (T/C) = R + R + 2R + … = R/(1- )

Page 20: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

DiscountingNow consider what happens to these values as varies (from 0-1):

V (D/D) = P + P + 2P + … = P+ P/(1-) V (C/C) = R + R + 2R + … = R/(1-) V (T/T) = R + R + 2R + … = R/(1-) V (D/C) = T + T + 2T + … = T/(1-) V (D/T) = T + P + 2P + … = T+ P/(1-) V (C/D) = S + S + 2S + … = S/(1-) V (C/T)= R + R + 2R + … = R/(1- )

V (T/D) = S + P + 2P + … = S+ P/(1-) V (T/C) = R + R + 2R + … = R/(1- )

V(D/D) > V(T/D) D is a best response to D

Page 21: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

DiscountingNow consider what happens to these values as varies (from 0-1):

V (D/D) = P + P + 2P + … = P+ P/(1-) V (C/C) = R + R + 2R + … = R/(1-) V (T/T) = R + R + 2R + … = R/(1-) V (D/C) = T + T + 2T + … = T/(1-) V (D/T) = T + P + 2P + … = T+ P/(1-) V (C/D) = S + S + 2S + … = S/(1-) V (C/T) = R + R + 2R + … = R/(1- )

V (T/D) = S + P + 2P + … = S+ P/(1-) V (T/C) = R + R + 2R + … = R/(1- )

2

1

3

?

Page 22: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

DiscountingNow consider what happens to these values as varies (from 0-1): For all values of : V(D/T) > V(D/D) > V(T/D)

V(T/T) > V(D/D) > V(T/D)  

Is there a value of s.t., V(D/T) = V(T/T)? Call this *.

If < *, the following ordering hold: 

V(D/T) > V(T/T) > V(D/D) > V(T/D)  

D is dominant: GAME SOLVED

V(D/T) = V(T/T)T+P(1-) = R/(1-) T-t+P = R T-R = (T-P)

* = (T-R)/(T-P)

?

Page 23: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

DiscountingNow consider what happens to these values as varies (from 0-1): For all values of : V(D/T) > V(D/D) > V(T/D)

V(T/T) > V(D/D) > V(T/D)  

Is there a value of s.t., V(D/T) = V(T/T)? Call this *. * = (T-R)/(T-P)

If > *, the following ordering hold: 

V(T/T) > V(D/T) > V(D/D) > V(T/D)  

D is a best response to D; T is a best response to T; multiple NE.

Page 24: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Discounting

V(T/T) = R/(1-)

* 1

V

TR

Graphically:

The V(alue) to a player using ALWAYSDEFECT (D) against TRIGGER (T), and the V(T/T) as a functionof the discount parameter ()

V(D/T) = T + P/(1-)

Page 25: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Folk Theorem

(R,R)

(T,S)

(S,T)

(P,P)

The payoff set of the repeated PD is the convex closure of the points [(T,S); (R,R); (S,T); (P,P)].

Page 26: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Folk Theorem

(R,R)

(T,S)

(S,T)

(P,P)

The shaded area is the set of payoffs that Pareto-dominate the one-shot NE (P,P).

Page 27: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Folk Theorem

(R,R)

(T,S)

(S,T)

(P,P)

Theorem: Any payoff that pareto-dominates the one-shot NE can be supported in a SPNE of the repeated game, if the discount parameter is sufficiently high.

Page 28: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Folk Theorem

(R,R)

(T,S)

(S,T)

(P,P)

In other words, in the repeatedgame, if the future matters “enough”i.e., ( > *),there are zillions of equilibria!

Page 29: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

• The theorem tells us that in general, repeated games give rise to a very large set of Nash equilibria. In the repeated PD, these are pareto-rankable, i.e., some are efficient and some are not.

• In this context, evolution can be seen as a process that selects for repeated game strategies with efficient payoffs.

“Survival of the Fittest”

The Folk Theorem

Page 30: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Thinking About Evolution

Fifteen months after I had begun my systematic enquiry, I happened to read for amusement ‘Malthus on Population’ . . . It at once struck me that . . . favorable variations would tend to be preserved, and unfavorable ones to be destroyed. Here then I had at last got a theory by which to work.

Charles Darwin

Page 31: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Thinking About Evolution

Biological Evolution: Under the pressure of natural selection, any population (capable of reproduction and variation) will evolve so as to become better adapted to its environment, i.e., will develop in the direction of increasing “fitness.”

Economic Evolution: Firms that adopt efficient “routines” will survive, expand, and multiply; whereas others will be “weeded out” (Nelson and Winters, 1982).

Page 32: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Evolution of CooperationUnder what conditions will cooperation emerge in world of egoists without central authority?

Axelrod uses an experimental method – the indefinitely repeated PD tournament – to investigate a series of questions: Can a cooperative strategy gain a foothold in a population of rational egoists? Can it survive better than its uncooperative rivals? Can it resist invasion and eventually dominate the system?      

Page 33: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Indefinitely Repeated Prisoner’s Dilemma Tournament

Axelrod (1980a,b, Journal of Conflict Resolution).

A group of scholars were invited to design strategies to play indefinitely repeated prisoner’s dilemmas in a round robin tournament.

Contestants submitted computer programs that select an action, Cooperate or Defect, in each round of the game, and each entry was matched against every other, itself, and a control, RANDOM.

The Evolution of Cooperation

Page 34: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Indefinitely Repeated Prisoner’s Dilemma Tournament

Axelrod (1980a,b, Journal of Conflict Resolution).

Contestants did not know the length of the games. (The first tournament lasted 200 rounds; the second varied probabilistically with an average of 151.)

The first tournament had 14 entrants, including game theorists, mathematicians, psychologists, political scientists, and others.

Results were published and new entrants solicited. The second tournament included 62 entrants . . .

The Evolution of Cooperation

Page 35: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Indefinitely Repeated Prisoner’s Dilemma Tournament

TIT FOR TAT won both tournaments!TFT cooperates in the first round, and then does whatever

the opponent did in the previous round.

TFT “was the simplest of all submitted programs and it turned out to be the best!” (31).

TFT was submitted by Anatol Rapoport to both tournaments, even after contestants could learn from the results of

the first.

The Evolution of Cooperation

Page 36: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Indefinitely Repeated Prisoner’s Dilemma Tournament

TIT FOR TAT won both tournaments!In addition, Axelrod provides a “theory of cooperation” based on his analysis of the repeated prisoner’s dilemma game.

In particular, if the “shadow of the future” looms large, then players may have an incentive to cooperate. A cooperative strategy such as TFT is “collectively stable.”

He also offers an evolutionary argument, i.e., TFT wins in an evolutionary competition in which payoffs play the role of reproductive rates.

The Evolution of Cooperation

Page 37: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Indefinitely Repeated Prisoner’s Dilemma Tournament

This result has been so influential that “some authors use TIT FOR TAT as though it were a synonym for a self-enforcing, cooperative agreement” (Binmore, 1992, p. 433). And many have taken these results to have shown that TFT is the “best way to play” in IRPD.

While TFT won these, will it win every tournament? Is showing that TFT is collectively stable equivalent to

predicting a winner in the computer tournaments? Is TFT evolutionarily stable?

The Evolution of Cooperation

Page 38: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Evolution of CooperationAn Evolutionary Tournament

Imagine a population of strategies matched in pairs to play repeated PD, where outcomes determine the number of offspring each leaves to the next generation.

– In each generation, each strategy is matched against every other, itself, and RANDOM.

– Between generations, the strategies reproduce, where the chance of successful reproduction (“fitness”) is determined by the payoffs (i.e., payoffs play the role of reproductive rates).

 

Then, strategies that do better than average will grow as a share of the population and those that do worse than average will eventually die-out. . .

Page 39: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Evolution of CooperationAn Evolutionary Tournament

Imagine a population of strategies matched in pairs to play repeated PD, where outcomes determine the number of offspring each leaves to the next generation.

– In each generation, each strategy is matched against every other, itself, and RANDOM.

– Between generations, the strategies reproduce, where the chance of successful reproduction (“fitness”) is determined by the payoffs (i.e., payoffs play the role of reproductive rates).

 

Then, strategies that do better than average will grow as a share of the population and those that do worse than average will eventually die-out. . .

The Replicator Dynamic

Page 40: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Replicator Dynamics

There is a very simple way to describe this process.Let:

x(A) = the proportion of the population using strategy A in a given generation; V(A) = strategy A’s tournament score; V = the population’s average score.

Then A’s population share in the next generation is:

x’(A) = x(A)   

V(A)V

Page 41: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Replicator DynamicsFor any finite set of strategies, the replicator dynamic will attain a fixed-point, where population shares do not change and all strategies are equally fit, i.e., V(A) = V(B), for all B.

However, the dynamic described is population-specific. For instance, if the population consists entirely of naive cooperators (ALWAYS COOPERATE), then x(A) = x’(A) = 1, and the process is at a fixed-point. To be sure, the population is in equilibrium, but only in a very weak sense. For if a single D strategy were to “invade” the population, the system would be driven away from equilibrium, and C would be driven toward extinction.

Page 42: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Pop. Share0.140

0.100

0.060

0.020

0 200 400 600 800 Generations

Simulating Evolution?

1(TFT)326

7,9

10411

5

81814,12,1513

No. = Position after 1st Generation

Source:Axelrod 1984, p. 51.

Page 43: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

An evolutionary model includes three components: Reproduction + Selection + Variation

Population of

Strategies

SelectionMechanism

VariationMechanism

Mutation orLearning

Reproduction

Competition

Invasion

The Evolution of CooperationSimulating Evolution

Page 44: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

The Trouble with TIT FOR TAT

TIT FOR TAT is susceptible to 2 types of perturbations:

Mutations: random Cs can invade TFT (TFT is not ESS), which in turn allows exploiters to gain a foothold.

Noise: a “mistake” between a pair of TFTs induces CD, DC cycles (“mirroring” or “echo” effect).

TIT FOR TAT never beats its opponent; it wins because it elicits reciprocal cooperation. It never exploits “naively” nice strategies.

(See Poundstone: 242-248; Casti 76-84.)

Page 45: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Noise in the form of random errors in implementing or perceiving an action is a common problem in real-world interactions. Such misunderstandings may lead “well-intentioned” cooperators into periods of alternating or mutual defection resulting in lower tournament scores.

TFT: C C C CTFT: C C C D

The Trouble with TIT FOR TAT

Page 46: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Noise in the form of random errors in implementing or perceiving an action is a common problem in real-world interactions. Such misunderstandings may lead “well-intentioned” cooperators into periods of alternating or mutual defection resulting in lower tournament scores.

TFT: C C C CTFT: C C C D

“mistake”

The Trouble with TIT FOR TAT

Page 47: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Noise in the form of random errors in implementing or perceiving an action is a common problem in real-world interactions. Such misunderstandings may lead “well-intentioned” cooperators into periods of alternating or mutual defection resulting in lower tournament scores.

TFT: C C C C D C D ….TFT: C C C D C D C ….

“mistake”

The Trouble with TIT FOR TAT

Page 48: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Noise in the form of random errors in implementing or perceiving an action is a common problem in real-world interactions. Such misunderstandings may lead “well-intentioned” cooperators into periods of alternating or mutual defection resulting in lower tournament scores.

TFT: C C C C D C D ….TFT: C C C D C D C ….

“mistake”

The Trouble with TIT FOR TAT

Page 49: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Noise in the form of random errors in implementing or perceiving an action is a common problem in real-world interactions. Such misunderstandings may lead “well-intentioned” cooperators into periods of alternating or mutual defection resulting in lower tournament scores.

TFT: C C C C D C D ….TFT: C C C D C D C ….

“mistake”

The Trouble with TIT FOR TAT

Page 50: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Noise in the form of random errors in implementing or perceiving an action is a common problem in real-world interactions. Such misunderstandings may lead “well-intentioned” cooperators into periods of alternating or mutual defection resulting in lower tournament scores.

TFT: C C C C D C D ….TFT: C C C D C D C ….

“mistake”

Avg Payoff = R (T+S)/2

The Trouble with TIT FOR TAT

Page 51: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Nowak and Sigmund (1993) ran an extensive series of computer-based experiments and found the simple learning rule PAVLOV outperformed TIT FOR TAT in the presence of noise.

PAVLOV (win-stay, lose-switch) Cooperate after both cooperated or both defected;otherwise defect.

The Trouble with TIT FOR TAT

Page 52: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

PAVLOV cannot be invaded by random C; PAVLOV is an exploiter (will “fleece a sucker” once it discovers no need to fear retaliation).

A mistake between a pair of PAVLOVs causes only a single round of mutual defection followed by a return to mutual cooperation.

PAV: C C C C DPAV: C C C D D

“mistake”

The Trouble with TIT FOR TAT

Page 53: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

PAVLOV cannot be invaded by random C; PAVLOV is an exploiter (will “fleece a sucker” once it discovers no need to fear retaliation).

A mistake between a pair of PAVLOVs causes only a single round of mutual defection followed by a return to mutual cooperation.

PAV: C C C C D C CPAV: C C C D D C C

“mistake”

The Trouble with TIT FOR TAT

Page 54: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

PAVLOV cannot be invaded by random C; PAVLOV is an exploiter (will “fleece a sucker” once it discovers no need to fear retaliation).

A mistake between a pair of PAVLOVs causes only a single round of mutual defection followed by a return to mutual cooperation.

PAV: C C C C D C CPAV: C C C D D C C

“mistake”

The Trouble with TIT FOR TAT

Page 55: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

PAVLOV cannot be invaded by random C; PAVLOV is an exploiter (will “fleece a sucker” once it discovers no need to fear retaliation).

A mistake between a pair of PAVLOVs causes only a single round of mutual defection followed by a return to mutual cooperation.

PAV: C C C C D C CPAV: C C C D D C C

“mistake”

The Trouble with TIT FOR TAT

Page 56: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Designing Repeated Game Strategies

Imagine a very simple decision making machine playing a repeated game. The machine has very little information at the start of the game: no knowledge of the payoffs or “priors” over the opponent’s behavior. It merely makes a choice, receives a payoff, then adapts its behavior, and so on.

The machine, though very simple, is able to implement a strategy against any possible opponent, i.e., it “knows what to do” in any possible situation of the game.

Page 57: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Designing Repeated Game Strategies

A repeated game strategy is a map from a history to an action. A history is all the actions in the game thus far ….

… T-3 T-2 T-1 To

C C C C D C CC C C D D C C

History at time T

?

Page 58: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Designing Repeated Game Strategies

A repeated game strategy is a map from a history to an action. A history is all the actions in the game thus far ….

… T-3 T-2 T-1 To

C C C C D C CC C C D D C D

History at time To

?

Page 59: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Designing Repeated Game Strategies

A repeated game strategy is a map from a history to an action. A history is all the actions in the game thus far, subject to the constraint of a finite memory:

… T-3 T-2 T-1 To

C C C C D C CC C C D D C C

History of memory-4

?

Page 60: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Designing Repeated Game Strategies

TIT FOR TAT is a remarkably simple repeated game strategy. It merely requires recall of what happened in the last round (memory-1).

… T-3 T-2 T-1 To

C C C C D D CC C C D D C D

History of memory-1

?

Page 61: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Finite AutomataA FINITE AUTOMATON (FA) is a mathematical representation of a simple decision-making process. FA are completely described by:

• A finite set of internal states• An initial state• An output function• A transition function

The output function determines an action, C or D, in each state.The transition function determines how the FA changes states inresponse to the inputs it receives (e.g., actions of other FA).

Rubinstein, “Finite Automata Play the Repeated PD” JET, 1986)

Page 62: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

FA will implement a strategy against any possible opponent, i.e., they “know what to do” in any possible situation of the game.

FA meet in 2-player repeated games and make a move in each round (either C or D). Depending upon the outcome of that round, they “decide” what to play on the next round, and so on.

FA are very simple, have no knowledge of the payoffs or priors over the opponent’s behavior, and no deductive ability. They simply read and react to what happens. Nonetheless, they are capable of a crude form of “learning” — they receive payoffs that reinforce certain behaviors and “punish” others.

Finite Automata

Page 63: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Finite Automata

DC D

C D

“TIT FOR TAT”

C

Page 64: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Finite Automata

CC D

C C D

DD

C

“TIT FOR TWO TATS”

Page 65: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Finite AutomataSome examples:

C C

D

D D D

C,D

C D

D

C

C,D

C

DSTART

“ALWAYS DEFECT” “TIT FOR TAT” “GRIM (TRIGGER)”

C DD

D

C C D

“PAVLOV” “M5”

CC C

DD

C

C

Page 66: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Calculating Automata Payoffs

DC DD

D

C C

“PAVLOV” “M5”

CCDD

C

C

D

Time-average payoffs can be calculated because any pair of FA will achieve cycles, since each FA takes as input only the actions in the previous period (i.e., it is “Markovian”).

For example, consider the following pair of FA:

C

Page 67: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Calculating Automata Payoffs

DC DD

D

C C

“PAVLOV” “M5”

CCDD

C

C

PAVLOV: CM5: D

D

C

Page 68: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Calculating Automata Payoffs

DC DD

D

C C

“PAVLOV” “M5”

CCDD

C

C

PAVLOV: C DM5: D C

D

C

Page 69: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Calculating Automata Payoffs

DC DD

D

C C

“PAVLOV” “M5”

CCDD

C

C

PAVLOV: C D DM5: D C D

D

C

Page 70: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Calculating Automata Payoffs

DC DD

D

C C

“PAVLOV” “M5”

CCDD

C

C

Payoff 0 5 1PAVLOV: C D DM5: D C DPayoff 5 0 1

D

C

Page 71: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Calculating Automata Payoffs

DC DD

D

C C

“PAVLOV” “M5”

CC C

DD

C

C

Payoff 0 5 1 0 5 1 0 5PAVLOV C D D C D D C DM5 D C D D C D D CPayoff 5 0 1 5 0 1 5

D

Page 72: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Calculating Automata Payoffs

DC DD

D

C C

“PAVLOV” “M5”

CC C

DD

C

C

Payoff 0 5 1 0 5 1 0 5 AVG=2PAVLOV C D D C D D C DM5 D C D D C D D CPayoff 5 0 1 5 0 1 5 AVG=2

Dcycle cycle cycle

Page 73: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Tournament AssignmentTo design your strategy, access the programs through your fas Unix account. The Finite Automaton Creation Tool (fa) will prompt you to create a finite automata to implement your strategy. Select the number of internal states, designate the initial state, define output and transition functions, which together determine how an automaton “behaves.” The program also allows you to specify probabilistic output and transition functions. Simple probabilistic strategies such as GENEROUS TIT FOR TAT have been shown to perform particularly well in noisy environments, because they avoid costly sequences of alternating defections that undermine sustained cooperation.

Page 74: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Creating your automaton

The program prompts the user to:

• specify the number of states in the automaton, with an upper limit of 50. For each state, the program asks:

• “choose an action (cooperate or defect);” and • “in response to cooperate (defect), transition to what state?”

Finally, the program asks:• specify the initial state.

The program also allows the user to specify probabilistic outputsand transitions.

Tournament Assignment

Page 75: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Tournament AssignmentDesign a strategy to play an Evolutionary Prisoner’s Dilemma Tournament.

Entries will meet in a round robin tournament, with 1% noise (i.e., for each intended choice there is a 1% chance that the opposite choice will be implemented). Games will last at least 1000 repetitions (each generation), and after each generation, population shares will be adjusted according to the replicator dynamic, so that strategies that do better than average will grow as a share of the population whereas others will be driven to extinction. The winner or winners will be those strategies that survive after at least 10,000 generations. 

Page 76: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Pop. Share0.140

0.100

0.060

0.020

0 200 400 600 800 Generations

Simulating Evolution

1(TFT)326

7,9

10411

5

81814,12,1513

No. = Position after 1st Generation

Source:Axelrod 1984, p. 51.

Page 77: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Simulating EvolutionPAV

TFT

GRIM (TRIGGER)D

R

C

Population shares for 6 RPD strategies (including RANDOM), with noise at 0.01 level.

Pop. Shares 0.50

0.40

0.30

0.20

0.10

0.00Generations

GTFT?

Page 78: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Preliminary Tournament Results

Test007.b

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Generations

Popu

latio

n Sh

ares

defect

cooperate

grim

tit4tat

pavlov

random

ataub

brill

cgerry

daniels

daniels1

daniels2

daranow

demashk

fahl

fahl2

fugger

hardin

mm5

mm90

mtangi

nabar

After 5000 generations

(as of 4/25/02)

Avg. Score (x10)

Page 79: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Preliminary Tournament Results

Test.009

0

0.2

0.4

0.6

0.8

1

1.2

Generations (x50)

Popu

latio

n Sh

ares

defect

cooperate

grim

tit4tat

pavlov

random

ataub

bjweiss

bmartin

brill

cgerry

daniels

daniels1

daniels2

daranow

delahuer

demashk

ekent

ekent1

fahl

fahl2

nicer

After 5000 generations

(10pm 4/27/02)

Page 80: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Preliminary Tournament Results

Test.010

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Generations (x50)

Popu

latio

n Sh

ares

defect

cooperate

grim

tit4tat

pavlov

random

ataub

bjweiss

bmartin

brill

brill2

cgerry

daniels

daniels1

daniels2

daniels3

daranow

delahuer

demashk

dkaufman

ekent

ekent1

After 20000 generations

(7am 4/28/02)

Page 81: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Preliminary Tournament ResultsAfter 10000 generations

(4/28/05)test.04.28

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 35 70 150

500

850

1200

1550

1900

2250

2600

2950

3300

3650

4000

4350

4700

5050

5400

5750

6100

6450

6800

7150

7500

7850

8200

8550

8900

9250

9600

9950

Generations

Popu

latio

n Sh

ares

defectcooperategrimtit4tatpavlovrandomgtftjonaheliOVERKILLixions-wheelmccarthyismzygoteSkinnerCopernicusKaiosSimple2Zombiebugchickensandwich

Page 82: Repeated Games Examples of Repeated Prisoner’s Dilemma Overfishing Transboundary pollution Cartel enforcement Labor union Public goods The Tragedy of the.

Preliminary Tournament ResultsAfter 20000 generations

(8/09/05)Test.8.09

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 65 400

1050

1700

2350

3000

3650

4300

4950

5600

6250

6900

7550

8200

8850

9500

1015

0

1080

0

1145

0

1210

0

1275

0

1340

0

1405

0

1470

0

1535

0

1600

0

1665

0

1730

0

1795

0

1860

0

Generations

Popu

latio

n Sh

ares

defect

cooperate

grim

tit4tat

pavlov

random

gtft

galas

mygp

stratone

twoforone4