Learning Dynamics for Mechanism Design Paul J. Healy California Institute of Technology An...
-
Upload
riley-warnell -
Category
Documents
-
view
215 -
download
2
Transcript of Learning Dynamics for Mechanism Design Paul J. Healy California Institute of Technology An...
Learning Dynamics for Mechanism Design
Paul J. HealyCalifornia Institute of Technology
An Experimental Comparison of Public Goods Mechanisms
Overview
• Institution (mechanism) design– Public goods
• Experiments– Equilibrium, rationality, convergence
• (How) Can experiments improve
institution/mechanism design?
Plan of the Talk
• Introduction
• The framework– Mechanism design, existing experiments
• New experiments– Design, data, analysis
• A (better) model of behavior in mechanisms
• Comparing the model to the data
A Simple Example
• Environment– Condo owners– Preferences– Income, existing park
• Outcomes– Gardening budget / Quality of the park
• Mechanism– Proposals, votes, majority rule
• Repeated Game, Incomplete Info
The Role of Experiments
Field: e unknown => F(e) unknown
Experiment: everything fixed/induced except
The Public Goods Environment
• n agents
• 1 private good x, 1 public good y
• Endowed with private good only i
• Preferences: ui(xi,y)=vi(y)+xi
• Linear technology ()• Mechanisms:
),,,()( 21 nmmmymy
),( ymtx iii ),()( 1 nii mmtmt
ii Mm
Five Mechanisms
• “Efficient” => g(e) PO(e)
• Inefficient Mechanisms• Voluntary Contribution Mech. (VCM)
• Proportional Tax Mech.
• (Outcome-) Efficient Mechanisms– Dominant Strategy Equilibrium
• Vickrey, Clarke, Groves (VCG) (1961, 71, 73)
– Nash Equilibrium• Groves-Ledyard (1977)
• Walker (1981)
The Experimental Environment• n = 5• Four sessions of each mech.• 50 periods (repetitions)• Quadratic, quasilinear utility• Preferences are private info• Payoff ≈ $25 for 1.5 hours
•Computerized, anonymous
•Caltech undergrads
•Inexperienced subjects
•History window
•“What-If Scenario Analyzer”
What-If Scenario Analyzer
• An interactive payoff table• Subjects understand how strategies → outcomes• Used extensively by all subjects
Environment Parameters
• Loosely based on Chen & Plott ’96
= 100
• Pareto optimum: yo =(bi - )/(2ai)=4.8095
ai bi i
Player 1 1 34 260
Player 2 8 116 140
Player 3 2 40 260
Player 4 6 68 250
Player 5 4 44 290
iiiiii xybyayx )(),(u 2
Voluntary Contribution Mechanism
• Previous experiments:– All players have dominant strategy: m* = 0– Contributions decline in time
• Current experiment:– Players 1, 3, 4, 5 have dom. strat.: m* = 0– Player 2’s best response: m2
* = 1 - i2mi
– Nash equilibrium: (0,1,0,0,0)
Mi = [0,6] y(m) = imi ti(m)= mi
VCM Results
Player 2
Nash Equilibrium: (0,1,0,0,0)
Dominant Strategies
0
1
2
3
4
5
6
0 10 20 30 40 50
Period
Ave
rag
e M
essa
ge
(4 s
essi
on
s)
PLR1
PLR2
PLR3
PLR4
PLR5
Proportional Tax Mechanism
• No previous experiments (?)
• Foundation of many efficient mechanisms
• Current experiment:– No dominant strategies
– Best response: mi* = yi
* ki mk
– (y1*,…,y5
*) = (7, 6, 5, 4, 3)
– Nash equilibrium: (6,0,0,0,0)
Mi = [0,6] y(m) = imi ti(m)=(/n)y(m)
Prop. Tax Results
Player 2
Player 1
0
1
2
3
4
5
6
0 10 20 30 40 50
Period
Ave
rag
e M
essa
ge
PLR1
PLR2
PLR3
PLR4
PLR5
Groves-Ledyard Mechanism
• Theory:– Pareto optimal equilibrium, not Lindahl
– Supermodular if /n > 2ai for every i
• Previous experiments:– Chen & Plott ’96 – higher => converges better
• Current experiment: =100 => Supermodular
– Nash equilibrium: (1.00, 1.15, 0.97, 0.86, 0.82)
)(
1
2
)()()( 22
iiiii
i mmmn
n
n
mymtmmy
Groves-Ledyard Results
-4
-3
-2
-1
0
1
2
3
4
5
6
0 10 20 30 40 50Period
Ave
rag
e M
essa
ge
PLR1
PLR2
PLR3
PLR4
PLR5
Walker’s Mechanism
• Theory:– Implements Lindahl Allocations
– Individually rational (nice!)
• Previous experiments:– Chen & Tang ’98 – unstable
• Current experiment:– Nash equilibrium: (12.28, -1.44, -6.78, -2.2, 2.94)
)()()( mod1mod)1( mymmn
mtmmy niniii
i
Walker Mechanism ResultsNE: (12.28, -1.44, -6.78, -2.2, 2.94)
-8
-6
-4
-2
0
2
4
6
8
10
12
0 10 20 30 40 50
Period
Av
era
ge
Me
ss
ag
e
PLR1
PLR2
PLR3
PLR4
PLR5
VCG Mechanism: Theory
• Truth-telling is a dominant strategy• Pareto optimal public good level• Not budget balanced• Not always individually rational
yn
nyvz
zn
nzvy
n
nyv
n
yt
yyvy
bamM
ijjj
yii
ijiijiij
ijjji
iii
y
iiiiii
1)ˆ|(maxarg)ˆ(
)ˆ(1
)ˆ|)ˆ(()ˆ(1
)ˆ|)ˆ(()ˆ(
)ˆ(
)ˆ|(maxarg)ˆ(
)ˆ,ˆ(ˆ
0
0
VCG Mechanism: Best Responses
• Truth-telling ( ) is a weak dominant strategy• There is always a continuum of best responses:
ii ˆ
iiiiiii yyBR ˆ,ˆ,ˆ:ˆ)ˆ(
VCG Mechanism: Previous Experiments
• Attiyeh, Franciosi & Isaac ’00– Binary public good: weak dominant strategy
– Value revelation around 15%, no convergence
• Cason, Saijo, Sjostrom & Yamato ’03– Binary public good:
• 50% revelation
• Many play non-dominant Nash equilibria
– Continuous public good with single-peaked preferences:
• 81% revelation
• Subjects play the unique equilibrium
VCG Experiment Results• Demand revelation: 50 – 60%
– NEVER observe the dominant strategy equilibrium
• 10/20 subjects fully reveal in 9/10 final periods– “Fully reveal” = both parameters
• 6/20 subjects fully reveal < 10% of time
• Outcomes very close to Pareto optimal– Announcements may be near non-revealing best
responses
Summary of Experimental Results
• VCM: convergence to dominant strategies• Prop Tax: non-equil., but near best response• Groves-Ledyard: convergence to stable equil. • Walker: no convergence to unstable equilibrium• VCG: low revelation, but high efficiency
Goal: A simple model of behavior to explain/predict which mechanisms converge to equilibrium
Observation: Results are qualitatively similar to best response predictions
A Class of Best Response Models• A general best response framework:
– Predictions map histories into strategies
– Agents best respond to their predictions
• A k-period best response model:
– Pure strategies only– Convex strategy space– Rational behavior, inconsistent predictions
jtjj
ij Mmm 11 ,,
in
ii
ti BRm ,,1
k
s
stj
tjj
ij m
kmm
1
11 1),,(
Testable Predictions of the k-Period Model
1. No strictly dominated strategies after period k
2. Same strategy k+1 times => Nash equilibrium
3. U.H.C. + Convergence to m* => m* is a N.E.3.1. Asymptotically stable points are N.E.
4. Not always stable 4.1. Global stability in supermodular games
4.2. Global stability in games with dominant diagonal
Note: Stability properties are not monotonic in k
Choosing the best k
• Which k minimizest |mtobs mt
pred| ?
• k=5 is the best fit
Model 2-50 3-50 4-50 5-50 6-50 7-50 8-50 9-50 10-50 11-50k=1 1.407 1.394 1.284 1.151 1.104 1.088 1.072 1.054 1.054 1.049k=2 - 1.240 1.135 0.991 0.967 0.949 0.932 0.922 0.913 0.910k=3 - - 1.097 0.963 0.940 0.925 0.904 0.888 0.883 0.875k=4 - - - 0.952 0.932 0.915 0.898 0.877 0.866 0.861k=5 - - - - 0.924 0.9114 0.895 0.876 0.860 0.853k=6 - - - - - 0.9106 0.897 0.881 0.868 0.854k=7 - - - - - - 0.899 0.884 0.873 0.863k=8 - - - - - - - 0.884 0.874 0.864k=9 - - - - - - - - 0.879 0.870
k=10 - - - - - - - - - 0.875
Walker Session 2 Player 1
-10
-5
0
5
10
15
0 10 20 30 40 50Period
Me
ss
ag
e
Walker Session 2 Player 2
-10
-5
0
5
10
15
0 10 20 30 40 50Period
Me
ss
ag
e
Walker Session 2 Player 3
-10
-5
0
5
10
15
0 10 20 30 40 50Period
Mes
sag
e
Walker Session 2 Player 4
-10
-5
0
5
10
15
0 10 20 30 40 50Period
Mes
sag
e
Walker Session 2 Player 5
-10
-5
0
5
10
15
0 10 20 30 40 50Period
Me
ss
ag
e
Groves-Ledyard Session 1 Player 1
-4
-2
0
2
4
6
0 10 20 30 40 50Period
Me
ss
ag
e
Statistical Tests: 5-B.R. vs. Equilibrium
• Null Hypothesis:
• Non-stationarity => period-by-period tests
• Non-normality of errors => non-parametric tests– Permutation test with 2,000 sample permutations
• Problem: If then the test has little power
• Solution: – Estimate test power as a function of
– Perform the test on the data only where power is sufficiently large.
|][||][| ti
ti
ti
ti EQmEBRmE
ti
ti BREQ
/)( ti
ti BREQ
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
0.67
0.8
0.86
0.89
0.91
0.92
0.93
0.94
0.95
0.95
Simulated Test PowerF
req
uen
cy o
f R
eje
ctin
g H
0
(Pow
er)
12
Pro
b. H
0 False
Give
n
Reje
ct H0
5-period B.R. vs. Nash Equilibrium• Voluntary Contribution (strict dom. strats):
• Groves-Ledyard (stable Nash equil):
• Walker (unstable Nash equil): 73/81 tests reject H0
– No apparent pattern of results across time
• Proportional Tax: 16/19 tests reject H0
• 5-period model beats any static prediction
ti
ti BREQ
ti
ti BREQ
Best Response in the cVCG Mechanism
Origin = Truth-telling dominant strategy
0-degree Line = Best response to 5-period average
The Testable Predictions
1. Weakly dominated ε-Nash equilibria are observed (67%)
– The dominant strategy equilibrium is not (0%)
– Convergence to strict dominant strategies
2,3. 6 repetitions of a strategy implies ε-equilibrium (75%)
4. Convergence with supermodularity & dom. diagonal (G-L)
0 5 10 15 20 25 30 35 40 45 500
1
2
3
4
5
6
Period
Avg
. C
on
trib
utio
n
Conclusions
• Experiments reveal the importance of
dynamics & stability• Dynamic models outperform static models• New directions for theoretical work• Applications for “real world” implementation• Open questions:
– Stable mechanisms implementing Lindahl*
– Efficiency/equilibrium tension in VCG
– Effect of the “What-If Scenario Analyzer”
– Better learning models
An Almost-Trivial Game
• Cycling (including equilibrium!) for k=3
• Global convergence for k=1,2,4,5,…
Efficiency Confidence Intervals - All 50 Periods
0.5
1
Mechanism
Eff
icie
ncy
Walker VC PT GL VCG
No Pub Good
Efficiency