Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s...

29
Do players of p-beauty games k -step reason? P. Richard Hahn and Indranil Goswami Booth School of Business, University of Chicago Chicago, Illinois 60637-1656, U.S.A. [email protected] Carl Mela T. Austin Finch Foundation Professor of Business Administration Fuqua School of Business, Duke University Durham, North Carolina 27708-0251, U.S.A. Summary This paper addresses a basic question in the field of behavioral game theory using Bayesian hierarchical modeling applied to data from a new p-beauty contest web experiment. Specifi- cally, we assess whether subjects apply k-step reasoning as is routinely assumed in previous literature. Collecting experimental p-beauty contest data across multiple values of p per subject, we estimate a nonlinear random effects regression model to infer an upper bound on the probability that a randomly selected subject will play a strategy compatible with k-step reasoning. While some individuals appear to adopt strategies compatible with k-step reasoning, the proportion of individuals who do so appears to be less than 30%. Our new statistical model overcomes two inferential hurdles. First, a fundamental identifi- cation problem lies at the heart of behavioral game theory data: we never observe strategic thinking itself, only its game-play manifestations. We take a partial-identification approach, estimating an upper bound on the quantity we ultimately would wish to learn (the proportion of k-step thinkers in the player population). Second, estimating this proportion amounts to judging which inferred regression curves are increasing convex functions; structuring a hier- archical model appropriate to this task requires special care to avoid overshrinking which we handle by introducing a robust hierarchical prior. Some key words: behavioral game theory, censored data, functional data analysis, hierarchical modeling. 1

Transcript of Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s...

Page 1: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

Do players of p-beauty games k-step reason?

P. Richard Hahn and Indranil Goswami

Booth School of Business, University of ChicagoChicago, Illinois 60637-1656, U.S.A.

[email protected]

Carl Mela

T. Austin Finch Foundation Professor of Business AdministrationFuqua School of Business, Duke University

Durham, North Carolina 27708-0251, U.S.A.

Summary

This paper addresses a basic question in the field of behavioral game theory using Bayesianhierarchical modeling applied to data from a new p-beauty contest web experiment. Specifi-cally, we assess whether subjects apply k-step reasoning as is routinely assumed in previousliterature. Collecting experimental p-beauty contest data across multiple values of p persubject, we estimate a nonlinear random effects regression model to infer an upper boundon the probability that a randomly selected subject will play a strategy compatible withk-step reasoning. While some individuals appear to adopt strategies compatible with k-stepreasoning, the proportion of individuals who do so appears to be less than 30%.

Our new statistical model overcomes two inferential hurdles. First, a fundamental identifi-cation problem lies at the heart of behavioral game theory data: we never observe strategicthinking itself, only its game-play manifestations. We take a partial-identification approach,estimating an upper bound on the quantity we ultimately would wish to learn (the proportionof k-step thinkers in the player population). Second, estimating this proportion amounts tojudging which inferred regression curves are increasing convex functions; structuring a hier-archical model appropriate to this task requires special care to avoid overshrinking which wehandle by introducing a robust hierarchical prior.

Some key words: behavioral game theory, censored data, functional data analysis, hierarchical

modeling.

1

Page 2: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

1 Introduction

A p-beauty contest (Moulin, 1986; Nagel, 1995) is a simple multiplayer game that asks each

player to play a number between 0 and 100 (say), the goal being to get as close as possible

to some known multiple p of the group average. An influential line of previous research has

proposed cognitive hierarchy models to explain patterns of game play observed in beauty

contest experiments (notably Camerer et al. (2004)). Whereas the economics literature has

focused predominantly on understanding formal properties of these models and interpreting

estimates of model parameters in light of this understanding, we focus on the purely empirical

problem of inferring whether or not agents actually use such strategies.

To study this question we focus on k-step thinking strategies, which generalize cognitive

hierarchy strategies. We collect new panel data from web experiments which we use to

estimate each player’s strategy (their pattern of play) as a function of p. It can be shown

that any k-step thinking strategy must be increasing and convex in p. We therefor infer

what proportion of subjects play convex strategies, which constitutes an upper bound on

the proportion of players who play cognitive hierarchy strategies.

A key statistical challenge is how to cope with the multiple-testing issues which arise

because we estimate a separate function for each player. To address this challenge we adopt

a hierarchical Bayesian approach which introduces a shared prevalence parameter, a common

approach for inducing multiplicity adjustment (Scott and Berger, 2010) in variable selection

problems. However, testing for global features of functions (such as convexity in our case) can

be a delicate exercise (Scott et al., 2013). Our model is structured specifically to avoid over-

shrinkage, which could result in a many players being mistakenly inferred to play strategies

consistent (resp. inconsistent) with k-step reasoning if many other players have strongly

convex (resp. concave) strategy curves.

Additionally, we conduct an exploratory study of how player attributes such as age,

2

Page 3: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

gender, and education associate with posterior probabilities of playing a k-step compliant

strategy. We conclude with a brief discussion.

2 Beauty contests and conditional rationality

2.1 Beauty contests

The goal of each player of the p-beauty contest (Moulin, 1986; Nagel, 1995) is to play a

number b – which we refer to here as a bid – that is as close as possible to p times the whole

group’s average bid. Bids are restricted to lie within some fixed interval (L,U). In this

paper, without loss of generality, we fix L = 0 and U = 100. The player bidding closest to p

times the group’s average bid wins a fixed payoff; we refer to this ex-post optimal bid as the

target bid. Other variants of the game are possible, for example by awarding payoffs which

decrease as a function of the distance from the target bid.

We restrict our attention to p ∈ (0, 1). Because every player is trying to undercut ev-

eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However,

experiments consistently reveal that most people do not play the zero strategy. The liter-

ature on p-beauty contests is substantial, most of it focused on theoretical observations or

behavioral explanation which, while fascinating, are beyond the scope of this paper; we refer

readers to the summary in Chapter 5 of Camerer (2003) and the many references therein.

The beauty contest game has two properties which make it a convenient setting for

studying strategic behavior. First, it is a symmetric game, meaning that all players have

the same payoff function. Second,when the total number of players, n, is more than several

dozen, one’s own bid negligibly affects the overall group mean, so that the game is effectively

a forecasting game. From this perspective, it becomes natural to ask if the observed non-Nash

play in the p-beauty game is a result of rational agents acting on the conviction that their

opponents are acting “irrationally” so that the group average is non-zero. If a given player

3

Page 4: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

does not trust that his opponents can reason their way to the Nash equilibrium strategy,

then the Nash equilibrium solution is no longer optimal or rational for that player.

Accordingly, characterizing players’ beliefs about the strategies others play may be one

route to accurately characterizing observed bidding behavior. We turn to this topic in the

next section.

2.2 A random effects k-step thinking model of strategy formation

In this section we introduce notation for handling heterogeneity in strategy formation in

beauty contest games. The heterogeneity comes from two sources: idiosyncratic beliefs

about the strategies of others and the number of iterations one proceeds, conditional on

these beliefs, towards the Nash equilibrium.

Informally, imagine a player pondering her strategy. Suppose she thinks that most people

probably do not understand the game and so will just play randomly with some mean, call

it µ0. If she believes her opponents play without error or if the payoff function is linear in

squared-distance (in the proper utility scale1), her optimal bid will be pµ0. But what if, she

thinks, some people reason as I have just now? Those people would thus play pµ0 and her

optimal bid becomes p times a weighted average of µ0 and pµ0 provided that she has beliefs

about the relative proportions of the first two types of players. We refer to a totally random

player as a level-0 player, one who thinks a single extra step a level-1 player, and so forth.

More formally, for k ≥ 2 let individual i have a k-step strategy if her optimal bid, given

p, can be computed from the following two parameters. First, we have a scalar µ0i ∈ (0, 100)

which is her belief as to what the level-0 random players will bid. Next, we have a (k−1)-by-k

lower-triangular right stochastic matrix Ωi, which we refer to as a belief matrix. The bottom

row of Ωi represents the individual’s beliefs about the relative proportions of the various

strategy classes below her, while the above rows reflect her beliefs about the analogous

1See Appendix A for further details.

4

Page 5: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

beliefs of each of the corresponding strategy levels below k; the rows thus associate with

strategy levels 2 through k and the columns with strategy levels 0 through k − 1.

Consider the case k = 3, where Ωi has only 2 rows. The bottom row is player i’s beliefs

about the relative proportion of the three classes (0, 1, 2) below her. The top row is the

beliefs she ascribes to a level-2 player about the relative proportion of level-0 and level-1

players. Note that the third element of this row is zero because a level-2 player does not

conceive of the possibility of other level 2 players2,3.

One may compute the (µ0i ,Ωi)-optimal bid from the following iterative formula (indexing

Ω from 0):

µ1i = pµ0

i ,

µhi = ph−1∑j=0

ωh,jµji .

(1)

Accordingly, we denote such a player’s optimal bid as µki .

From this general structure we see that a k-step thinking strategy satisfies several easy-

to-check conditions:

i) at p = 1 if individual i is a non-random player, her optimal bid is µ0i ,

ii) at p = 0 the optimal bid for any non-random player is 0

iii) if individual i is a level-k player then her optimal bid lies in the interval (pkµ0i , pµ

0i ) (this

follows from the extreme cases of assuming that all players are level k − 1 or level-0

players, respectively) and

2Similarly, a level-1 player believes that all opponents are level-0 players and level-0 players possess nobeliefs about their opponents at all.

3Note that in general a level-k thinker may bid higher than a level k − 1 thinker; for example a level-3player who believes that all of his opponents are level-0 thinkers will be indistinguishable from a level-1player. This is not so in many earlier models, which rule out this possibility by assumption.

5

Page 6: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

iv) a k-step iterated reasoning strategy is a positive linear combination of the monomial

terms p1, . . . , pk, so is strictly increasing in p and convex on (0, 1).

Our statistical analysis seeks to assess the probability that these conditions are satisfied

for a randomly selected player from the population4.

2.2.1 Cognitive hierarchies

Cognitive hierarchy models are a special case of the framework described above. Let γi

denote the level thinking of player i. Then a cognitive hierarchy model assumes that

1. Ωi = Ωj whenever γi = γj.

2. For γj = γi − 1 > 2, Ωj is the leading principal submatrix of Ωi (rescaled to preserve

right stochasticity).

It is in the sense of 2 that such models are hierarchical: the belief matrices of lower-level

thinkers are nested inside the belief matrices of higher-level thinkers. Additionally, specific

cognitive hierarchy models may assume that µ0i is common across i, that µ0 is in fact the

actual mean of the level-0 players, and/or that players beliefs about the relative proportions

of thinker types is accurate.

For example, k = 3 under the CH-Poisson model of Camerer et al. (2004) yields Ω =

f(0|τ=2)f(0|τ=2)+f(1|τ=2)

= 1/3 f(1|τ=2)f(0|τ=2)+f(1|τ=2)

= 2/3 0

f(0|τ=2)f(0|τ=2)+f(1|τ=2)+f(2|τ=2)

= 1/5 f(1|τ=2)f(0|τ=2)+f(1|τ=2)+f(2|τ=2)

= 2/5 f(2|τ=2)f(0|τ=2)+f(1|τ=2)+f(2|τ=2)

= 2/5

,

4See Appendix A for a brief remark on the interpretation of this approach as it relates to different payoffschemes and an agent’s uncertainty considerations.

6

Page 7: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

where f(· | τ) denotes the Poisson probability mass function with parameter τ . Element

(g, h) of this matrix records what a level-3 thinker believes a level g + 1 thinker believes is

the relative proportion of level-(h− 1) thinkers in the population.

As another example, the “k − 1” model of Crawford and Iriberri (2007), Ωk is given by

a matrix with ωg,g+1 = 1 and zeros elsewhere.

3 Experiment and Model

3.1 Beauty contest panel data

Previous cognitive hierarchy models functionally constrain the entries of the belief matrices

Ωi so that each player’s bidding strategy µi uniquely defines an associated belief matrix Ωi.

Our approach avoids such assumptions. Instead we determine if µi(p) could not correspond to

any Ωi by checking if it fails two necessary (though not sufficient) conditions that a strategy

would satisfy if it arose via a k-step thinking process: that it goes through the origin and

that it is increasing and convex in p.

First, we must infer a bidding strategy µi(p) for each subject, which we do by collecting

data from subjects across multiple values of p. A preliminary data collection produced the

scatter plot shown in Figure 1. We used this preliminary study to ascertain a reasonable

class of regression functions to use for modeling the player strategies.

In the finalized version of the game, players were asked to play a beauty contest, without

immediate feedback about their performance, for six values of p ∈ 0.3, 0.4, 0.5, 0.6, 0.7, 1.0

under two different payoff schemes. The first payoff was the usual winner-take-all game. The

second scheme awarded participants a payoff which decreased exponentially away from the

realized target bid; see Figure 2 for an illustration.

Our analysis uses data from a total of 106 individuals who demonstrated that they prop-

erly read the instructions by correctly answering an attention test and who successfully

7

Page 8: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Preliminary Study

p

Bids

Figure 1: Bids for 100 respondents over eight one-shot beauty contests. The smoothedempirical mean is shown in solid and the ex post optimal strategy is shown dashed. Notethe non-Nash bidding at p = 0.

0 20 40 60 80 100

01

23

45

Pay-off versus bid (target bid = 15)

bid

Pay

-off

in U

S d

olla

rs

Figure 2: A depiction of the bell-shaped payoff scheme with target bid of 15. A maximumpayoff of $5 is earned for a bid of 15, with a sharp decline away from that point. The blackdots indicate bids of 0, 10, 17, and 20 for illustration. For utility in log-dollars, this schemeresults in a squared-error utility function.

8

Page 9: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

participated in all games and all questions in the accompanying demographic survey. In-

structions for playing the game were provided via a video tutorial. The precise experimental

protocol can be examined by playing the game for oneself via a web interface at the first

author’s web page. The web interface plots each bid against the associated value of p so that

players may visualize how their bids relate to one another prior to final submission.

3.2 Statistical model

Our statistical analysis is based on a multi-level model, where each individual has their

own mean regression curve (their strategy). These individual strategies share a common

prior distribution with unknown (hyper)parameters. We focus on isolating two properties of

players’ bidding strategies: if they equal 0 at p = 0 and if they increase in a convex manner

as a function of p.

We begin by characterizing the observed plays of individuals conditioned on their bidding

strategy; bidding data is rescaled to the unit interval for convenience. We assume that play-

ers’ bids can deviate from their underlying strategy owing to inattention or other unobserved

contextual factors 5. We use a Beta distribution with prescribed mean µ(p),

bi | µi, si ∼ Beta(ciµi, ci(1− µi)),

ci = si ×max(1/µi, 1/(1− µi)), si > 1,

(2)

where bi is the observed bid of player i. Note that µi(p) is a function of p, so bi(p) is as well.

We use a mean and sample-size parametrization restricted to have unimodal density; higher

values of ci (and hence higher values of the parameter si) correspond to a more sharply

peaked densities (higher precision, lower variance).

Next, we turn to representing µi(p) in such a way that we may easily assess if µi(0) = 0

5Whether or not players entertain the possibility that their opponents can bid with error is discussed inAppendix A.

9

Page 10: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

and if it is convexly increasing in p, the conditions we use to probe the k-step reasoning

hypothesis. To this end, we associate µi(p) with a parameter vector θi = (ηi, φi, νi, µ0i ). In

this section we suppress the subscript for clarity. The four θ parameters constitute three

“knots” — ordered pairs in the unit square — with horizontal coordinates (0, η, 1) and

corresponding vertical coordinates (φ, ν, µ0). Given these knots, a cubic spline is interpolated

through them such that the resulting curve µ(p) is monotonic if and only if φ ≤ ν ≤ µ0

(specifically we use the R function splineFun() with setting method = ‘mono’). This

particular interpolation method has the desirable property of mitigating “overshoot”, the

amount by which the interpolated function rises above (resp. below) the largest (resp.

smallest) interpolation point; other choices are available but this one behaves reasonably in

our judgment. A prior on µ(p) is then induced via a prior on θ.

More specifically, we use the compositional prior

p(ρ, q0, q1,µ0,η,φ,ν) = p(ρ)p(q0 | ρ)p(q0 | ρ)

∏i

p(µ0i )p(ηi)p(φi | ρ, µ0

i )p(νi | q, ηi, φi, µ0i ),

(3)

with component parts

µ0i ∼ Beta(3/2, 1),

ηi ∼ Beta(5, 5),

φi | ρ, µ0i ∼ ρδ0 + (1− ρ)Beta(cµ0

i , c(1− µ0i ))

p(νi | q, ηi, φi, µ0i ) = BiUnif

(qφ=0, (µ

0i − φi)ηi + φi

),

ρ ∼ Beta(3, 1)

q0 ∼ Beta(3, 1),

q1 ∼ Beta(1, 3),

(4)

where c = 2 max (1/µ0i , 1/(1− µ0

i )) and δ0 denotes a point-mass distribution at 0.

10

Page 11: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Three example splines

p

Bid

Not through origin, non-convexThrough origin, non-convexThrough origin, convex (k-step compliant)

Figure 3: Three curves are shown with associated parameters θdashed =(0.4, 0.3, 0.7, 0.4), θdotted = (0.5, 0, 0.2, 0.5) and θsolid = (0.3, 0, 0.4, 0.7) shown as points. Thedashed curve neither intersects the origin nor is convex. The dotted curve intersects theorigin but is not convex. The solid curve both intersects the origin and is convex. Only thesolid curve could represent a k-step thinking strategy.

The parameter ρ controls the probability that a strategy runs through the origin; our first

condition in assessing whether a strategy is consistent with k-step reasoning. Specifically

with probability ρ, φi = 0, while with probability 1 − ρ, φi is drawn from a unimodal Beta

distribution centered at the assumed level-0 mean, µ0i .

Next, we consider the second condition, which is that the strategy is convex and increasing

in p. The prior over νi has a “bi-uniform” density which is a step function with a single

discontinuity at a point m where ηi intersects the line running between (0, φi) and (1, µ0i )

and a quantile parameter q; the probability of ηi being less than m is q, while the probability

of being above m is 1− q. Recall that due to our spline representation if νi < m then µi(p)

will be convex increasing in function, with smaller values of ηi inducing “more” convexity.

11

Page 12: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

0.0 0.2 0.4 0.6 0.8 1.0

0.00.51.01.52.02.5

Conditional prior on ν

ν

Density

Figure 4: The prior for νi given (q, ηi, φi, µ0i ) shown here for m = (µ0

i − φi)ηi + φi = 0.4 andq = 0.7. The area under the curve to the left of the mode at 0.4 is q = 0.7. We refer to thisas a bi-uniform density with quantile parameter q and location parameter m.

See Figure 4 for an illustration. The intuition is that now ρq0 is the probability of drawing

a k-step compliant strategy by the composition:

ρq0 = Pr(φi = 0)Pr(νi < (µ0i − φi)ηi + φi | φi = 0, µ0

i , ηi),

= Pr(goes through the origin)Pr(convex | goes through the origin),

= Pr(goes through the origin and is convex).

(5)

With this parametrization, the priors on q0 and ρ can straightforwardly bias µ(p) towards

k-step thinking, reflecting current attitudes in that research community; see Figure 6 for

an illustration and compare to Figure 7 to observe this model’s flexibility relative to earlier

approaches. Posterior inferences concerning prevalence of k-step thinking likewise follow

directly from the posterior over these parameters.

Figure 4 shows draws from the prior over µ(p); k-step-compliant strategies are shown in

12

Page 13: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

si

w

yii

i

i µ0i

q = (q0, q1)

Figure 5: A diagram of the dependence structure in our random effects model of beautycontest strategies and bids. The dotted box indicates that there are n such parameter sets,each associated with an individual player and defining that player’s strategy. The solid boxwith rounded edges contains the parameters governing the mean. The parameter si controlsthe bid error scale. The parameters ρ, q and w are shared across all players. The circledvariable yi denotes the observed bidding data. The strategy function µi(p) is defined byparameters (φi, ηi, µ

0i , νi) by the monotone interpolation spline.

black, non-k-step-compliant strategies are shown in gray.

For each player we have 12 observations, from which we infer the four parameters of θ

and the noise level si, which is given a discrete prior at values of (1.2, 3, 21, 51, 101) with

shared, unknown probability vector w, which is given a Dirichlet prior with parameter vector

α = (0.1, 0.25, 0.3, 0.25, 0.1). The choice of these values gives a nice range of bidding “types”

ranging from imprecise to precise; see Figure 8.

Collecting many more observations per player is not an appealing option because players

very rapidly cease paying close attention to the game. Indeed, in our experiment, roughly

30% of the respondents were not used because they failed an instruction-reading check. The

shared hyper-parameters w, q and ρ help prevent over-fitting.

13

Page 14: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Prior over strategies

p

Bids

Figure 6: One hundred randomly drawn strategies from the prior; k-step-compliant strate-gies are shown in black and non-k-step-compliant strategies are shown in gray. This plotcorresponds to a 56% a priori k-step-compliance probability.

In our actual analysis our model is complicated by the fact that we have repeated obser-

vations for each subject under two separate payoff functions. We do not require that a player

adopts the same strategy in each case, however we do assume that si is common across payoff

structures and also that if φi = 0 under one payoff structure, then it equals zero under the

other as well. (For ease of exposition, this elaboration is not reflected in Figure 5 or in the

computational appendix; R code is available upon request.)

Posterior analysis is conducted via Monte Carlo simulation using a Metropolis-within-

Gibbs approach; computational details are deferred to an appendix.

14

Page 15: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.1

0.2

0.3

0.4

0.5

CH-Poisson strategies, τ = 3

p

Opt

imal

bid

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.1

0.2

0.3

0.4

0.5

Level-k strategies

p

Opt

imal

bid

Figure 7: The first panel shows CH-Poisson strategies for τ = 3. Notice that beyond 4 stepsof iterated reasoning the strategies become much closer than plausible levels of noise in thebids. Level-0 players are assumed to bid 1/2. The second panel shows level-k strategies (outto ten levels of iterated reasoning).

4 Analysis

4.1 Main findings

Our key findings follow directly from the posterior distribution over ρ, q0 and q1 and functions

thereof. In particular, the posterior distribution of ρq0 represents (an upper bound on) the

15

Page 16: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

0.0 0.2 0.4 0.6 0.8 1.0

05

1015

Bidding precisions

Bids

Density

s1.232151101

Figure 8: Players are permitted to be one of five types of bidder varying by amount ofprecision with which they bid about their mean strategy. Here the densities of the fivetypes are shown for their corresponding values of s when the mean bid is fixed at 0.3. Theparameter w is a vector of probabilities recording then prevalence of these five types; thisparameter is inferred from the data.

probability that a randomly selected person applies k-step reasoning. We find the posterior

mean of this probability to be 22%, with a right 95% quantile of 29%. By contrast the prior

mean was 56% with a right 95% quantile of 89%; despite this favorable prior, we infer the

prevelance of k-step reasoning to be quite low. Figure 9 depicts a kernel-smoothed Monte

Carlo estimate of this posterior density.

Similarly, q0 represents the probability that a randomly selected player has a convex

strategy, given that they play through the origin (φ = 0). This rules out individuals who

do not, in one sense, “get” the point of the game. The posterior mean of this probability is

28%, with right 95% quantile of 36%.

In summary, while it is plausible that some individuals arrive at their strategies via k-step

reasoning, many more play strategies that do not correspond to any k-step derived strategy.

Finally, we are able to produce individual strategy estimates, along with posterior point-

16

Page 17: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

0.0 0.2 0.4 0.6 0.8 1.0

02

46

810

Posterior probability of k-step compliance

ρq0

Density

Figure 9: With 95% posterior probability, the chance of a randomly selected person plays ak-step-compliant strategy is less than 29%.

17

Page 18: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

0.0 0.2 0.4 0.6 0.8 1.0

02

46

8

Posterior probability of k-step compliance

q0

Density

Figure 10: Conditional on playing through the origin, the posterior probability of a k-stepcompliant strategy is higher with a mean of 27%, but still not more than 35% with 95%posterior probability.

18

Page 19: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

wise 95% confidence intervals as well as individual estimates that a given player is following

a k-step-compliant strategy. These plots are shown in the supplementary file

individualSummaries.pdf. What these plots reveal is that many players appear to fail

the convexity criterion, despite exhibiting a general decreasing trend in their bids. This

observation is supported by the posterior distribution of ρ (shown in Figure 11): a player

will play through the origin with relatively high probability (posterior mean 82%).

0.0 0.2 0.4 0.6 0.8 1.0

02

46

8

Posterior probability of strategy through the origin

ρ

Density

Figure 11: More people fail k-step compliance by lack of convexity than by lack of playthrough the origin, as the probability of playing through the origin is relatively high as seenhere.

4.2 Exploratory analysis and future work

In addition to the game play data, we collected various subject-attributes for each partic-

ipant. As a way to determine which, if any, of these attributes are predictive of playing a

19

Page 20: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

k-step compliant strategies, we perform a non-linear regression of the posterior probability

of being a k-step thinker on these attributes using the tree-based method described in Gra-

macy et al. (2011). This method has the virtue of providing a readily-interpretable variable

relevance metric. The plot below shows the posterior summary of this metric for each of 12

variables, which are described below.

1 2 3 4 5 6 7 8 9 10 11 12

42

0-2

-4

DynaTree variable relevance

Variable Number

Relevance

Figure 12: The posterior probability of k-step thinking is regressed on twelve subject featuresusing a flexible non-linear, tree-based method (Gramacy et al., 2011) to see which featuresare predictive of k-step thinking. Only time-taken (variable 1) appears compellingly related.

Several observations are noteworthy. First, we note that the probability of k-step compli-

ance seems to be unaffected by whether or not payoffs are winner-take-all format or distance

based. This is unsurprising in light of the fact that people likely do not explicitly optimize

their strategies; an implicit cognitive process is more likely and this process appears not to

differ (too much) between these two payoff schemes. Second, we observe that age and educa-

tion do not appear to be predictive of k-step compliant strategies. Finally, we observe that

20

Page 21: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

Variable name

1 aggregate time taken on games2 aggregate time taken on survey (including games)3 gender (male)4 gender (female)5 education (bachelors)6 education (graduate degree)7 education (high school graduate)8 education (some college)9 education (some high school)10 age11 cognitive reflection task indicator12 payoff indicator (winner-take-all or bell-shaped)

Table 1: Variable list for exploratory regression analysis. Gender can take three levels: male,female, not reported.

the total time taken on the games is clearly predictive of playing a more convincingly k-step

compliant strategy. The cognitive reflection task indicator, which records if they correctly

answered a (grade-school level) non-trivial algebra exercise, may be weakly predictive. (The

correlation between the total time on the games and the cognitive reflection task is 0.2.)

In light of our main finding that not many people appear to k-step reason, a future

line of research we plan to pursue is to use our posterior probability of k-step reasoning

as an outcome measurement and conduct a series of experiments with different payoffs and

instructions to see under what conditions people can be induced to strategize (short of

explicit tutoring).

4.3 Discussion

The statistical contributions of this paper consist in the design of a bespoke hierarchical

model tailored to a challenging structured problem. Our model incorporates two main inno-

vations.

First, while hierarchical models are widely recognized to foster “borrowing of informa-

21

Page 22: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

tion”, the nature of the borrowing often remains implicit. We strove to make the nature

of the borrowing explicit and, indeed, limited by using the bi-uniform prior on ηi. We can

illustrate our approach with the following simplified example. Suppose we observe the exact

value of νi for each of eight players and assume that µ0i ηi = 0.4 and φi = 0 for all of them.

We seek to determine the probability that a future player will adopt have a k-step compliant

strategy under these same conditions, based on the observed data. Specifically, we want to

infer from the observed νi the posterior probability that ν < 0.4, because a strategy is convex

in this case precisely when νi is below the line p(µ0i −φi)+φi at the point p = ηi (as dictated

by our spline interpolation). Suppose further that the data consists of two observations at

0.35 and six observations at 0.95. If we assume a Beta distribution for the shared prior, the

maximum likelihood fit is a Beta(2.19, 0.607) density, which ascribes only 7% probability

to the region less than 0.4. The bi-uniform prior with location parameter 0.4, meanwhile,

has a maximum likelihood estimate of q = 2/8. Figure 13 illustrates the difficulty: the

inferred prevalence is strongly influence by the locations of the observed data and so will

be inconsistent for the probability that ν < 0.4 if the assumed distribution is misspecified.

The bi-uniform prior is consistent for the relevant probability by construction even if it is

misspecification. This approach is analogous to using least absolute error regression rather

than least-squares regression as a way to protect against outliers. Here we have applied this

basic insight to a complicated hierarchical model.

The intuition we wished to capture was this: imagine two curves which fit the observed

data about equally as well, one of which is convex and through the origin, and one of which

is not convex and not through the origin; how should we wish to break this tie? Roughly

speaking, the idea is that we want to bias it toward whichever function type was more highly

represented in the rest of the sample. The more usual method of using a conventional shared

prior over ν, such as the Beta distribution from the above example, would dictate that the

parameters of the shared prior would be determined by the magnitude of the other samples as

22

Page 23: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

0.0 0.2 0.4 0.6 0.8 1.0

0.0

1.0

2.0

3.0

"Borrowing via counts"

ν

Density

bi-uniform MLEBeta MLE

Figure 13: Observations are shown as solid dots. The maximum likelihood estimator of thebi-uniform model is consistent for the probability that ν < µη (or 0.4 in this depiction).Other default models, such as the Beta distribution shown, do not necessarily have thisproperty.

well as their prevalence, meaning that our estimate of the prevalence of k-step type strategies

may be affected appreciably by the immaterial fact that some players deviate severely from

convex strategies.

Second, estimation of “strategic thinking” is fundamentally an un-identified problem: we

can only see the game play itself, not the thought process underpinning it. (Introspective

reporting of thought processes is widely regarded as unreliable.) As such, inferences of

interest — in this case the proportion of players who use k-step thinking — depend crucially

on untestable assumptions.

Rather than making specific assumptions, fitting a model, and interpreting our estimates

(as in Brown et al. (2013) and Hossain and Morgan (2013) for two recent examples), we take

a partial identification approach, estimating an identified bound on an unidentified quantity.

Although relatively common in econometrics, this tactic is presently atypical in Bayesian

statistics; see Manski (2008) for a book-length treatment and Tamer (2010) for a review of

23

Page 24: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

this general approach in non-Bayesian contexts. In our beauty contest analysis, we are able

to obtain a posterior distribution of an upper bound for the prevalence of k-step thinking.

Finding this upper bound to be quite low, we argue that we should re-evaluate the prevailing

wisdom (Camerer (2003), chapter 5) that the beauty contest is the k-step reasoning gold

standard.

The precise numerical estimates will vary subject to particular modeling decisions, such

as the shape of the error distribution on bids. But our viewpoint is that such modeling

choices are relatively benign compared to the stronger assumptions adopted by previous

methods, which are needed to establish full identifiability of the model. We opt to model the

observed phenomenon as accurately as possible and then to interpret the patterns we infer

after the fact (see Hill (2011) for a similar-in-spirit approach in a causal inference setting).

Because of this approach, had the probability estimates of k-step compliance turned out to

be very high, we still would not have proved k-step reasoning, we only would have found

an uninformative upper-bound; in this respect our model, while Bayesian in execution, has

a falsificationist flavor, reflecting a philosophical outlook espoused recently in Gelman and

Shalizi (2013).

A Bayesian approach to partial identification thus yields approximate answers to the

right question (how many people could be k-step reasoning), whereas previous approaches

offered precise answers to the wrong question (e.g., what is the average level of thinking,

presuming that people exactly play according to, say, the CH-Poisson model).

24

Page 25: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

References

A. L. Brown, C. F. Camerer, and D. Lovallo. Estimating structural models of equilibrium

and cognitive hierarchy thinking in the field: The case of withheld movie critic reviews.

Management Science, 59(3):733–747, 2013.

C. Camerer. Behavioral Game Theory. Princeton University Press, 2003.

C. F. Camerer, T.-H. Ho, and J.-K. Chong. A cognitive hierarchy model of games. Quarterly

Journal of Economics, 119(3):861–898, 2004. doi: 10.1162/0033553041502225. URL http:

//www.mitpressjournals.org/doi/abs/10.1162/0033553041502225.

V. P. Crawford and N. Iriberri. Level-k auctions: Can a non-equilibrium model of strategic

thinking explain the winner’s curse and overbidding in private-value auctions? Economet-

rica, 75:1721–1770, 2007.

A. Gelman and C. R. Shalizi. Philosophy and the practice of Bayesian statistics. British

Journal of Mathematical and Statistical Psychology, 66(1):8–38, February 2013.

R. Gramacy, M. Taddy, and N. Polson. Dynamic trees for learning and design. Journal of

American Statistical Association, 106(493):109–123, 2011.

J. L. Hill. Bayesian nonparametric modeling for causal inference. Journal of Computational

and Graphical Statistics, 20(1):217–240, 2011.

T. Hossain and J. Morgan. When do markets tip? a cognitive hierarchy approach. Marketing

Science, 32(3):431–453, 2013.

C. Manski. Identification for Prediction and Decision. Harvard University Press, 2008.

H. Moulin. Game Theory for Social Sciences. New York University Press, 1986.

25

Page 26: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

R. Nagel. Unraveling in guessing games: An experimental study. American Economic

Review, 85(5):1313–26, December 1995. URL http://ideas.repec.org/a/aea/aecrev/

v85y1995i5p1313-26.html.

J. G. Scott and J. O. Berger. Bayes and empirical-Bayes multiplicity adjustment in the

variable-selection problem. Annals of Statistics, 38(5):2587–2619, 2010.

J. G. Scott, T. S. Shively, and S. G. Walker. Nonparametric Bayesian testing for monotonic-

ity. Technical report, University of Texas at Austin, April 2013.

E. Tamer. Partial identification in econometrics. Annual Review of Economics, (2):167–95,

2010.

A Best-responding for different payoffs and uncertainties

As presented here, a k-step strategy is a (conditional) best response in a winner-take-all

beauty contest if there is no bidding error and no agent uncertainty about the elements of

the belief matrix Ωi. If one wishes to incorporate such elaborations, more nuance is needed.

In particular, the optimal bid under winner-take-all is determined as p times the modal group

bid (the value believed by the agent to be the most likely group average bid). Permitting

agents to account for optimization error (where one’s opponents optimize their strategies

imperfectly) would force one to redefine the relationship given in (1) to take into account the

expected utility maximization. With unimodal error distributions, it ought to be possible to

reach similar conclusions to (i)-(iv).

With a modified payoff function based on the squared-distance from the target bid,

the situation becomes much tidier analytically, because (1) is preserved under expectations

and the optimal bid is simply p times the expected bid, where the expectation is over all

relevant uncertainties. In the case of either payoff function (we collect data under both), our

26

Page 27: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

approach will be to test for conditions (i)-(iv), with the understanding that the interpretation

of the results will hinge on what assumptions one is willing to make about how agents handle

potential uncertainties. In our experiment we assume linearity of utility in log-dollars, leading

to the bell-shaped payoff scheme described above.

As a matter of realism, we feel that agents are likely to employ satisficing heuristics in

any event; assuming that agents posit no optimization error and no uncertainty concerning

Ωi strikes us as unobjectionable in light of this.

B Computational details

Posterior sampling is performed via a Metropolis-with-Gibbs approach. We sequentially

sample parameter, given the current value of the remaining parameters according to the

Metropolis-Hastings acceptance probabilities. A challenge with this approach is to devise

suitable proposal densities. We utilize a random-walk approach. As many of the model

parameters are restricted to the unit interval we introduce a latent variable representation

using the “wrapping function”:

g(x) = x− bxc+ 1(x < 0),

where bxc denotes the integer part of x. This function maps numbers in the unit interval to

themselves, while numbers outside the unit interval get mapped back to their fractional part,

in the case of positive numbers, or one minus their fractional part, in the case of negative

numbers. So 0.8 gets mapped to itself; 1.3 gets mapped to 0.3; -2.4 gets mapped to 0.6.

For a parameter, such as µ0, restricted to the unit interval, we conduct a random walk over

parameters µ on the whole real line and define µ ≡ g(µ). The µ parameters are unidentified,

but this is irrelevant as we report inferences and define priors on the original, identified scale.

For the φi parameters, which can be exactly zero with positive probability, we also in-

27

Page 28: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

troduce a binary latent variable zi and define φi = zig(φi). We conduct the random walk on

the real line with φ and transform to φi for likelihood evaluations.

1. For each i, sample (θi | –) according to a Metropolis ratio using likelihood and prior

given in (2) and (4) respectively. Proposal density evaluations can be avoided by using

a symmetric random walk over the elements of θi, in our case an independent normal

distribution centered at the current values of ηi, φi, νi and µ0i .

2. For each i, sample (zi | –) using a straightforward application of Bayes rule.

3. For each i, sample (si | –) using a straightforward application of Bayes rule.

4. Sample (w | –) from a Dirichlet distribution with parameter α∗ = α + κ where κ

records counts of how many observations are currently assigned to each level of s.

5. Sample (ρ | –) as a Beta(γ, β) random variable. Let n1 =∑

i zi and n0 =∑

i(1− zi).

Then γ = 3 + n0 and β = 1 + n1.

6. Sample (q0 | –) as a Beta(γ, β) random variable. Let nq0 be the number of players for

whom both φi = 0 and νi < µ0i ηi. Then γ = 3 + nq0 and β = 1 + n0 − nq0.

7. Sample (q1 | –) s a Beta(γ, β) random variable. Let nq1 be the number of players for

whom both φi 6= 0 and νi < µ0i ηi. Then γ = 3 + nq1 and β = 1 + n0 − nq1.

C Note on individual summary plots

Associated with this paper is a 106 page supplementary file individualSummaries.pdf.

Each page consists of two posterior summary plots for an individual subject, one correspond-

ing to the six bids from the winner-take-all version of the game and six bids corresponding to

the bell-shaped payoff version. Each plot displays the bids themselves as points in the plane,

with the bid value on the vertical axis and the value of p in the game on the horizontal axis.

28

Page 29: Do players of p-beauty games k-step reason?mela/bio/papers/Hahn_Goswami_Mel… · eryone else’s bid by the fraction p, the Nash equilibrium strategy is the zero bid. However, experiments

The posterior mean strategy is shown as a red curve, and a 95% posterior credible interval

for this curve is shown in gray. Each plot is labeled in the title with a posterior probability

of k-step compliance, which is computed as the fraction of the posterior samples in which

the strategy curve for that subject is increase convex and goes through the origin.

The plots themselves are instructive, both in terms of observing how individuals bid and

how bidding behavior various between individuals, but also to observe how the model is

interpreting the data. We observe in particular that players with “orderly” bids that fail

convexity are judged to have lower probability of k-step compliance than players who bid

more haphazardly; this reflects the fact that for the haphazard player we cannot be certain

that she does not intend to play convex increasing and through the origin and just does so

poorly. As a special case of this, there are numerous players who appear to play a perfectly

linear strategy. If such a pattern of bids approximately coincides with an intersection at

the origin, then the subject is inferred to have a 50% probability of being k-step compliant.

However, if the linear trend does not intersect the line, the posterior probability is very

nearly zero.

29