Skewness and Preferences for Non-Instrumental Informationon Kreps and Porteus (1978) is most...

Skewness and Preferences for Non-Instrumental Information∗

PRELIMINARY, PLEASE DO NOT CITE

Yusufcan Masatlioglu†

University of Michigan

Yesim Orhun‡

University of Michigan

Collin Raymond§

University of Oxford

November 1, 2015

Abstract

We test individuals’ preferences for skewed information structures in situations where informa-

tion has no instrumental value. We find individuals exhibit a strong preference for positively

skewed information structures as well as Blackwell more informative information structures. We

show that our results allow for testing of a variety of models. A model based on the framework

on Kreps and Porteus (1978) is most consistent with the data we observe.

∗We thank the Michigan Institute for Teaching and Research in Economics, Ross School of Business Faculty GrantFund, and the University of Oxford John Fell Fund for Support. Our errors are ours alone.†Email: [email protected]‡Email: [email protected]§Email: [email protected].

1

1 Introduction

Information affects the beliefs we hold, what we choose, how happy our choices make us, the

precision and impact of opinions we perpetuate, and more generally, how we make sense of and

interact with the world around us. Neoclassical theory posits that people value information if and

only if the information is instrumental for decision-making. According to this view, neither would

people pay to acquire, nor would they pay to avoid an information that would not change their

decision-making.

People however often have strong preferences regarding information about an uncertain and

unavoidable outcome even when information has no instrumental value: they try to avoid situations

that make them feel anxious about a future event and cultivate instances of hope at the face of

uncertainty, even when doing so is costly. For example, anxious patients with potential symptoms

of a disease may put off taking a diagnostic test, even if it means to delay possible treatments; and,

hopeful voters park themselves in front of the TV on election night, even though it costs them a

good night’s sleep.

Psychologists have long recognized that people regulate their anticipatory emotions regarding

an uncertain outcome in the future, such as hope, anxiety and suspense, by managing their beliefs

about the outcome. While people cannot directly choose their beliefs, they can choose the sources of

information they are exposed to that shape those beliefs. Therefore, these regulatory psychological

objectives shape what people want to learn, and when they want to learn it, even in the absence

of their ability to condition their actions on that information. In order to explain preferences

over non-instrumental information across a variety of contexts, both theory and empirical work

in economics recently embraced the idea that utility can also depend on beliefs.1 These models

lead to very different behavior over information acquisition and consumption than the objective of

instrumentality alone would generate.

The bulk of the work trying to understand preference for non-instrumental information in

economics have focused on two particular types of preferences — preferences for early resolution of

information (as in Kreps and Porteous, 1978), or preferences for one-shot resolution of information

(as in Dillenberger, 2010 or Koszegi and Rabin, 2009). Thus, as discussed in a recent survey by

Golman, Hagmann and Loewenstein (2015) much of the literature has focused on how individuals

attempt to avoid (or seek out) information.

However, more generally, individuals may also care not only about whether they observe any

information at all, but what kind of information they observe. Thus, they may seek out certain

types of information in order to manipulate their posterior beliefs. This leads us to a an important,

1For applied work, see Koszegi, 2006, Caplin and Leahy, 2004, Mullainathan and Shleifer, 2005, Gentzkow andShapiro, 2010, Caplin and Eliaz, 2003, and Oster, Shoulson and Dorsey, 2011, for the theoretical work, see Kreps andPorteous, 1978, Dillenberger, 2010, and Koszegi and Rabin, 2009.

2

but relatively neglected, type of informational preference: the preference for skewed information.

Given equal priors over possible outcomes, we define a preference for positively skewed information

as a preference for a signal with a high false negative rate and a low false positive rate to a signal

with the symmetric and opposite features, where the false positive and false negative rates have

been exchanged.2 Consider information structures that give binary signals regarding whether the

outcome is going to result in a desired state or an undesired state. Negatively skewed information

structures eliminate more uncertainty about the undesired outcome conditional on generating a bad

signal, and positively skewed information eliminate more uncertainty about the desired outcome

conditional on generating a good signal.

In many real world situations we find ourselves having to trade off false negatives versus false

positives. Consider the voters who are sitting in front of the TV at midnight on November 6, 2012.

They can hear projections of the results on FOX or on MSNBC. Each source of information is known

to be somewhat biased: the stations bend the discussion of the available results towards the possible

victory of the party their consumer base supports. Hence, a Democrat has the choice between either

watching Fox, which generates more false negatives, or watching MSNBC, which generates more

false positives for her desired outcome. Clearly, watching either station will not change the election

results. However, voters may have intrinsic preferences over the trade-off between false positives

and false negatives, depending on whether they would rather be overly optimistic in the process

and disappointed at the end, or overly pessimistic in the process and surprised at the end. Despite

the fact that these types of preferences are ubiquitous and intuitive, little has been done to study

them, or ascertain how individuals might manifest them.

Recently, a few theoretical papers have attempted to specifically model non-instrumental pref-

erences for skewed information. These include Dillenberger and Segal (2015), Szech and Schweitzer

(2014), Caplin and Eliaz (2003), Eliaz and Spiegler (2006), Eliaz and Schotter (2010), and Mul-

lainathan and Shleifer (2005), where priors effect the preference for skew. However, these applica-

tions disagree about what kind of preferences one should expect individuals to exhibit. For example,

Szech and Schweitzer (2014), Caplin and Eliaz (2003) and Dillenberger and Segal (2014) predict

preferences for positive skew, while Eliaz and Schotter (2010) and Eliaz and Spiegler (2006) focus

on the case of preferences for left-skewed information.

One can draw on competing intuitions for each of these predictions about preferences. If,

conditional on observing any given signal, an individual prefers to have a higher posterior belief,

then they should prefer positively skewed signals. However, if in contrast, and individual prefers

to see more signals that are associated with increasing their posterior, rather than decreasing, then

they should prefer negatively skewed signals. Thus, it becomes important to empirically test what

2The positive/negative nomenclature is based on the fact that the two signals generate distribution of posteriorsthat have the same expectation and variance (first and second moments), but the former has a positive third moment,while the latter has a negative third moment.

3

kind of preferences individuals actually exhibit.

Our main motivation for of investigating informational preferences for skewness is their preva-

lence in many economic decisions individuals face. However, an important ancillary benefit is that

characterization of these preferences also allows us distinguish between belief-based utility models

in a manner that is not possible by examining preferences for early or one-shot resolution of infor-

mation. A variety of theoretical models have been developed to try to understand and characterize

belief-based utility that generate an intrinsic preference for information. These include models

of: dynamic reference dependence, such as Koszegi and Rabin (2009); anticipatory utility such as

Brunnermeier and Parker (2005) and Caplin and Leahy (2001); and models of preferences for the

timing of the resolution of information such as Kreps and Porteus (1978), Dillenberger (2010) and

Dillenberger and Segal (2014). These models is designed to accommodate behavior that neoclassi-

cal models cannot, and as such have become widely used in the literature. Distinguishing between

different models of information preferences is an important step in incorporating belief-based utility

in policy-making, as it would help understand what policies can best improve welfare in contexts

where there is a conflict between utility from beliefs and utility from material payoffs.

However, current experimental evidence studying preferences for early or one-shot resolution of

information cannot distinguish between most existing models of belief-based utility, because these

models are designed to generate these exact preferences. On the other hand, preferences for skewed

information were not one of the original motivating factors in the development of many of models

of belief based utility. Thus, our experiments allows us to test an important, but “out-of sample”

prediction — whether individuals prefer to learn more about whether they will receive the better

outcome or the worse outcome. Importantly, different classes of models generate very different

predictions for skewness preferences. These diverging ancillary predictions allow us to distinguish

competing models of belief-based utility based on our experimental results.3

We report results from a lab experiment that elicits preferences for information in an environ-

ment where the information, by construction, cannot influence actions. The future outcome of

interest has two states that are ordered in terms of payoffs: whether the lottery ticket each partici-

pant was given is one of the winning tickets that were drawn at the start of the experiment, or not.

The likelihood of having one of the winning tickets is 50%. Participants make choices between two

information structures across five questions. All information structures can generate one of two

possible signals: good and bad. However, the information structures vary in how much and what

kind of information uncertainty they resolve. The participants observe the signal generated by one

of the information structures they choose, learn the posterior likelihood that their ticket is one of

the winning tickets, and sit with that information for at least half an hour before the winning ticket

numbers are revealed.

3Although preferences for skewed distributions of outcomes have been discussed extensively in economics, thediscussion has not been extended for the most part to preferences for skewed information structures.

4

Our experimental design addresses three important challenges in identifying preferences for

non-instrumental information. First, it ensures that information cannot influence actions in the

experiment or elsewhere: subjects are in the lab answering unrelated questions after observing

the signal generated by the information structure they chose, and before learning the outcome for

sure. Therefore, preferences for information are entirely driven by belief-based utility. Second, it

ensures that the observed preferences are for information that impact subjects’ beliefs about future

outcomes and their belief-utility, rather than for information that shapes their self-perceptions,

impacts their ego-utility or confidence regarding making a choice: subjects are randomly assigned

to the high payoff condition, which is entirely unrelated to any characteristics of them, and do

not have any control or agency regarding the outcome. Third, it reduces information processing

cost of subjects to ensure that preferences reflect utility and not cognitive processing constraints:

As a part of the experiment we explain to subjects with what probability they will receive any

given signal, and what posteriors they should have after observing any given signal. Thus, choices

we observe are not the result of individuals incorrectly updating, or being confused by what the

information is telling them.

Our three main contributions are as follows. First, we demonstrate the existence of strong

preferences regarding the skewness of information. Our results indicate that individuals exhibit a

strong preference for positively skewed information structures relative to those that are negatively

skewed. These preferences are demonstrated at both the between person and within person level and

are robust to the type of information structures subjects compare. Individuals seem to prefer ruling

out more uncertainty about the desired outcome (and tolerating uncertainty about the undesired

outcome) compared to ruling out more uncertainty about the undesired outcome (and tolerating

uncertainty about the desired outcome).

Second, we discuss how preferences for skewed information relate to other preferences for in-

formation that are discussed in the literature. Preferences for third moments (preferences for

skewness) have been extensively discussed for choices over risky outcomes (e.g. lotteries) and have

been related to behavior such as precautionary savings. In fact, there is a natural analogy between

studying preferences over lotteries and intrinsic preferences over information structures. The first,

second and third derivatives of the Bernoulli utility function over wealth is known to define prefer-

ences over first order stochastic dominance, mean-preserving spreads and mean-variance preserving

shifts (Rotschild and Stiglitz, 1970, Menezes et al, 1980). The first and second derivatives of a

utility function defined over information structures is known to relate again to monotonicity and

preferences for early versus late resolution of information respectively (Kreps and Porteus 1978,

Grant, Kajii and Polak, 1998). Moreover, as Grant, Kajii and Polak (1998) show there is a nat-

ural analogy between risk aversion and a preference for late resolution of information, linking the

orderings associated with risk and the orderings associated with information. Our within person

5

design allows for an empirical investigation of how preferences for skewness relate to preferences

regarding early versus late and one-shot versus gradual resolution of uncertainty.

Third, we use our results as a testing ground of utility-based models. We find that our model

finds strongest support for the class of preferences introduced by Kreps and Porteus (1978), and

extended by Grant, Kajii and Polak (1998). We find that subjects exhibit preferences for skewness

are inconsistent with the predictions of many other models used to generate non-instrumental

preferences for information, including Koszegi and Rabin (2009). Moreover, other models that

are able to generate preferences for skewness in accordance with what we observe — such as

dynamic extensions of rank-dependent utility, Gul’s (1991) model of disappointment aversion and

Dillenberger and Segal’s (2015) model of skewness — cannot accomodate the fact we no evidence

for a preference for no information over very little information.

The rest of the paper is structured as follows. Section 2 presents a simple theoretical framework

that allows us to analyze preferences over information structure. It focuses on discussing the situa-

tions where there are two possible outcomes and two possible signals — a simple environment that

will still allow us to demonstrate important results, as well as summarizing the existing literature.

Section 3 discusses the implications for preferences over information structures of important classes

of models in our environments, which we will be able to test in our environment. Section 4 outlines

the experimental design. Section 5 presents the results of our experiment. Section 6 discusses these

results and relates them to the theoretical predictions of Section 3. Section 7 concludes.

2 Framework and Literature Review

2.1 General Framework and Preliminaries

We focus on individuals’ preferences for information where all probabilities are objectively known,

rather than subjective. In order to capture preferences for information, our theory focuses on an

idealized situation where there are three periods (0, 1 and 2). In Period 0 individuals have a

prior probability distribution over states that will be realized and determine payoffs in Period 2.

In Period 1 they receive a signal, which might cause them to update their prior to a posterior.

In Period 2 the states are revealed and individuals receive their payoff. Importantly, individuals

cannot take any actions; thus all preferences for information must come from intrinsic, rather than

instrumental, motivations.

Formally, we imagine there are a finite number N of indexed states ωi. Each state corresponds

to a different payoff for the individual. Moreover, there are M signals indexed by sj . An information

structure I is an N by M matrix, such that the entries in each row sum to 1. The i, j-th entry of

the matrix, denoted Iij gives the probability that signal sj is realized if the state is ωi. Given a

prior distribution f over states, if the individual utilizes Bayes’ rule then a posterior probability of

6

state ωi conditional on observing signal sj is given by:

ψj(ωi) =f(ωi)Iij∑k f(ωk)Ikj

We will suppose that individuals have preferences over information structures given the prior f ,

denoted by%f . Formally, within the economics literature, these are typically modeled as preferences

over two-stage compound lotteries; lotteries over lotteries. Each signal si induces a lottery over

outcomes — the posterior distribution ψj . This is the lottery that individuals face in period 1

after receiving information. In period 0, the individual faces a lottery over these possible lotteries

— signal sj is received with probability∑

i f(ωi)Iij . There is a natural bijection between prior-

information structure pairs and two-stage compound lotteries. Because our focus is on information,

we will write preferences, and utility functionals, over the space of prior-information structure pairs.

However, formal results will use the induced preferences in the space of two-stage compound lotteries

(for example, assumptions on convexity or concavity of the utility function will be stated using the

space of two-stage compound lotteries).4

In order to derive predictions applicable to our particular experimental setting, we will focus

on situations where there are two outcomes, high and low, with utility values u(H), u(L) (in

this subsection we normalize these utilities to 0 and 1 respectively); and so only two states (i.e.,

M = N = 2). We will define the information structures we will consider, as well as important

orderings on the set of information structures, which we will use in deriving testable predictions

regarding behavior from important classes of models.

Given the two outcomes, we denote the prior probability on the high outcome as f . The

decision-maker has access to a set of binary signal structures: the realizations are G (good) or B

(bad). A good (bad) signal is a signal that increases (decreases) the beliefs about the outcome

being high compared to the prior. I is then a two by two matrix which can be characterized by

only two numbers (since the entries in each row must sum to 1). We do so by setting p = I11 and

q = I22. Thus, our information structures are characterized as points in R2: (p, q). The probability

of good signal conditional on high outcome is p = p(G|H) and probability of bad signal conditional

on low outcome is q = p(B|L). Using Bayes’ Rule, after observing a good signal the posterior for

a high outcome is

ψG =fp

fp+ (1− f)(1− q). (1)

After observing a bad signal the posterior is

ψB =f(1− p)

f(1− p) + (1− f)q. (2)

4For an extended discussion of these issues, please see the Appendix.

7

Observing a good signal occurs with probability fp+ (1− q)(1− f) and observing a bad signal

occurs with probability f(1− p) + q(1− f).

In our binary-binary world, we can represent any possible signal structure as a point in (p, q)

space, with the horizontal access being the p-value. For the rest of the paper we assume that

all information structures (which could lie anywhere in the unit square) must lie above the line

p+ q = 1 along with the point (.5, .5). We denote this set by S := {(p, q)| p+ q > 1} ∪ (.5, .5). We

do so for two reasons. First, all points in S have a natural interpretation: a good signal is good

news (a bad signal is bad news). Lemma 1 formalizes this.

Lemma 1 For any (p, q) ∈ S, observing a good signal increases the posterior on high outcome

relative to the prior, and observing a bad signal decreases the posterior on high outcome relative to

the prior.

Moreover, this set of signals is a minimal set that still allows us to capture all possible posterior

distributions, as shown by Lemma 2.

Lemma 2 For any signal structure (p′, q′) ∈ [0, 1] × [0, 1], there exists a (p, q) ∈ S that generates

the same posterior distribution. However, for any T ⊂ S there exists a (p′, q′) ∈ S such that there

is no element of T that generates the same posterior distribution as (p′, q′).

Given this restriction we can consider examples of information structures. An information

structure that resolves all information as early as possible is one in which p = q = 1. In this

case, a good signal implies that the outcome is high for sure. Similarly, a bad signal indicates

a bad outcome for sure. An information structure which conveys no information at all will have

p = q = .5. In this case, the posterior after either signal is equal to the prior. With these examples

in mind we now turn to focus on some interesting orderings over prior-information structure pairs.

2.2 Types of Informational Preferences

Preference for earlier versus later: Most of the theoretical models for non-instrumental

information focus on accommodating preferences for early versus late resolution of information.

There have been a fair amount of work in this domain both the decision theory literature (e.g.,

Kreps and Porteus, 1978; Epstein and Zin, 1989; Grant, Kajii, and Polak, 1998) and the behavioral

literature (e.g., Koszegi and Rabin, 2009 and Koszegi, 2009). These models characterize the condi-

tions under which a preference for early resolution of uncertainty can be observed, albeit proposing

distinct processes.

A preference for early or late resolution is tightly linked to the most well-known ordering

of information structures in economics, Blackwell’s ordering. Blackwell’s ordering was originally

designed to be used in situations where the individual’s payoffs in Period 2 depend on both the

state and an action taken by individuals in Period 1. Information structure (p′, q′) Blackwell

8

dominates structure (p, q) if, for any set of actions and utility functions over action and state

pairs, an individual with expected utility preferences receives a higher expected utility from (p′, q′)

compared to (p, q).

However, as Kreps and Porteus (1978) and Grant, Kajii and Polak (1998) demonstrate, there

is a meaningful mapping between Blackwell’s ordering and information preferences even when

information is non-instrumental (i.e., individuals cannot take any action based on it). In particular,

one information structure resolves more uncertainty earlier than another if the first information

structure is Blackwell more informative than the second. Note that lotteries may be earlier or later

than each other, even if all uncertainty is not resolved early (Period 1) or late (Period 2). Lemma

3 formalizes the intuitive concept of “informativeness” of the signal structure in our setting by

defining when one signal structure is Blackwell dominated by another.

Lemma 3 (p′, q′) Blackwell dominates (is Blackwell more informative than) (p, q) if and only if

p′ ≥ max{ p1−q (1− q′), 1− q′ 1−pq }.

If (p′, q′) is more Blackwell informative than (p, q) then the posteriors under (p′, q′) are a mean

preserving spread of the posteriors under (p, q) — a result that follows from the law of iterated

expectations. Clearly, this will be true if p′ > p and q′ > q, but Lemma 3 shows it can also

be true under less stringent conditions. Figure 1 illustrates the set of all signals (p′, q′) that are

Blackwell more informative than the signal (p, q) = (.6, .6). Importantly, the definition of Blackwell

more information is prior independent; and so preferences that prefer early over late resolution (or

vice-versa) must obey the constraints imposed by Lemma 3 regardless of the prior.

1

1

1 0.6

0.6

Blackwell More Informa4ve Set

Figure 1: Signals that are Blackwell More Informative than (.6,.6)

Preference for one-shot versus gradual: Another ordering over information structures

9

that is discussed in the literature is a preference for one-shot resolution of uncertainty. Building

on Palacios-Huertas (1999), Dillenberger (2010) provides a characterization of a preference for one-

shot resolution of uncertainty. Dillenberger describes an individual who prefers full early resolution

(p = q = 1) or full late resolution (p = q = .5) over any other information structure, given any

prior f . This phenomena is closely linked to the notion of a preference for clumping, introduced

by Koszegi and Rabin (2009). Ely, Frankel and Kamenica (2013) model preferences for gradual

resolution of information and their application in a mechanism design setting In our experiment we

focus on looking for a local preference for clumping, where if an individual an information structure

that reveals nothing in Period 1 to all “nearby” information structures, which convey only a little

information in Period 1.

Preference for positive versus negative skewness: The last ordering we want to discuss,

and the novel contribution of this paper, is testing preferences for skewed information — whether in

Period 1, individuals prefer to learn more about whether they will receive the better outcome (H) or

the worse outcome (L). In the theoretical literature there are several notions related to preferences

for skewness (including third-order stochastic dominance, third-degree risk order, mean variance

preserving probability transformations, the central third moment and the Dillenberger and Segal

(2015) notion of skewness). Given our experimental design, with a fixed prior and “symmetric”

signal structures, these notions coincide, and so for simplicity we focus on providing intuition via

mean-variance preserving probability transformations, which as Menezes et al (1980) has shown, is

equivalent to the third degree risk order.

For example, suppose we fix f = .5. Consider structures (.5, 1) and (1, .5); which are positively

and negatively skewed respectively. The positively skewed information structure provides a 25%

chance to resolve all uncertainty in favor of the better outcome (giving a posterior of 1), while

delivering worse-than-before news 75% of the time (delivering a posterior of 13). The negatively

skewed information structure provides a 25% chance to resolve all uncertainty in favor of the worse

outcome (giving a posterior 0), while delivering better-than-before news 75% of the time (and giving

a posterior of 23).

These examples of skewed information structures have a particular feature that they also allow

full resolution of uncertainty in Period 1 in some cases. Theories that predict a preference for

these “extreme” skewed information structures also predict the same preference for interior cases

where information is always resolved gradually. Thus, we are interested in whether people prefer

informations structures that, given equal priors, are more accurate at predicting the worse outcome

than those that are more accurate in predicting the better outcome. Formally, suppose a ≥ b. A

preference for positive skewed information occurs if the decision maker prefers (b, a) if preferred to

(a, b). Given equal priors, the expectation of the posterior distribution is the same for both infor-

mation structures (by the Law of Iterated Expectations). Moreover, the variance of the posterior

10

distributions are the same. But, (b, a) has a posterior distribution with a positive third moment,

while (a, b) has a posterior distribution with a negative third moment (and in fact, the two third

moments have the same absolute value).5

3 Theoretical Predictions

There are a variety of theoretical models that predict preferences over information structures. In

this section we will derive testable predictions of important classes of models. We can will then take

these predictions to the data, which can help us understand the best way to model the preferences

we observe. This is important because, for example, although many models can predict a preference

for skewness, they will predict it in conjunction with other phenomenon.

In order to facilitate comparisons of the predictions, we will frame the assumptions we make

as restrictions on functional forms. For some of the predictions, there are equivalent formulations

which can be made directly on the preferences. We will also not describe the models themselves in

detail in this section. The interested reader can see the Appendix for a more detailed description

of the models, assumptions and axioms. In all predictions, in order to match the design of the

experiment, we assume the prior is 0.5, as well as the binary state, binary signal framework discussed

in the previous section. Thus we frame our prediction in terms of observed preferences, %.5.

Traditional economics would assume that individuals do not have non-instrumental preferences

over information; Segal (1990) describes these individuals as satisfying an axiom called Reduction

of Compound Lotteries. When re-framed in our domain, information structures, this axiom simply

says that an individual should not care about what information structure they face, i.e., (p, q) ∼f(p′, q′).

Prediction 1 Fixing f , the decision maker should be indifferent between all information structures.

Of course, it is easy to imagine that individuals are not indifferent between all information

structures even when information has no instrumental value. Thus, the literature has considered

various weakenings of the Reduction of Compound Lotteries assumption. One assumption, intro-

duced by Segal (1990), is that rather than being indifferent between all lotteries that have the same

reduced form probabilities over final outcomes, individuals are only indifferent between full early

resolving lotteries (i.e., p = q = 1) and full late resolving lotteries (i.e., p = q = .5) that have same

reduced form probabilities over final outcomes (i.e. the same f). Segal describes these individuals

as satisfying the Time Neutrality axiom, which says (1, 1) ∼f (.5, .5).

Prediction 2 Fixing f , the decision-maker should be indifferent between (1, 1) and (.5, .5).

5Of course, if priors vary, then although the posterior distributions of (b, a) and (a, b) will always have the samemean, they will not necessarily have the same variance.

11

Because time neutrality imposes a type of stationarity on preferences they have been widely

used in the literature, including papers such as Dillenberger (2010). In contrast, a large number of

papers, beginning with Kreps and Porteus (1978), have discussed the importance of a preferences

that do not satisfy time neutrality. In particular they focus on individuals who have a preference

for early resolution of information. Individuals have a preference for early resolution of uncertainty

if, given two lotteries which generate the same reduced form probabilities over the same outcomes,

they always prefer a compound lottery which is more Blackwell informative in the first stage. Grant,

Kajii and Polak (1998) show that given mild differentiability assumptions on the utility function

V that represents the preferences, a preference for more (less) Blackwell informative signals is

equivalent to the local utility function of V being convex (concave).6

Prediction 3 Let %f represented by V , where V is Gateaux differentiable. Then the local utility

function of V is everywhere convex (concave) if and only if the decision-maker prefers Blackwell

more (less) information structures.

The intuition for this result (drawing on Machina, 1982) is that individuals like increases in the

spread of their posterior if and only if their local utility functions are convex (here the posterior

distribution acts like a monetary lottery). Similarly, we know in the case of monetary lotteries

that if the derivative of the local utility function is convex, then the individual prefers positively

skewed lotteries. This intuition naturally maps into compound lotteries, as the next prediction

demonstrates.7,8

Prediction 4 Let %.5 represented by V , where V is Gateaux differentiable. If the local util-

ity function of V is thrice differentiable and has a convex (concave) derivative everywhere, then

(x, y) %.5 (-.5)(y, x) whenever x ≤ y.

Segal (1990) proposed a class of preferences, called “Recursive Preferences”. In this class,

decision-makers evaluate situations with information revelation using a folding-back procedure by

using two functionals V1 and V2, which represent the utility at Period 1 and 2, respectively.

We next turn to the predictions of some specific models that specifically assume recursivity and

can address preferences for skewness.9 The first model to implicitly use recursive preferences to

address non-instrumental preference for information was introduced by Kreps and Porteus (1978).

6Recall that convexity, concavity, as well as the local utility functions, are defined in the space of two-stagecompound lotteries induced by the prior-information structure pair.

7Again, the function is defined in the space of induced two-stage compound lotteries.8Our requirement on the differentiability of the utility functional and the local utility functions is stronger than

it actually needs to be in both this and the previous prediction. Using the techniques of Cerreia-Vioglio, Maccheroniand Marinacci (2015) we can relax all differentiability assumptions.

9Eliaz and Spiegler (2005) discuss an impossibility result related to preferences for skewness and preferences forfull-early resolution. However, their exact results rely on existential quantifiers that are impossible to violate andcalibrate.

12

They assume both V1 and V2 have expected utility representations. Given these specification, we

now provide a stronger version of Predictions 3 and 4.10

Prediction 5 Suppose preferences have a recursive representation (V1, V2) such that V1 and V2

have expected utility representations with Bernoulli utilities u1 and u2. Then u1 ◦ u−12 is convex

(concave) if and only if the decision-maker prefers Blackwell more (less) information structures.

Moreover, if the derivative of u1 ◦ u−12 is convex (concave), then (x, y) %.5 (-.5)(y, x) whenever

x ≤ y.

There are also other models, used in a variety of applications, which make other particular

functional form assumptions. Two well-known classes of models used in dynamic applications are

the recursive extensions of Gul’s (1991) model of disappointment aversion and rank dependent

utility.11 Both of these models can accommodate a preference for right-skewed information or for

left-skewed information.12 However, they also generate additional predictions regarding behavior

that can separate them from the basic Kreps-Porteus model. Our next prediction is one such

behavior: it states that if an individual’s preferences fall within either of these classes and are

consistent with the empirical evidence on the Allais paradox and first-order risk aversion, then they

must exhibit local preference for late resolution.13


are both in Gul’s class of disappointment aversion functionals (or rank-dependent utility) and the

decision-maker is disappointment averse (has a strictly convex weighting function). Then there

exists an 0 < ε′ such that for all ε < ε′, (.5, .5) �.5 (.5 + ε, .5 + ε).

One paper directly addressing preferences for skewed information is Dillenberger and Segal

(2015). They provide sufficient conditions such that, fixing a prior, if an individual prefers full late

resolution (.5, .5) over all more informative structures, (p, q) where p ≥ .5 and q ≥ .5, then they

must also prefer (.5, .5) over all negatively skewed structures. However, these individuals prefer

some positively skewed structures over (.5, .5). We refer the interested reader to their paper for a

full description of the conditions.14

10Caplin and Leahy’s (2001) model of psychological expected utility nests Kreps and Porteus’ (1978) specificationin our framework.

11One might ask why not directly test the assumption of recursivity with our data. We cannot do so because, fixinga prior belief about outcomes, recursivity itself has no testable implications regarding preferences over information.Thus we must test recursivity in conjunction with other assumptions about the structure of the functionals.

12This is almost trivially true since they nest the Kreps-Porteus functional form just discussed13As inspection of the proofs will demonstrate, these are stronger than necessary conditions.14Dillenberger and Segal (2015) provide a different definition of skewness. In our particular domain, their definition,

as well as all other notions of skewness coincide, because of the binary nature of the outcome and priors being equalto 50%.

13

Prediction 7 Suppose preferences belong to the class defined by Dillenberger and Segal (2015).

Then if (.5, .5) %.5 (x, x) for all x ≥ .5, then for all x ≤ y, (.5, .5) %.5 (y, x). However, it is possible

that (.5, .5) -.5 (x, y).

We also want to consider three important models of preferences which do not satisfy recursivity,

but which are used in many applications to generate preferences over information. Brunnermeier

and Parker (2005) introduce a well known model of optimal expectations. In their model individuals

trade off having (distorted) optimistic beliefs today with possibly taking incorrect actions in the

future based off of those incorrect beliefs. Of course, in our environment there are no actions to

take, so individuals should be indifferent between all structures. We refer to their functional form

as BP.

Prediction 8 Suppose preferences represented by a BP functional form. Then the decision-maker

should be indifferent between all information structures.

A second important class of non-recursive preferences are those of Koszegi and Rabin (2009).

We refer to their functional form as KR. These preferences, although flexible enough to capture

preferences for early versus late resolution of information, and a preference for clumping, have

strong predictions regarding preferences for skewness. This is because their functional form imbeds

strong symmetry assumptions regarding the payoffs over beliefs.

Similarly, the non-recursive preferences using in Ely, Frankel, and Kamenica (2013), referred to

here as EFK, also imbed strong symmetry assumptions. Thus, despite the fact that they are meant

to capture a preference for gradual resolution of information, the opposite intuition of Koszegi and

Rabin (2009), we obtain the same prediction.

Prediction 9 Suppose preferences represented by a KR or EFK functional form. Then (x, y) ∼.5(y, x).

4 Experimental Design and Procedures

This section provides a brief primer on the experimental design (for a more detailed description

see Appendix C). Subjects received a raffle ticket upon entering a room which gave them a 50%

chance of winning winning $10 in addition to their show-up fee and a 50% chance of winning no

additional money. They were told that the winning ticket numbers would be announced at the

end of the 60 minute study and that they could choose the kind and amount of information they

would like to receive in the middle of the study by making choices among information options. The

particular setting we chose focuses on information regarding an outcome that the subjects have

no control or agency over, thus ruling out information preferences linked to ego-utility, confidence,

14

etc. (see Hoffman, 2011; Eil and Rao, 2011; Mobius et al., 2011 for examples where information of

interest is about the action or characteristics of the subject, and Eliaz and Schotter, 2010 for the

case where confidence matters due to agency). Subjects were also told that they would receive a

clue generated by the information option they chose at the end of Part 1, and that this information

would not change whether they actually won the money or not, nor would it help them elsewhere

in the experiment, nor have any impact on their earnings. Finally, the subjects were informed that

they would sit with this information until the outcome of the lottery was announced at end of the

experiment while answering hypothetical questions in Part 2 of the study. They were told that Part

2 of the study would not affect their payments or the lottery outcome in any way. As such, the

second part of the study served mainly as a filler task as they waited to learn the lottery outcome.

This design choice introduced a considerable delay between the time of information acquisition to

the time of uncertainty resolution, while keeping the subjects in a controlled environment to rule

out any potential instrumental use of the information acquired. In other words, subjects could

not engage in any actions, such as purchasing goods, based on the information about their future

earnings.

In the experiment we specifically test preferences between pairs of information structures.15 We

also focus on a particular prior, where f = .5. The specific pairs we focus on are:

• (1,1) and (.5, .5), which we denote as full early resolution and full late resolution respectively.

In the former case, the decision-maker learns everything after observing the signal, and in the

latter nothing.

• (.5, 1) and (1, .5), which we denote as extremely positively skewed information and extremely

negatively skewed information respectively. In the former case, the decision-maker knows for

sure that the low outcome will happen whenever they observe a bad signal, but does not know

for sure if the high outcome will happen after observing a good signal. In the latter case the

decision-maker knows for sure that the high outcome will happen whenever they observe a

good signal, but does not know for sure if the low outcome will happen after observing a bad

signal.

• (.3, .9) and (.9, .3), which we denote as slightly positively skewed information and slightly

negatively skewed information. In the former case, given equal priors, the decision-maker

is much more sure that the low outcome will happen whenever they observe a bad signal

compared to their belief about the high outcome will happen if they observe a good signal.

15Because that our domain appears similar to a standard consumption domain, it would be possible to give con-sumers a “budget” constraint and have them choose their favorite signal within the budget constraint. We do notdo so because we believe this would make it harder for the subjects to understand the posterior distribution inducedby any given signal and the probabilities with which a given signal is realized. Given pairwise choices we can ensurethat the subjects understand, as well as possible, these attributes.

15

In the latter case, given equal priors, the decision-maker is much more sure that the high

outcome will happen whenever they observe a good signal compared to their belief about the

low outcome will happen if they observe a bad signal.

• (.6, .9) and (.9, .6), which we denote as moderately positively skewed information and mod-

erately negatively skewed information. The intuition is as for (.9,.3) and (.3,.9) respectively.

• (.5, .5) and (.55, .55), which are both symmetric signal structures with no skewness. The

former option does not provide any information in Period 1 and the latter option conveys

only a little information in Period 1. Therefore we denote them as as full late resolution and

a little early resolution, respectively.

• (.76, .76) and (.3, .9), which are a symmetric signal structure with no skewness and a positively

skewed information structure, respectively. Note that (.76, .76) is just barely Blackwell more

informative than either (.3, .9).

• (.67, .67) and (.1, .95), which are a symmetric signal structure with no skewness and a

positively skewed information structure, respectively. Note that (.67, .67) is just barely

Blackwell more informative than (.1, .95).

• (.66, .66) and (.5, 1), which are a symmetric signal structure with no skewness and a positively

skewed information structure, respectively. Note that (.66, .66) is just barely Blackwell less

informative than (.5,1).

• (.55, .55) and (.3, .9), which are a symmetric signal structure with no skewness and a positively

skewed information structure, respectively. Note that (.55, .55) is just barely Blackwell less

informative than (.3,9).

Figure 2 and Figure 3 show the relationship between all the information structures we consider

in our setting where the prior is 0.5. The p value is the horizontal axis in each figure, while the

q value is the vertical axis. Figure 2 demonstrates the types of skewed information structures we

are considering, along with full early and full late resolution information structure. In our choices

to assess preferences across skewed information structures, we will compare a skewed information

structure to the information structure that is it’s reflection across the line (p = q) — e.g. we will

compare slightly negative skew to slightly positive skew. The reflection across the line (p = q) gives

us the most natural pair with the opposite skew, while keeping the Blackwell informativeness of

the signals constant.

16

0.3 1

1

0.5

0.5

1 0.5

0.5 Extreme

Slightly

Full Late Medium

Nega:vely Skewed

Extrem

e

Slightly

Med

ium

Posi:vely Skewed

0.9

0.9

Full Early

0.3

Figure 2: Information Structures, Skewness

In contrast, Figure 3 focuses on showing the relationship in terms of Blackwell dominance

between different signals. Given our discussion above, points that are higher on the 45-degree (i.e.

p = q) line Blackwell dominate points that are lower. More interestingly, we can also use Lemma

2 to compare information structures with skew to symmetric information structures that generate

(given equal priors) a posterior distribution with a 0 third moment — information structures

along the line p = q. In our experiment we will compare a skewed information structure to a

symmetric information structure that is either just Blackwell more or just Blackwell less informative,

which helps us better distinguish between models that capture non-instrumental preferences for

information. Moreover, it allows us to assess to what extent preferences for skewness may alter

preferences for Blackwell dominance.

The lines drawn through a positively skewed structure demarcate the set of structures either

Blackwell more informative or less informative. One line passes through (.5, 1). Points are below

this line if and only if they are Blackwell dominated by (.5, 1). In contrast, two lines pass through

(.3, .9). First consider the line that is steeper (i.e. has a slope that is larger in absolute value).

Points are below this line if and only if they are Blackwell dominated by (.3, .9). Next consider the

line that is flatter. Points are above this line if and only if they are Blackwell dominate (.3, .9).

In a similar manner, the line drawn through (.1, .95) demarcates the set of points that Blackwell

dominate (.1, .95).

17

Figure 3: Information Structures, Blackwell Ordering

There were two conditions. In each condition subjects faced a series of five pair-wise choices

between information structures. They were told that one of the questions would be chosen, at ran-

dom, with equal probability assigned to all questions, after they had made all pairwise choices.16, 17

In order to verify the robustness of observed preferences, Conditions 1 and 2 varied the order of

the options presented in Q1, Q2, Q5b, and counterbalanced the order in which Q3 and Q5a were

presented, and also asked different Q4a and Q4b.18

Table 1 details the order of questions and options presented in Condition 1 and Condition 2.

Q1 elicited preferences regarding full early resolution of uncertainty (indicated by (1,1)) versus

full late resolution of uncertainty (indicated by (.5, .5)). Q2 elicited preferences between the

extreme positively and negatively skewed signal structures, while Q3 and Q5 presented the slightly

and moderately skewed signal structures. Note that half the time, Q5 presented another signal

structure that tested preferences for full late resolution and another symmetric signal structure that

is more informative (.55, .55). Finally, Q4 tested for a trade-off between Blackwell informativeness

and skewness. If subjects preferred late resolution, in both conditions they were asked to choose

between a skewed signal to a symmetric signal that was Blackwell less informative than that skewed

16One may be concerned that we did not elicit willingness to pay for information structures. In order to verify thatindividuals were in fact willing to pay for their information structures we ran a separate experiment. We found thatindividuals are willing to pay for their favored information structure, we discuss these results as a robustness checkin the next section, and the design of the experiment in Appendix D.

17Implementation using to a random-lottery incentive system is quite common in the literature. However, it hasbeen criticized (Holt, 1986). If subjects treat each choice in isolation, then this incentive system introduces nodistortions. Experimental evidence on this, although mixed, has been generally supportive; Starmer and Sugden(1991) and Cubitt et al. (1998) are supportive, while Harrison et al. (2013) finds distortions. Wakker (2007) providesa useful summary of these issues.

18We did not test richer question order randomization because the video instructions explaining each question builton each other and starting with Q1-Q2 made most sense because they were the simplest to explain.

18

signal. Similarly, if they preferred full early resolution, they were asked to choose between a skewed

signal to a symmetric signal that was Blackwell more informative. Therefore choices presented in

Q4 were aimed at testing whether the preferences of subjects who exhibit a preference for Blackwell

informativeness over symmetric signal structures also respect that same Blackwell ordering when

comparing positively-skewed structures to symmetric structures.

Figure 4: Q2, Condition 1

Each of the information structures were presented as options from which the computer would

draw a ball from according to whether the subject won or lost the lottery. The subjects could

not see which box the computer was drawing a ball from, but could observe the color of the ball.

Figure 4 depicts the options presented in Q2 of Condition 1. Subjects watched an instructional

video before each question that presented the two options, explained the percentage of the instances

a red or a black ball would be drawn from each option, and displayed the posterior probability of

winning or losing associated with observing a red or a black ball from each option. After this

instructional video, subjects completed comprehension questions that checked their understanding

before proceeding to making their choices. The information presented by the video was repeated

on the page that described each option and asked for their choice.

After the subjects made choices in all five questions, the computer randomly picked one question

among the five to be carried out. The subjects saw the chosen question and their choice in that

question was repeated on their screen. Then, the computer displayed the color of the ball drawn

from their preferred option and the posterior probability of the subject having won the lottery based

on the color of the ball, and asked the subject to enter this probability to confirm that they were

paying attention. After their information choice in one randomly chosen question was carried out

in this manner, subjects were asked to answer several qualitative questions regarding their choice.

19

Condition 1 Condition 2

Option 1 Option 2 Option 1 Option 2 conditional on

Q1 (1, 1) (.5, .5) (.5, .5) (1, 1) -

Q2 (1, .5) (.5, 1) (.5, 1) (1, .5) -

Q3 (.9, .3) (.3, .9) (.6, .9) (.9, .6) -

Q4(.76, .76) (.3, .9) (.67, .67) (.1, .95) if (1, 1) � (.5, .5)

(.55, .55) (.3, .9) (.66, .66) (.5, 1) if (1, 1) � (.5.5)

Q5(.9, .6) (.6, .9) (.9, .3) (.3, .9) random

(.55, .55) (.5, .5) (.5, .5) (.55, .55) random

Table 1: The order of questions and options in Condition 1 and 2

In the remaining time before the outcome of the lottery was to be revealed, subjects were also

asked a series of hypothetical questions across 5 blocks in Part 2 of the study.19 Each block featured

10 questions, asking whether individuals preferred to take Option A or Option B. In blocks 1-3,

Option B was receiving some amount of money for sure, beginning with $2 and increasing in $2

increments to $20 dollars. In block 1, Option A was a gamble that was structured as follows: “a

ball will be drawn from a box with 50 white and 50 blue balls. If a blue ball is drawn you receive

$30, otherwise nothing.” In block 2, Option A was a gamble that was structured as follows: “a

ball will be drawn from a box with white and blue balls (the respective number were not specified).

If a blue ball is drawn you receive $30, otherwise nothing.” Option B was receiving some amount

of money for sure, beginning with $2 and increasing in $2 increments to $20 dollars. In block 3,

Option A was a gamble that was structured as follows: “a ticket will be drawn from an urn that

features 101 tickets labeled from 0 to 100. The number on the ticket determines how many blue

balls will be in a box of 100 blue and white balls. Next, a ball will be drawn from the box. If a blue

ball is drawn you receive $30, otherwise nothing.” In block 4, Option A allowed the individual to

receive $30 for sure. Option B was a gamble that paid an 80% of x and a 20% of 0. x varied from

$34 to $74 in $4 increments. In block 5, Option A was a gamble which allowed the individual to

receive a 25% chance of $30 and 75 % chance of $0. Option B was a gamble that paid an 20% of

x and a 80% of 0. x varied from $34 to $74 in $4 increments.

Our design allows us to directly relate the theoretical predictions to the questions. Prediction

1 can be tested by all questions, as it says that the decision-maker should always be indifferent.

Prediction 2 is specifically tested in Q1. Prediction 3 is tested by looking at preferences for earlier

19These questions were hypothetical in order to ensure that subjects could not use the information to adjust theirresponses to questions that would result in actual monetary rewards.

20

or later resolution, which are elicited in Q1, Q4 and the second possible Q5 question. Prediction 4

is tested by looking at preferences for skewness, and so is tested by looking at Q2, Q3 and the first

possible Q5 question. Prediction 5 is tested by looking at both preferences for skewness, Q2, Q3

and the first possible Q5 question, as well as preferences for avoiding information entirely, which

is tested in Q1, Q4 and the second possible Q5 question. Prediction 6 is tested both by looking at

preferences for extreme skewness and whether preferences are monotone in the Blackwell ordering,

so Q1, Q2 and Q4.20 Prediction 7 is tested by looking for preferences for skewness Q2, Q3 and the

first question of Q5, and a preference for full late resolution locally, so the second question of Q5.

Like Prediction 1, Prediction 8 is tested by all questions. Predictions 9 and 10 specifically rely on

Q2, Q3, and the first question of Q5. The following Table highlights these relationships.

5 Data and Results

Table 2 summarizes choices across the information structures tested by Q1-Q5. The first set of

results describes preferences over information structures that are symmetric, but vary in terms of

Blackwell informativeness. This first set of results indicates that individuals generally prefer full

early resolution relative to full late resolution. Moreover, individuals prefer even learning a little

bit earlier rather than full late resolution in just a large a proportion. Therefore, the results do not

support a general preference for one-shot resolution of uncertainty.

The second set of results relates to preferences for negatively versus positively skewed infor-

mation structures. We observe that most individuals prefer the positively skewed information

structure relative to the negatively skewed information structure. Moreover, the proportion of indi-

viduals who prefer the positively skewed structure does not seem to vary with the type of structure

(i.e. slightly, moderately, or extremely skewed). In addition, the preference for positively skewed

information seems as prevalent in the population as the preference for early resolution.

The third set of results concerning choices between symmetric and skewed information op-

tions require more interpretation due to the conditional nature of the experimental design. Recall

that individuals only compared (.76, .76) to (.3, .9) if they previously indicated they preferred full

early resolution of information to full late resolution (this was the same for comparing (.1, .95) to

(.67, .67)). Individuals compared (.3, .9) to (.55, .55) if they made the opposite choice regarding

the timing of full resolution of information (similarly for comparing (.66, .66) to (.5, 1)). Thus, we

can interpret these choices as asking, conditional on individuals seeming to exhibit a preference

regarding Blackwell informativeness over symmetric signal structures, do we observe their pref-

erences respecting that same Blackwell ordering when comparing positively-skewed structures to

symmetric structures. Because most individuals prefer positively-skewed structures, and because

20We do not directly compare skewed information to full early resolution, but rely on the fact we find monotonicityin the Blackwell ordering elsewhere.

21

Early vs. Late

(1, 1) � (.5, .5) 77%∗∗∗ (196/250)

(.55, .55) � (.5, .5) 78%∗∗∗ (91/121)

Pos. Skewed vs. Neg. Skewed

(.5, .1) � (1, .5) 75%∗∗∗ (193/250)

(.6, .9) � (.9, .6) 72%∗∗∗ (144/196)

(.3, .9) � (.9, .3) 84%∗∗∗ (149/183)

Symmetric vs. Skewed

(.76, .76) � (.3, .9) 71%∗∗∗ (65/92)

(.3, .9) � (.55, .55) 67%∗ (18/27)

(.67, .67) � (.1, .95) 64%∗∗∗ (67/104)

(.66, .66) � (.5, 1) 56% (15/27)

� represents “chosen over”. ∗∗∗\∗∗\∗ implies pro-portion is significantly different from 0.5 at the0.01\.05\.1 level.

Table 2: Percentage of choices

the comparisons are to symmetric structures that are just barely more or less Blackwell informative,

these questions also test whether preferences for skewness can dominate preferences for Blackwell

informativeness.

From the comparisons of (.76, .76) to (.3, .9) and (.1, .95) to (.67, .67) we see that most (around

two thirds) of the individuals that exhibit a preference for early resolution over symmetric structures

also prefer Blackwell more informative signals to skewed signals. Looking at the choice between

(.3, .9) and (.55, .55) we observe even though all these individuals preferred full late resolution to full

early resolution, most individuals still preferred positively-skewed information that was Blackwell

more informative to a less informative symmetric signal structure. Looking at the choice between

(.66, .66) to (.5, 1) we see that almost equal proportions of individuals choose either option. Thus,

the preferences of individuals who preferred full late to full early resolution of uncertainty when

comparing symmetric information structures do not respect respect the same ordering induced by

Blackwell dominance when comparing a positively skewed structure to a symmetric structure.

In addition to choosing an information structure, subjects were also asked by how much they

preferred their chosen option over the unchosen option. Although these questions were not incen-

tivized, we believe they give a sense of the relative strength of preference within individuals across

the chosen options. We test whether reversals of preference regarding Blackwell ordering among the

individuals who preferred full late to full early resolution of uncertainty are more likely among those

with weak preferences for late resolution. Indeed, we find that the weaker individuals’ preference for

22

late resolution, the more likely that they prefer positively-skewed information to a less informative

symmetric signal (p−value= .009, logistic regression of conditional Q4 choice on preference strength

in Q1). Those who always prefer less information have rated their preference for full late resolution

to be on average 8.8 out of a 10 point scale, whereas those who prefer the more informative skewed

signal have rated their preference for full late resolution to be 6.5 on average.Therefore, it seems

that at least some of the individuals who do not seem to have a consistent preference regarding

Blackwell informativeness of signals may have weaker preferences of late resolution of uncertainty

to begin with.21

Chosen Option

First Second

Early vs. Late

(1, 1) vs (.5, .5) 9.23 7.37

(.55, .55) vs (.5, .5) 7.08 5.67


(.5, .1) vs (1, .5) 8.24 7.19

(.6, .9) vs (.9, .6) 7.13 6.48

(.3, .9) vs (.9, .3) 7.54 6.76

Symmetric vs. Skewed

(.76, .76) vs (.3, .9) 7.42 7.93

(.3, .9) vs (.55, .55) 7.11 8.44

(.67, .67) vs (.1, .95) 7.67 7.24

(.66, .66) vs (.5, 1) 7.87 6.58

Table 3: Intensity of Choices

Table 3 summarizes the preference strength data conditional on a particular chosen option. This

data supports the previous results — the option that was chosen by a majority of subjects also

was more strongly preferred by the subjects that chose it, compared to the strength of preference

reported by individuals who preferred the less-chosen option. The one question where this is not

true is the choice between (.3, .9) and (.55, .55), where most individuals preferred (.3, .9) but

individuals who chose (.55, .55) exhibit a stronger preference for their chosen option.

Even though the existing theoretical models do not provide any clues regarding the relationship

between preferences for early versus late resolution of uncertainty and preferences for skewness

given a prior, the within-person nature of our experiment’s design allows us to investigate whether

21Differences in strength of preference does not explain whether individuals that exhibit a preference for earlyresolution over symmetric structures also prefer Blackwell more informative signals to skewed signals, most probablybecause only a minority of subjects fail to do so.

23

preference for positively skewed information option is stronger among subjects who prefer late

resolution of uncertainty compared to subjects who prefer an early resolution.

Extreme Medium Slight

Pos. Neg. Pos. Neg. Pos. Neg.

(.5,1) (1,.5) (.6,.9) (.9,.6) (.3,.9) (.9,.3)

Early (1,1) 123 73 196 113 41 154 117 28 145

Late (.5,.5) 44 10 54 31 11 42 32 6 38

167 83 250 144 52 196 149 34 183

Table 4: Early or Late vs Skewed

Table 4 cross-tabulates these within-person choice patterns. For example, out of the 196 subjects

who prefer early resolution of uncertainty, 123 choose the extreme positively skewed information

option and the remaining 73 choose the extreme negatively skewed information option. Similarly,

out of the 154 (145) subjects who prefer early resolution of uncertainty who also were asked to

indicate a choice between information options with medium (slight) skew, 113 (117) choose the

positively skewed information option. We see that subjects who have a preference for early res-

olution of uncertainty are relatively less likely to choose the extremely positively skewed signal,

compared to those who prefer late resolution (p−value= .012, logistic regression of Q1 choice onto

Q3 choice). However, such a relationship does not exist between medium or slight positive skewness

and late resolution preferences. Therefore, the evidence is intriguing, but inconclusive.

Table 5 cross-tabulates within-person choice patterns in the questions that present positively

and negatively skewed information structures. Overall, we see that most individuals are consistent

in their preferences. In particular, those who prefer one positively skewed signal are very likely

to prefer another positively skewed signal. Overall, 73% of the individuals who made choices

among {(1,.5), (.5,1)} and {(.9,.3), (.3,.9)}, 65% of the individuals who made choices among {(1,.5),

(.5,1)} and {(.9,.6), (.6,.9)}, and 76% of the individuals who made choices among {(.9,.3), (.3,.9)}and {(.9,.6), (.6,.9)} display the same ordering when comparing negatively and positively skewed

structures. We also find some evidence that people who have weak preferences regarding their

choice in one of the questions are more likely to switch their preference ordering in another question.

However, the consistency across questions is considerably high, especially given the complexity of

the experiment.

One plausible concern regarding the design of the experiment is that we did not directly elicit

a willingness to pay for individuals preferred information structure. We did this to avoid further

complicating an already complex elicitation procedure. A second plausible concern is that in having

individuals make several pairwise choices, we elicited preferences different from what they would

24

Extreme Medium Medium

Pos. Neg. Pos. Neg. Pos. Neg.

(.5,1) (1,.5) (.6,.9) (.9,.6) (.6,.9) (.9,.6)

Pos. (.3, .9) 107 42 149 (.3, .9) 85 22 107 (.5, 1) 101 26 127

Neg. (.9,.3) 17 17 34 (.9,.3) 9 13 22 (1, .5) 43 26 69

124 59 183 94 35 129 144 52 144

Table 5: Relationships: Skewness Preferences

express if they were making a single pairwise choice (despite the fact that only one of the pairwise

choices would be implemented).

In order to ensure our results are robust to these concerns, we ran an additional experiment.

The experiment was very similar — two outcomes, with a 50 % prior on the high outcome. However,

individuals made only a single pairwise choice between two information structures. Once subjects

made a choice, and before they saw the signal, they were asked to indicate the amount they would

be willing to accept in order to change their choice. This amount measures the utility difference

between the two informational options. Then their choices were carried out using a DGM procedure

and they saw a signal from the information structure their willingness to accept answers indicate.

For more details of the implementation, please see Appendix D.


(.5, .1) � (1, .5) 64% (35/55)

(.7, .9) � (.9, .7) 85%∗∗∗ (57/67)

� represents “chosen over”. ∗∗∗ implies proportion issignificantly different from 0.5 at the 0.01.

Table 6: Percentage of choices: Single pairwise choice

Thus we find results that are qualitatively similar to the results in the main study, leading us to

believe that our findings are not significantly effects about concerns regarding a lack of payment or

the multiple pairwise choices. Moreover, individuals exhibited a strict willingness to pay — fewer

than 10 % of individuals who chose the positively skewed signal required 0 cents to change their

signal — thus almost all individuals who exhibited a preference for positive skew also demonstrated

a strictly positive willingness to accept in order to change to the negatively skewed signal.

25

6 Discussion

6.1 Our Results

Our results are useful in three ways. First, they allow us to demonstrate that individuals exhibit

consistent patterns of preference over skewed information structures. We find a strong preference

for positively skewed information. People choose right skewed information structures over left

skewed information structures and indicate that this is a relatively strong preference. Moreover,

this preference is robust, and exhibited through the comparison of several types of positively skewed

information structures to negatively skewed structures.

There are two concerns we want to address regarding the interpretation of the data. First, we

might be concerned that individuals are so used to having information be instrumentally valuable

that they simply apply the instrumental heuristic in non-instrumental settings. This would be a

good reason for being concerned that individuals choose more informative over less informative

signals. However, there are many simple instrumental settings where choosing the left-skewed

information structure gives a higher expected payoff than choosing the right skewed structure, and

so it is hard to conceive that this preference is purely heuristic in nature. More generally, for any

convex Kreps-Porteus functional, there exists an action set such that the functional can represent

the preferences a standard EU decision-maker who receives information and then must take a single

action out of that set.

Second, as we have observed before, positively skewed structures, compared to negatively skewed

structures, always generate a higher posterior probability of winning conditional on observing either

signal. Thus might be concerned that people are only focusing on these posterior probabilities. The

real concern is that this is not a true preference but simply a boundedly rational way of evaluating

the signals. However, our data shows that individuals do not simply want to maximize the posterior

probability of winning, conditional on observing a signal. We find that individuals prefer Blackwell

more informative symmetric signals to right skewed signals. The former have the same posterior

probability of winning conditional on a red ball, but a lower posterior probability of winning

conditional on a black ball.

Second, our results allow us to understand the relationship between preferences for skewed

information and related preferences. Consistent with previous studies, we find a preference for full

early resolution of information relative to late resolution of information. Moreover, we find that

individuals prefer Blackwell more informative signals to those that exhibit positive skew but are less

Blackwell informative. We find no evidence for preference for one-shot resolution of information.

Our results regarding skewness are consistent with a positive third derivative in the utility function

defined over compound lotteries (i.e. information structures); moreover, our results regarding

preference for early resolution are consistent with a a convex utility function defined over compound

26

lotteries.

Third, we can use our findings to test the various assumptions and models discussed in Section 3.

Because individuals exhibit preferences over different information structures with the same prior,

they violate the reduction of compound lottery assumption, as discussed in Prediction 1. Such

behavior is also inconsistent with the predictions of Prediction 8. Moreover, looking at Prediction

2, we find that individuals do not satisfy time neutrality — individuals, both at an aggregate and

individual level, are not indifferent between (1,1) and (0,0).

Turning to the predictions of specific models, Hypotheses 9 and ?? tells us that if preferences

are in either Koszegi and Rabin’s or Ely, Frankel and Kamenica’s class, then for f = .5 individuals

should be indifferent between (p, q) and (q, p). In fact, at both aggregate and individual levels we

do not find such an indifference — people prefer the positively skewed structure.

A variety of models have the possibility of predicting preferences for skewed information, even

when f = .5. These include Gul’s model of Disappointment Aversion and Rank Dependent Utility.

However, Prediction 6 indicates that those preferences should also prefer (.5, .5) to (p+ ε, q+ ε) for

a ε close enough to .5. We find no evidence for this type of preference for clumping.22

Our data fails to be consistent with the approach of Dillenberger and Segal (2015) for much the

same reason. Prediction 7 requires that individuals prefer (.5, .5) to any other more informative

symmetric signal. We find that this is not the case, as most individuals prefer (.55, .55) to (.5, .5).

Our subjects also fail to exhibit either a strong type I error preference or a strong type II error

preference, and so fall outside of the scope of Eliaz and Spiegler (2006). This is because we find that

individuals preferences seem to respect Blackwell’s ordering, and so no less than fully informative

signal structure would to be preferred to (1, 1).

In contrast, our data is generally consistent with the traditional model of Kreps and Porteus

(1978). Considering Prediction 3, we find that individuals behave in a way that is consistent with

the local utility function being convex in the model of Kreps and Porteus (1978). Moreover, given

Prediction 4 we know that if the derivative of the local utility function is also convex, then we can

rationalize the preference for positive skew.

In order to provide more context for these results, consider the Epstein-Zin parameteriza-

tion of the Kreps-Porteus model. Then V1(l) =∑

x∈l u1(x)l(x) =∑

x∈l xρl(x) while V2(l) =∑

x∈l u2(x)l(x) =∑

x∈l xαl(x). In this case the local utility function is convex if and only if

u1(u−12 (x)) is convex — or x

ρα is convex, which is the same as ρ ≥ α. Similarly, the derivative of

22One potential objection is that perhaps we did not set ε close enough to 0. In fact, simple calibrations showthe power of our test, where ε = .05. Suppose preferences in V1 and V2 with both have a disappointment aversionrepresentations (ui, βi). Moreover, suppose, as is plausible for a our small stakes, the ui is linear. In this case, if anindividual prefers (.55, .55) over (.5, .5), then for any plausible value of β2 (i.e. 0 ≥ β2 ≤ 100) β1 must be less than.01, or in other words, people must be ‘almost’ expected utility over gambles that resolve now — a fact inconsistentwith the risk aversion we actually observe over small stakes. Similar calibrations can be done with rank-dependentutility.

27

the local utility function is convex if and only if the derivative of u1(u−12 (x)) is convex. Given ρ ≥ α

this equivalent to ρ ≥ 2α. Thus, individuals must have a strong preference for early resolution.

We can relate our restrictions to the larger literature estimating Kreps-Porteus and Epstein-Zin

preferences. Epstein-Zin preferences are used widely in macroeconomics and have been estimated

from a variety of data. We can ask how well do our restrictions actually match the estimates

obtained from an entirely different domain. In fact, recent estimates are consistent with the re-

strictions our observed preferences for skewness place on the data (i.e. the convexity of the first

derivative of u1(u−12 ). For example, both Brown and Kim’s (2014) estimation, which relies on lab

experiments testing risk preferences, intertemporal elasticity, and preference for early resolution,

and Binsbergen et al’s (2012) results, which uses macroeconomic data, find that ρ ≥ 2α (much

greater in fact).

6.2 Literature Review

Here we review the literature that relates to our results. We discuss in detail both theoretical and

empirical approaches that touch on preferences for information, looking a preferences over skewness,

preferences for early versus late resolution, and preferences for one shot versus gradual resolution

separately. We discuss the literature for compound lotteries separately; even though preferences

over compound lotteries and information structures are theoretically linked, the decision framing

is quite different across the two domains.

Preferences for skewed information: There has been little empirical of preferences for

skewness. In the first work looking at this subject we are aware of, Boiney (1993) finds a preference

for positively skewed compound lotteries, but in the context of ambiguity, rather than objective

lotteries. Treatment 4 of Eliaz and Schotter (2010) provides the only experimental investigation

of preferences over positively versus negatively skewed information structures that we are aware

of. In particular, in a two-stage compound lottery context where one option dominates the other

in all states of the world, they test whether individuals are willing to pay to obtain information

about the degree to which the dominating option is superior, even though this information will

not affect their ultimate choice. Overall, they find that individuals are willing to pay to learn

whether the option they are about to choose is strongly or mildly dominating the option they are

rejecting. In Treatment 4, they find that individuals prefer a negatively skewed signal over one that

is positively skewed, as well as over one that is uninformative. They argue that individuals demand

non-instrumental information in this setting in order to feel more confident about choosing the

dominating option. As a result, they predict and show, that individuals would not have demand

for any non-instrumental information after the choice is made. Relatedly, they also hypothesize

that demand for non-instrumental information would not exist if participants did not have a choice

to make. Note that we study demand for non-instrumental information in a context where people

28

neither make choices nor have any other form of agency in determining the outcome of interest. As

a result, their results are not applicable in our setting. In addition to this fundamental difference

in the proposed process, their design also has important differences that make it hard to generalize

their results to our domain. Most notably, there isn’t a delay between the receipt of information and

the full resolution of information in their experiment. As a result, regulating anticipatory emotions

is not likely to play a role in shaping preferences over information in their data. Therefore, the

focus of the present paper and that of Eliaz and Schotter (2010) are quite different.

Preferences for early versus late resolution: The literature beginning with Kreps and

Porteus (1978) has spawned a great deal of empirical testing of preferences over timing of resolution.

This literature has broadly found support for individuals preferring earlier to later resolution.

To organize our discussion of the empirical tests of these preferences, we distinguish three

branches of the literature. The first branch is motivated by macro-economic applications of the

Kreps-Porteus model, and in particular the Epstein-Zin parameterization. These models allow for

the separation of risk preferences from inter-temporal elasticity and so can accommodate a wider

range of data than traditional models. As such, the Epstein-Zin model was widely adopted by

macroeconomists to estimate both a risk preference paramater and a intertemporal elasticity using

either survey data and financial decision-making data. However, these papers do not directly test

for preferences over information. These preferences are indirectly inferred from the estimates of risk

preferences and inter-temporal elasticity. This literature offers mixed results concerning preferences

for early versus late resolution of uncertainty. For example, the data in the early investigations

provided by Epstein and Zin (1991) indicate a preference for late resolution of uncertainty. However,

more recent papers, such as Binsbergen et al. (2012) have present a strong preference for early

resolution of uncertainty.

The second branch of the literature directly tests preferences over information structures, as

we do in this paper. However, most of this literature either asks hypothetical questions and/or

studies demand for information in contexts where information may be instrumentally valuable, for

example by providing planning benefits. For example, Arai (1997) explores whether individuals

have a preference for early or late resolution based on non-incentivized choices in a setting where

there the information concerns future income/consumption. He finds an overall preference for early

resolution. Ahlbrecht and Weber (1997) study whether individuals’ choices are consistent with the

Kreps-Porteus model. They find that a plurality of individuals always prefer full early resolution

to full late resolution (we say plurality because they asked for subjects preferences for full early

versus full late multiple times). However, they find that individuals do not satisfy some ancillary

predictions of the Kreps-Porteus model when preferences are recursive and satisfy independence in

each period. Thus, they caution against interpreting their evidence as simply support for Kreps-

Porteus.

29

Chew and Ho (1994) test preferences for early versus late resolution using hypothetical real-

world scenarios. They find that most individuals prefer early resolution. Similarly, and again using

hypothetical choices, Lovallo and Kahneman (2000) find that subjects prefer early resolution of

information. Moreover, moving from gains to losses strengthens the preference for early resolution

of uncertainty; and, at least in the domain of gains, a negatively skewed prior (which is quite

distinct from a skewed information structure) induces a greater interest in speeding up resolution

for gains compared to positively skewed gambles. Ganguly and Tasoff (2014) find that individuals’

preferences for earlier information grows the larger the gain they are facing is. Similarly, larger

losses lead to a preference for delaying information.

While most of the empirical data seems to suggest a preference for early versus late resolution of

uncertainty, Kocher, Krawczyk and Van Winden (2014) provide evidence of heterogeneity in prefer-

ences. Using lottery tickets in the lab, show that although many participants prefer lottery tickets

for an immediate drawing rather than one for the subsequent day, a substantial fraction actually

prefers delayed resolution23. Moreover, Von Gaudecker et al. (2011), using survey evidence from

a representative sample in Holland, find that the median subject is essentially indifferent between

early and late resolution. They also demonstrate a large degree of heterogeneity in preferences.

It is possible to think of the first two branches of the literature as complementary because

the Kreps-Porteus and Epstein-Zin models predict a tight connection between preferences over the

resolution of uncertainty and the relative values of the risk parameter and elasticity of intertemporal

substitution. Brown and Kim (2014) directly test this connection by eliciting risk preferences,

discount rates, elasticity of intertemporal substitution and preferences for resolution of uncertainty.

They find a majority (approximately 60 percent) of the subjects have a preference for earlier

resolution of uncertainty. Only a negligible proportion of subjects prefer late resolution.

The third vein of the literature considers the effect of delaying (or speeding up) resolution of

uncertainty on actions of players that are indicative of their information preferences. In a real-

stakes investment task, van Winden, Krawczykb, Hopfensitz (2011) find that subjects invest more

in a risky investment if resolution is sooner. Erev and Haruvy (2010) find that subjects value a

delayed chance at winning a prize more than an immediate chance.

Although the literature has found general support for early resolution of uncertainty, there is

also substantial heterogeneity within the subject population of a given study, and across studies.

One reason for the heterogeneity across studies may be due to different framing effects.

Preferences for one-shot versus gradual resolution: Individuals may prefer to learn all

information at once (one-shot resolution) or to learn the information gradually, in increments over

time. This preference is distinct from the preference of late versus early resolution of uncertainty,

as it concerns lumping versus spreading out information.

23Note, however, that in their context there is some planning benefit to receiving the lottery early.

30

Using incentivized choices, although allowing for the possibility of instrumental value of in-

formation, Zimmerman (2013) finds no evidence that subjects are averse to gradual resolution of

information. On the other hand, when examining these preferences concerning the potential of out-

comes in the loss domain (electric shocks), Falk and Zimmerman (2014) do. Other papers examine

actions regarding financial investments that may reflect preferences regarding gradual resolution

of uncertainty. For example, Karlsson, Loewenstein and Seppi (2009) find that individuals check

portfolios more often if they hold a high prior than a low prior. Bellemare, Krause, Kroger, and

Zhang (2005) demonstrate that if individuals receive information more often about their risky in-

vestment, they tend to invest less in that option and favor a safe investment where such information

is essentially eliminated.

Preferences over compound lotteries: Some tests of the theory frame the choices as com-

pound lotteries, rather than information structures. Halevy (2007) finds that individuals prefer

one-shot lotteries to compound lotteries. Abdellaoui, Klibanoff and Placido (2015) elicit willing-

ness to pay for compound lotteries and one-shot lotteries and also find individuals to prefer one-shot

lotteries to compound lotteries. These results can be interpreted as individuals having preferences

for one-shot over gradual resolution of uncertainty. However, it isn’t clear whether subjects view a

one-stage lottery as an early resolving lottery or a late resolving lottery, therefore making it difficult

to fit these results into the information framework and draw such conclusions. Miao and Zhong

(2012) explicitly address this concern, allowing individuals to compare two stage lotteries, but

where either the first or second stage is degenerate. Doing this, they can distinguish a preference

for early versus late resolution. Interestingly, they find that individuals prefer compound lottery

structures that feature full late resolution to most other compound lottery structures — even to

those that induce full early resolution. They also find violations of independence of %1.

7 Conclusion

Our results indicate individuals have strong preferences for right skewed information structures.

However, even without changing the number of outcomes or the number of signals, much work re-

mains to explore how these preferences may vary as the payoff differential across outcomes changes

or the prior probability of the high outcome changes. Moreover, there are different notions of

‘skewness’ in the literature. The most well known one, introduced by Menezes et al, 1980), involves

mean-variance preserving changes in the distribution that change the third moment (without chang-

ing the first two). Dillenberger and Segal (2013) introduce a different definition of skewed compound

lotteries. A third definition would define preferences over p and q rather than over the induced

compound lotteries. All three of these definitions coincide in our current setting. Future work can

disentangle which of these notions is most appropriate for evaluating preferences over information

31

structures.

32

Appendix A: Formal Definitions

This section will provide formal definitions for the theoretical discussion in the paper. In order to

link our discussion more closely to the existing literature, this Appendix will work with two-stage

compound lotteries, the set of which are equivalent to the set of prior, information structure pairs.

Formally, consider an interval [w, b] = X ⊂ R of money. Let ∆X be the set of all simple lotteries

on X. A lottery F ∈ ∆X is a function from X to [0, 1] such that∑

x∈X F (x) = 1 and the number of

prizes with non-zero support is finite. F (x) represents the probability assigned to the outcome x in

lottery F . For any lotteries F,G we let αF + (1−α)G be the lottery that yields x with probability

αF (x) + (1− α)G(x). Denote by δx the degenerate lottery that yields x with probability 1. Next,

denote ∆(∆X) as the set of simple lotteries over ∆X . For P,Q ∈ ∆(∆X) denote R = αP +(1−α)Q

as the lottery that yields simple (one-stage) lottery F with probability αP (F )+(1−α)Q(F ). Denote

by DF the degenerate, in the first stage, compound lottery that yields F with certainty. % is a weak

order over ∆(∆X) which represents the decision-maker’s preferences over lotteries and is continuous

(in the weak topology). Moreover, we will define a reduction function that maps compound lotteries

to reduced one-stage lotteries: φ(Q) =∑

F∈∆XQ(F )F .

Now consider the set of prior-information structure pairs, such that the prior f has support on

[w, b]. The main text discusses how to map a prior-information structure pair into a (unique) two

stage compound lottery. We now show that any given two-stage compound lottery maps into a

unique prior-information structure pair. Given a two-stage lottery P with support p1, ...pn we first

can find f , the prior.

φ(P )(ωi) = f(ωi)

. To identify I, observe that we have a set of equations pj(ωi) = ψj(ωi) =f(ωi)Iij∑k f(ωk)Ikj

, along with

restrictions on the elements of I discussed in the main text (and with a known f). These form a a

set of equations that generates a unique solution I.

We can now turn to discussing the formal properties and models related to our predictions,

using the framework of compound lotteries. First, reduction of compound lotteries implies that

individuals only care about the reduced one-stage lotteries that they face:

Reduction of Compound Lotteries: For all P,Q ∈ ∆(∆X) if φ(P ) = φ(Q) then

P ∼ Q.

In deriving additional predictions, it will be useful to formally define early and late resolving

lotteries:

• Γ = {DF |F ∈ ∆X}, the set of degenerate lotteries in ∆(∆X)

• Λ = {Q ∈ ∆(∆X)|Q(F ) > 0⇒ F = δx for some x ∈ X}, the set of compound lotteries whose

outcomes are degenerate in ∆X .

33

Early resolving lotteries have all uncertainty resolved in the first stage and so the second stage

lotteries are degenerate. These are equivalent to situations where the information structure reveals

all information in Period 1; thus, posteriors after observing the signal are degenerate. In contrast,

late resolving lotteries have all uncertainty resolved in the second stage and so their first stage is

degenerate. These are equivalent to situations where the information structure reveals no informa-

tion in Period 1. Thus, posteriors after receiving information are exactly the same as the priors

before receiving information. We define the restriction of % to the subsets Γ and Λ as %Γ and %Λ.

Given these definitions, we can now state Time Neutrality.

Time Neutrality: If P ∈ Γ and Q ∈ Λ and φ(P ) = φ(Q) then P ∼ Q.

Grant, Kajii and Polak (1998) formally define a preference for early resolution of information

in the setting of compound lotteries as:

Definition: % displays a preference for early resolution of uncertainty if for all Q,P ∈ ∆(∆X),

where Q =∑N

i Fiqi, P =∑j−1

i=1 Fiqi + G1βqj + G2(1 − βqj) +∑N

i=j+1 Fiqi where β ∈ [0, 1]; if

Fj = βG1 + (1− β)G2 then P % Q.

Grant, Kajii and Polak (1998) define a notion of “elementary linear bifurcations” which is

equivalent to a binary relation over compound lotteries. They show that one compound lottery is

an elementary linear bifurcation of another if and only if the former Blackwell dominates the latter.

Given a function V on the set of probability measures ∆(X), then for each each µ ∈ ∆(X) we

say that V is Gateaux differentiable at µ in ∆(X) if there is a measurable function v(;µ) on X such

that for any ν in ∆(X) and any α ∈ (01):

W (αν + (1− α)µ) = α

∫v(ζ;µ)[ν(dζ)− µ(dζ)] + o(α)

where o(α) is a function with the property that o(α)α → 0 as α → 0. v(;µ) is the Gateaux

derivatve of V at µ. V is Gateaux differentiable if V is V is Gateaux differentiable at all µ. We

call v(;µ) the local utility function at µ.

We next turn to discussing recursivity.

Recursivity: For all F,G ∈ ∆X , all Q ∈ ∆(∆X) and α ∈ (0, 1), DF % DG if and only

if αDF + (1− α)Q succsim αDG + (1− α)Q

As discussed in the text recursivity is useful because decision-makers with recursive prefer-

ences evaluate compound lotteries using a folding-back procedure — preferences over two stage

lotteries can be evaluated using preferences over one stage lotteries. Decision-makers replace the

second stage of any given compound lottery by the certainty equivalent generated by %Γ. The

34

resulting lottery is evaluated using %Λ. For example, suppose that %Γ and %Λ both satisfy In-

dependence, and denote the Bernoulli utility function used to evaluate each of them respectively

as uΓ (uΛ). In order to calculate the value of P (Figure ??) the decision-maker first evaluates

the possible second stage lotteries separately. Thus, she evaluates F according to uΓ and finds

the certainty equivalent u−1Γ (0.75uΓ(1) + 0.25uΓ(0)). She also evaluates F ′ according to uΓ and

finds the certainty equivalent u−1Γ (0.25uΓ(1) + 0.75uΓ(0)). In order to evaluate P , she substitutes

for F and F ′ their respective certainty equivalents. This generates a one-stage lottery that with

probability 0.5 gives outcome u−1Γ (0.75uΓ(1) + 0.25uΓ(0)) and with probability 0.5 gives outcome

u−1Γ (0.25uΓ(1) + 0.75uΓ(0)). She then evaluates this lottery using uΛ.

We say that a preference over two stage lotteries has a recursive representation (V1, V2) if the

preference can be represented with a functional V , such that V is derived using V1 and V2 in the

folding-back procedure described above. More formally, let F denote a one stage lottery, which gives

outcome xi with probability f(xi) which we can represent as follows F = (x1, f(x1); . . . ;xn; f(xn)).

For a two-stage lottery P , let p(Fi) denote the probability of receiving lottery Fi in the second stage,

and represent P as (F1, p(F1); . . . ;Fn; p(Fn)). Last, let CE2(F ) denote the certainty equivalent of

F using %Γ.

Definition 1 Suppose preferences over two-stage lotteries can be represented by V . We say pref-

erences have a recursive representation (V1, V2), where V1 and V2 are utility functions over one-

stage lotteries, if and only if for all P = (F1, p(F1); . . . ;Fn; p(Fn)), it is the case that V (P ) =

V1(CE2(F1), p(F1); . . . ;CE2(Fn); p(Fn)).

Independence within Γ and Λ is defined as per standard for any one-stage lottery.

We now sketch out some of the functional forms that are relevant for our predictions. If

preferences satisfy recursivity, we can represent them using VΛ and VΓ. Because Vi for i ∈ {Λ,Γ}is defined over two-stage lotteries that are isomorphic to one stage lotteries, we can simply define

Vi using one stage lotteries. If Period 1 and Period 2 preferences both satisfy Independence then

Vi =∑u(x)F (x) for i ∈ {Λ,Γ}. We refer to these as the Kreps-Porteus class of preferences.

If preferences in both periods are in Gul’s class of disappointment aversion, then

Vi(F ) =∑x

u(x)F (x) + β∑

x≤u−1(VG(F ))

(u(x)− VG(F ))F (x)

where u is a function mapping from wealth to the reals, and β is a scalar. Individuals are disap-

pointment averse if and only if β ≥ 0. If preferences in both periods are in the rank dependent

class then

Vi(F ) =∑x

u(x)

w∑y≥x

F (y)

− w(∑y>x

F (y)

)35

where u is a function mapping from wealth to the reals, and w is a function mapping from [0, 1] to

[0, 1], such that w(0) = 0, w(1) = 1 and w is strictly increasing. Individuals are pessimistic if and

only if w is convex.

Because Dillenberger and Segal’s (2015) conditions on preferences are quite subtle (beyond the

assumption of recursivity) we refer interested readers to their discussion. Moreover, since Brunner-

meier and Parker’s (2005) model predicts that all information structures should be indifferent to

one another, we direct the interested reader to their paper for the details of their functional form.

We next summarize Koszegi and Rabin’s functional form. Given a gain-loss functional η, a scalar

weight on expected utility κ, a scalar weight on first period gain-loss utility of ν, and denoting,

given a distribution h over the payoff across states, any ζ ∈ (0, 1) let u(ωh(p)) denote the utility of

the payoff level at percentile p. Then the functional form is:24

κEf (u(ωi)) + ν∑i

f(ωi)Iij

∫ 1

0η(u(ωpsij (p))− u(ωf (p)))dp

+∑i

f(ωi)Iijψj(ωi)

∫ 1

0η(u(ωi(p))− u(ωψj (p)))dp

Because this is complicated, we will define the function for our simple binary-binary setup.

Normalizing the Bernoulli utility of the high and low outcomes to 0 and 1 the total utility of an

information structure is:

κ(1f + (1− f)0)

+ ν(η(1− 0)(fp

fp+ (1− q)(1− f)− f)(fp+ (1− q)(1− f)))

+ ν(η(0− 1)(f − f(1− p)f(1− p) + q(1− f)

)(f(1− p) + q(1− f)))

+ (fp+ (1− f)(1− q))(

fp

fp+ (1− f)(1− q)(η(1− 0)

(1− q)(1− f)

fp+ (1− q)(1− f))

)+ (fp+ (1− f)(1− q))

((1− fp

fp+ (1− f)(1− q))η(0− 1)

fp

fp+ (1− q)(1− f)

)+ ((1− p)f + q(1− f))

(f(1− p)

(1− p)f + q(1− f)η(1− 0)

q(1− f)

f(1− p) + q(1− f)

)+ ((1− p)f + q(1− f))

((1− f(1− p)

(1− p)f + q(1− f))η(0− 1)

f(1− p)(1− p)f + q(1− f)

)The last functional form we consider is Ely, Frankel and Kamenica (2014). They have two

models, both of which deliver the same predictions regarding skewness. We acccomodate the more

24Denoting beliefs in Period 0 as f (our prior) and the beliefs in Period 1 (after receiving signal sj) as ψj

36

general forms of their models and allow for individuals to care both about an expected utility portion

of their beliefs, as well as suspense or surprise. We also allow individuals to weight suspense and

surprise different across periods. We denote ϑ as a function that turns suspense and surprise into

utils. As before we also have a scalar weight on expected utility κ and scalar weight on first period

gain-loss utility of ν

We first consider their model of suspense, where utility is given by:

κEf (u(ωi)) + νϑ(∑j

f(ωi)Iij∑i

(ψj(ωi)− f(ωi))2)

+∑j

f(ωi)Iijϑ(∑i

ψj(ωi)∑i

(I− ψj(ωi))2)

Simplifying in our binary-binary setup, we get:

κ(1f + (1− f)0)

+ νϑ((fp + (1− q)(1− f))2(fp

fp + (1− q)(1− f)− f)2 + (f(1− p) + q(1− f))2(f −

f(1− p)f(1− p) + q(1− f)

)2)

+ (fp + (1− f)(1− q))ϑ(fp

fp + (1− f)(1− q)2(1−

fp

fp + (1− f)(1− q))2+ (1−

fp

fp + (1− f)(1− q))2(

fp

fp + (1− f)(1− q))2)

+ ((1− p)f + q(1− f))ϑ(f(1− p)

(1− p)f + q(1− f)2(1−

f(1− p)(1− p)f + q(1− f)

)2+ (1−

f(1− p)(1− p)f + q(1− f)

)2(f(1− p)

(1− p)f + q(1− f))2)

EFK also provide a model of suprise:

κEf (u(ωi)) + ν∑j

f(ωi)Iijϑ(∑i

(ψj(ωi)− f(ωi))2)

+∑j

f(ωi)Iij∑i

ψj(ωi)ϑ(∑i

(I− ψj(ωi))2)

In our binary-binary setting this becomes:

37

κ(1f + (1− f)0)

+ ν(fp+ (1− q)(1− f))ϑ(2(fp

fp+ (1− q)(1− f)− f)2)

+ ν(f(1− p) + q(1− f))ϑ(2(f − f(1− p)f(1− p) + q(1− f)

)2)

+ (fp+ (1− f)(1− q)) fp

fp+ (1− f)(1− q)ϑ(2(1− fp

fp+ (1− f)(1− q))2)

+ (fp+ (1− f)(1− q))(1− fp

fp+ (1− f)(1− q))ϑ(2(

fp

fp+ (1− f)(1− q))2)

+ ((1− p)f + q(1− f))f(1− p)

(1− p)f + q(1− f)ϑ(2(1− f(1− p)

(1− p)f + q(1− f))2)

+ ((1− p)f + q(1− f))(1− f(1− p)(1− p)f + q(1− f)

)ϑ(2(f(1− p)

(1− p)f + q(1− f))2)

Appendix B: Proofs

Before we prove the proofs in the text, we will prove a useful lemma.

Lemma A Suppose f = .5. Then if x < y the posterior distribution induced by (y, x) has more

downside risk, in the sense of Menezes, Geiss and Tressler (1980) than that induced by (x, y).

Proof We prove that, given x < y and a prior of .5 , the posterior distribution induced by (x, y) is a

mean-variance preserving transformation of that induced by (y, x): or in other words, the posterior

distribution have the same mean and variance, but the former has a skew greater than the latter;

or the posterior distribution induced by the latter has more downside risk than the former. We

denote the cdf of the posterior distribution induced by the former as F and of the latter as G.

First, observe that the two distributions induce the same mean posterior, by law of iterated

expectations, which is simply the prior.

Second, we will show that∫ x

0

∫ z0 (G(y)−F (y))dydz > 0 for x < 1 and

∫ 10

∫ z0 (G(y)−F (y))dydz =

0.

Observe that G is equal to 0 on the range [0, 1−y)1−y+x), and then .5(x + 1 − y) on the range

[ 1−y)1−y+x ,

yy+1−x) and 1 on the range [ y

y+1−x , 1]. Observe that F is equal to 0 on the range [0, 1−x)1+y−x),

and then .5(1− x+ y) on the range [ 1−x)1+y−x ,

x1+x−y ) and 1 on the range [ x

1+x−y , 1].

We denote regionA as [0, 1−y)1−y+x); B as [ 1−y)

1−y+x ,1−x)

1+y−x); C as [ 1−x)1+y−x ,

yy+1−x); D as [fracyy + 1− x, x

1+x−y );

and E as [ x1+x−y , 1].

Thus (G(y) - F(y)) is 0 on A, then .5(1 +x− y) on B, then x− y on C, then .5(1 +x− y) again

on D, and then 0 on E.

This implies that∫ z

0 (G(y)−F (y))dy, is for z ∈ A is 0. For z ∈ B it is .5[1+x−y] 1−x1−x+y−.5(1−y).

For z ∈ C it is (z − .5)(x− y). For z ∈ D it is .5[1 + x− y]z − .5x. For z ∈ E it is 0.

38

We will now divide C into two separate intervals: C1 = [ 1−x)1+y−x , .5) and C2 = [.5, y

y+1−x).

Observe that∫ z

0 (G(y)− F (y))dy is weakly greater than 0 when z is in A, B and C1. Similarly,∫ z0 (G(y)− F (y))dy is weakly less than 0 when z is in C2, D and E.

Thus, we simply need to compute∫ .

0 5∫ z

0 (G(y) − F (y))dydz > 0 and show that∫ .5

0

∫ z0 (G(y) −

F (y))dydz = −∫ 1.5

∫ z0 (G(y)− F (y))dydz and we will have shown both parts.

Observe that∫ .5

0

∫ z0 (G(y) − F (y))dydz = 1

8 [1 − 2y1+x−y ][ (1+x−y)(1−x)

1−x+y − (1 − y)]. Moreover∫ 1.5

∫ z0 (G(y)− F (y))dydz = 1

8 [ 2x1+x−y − 1][x− y][ 2y

1+y−x − 1].

Routine algebra shows that the first is then equal to 18 [−1−x−y

1+x−y ][y2−y−x2+x

1−x+y ] and the second is

equal to 18 [−1−x−y

1+x−y ][−y2+y+x2−x1−x+y ].

Thus, by Menezes, Geiss and Tressler (1980) the posterior distribution G has more downside

risk than the posterior distribution F . �

Lemma 1 For any (p, q) ∈ S, observing a good signal increases the posterior on high outcome

relative to the prior, and observing a bad signal decreases the posterior on high outcome relative to

the prior.

Proof We will prove each part of the Lemma in turn. First we prove the first part. Recall that

for a given prior 0 < f < 1 on a high payoff and information structure (p, q), the posterior for the

high payoff given the good signal is

ψF =fp

fp+ (1− f)(1− q).

Now ψF > f if and only if

ψF =fp

fp+ (1− f)(1− q)> f,

which holds if and only if

(1− f)p > (1− f)− (1− f)q,

which is the same as

p+ q > 1.

An analogous series of steps establishes the result for the posterior after observing a B signal,

ψR =f(1− p)

f(1− p) + (1− f)q.

�

Lemma 2 For any signal structure (p′, q′) ∈ [0, 1] × [0, 1], there exists a (p, q) ∈ S that generates

the same posterior distribution. However, for any T ⊂ S there exists a (p′, q′) ∈ S such that there

is no element of T that generates the same posterior distribution as (p′, q′).

39

Assume that p+ q < 1 (observe that all signal structures on p+ q = 1 give the same posterior

distribution). In this case, denote p′ = 1−p and q′ = 1−q. We will work with likelihood ratios rather

than posterior beliefs. Under (p, q), likelihood ratio p1−q occurs with probability fp+ (1− f)(1− q)

and likelihood ratio 1−pq occurs with probability f(1− p) + (1− f)q.

Under (p′, q′) likelihood ratio 1−p′q′ = p

1−q occurs with probability f(1 − p′) + (1 − f)q′ =

fp + (1 − f)(1 − q). Likelihood ratio p′

1−q′ = 1−pq occurs with probability fp′ + (1 − f)(1 − q′) =

f(1− p) + (1− f)q. Therefore (p′, q′) generates the same posterior distribution as (p, q). Moreover,

p′ + q′ = (1 − p) + (1 − q) = 2 − p − q ≥ 1 since p + q ≤ 1. So therefore, instead of considering

some (p, q) we can always instead consider the corresponding p′ = 1 − p, q′ = 1 − q. This proves

the second part.

To prove the second part observe that in order for two signal structures (p, q) and (p′, q′) to

generate the same posteriors (so that for both signal structures, a fair weather prediction increases

the posterior relative to the prior and a rainy weather prediction decreases it) it must be the case

that p′

1−q′ = p1−q and 1−p′

q′ = 1−pq .

Therefore p′ − p′q = p − pq′ and q − p′q = q′ − pq′, which is equivalent to q = −p+pq′+p′p′ and

q = q′−pq′1−p′ . Simplifying, we have −p+pq

′+p′

p′ = q′−pq′1−p′ , or p′q′−pq′p′ = −p+pq′+p′+pp′−pp′q′−p′2.

This holds if and only if p′q′ = −p+pq′+p′+pp′−p′2, or p(1−q′−p′) = −p′q′+p′−p′2 = p′(1−q′−p′).This equality is true if and only if p = p′ or q′ + p′ = 1. �

Lemma 3 (p′, q′) Blackwell dominates (is Blackwell more informative than) (p, q) if and only if

p′ ≥ max{ p1−q (1− q′), 1− q′ 1−pq }.

Proof Recall that one signal structure (p′, q′) is Blackwell more informative than another (p, q)

if and only if the distribution of posteriors induced by (p′, q′) is a mean preserving spread of the

distribution induced by (p, q). By the law of iterated expectations, the expected posterior under

(p′, q′) and (p, q) must be the same — the prior. Because there are only 2 signals (and so 2 posteriors)

as well as only 2 states, the problem reduces to showing that the posteriors under (p′, q′) are more

extreme (in the sense that they are farther from the prior) than the posteriors under (p, q). In

order to simplify the proofs, we will show an equivalent result — that the likelihood ratios under

(p′, q′) are more extreme (farther from 1) than the likelihood ratios under (p, q).

The likelihood ratios after observing a fair signal under (p′, q′) and (p, q) are (respectively) p′

1−q′

and p1−q while the likelihood ratios after observing a rainy signal are 1−p′

q′ and 1−pq .

In order for the ratios under (p′, q′) to be farther from 1 than (p, q), then p′

1−q′ ≥p

1−q and1−p′q′ ≤

1−pq . This is equivalent to p′ ≥ p

1−q −p

1−q q′ and p′ ≥ 1− q′ 1−pq . �

Prediction 1 Fixing f , the decision maker should be indifferent between all information structures.

Proof Under reduction of compound lotteries this is true by definition.�

Prediction 2 Fixing f , the decision-maker should be indifferent between (1, 1) and (.5, .5).

40

Proof Under time neutrality this is true by definition.�

Prediction 3 Let %f represented by V , where V is Gateaux differentiable. Then the local utility

function of V is everywhere convex (concave) if and only if the decision-maker prefers Blackwell

more (less) information structures.

Proof This is proved by Grant, Kajii and Polack (1998).�

Prediction 4 Let %.5 represented by V , where V is Gateaux differentiable. If the local util-

ity function of V is thrice differentiable and has a convex (concave) derivative everywhere, then

(x, y) %.5 (-.5)(y, x) whenever x ≤ y

Proof Assume that all local utility functions are thrice differentiable and have a positive third

derivative. Denote the local utility function v(;µ). Fix two lotteries induced by information

structures with a prior of .5 Z1 and Z0 where Z0 generates a posterior distribution which has

more downside risk aversion than Z1. We simply need to show that W (Z1) −W (Z0) ≥ 0. Let

Z(α) = αZ1 + (1− α)Z0. By Grant, Kaji and Polack (pg 255) ddαW (Z(α))|α=β exists for any β in

(0, 1) and is equal to∫v(µ;Z(β))(Z1(dµ)−Z0(dµ)). Observe that this is simply the expected value

of v under Z1 less the expected value of v under Z0. By Theorem 2 of Menezes, Geiss and Tressler

(1980) this is positive for any β ∈ (0, 1). Integrating with respect to β yieldsW (Z(1))−W (Z(0)) ≥ 0

which gives the required result since W (Z(1)) = W (Z1) and W (Z0) = W (Z(0)). �


have expected utility representations with Bernoulli utilities u1 and u2. Then u1 ◦ u−12 is convex

(concave) if and only if the decision-maker prefers Blackwell more (less) information structures.

Moreover, if the derivative of u1 ◦ u−12 is convex (concave), then (x, y) %.5 (-.5)(y, x) whenever

x ≤ y.

Proof The first relationship — between Blackwell informativeness and convexity/concavity is

proved in Grant, Kajii and Polak (1998).

For the next part, denoting τ = u1(u−12 ) the utility of (p, q) is simply: τ( fp

fp+(1−f)(1−q))(fp +

(1− f)(1− q)) + τ( (1−p)f(1−p)f+q(1−f))((1− p)f + q(1− f)). Observe that this implies the individual is

an EU maximizer over a utility function defined over their posteriors. We know that x < y and a

prior of .5, (y, x) has more downside risk than (x, y). Thus by Theorem 2 of Menezes, Geiss and

Tressler (1980) if the third derivative of τ is positive then (x, y) must by preferred to (y, x). �


are both in Gul’s class of disappointment aversion functionals (or rank-dependent utility) and the

decision-maker is disappointment averse (has a strictly convex weighting function). Then there

exists an 0 < ε′ such that for all ε < ε′, (.5, .5) �.5 (.5 + ε, .5 + ε).

Proof Recall that we have representations in period 1 and period 2 which are (u1, β1) and (u2, β2).

41

If we get a good signal, the utility isfp

fp+(1−f)(1−q)+0

1+β2(1−f)(1−q)

fp+(1−f)(1−q)= fp

fp+(1+β2)(1−q)(1−f) . If we get a bad signal,

the utility is (1−p)f(1−p)f+(1+β2)q(1−f) . Iterating forward to period 1, and denoting τ = u1(u−1

2 ) we have

τ(

fpfp+(1+β2)(1−q)(1−f)

)(fp+ (1− q)(1− f)) + τ

((1−p)f

(1−p)f+(1+β2)(1−f)q

)(1 + β1)((1− p)f + (1− f)q)

1 + β1((1− p)f + (1− f)q)

Taking the value at f = .5, setting p = q:

τ(

pp+(1+β2)(1−p)

)(p+ (1− p)).5 + τ

((1−p)

(1−p)+(1+β2)p

)(1 + β1)((1− p) + p).5

1 + .5β1((1− p) + p)

τ(

p1+β2−β2p

)+ τ

((1−p)1+β2p

)(1 + β1)

2

p goes to .5

τ

(.5

1 + .5β2

)Clearly the 1

2+β1is irrelevant for the derivative, so take the derivative of τ

(p

1+β2−β2p

)+

τ(

(1−p)(1+β2p

)(1 + β1) with respect to p:

τ ′(

p

1 + β2 − β2p

)1 + β2 − β2p+ β2p

(1 + β2 − β2p)2+ τ ′

((1− p)

(1 + β2p

)(1 + β1)

−1− β2p− β2 + β2p

(1 + β2p)2

or

τ ′(

p

1 + β2 − β2p

)1 + β2

(1 + β2 − β2p)2− τ ′

((1− p)

(1 + β2p

)(1 + β1)

1 + β2

(1 + β2p)2

and then taking limit of the derivative as p→ .5 gives:

τ ′(

.5

1 + β2 − β2.5

)1 + β2

(1 + β2 − β2.5)2− τ ′

(.5

(1 + β2.5

)(1 + β1)

1 + β2

(1 + β2.5)2

or

τ ′(

.5

1 + .5β2

)1 + β2

(1 + .5β2)2− τ ′

(.5

(1 + .5β2

)(1 + β1)

1 + β2

(1 + .5β2)2< 0

Next we prove the result for RDU. If we get a good signal, utility is: w2( fpfp+(1−f)(1−q)) + 0. If

we get a bad signal, the utility is w2( (1−p)f(1−p)f+q(1−f)). Iterating forward one period, utility is then:

42

τ(w2( fpfp+(1−f)(1−q)))w1(fp+ (1− f)(1− q)) + τ(w2( (1−p)f

(1−p)f+q(1−f)))(1−w1(fp+ (1− f)(1− q)))Observe that this function is continuous and differentiable in the neighborhood of p = q = .5.

Substituting in f = .5:

τ(w2( pp+(1−q)))w1(.5(p+ (1− q))) + τ(w2( (1−p)

(1−p)+q ))(1− w1(.5(p+ (1− q))))And substituting in p = q gives τ(w2(p))w1(.5) + τ(w2(1− p))(1− w1(.5)).

The derivative of this with respect to p is then τ ′(w2(p))w′2(p)w1(.5)− τ ′(w2(1− p))w′2(1− p) +

τ ′(w2(1−p))w′2(1−p)w1(.5). Taking the limit of this as p goes to .5 gives τ ′(w2(.5))w′2(.5)[2w1(.5)−1]. When w is convex, this must be negative, since w1(.5) ≤ .5. �

Prediction 7 Suppose preferences belong to the class defined by Dillenberger and Segal (2015).

Then if (.5, .5) %.5 (x, x) for all x ≥ .5, then for all x ≤ y, (.5, .5) %.5 (y, x). However, it is possible

that (.5, .5) -.5 (x, y).

Proof See Dillenberger and Segal (2015) for the proof.�

Prediction 8 Suppose preferences can be represented by a BP functional form. Then the decision-

maker should be indifferent between all information structures.

Proof This is proved by construction.�

Prediction 9 Suppose preferences represented by a KR or EFK functional form. Then (x, y) ∼.5(y, x).

Proof We discussed KR’s functional form previously. It is:

κ(1f + (1− f)0)

+ ν(η(1− 0)(fp

fp+ (1− q)(1− f)− f)(fp+ (1− q)(1− f)))

+ ν(η(0− 1)(f − f(1− p)f(1− p) + q(1− f)

)(f(1− p) + q(1− f)))

+ (fp+ (1− f)(1− q))(

fp

fp+ (1− f)(1− q)(η(1− 0)

(1− q)(1− f)

fp+ (1− q)(1− f))

)+ (fp+ (1− f)(1− q))

((1− fp

fp+ (1− f)(1− q))η(0− 1)

fp

fp+ (1− q)(1− f)

)+ ((1− p)f + q(1− f))

(f(1− p)

(1− p)f + q(1− f)η(1− 0)

q(1− f)

f(1− p) + q(1− f)

)+ ((1− p)f + q(1− f))

((1− f(1− p)

(1− p)f + q(1− f))η(0− 1)

f(1− p)(1− p)f + q(1− f)

)Dropping the expected utility part, since it depends only on f , this becomes

43

νη(1)(fp− f(fp+ (1− q)(1− f))) + νη(−1)(−f(1− p) + f(f(1− p) + q(1− f)))

+ (η(1) + η(−1))

(f(1− f)p(1− q)fp+ (1− f)(1− q)

+f(1− p)q(1− f)

f(1− p) + (1− f)q

)= ν(η(1) + η(−1))f(1− f)(p+ q − 1)

+ f(1− f)(η(1) + η(−1))

(p(1− q)

fp+ (1− f)(1− q)+

(1− p)qf(1− p) + (1− f)q

)= (η(1) + η(−1))f(1− f)

(p(1− q)

fp+ (1− f)(1− q)+

(1− p)qf(1− p) + (1− f)q

+ ν(p+ q − 1)

)The results are immediate from this. Observe that the inside of parantheses under signal x, y

is x(1−y)xf(1−f)(1−y) + (1−x)y

(1−x)f+(1−f)y and under y, x is y(1−x)yf(1−f)(1−x) + (1−y)x

(1−y)f+(1−f)x . If f = .5 these are

equal.

Moreover, if we substitute in q = k−p and f = .5 then we get, on the inside, 2p(1−q)p+(1−q) + 2(1−p)q

(1−p)+q +

ν(p+ q− 1) or 2p(1−k+p)p+1−k+p + 2(1−p)(k−p)

1−p+k−p + ν(p+ k− p− 1) or 2p(1−k+p)2p+1−k + 2(1−p)(k−p)

1−2p+k + ν(k− 1). The

FOC should follow. �

We next turn to the EFK functional forms. Using their model of suspense, and letting f = .5

we get:

.5κ

+ νϑ((p+ (1− q))( p

p+ (1− q)− .5)2 + ((1− p) + q)(.5− (1− p)

(1− p) + q)2)

+ .5(p+ (1− q))ϑ(p

p+ (1− q)2(1− p

p+ (1− q))2 + (1− p

p+ (1− q))2(

p

p+ (1− q))2)

+ .5((1− p) + q)ϑ((1− p)

(1− p) + q2(1− (1− p)

(1− p) + q)2 + (1− (1− p)

(1− p) + q)2(

(1− p)(1− p) + q

)2)

44

Simplifying this gives:

.5κ

+ νϑ(p2

p+ (1− q)− p+ .25(p+ (1− q)) +

(1− p)2

(1− p) + q− (1− p) + .25((1− p) + q))

+ .5(p+ (1− q))ϑ(2p

p+ (1− q)− 4(

p

p+ (1− q))2 + 2(

p

p+ (1− q))3 + 2(

p

p+ (1− q))2 − 2(

p

p+ (1− q))3)

+ .5((1− p) + q)ϑ(2(1− p)

(1− p) + q− 4(

(1− p)(1− p) + q

)2 + 2((1− p)

(1− p) + q)3 + 2(

(1− p)(1− p) + q

)2 − 2((1− p)

(1− p) + q)3)

= .5κ

+ νϑ(p2

p+ (1− q)+

(1− p)2

(1− p) + q− .5)

+ .5(p+ (1− q))ϑ(2p

p+ (1− q)(1− p

p+ (1− q)))

+ .5((1− p) + q)ϑ(2(1− p)

(1− p) + q(1− (1− p)

(1− p) + q))

Under structure (x, y) this becomes:

.5κ

+ νϑ(x2

x+ (1− y)+

(1− x)2

(1− x) + y− .5)

+ .5(x+ (1− y))ϑ(2x

x+ (1− y)(1− x

x+ (1− y)))

+ .5((1− x) + y)ϑ(2(1− x)

(1− x) + y(1− (1− x)

(1− x) + y))

= .5κ

+ νϑ(x2

x+ (1− y)+

(1− x)2

(1− x) + y− .5)

+ .5(x+ (1− y))ϑ(2x

x+ (1− y)(

1− yx+ (1− y)

))

+ .5((1− x) + y)ϑ(2(1− x)

(1− x) + y(

(y)

(1− x) + y))

Under structure (y, x) the utility becomes:

45

.5κ

+ νϑ(y2

y + (1− x)+

(1− y)2

(1− y) + x− .5)

+ .5(y + (1− x))ϑ(2y

y + (1− x)(1− y

y + (1− x)))

+ .5((1− y) + x)ϑ(2(1− y)

(1− y) + x(1− (1− y)

(1− y) + x))

= 5κ

+ νϑ(y2

y + (1− x)+

(1− y)2

(1− y) + x− .5)

+ .5(y + (1− x))ϑ(2y

y + (1− x)(

1− xy + (1− x)

))

+ .5((1− y) + x)ϑ(2(1− y)

(1− y) + x(

(x)

(1− y) + x))

Subtracting the second from the first gives:

.5κ − .5κ

+ νϑ(x2

x+ (1− y)+

(1− x)2

(1− x) + y− .5)

− νϑ(y2

y + (1− x)+

(1− y)2

(1− y) + x− .5)

+ .5(x+ (1− y))ϑ(2x

x+ (1− y)(

1− yx+ (1− y)

))

− .5((1− y) + x)ϑ(2(1− y)

(1− y) + x(

(x)

(1− y) + x))

+ .5((1− x) + y)ϑ(2(1− x)

(1− x) + y(

(y)

(1− x) + y))

− .5(y + (1− x))ϑ(2y

y + (1− x)(

1− xy + (1− x)

))

= ν[ϑ(x2

x+ (1− y)+

(1− x)2

(1− x) + y− .5)

− ϑ(y2

y + (1− x)+

(1− y)2

(1− y) + x− .5)]

Simplifying the above expression gives

46

ν[ϑ(x2 − x3 + x2y + x− 2x2 + x3 + 1− 2x+ x2 − y + 2xy − yx2

x− x2 + yx+ 1− x+ y − y + yx− y2− .5)

− ϑ(y2 − y3 + xy2 + y − 2y2 + y3 + 1− 2y + y2 − x+ 2yx− xy2

y − y2 + yx+ 1− y + x− x+ yx− x2− .5)]

= ν[ϑ(−x+ 1− y + 2xy

(1 + x− y)(1 + y − x)− .5)

− ϑ(1− y − x+ 2yx

(1 + y − x)(1 + x− y)− .5)]

= ν[ϑ(1− x− y + 2xy

(1 + x− y)(1 + y − x)− .5)

− ϑ(1− x− y + 2yx

(1 + y − x)(1 + x− y)− .5)]

= 0

We next derive the result for EFK’s model of surprise. Again, taking the functional form

discussed above, and substituting in f = .5, we obtain:

κ.5 + ν.5(p+ (1− q))ϑ(2(p

p+ (1− q)− .5)2)

+ ν.5((1− p) + q)ϑ(2(.5− (1− p)(1− p) + q

)2)

+ .5(p+ (1− q)) p

p+ (1− q)ϑ(2(1− p

p+ (1− q))2)

+ .5(p+ (1− q))(1− p

p+ (1− q))ϑ(2(

p

p+ (1− q))2)

+ .5((1− p) + q)(1− p)

(1− p) + qϑ(2(1− (1− p)

(1− p) + q)2)

+ .5((1− p) + q)(1− (1− p)(1− p) + q

)ϑ(2((1− p)

(1− p) + q)2)

which is equivalent to

47

.5[κ + ν(p+ (1− q))ϑ(2(p

p+ (1− q)− .5)2)

+ ν((1− p) + q)ϑ(2(.5− (1− p)(1− p) + q

)2)

+ pϑ(2(1− q

p+ (1− q))2)

+ (1− q)ϑ(2(p

p+ (1− q))2)

+ (1− p)ϑ(2(q

(1− p) + q)2)

+ qϑ(2((1− p)

(1− p) + q)2)]

Under structure (x, y) this becomes:

.5[κ + ν(x+ (1− y))ϑ(2(x

x+ (1− y)− .5)2)

+ ν((1− x) + y)ϑ(2(.5− (1− x)

(1− x) + y)2)

+ xϑ(2(1− y

x+ (1− y))2)

+ (1− y)ϑ(2(x

x+ (1− y))2)

+ (1− x)ϑ(2(y

(1− x) + y)2)

+ yϑ(2((1− x)

(1− x) + y)2)]

Under structure (y, x) the utility becomes:

.5[κ + ν(y + (1− x))ϑ(2(y

y + (1− x)− .5)2)

+ ν((1− y) + x)ϑ(2(.5− (1− y)

(1− y) + x)2)

+ yϑ(2(1− x

y + (1− x))2)

+ (1− x)ϑ(2(y

y + (1− x))2)

+ (1− y)ϑ(2(x

(1− y) + x)2)

+ xϑ(2((1− y)

(1− y) + x)2)]

48

Subtracting the second from the first gives:

.5[κ + ν(x+ (1− y))ϑ(2(x

x+ (1− y)− .5)2)

−κ + ν(y + (1− x))ϑ(2(y

y + (1− x)− .5)2)

+ ν((1− x) + y)ϑ(2(.5− (1− x)

(1− x) + y)2)

− ν((1− y) + x)ϑ(2(.5− (1− y)

(1− y) + x)2)

+ xϑ(2(1− y

x+ (1− y))2)

− xϑ(2((1− y)

(1− y) + x)2)]

+ (1− y)ϑ(2(x

x+ (1− y))2)

− (1− y)ϑ(2(x

(1− y) + x)2)

+ (1− x)ϑ(2(y

(1− x) + y)2)

− (1− x)ϑ(2(y

y + (1− x))2)

+ yϑ(2((1− x)

(1− x) + y)2)]

− yϑ(2(1− x

y + (1− x))2)

We can simplify this to:

49

= ν.5[(x+ (1− y))ϑ(2(x

x+ (1− y)− .5)2)

− (y + (1− x))ϑ(2(y

y + (1− x)− .5)2)

+ ((1− x) + y)ϑ(2(.5− (1− x)

(1− x) + y)2)

− ((1− y) + x)ϑ(2(.5− (1− y)

(1− y) + x)2)

= ν.5[ (x+ (1− y))ϑ(2(.5x− .5(1− y)

x+ (1− y))2)

− (y + (1− x))ϑ(2(.5y − .5(1− x)

y + (1− x))2)

+ ((1− x) + y)ϑ(2(.5y − .5(1− x)

(1− x) + y)2)

− ((1− y) + x)ϑ(2(.5x− .5(1− y)

(1− y) + x)2)

= ν.5[ (1 + x− y)ϑ(2(.5x− .5(1− y)

x+ (1− y))2)

− (1 + y − x)ϑ(2(.5y − .5(1− x)

y + (1− x))2)

+ (1 + y − x)ϑ(2(.5y − .5(1− x)

(1− x) + y)2)

− (1 + x− y)ϑ(2(.5x− .5(1− y)

(1− y) + x)2)

= 0

Appendix C: Experimental Protocol

The experimenters gave participants a ticket from a raffle ticket roll in the sequence with which

they entered the lab. The ticket assignment was simple and public, making it transparent to all

participants that each participant had equal chances in the lottery coming up. The participants

read the instructions displayed on their screens and waited for the experimenter to begin the study.

The instructions on the screen informed participants that the study was 75 minutes long and had

two parts and that they would receive $7 for participating in the study. These instructions also

informed the participants that they would be participating in a lottery with the ticket they received

as they entered the room. With 50% chance, they would earn an additional $10, and with 50%

chance they would not earn any additional money.

The experimenter told the participants to put on their headphones in order to listen to the

instructions that will be given on the next page. The instructional video explained that whether

50

a particular ticket wins or loses the lottery is determined by the last digit of the ticket number

and the outcome of a 10-sided die throw. They learned that the experimenter would roll the die

and cover it with a cup after seeing the die outcome. They were told that if the die outcome is an

odd (even) number and the last digit of the ticket the participant is holding is also odd (even), the

participant would win $10. And, if the last digit of the ticket and the die outcome fail to match in

this way, the participant does not win any money. Importantly, the instructions emphasized that

none of the participants would learn the outcome of the die and thus whether they won or lost,

before the experiment was over. They were told that one of the participants would be invited to

lift the cup and read the number on the die out loud at the end of the experiment for everyone to

learn the outcome. The participants were also told that they would enter their ticket number and

the experimenter would supply a code to be entered so that the computer program would know

whether they won or lost, right from the beginning of the experiment. Then, participants were

given hypothetical examples of this process and understood how the computer would be able to

know more than they did and would be able to generate clues if needed. The instructions also

explained that in the first part of the study was expected to take around half the allotted time

for the experiment and the participants would be answering five questions, each with two clue-

generating options, about their preferences about what kind of clues they would like to get about

whether their ticket won or lost. They were told that one of these five questions would be chosen

at random to be carried out at the end of the first part and that they would observe the clue

generated by the option they chose in that question at that time. The participants understood that

they would sit with that clue for the rest of the experiment until they were able to learn whether

they won or lost the lottery at the end of the 75 minutes. The participants were told that they

would be answering questions unrelated to the lottery in the second part and that these questions

did not have any informational or monetary value associated with them.

After listening to these instructions, the participants were asked if they had any difficulties with

the video or audio components of the program. Only 1 person did, and his/her microphone was

adjusted immediately.

When the instructional video was over, participants were asked to enter the last digit of their

ticket number and the experimenter rolled a 10-sided die on the table publicly and covered it with

a cup so that the outcome was not visible to the participants. The experimenter informed the

participants Ive rolled the die. At this point, the outcome of the lottery for everyone is determined.

I will now look at the outcome and give you a code to enter, so that the computer knows what the

outcome was. and gave one of the following codes: sugar, milk, cake, candy, coffee, butter; where

sugar, cake or coffee informed the computer that the die outcome was an odd number. We used

more than one code and changed it around across sessions to prevent participants from learning

the codes across sessions.

51

After entering the code and the last digit of their ticket numbers, the participants worked on the

study on their own. They first answered some comprehension questions regarding the instructions

they received. The program instructed them if they answered any question incorrectly.

On the next page, they were asked to rate their happiness in order to elicit an initial baseline

happiness measure. The question asked “Please indicate how happy/unhappy you are feeling in the

current moment by sliding the scale. -100 means you are feeling ‘very unhappy’, 100 means you are

feeling ‘very happy’, 0 means you are feeling ‘neutral’.” After this question, they were informed

that they were proceeding to part 1 of the study where they would be making choices about the

kind and amount of information they would like to get about whether their ticket won or lost the

lottery.

Before each question in part 1, they listened to video instructions by using their headphones and

answered comprehension questions. The program instructed them of the correct answers if they

made any mistakes in the comprehension questions, before they could proceed to making choices.

After making a choice and indicating their preference strength for the option they chose on the

following page, they proceeded to the next video that explained the following question, until all

five questions were answered.

The videos for each question were all structured in the following manner: 1) The two options in

the question were presented, and the text indicating the contents of each box in the options were

read. 2) For each option, the box from which the ball would be drawn if the participant won the

lottery was highlighted, followed by the box from which the ball would be drawn if the participant

lost the lottery. 3) The percentage of the instances a red or a black ball would be drawn from

Option 1 was indicated and explained, 4) The meaning (posterior probability of winning or losing)

associated with observing a red or a black ball from Option 1 was defined and explained, 5) steps

3 and 4 were repeated for Option 2, 6) Option 1 and Option 2 were displayed next to one another

and a summary of the information regarding the likelihood of observing each ball color and the

posterior probability of winning associated with each color was included below each option. This

final comparison visual is the same graphic as the one that the participants saw when they were

making a choice between the two options. The video instructions did not provide any additional

information than the information already included on their screens right at the time of making

a choice, however we believe that watching the video instructions before making a choice forced

participants to pay more close attention to this information and provided them with more of an

understanding of how the posterior probabilities were calculated.

An example video can be found at: YOUTUBE LINK

After watching the video and completing the comprehension questions, the participants arrived

at a page that displayed the two options graphically and explained each verbally. All questions are

included in the end of this document.

52

After answering all five questions, one question was randomly chosen for each participant to be

carried out and the program randomly drew a ball from the option the participant chose in that

question. The program displayed the two options in the chosen question along with the participants

choice in that question on the screen. It also indicated whether the ball drawn from the option

the participant chose was red or black. Given the color of the ball drawn from the option, and the

information about the posteriors included in the graphics of the option, the participant was asked

to enter the probability that s/he won the lottery (which s/he could simply read from the graphic

if s/he paid attention).

On the next page, the participants were asked to rate their happiness in that moment using the

same scale as before. On the following pages, they were also asked to rate how optimistic/pessimistic

they feel about winning the lottery, to note whether they had any questions or confusions about

part 1 and to provide a short explanation for the reason behind their choices in the first three

information-preference questions in part 1.

In the second part of the experiment, they were asked hypothetical questions that each presented

two options to elicit their risk preferences, ambiguity aversion, ability to reduce compound lotteries

and attitude differences towards common ratios.

At the end, when all participants were done (or when time was running out), one participant

was invited to lift the cup and announce the die outcome. All participants were asked to indicate

this outcome and whether they won or lost the lottery as a result on their screens. On the next

page, right after learning the outcome of the lottery, the participants were asked once again to rate

their happiness in that moment.

The experimenters went to each participants stall to pay him or her in private. The experimenter

checked the ticket number, paid the participant in cash and asked him or her to fill out the receipt

form and answer one more question on the last page of the study and advanced the participants

program to that last page. On the last page, after receiving the cash, the participants were asked

once again to rate their happiness in that moment.

Appendix D: Experimental Design — Single Pairwise Comparison

With initial funding from the Ross School of Business and the University of Oxford, we conducted

a real-choice study in May 2014. We conducted a 1x2 between subjects experiment (2 sets of

information structures). Each participant is paid $7 for participating the hour-long experiment.

At the beginning of the experiment, each participant is given a red ticket with a 5 digit number

on it and are informed of their chances of winning (50%). If they win, they get an additional

$10. A die is thrown, which determines the winning ticket numbers, but is covered with a cup

so that participants cannot see the outcome of the die throw. So, the outcome is determined at

53

the beginning of the experiment, but each participant remains uncertain whether s/he won.25 The

participants are told that the cup would be lifted at the end of the experiment and all participants

can see the outcome of the die throw at that time.

After receiving these instructions, the participants first completed a 10-minute training session

that gave them experience in using the willingness to accept protocol. After this seemingly unrelated

task, we explained two signal structures (Option 1 and Option 2, detailed below) and ask them

to make a choice between the two. Once they made a choice, and before they see the signal, they

were asked to indicate the amount they would be willing to accept (WTA) in order to change their

choice. This amount measures the utility difference between the two informational options. Then

their choices were carried out using a DGM procedure and they saw a signal from the information

structure their WTA answers indicate. In the last 30 minutes of the study, they made choices

in hypothetical risk scenarios. At the end of the study, the die outcome was announced and the

holders of the winning tickets were paid an additional $10.

There were two sets of information structures, presented across sessions (between subjects

design). Set 1 operationalized the information structure in Figure 4 (also presented as Q2 in the

main experiment), and Set 2 operationalized the information structure in Figure 5.

Figure 5: Set 2

In both sets, Option 1 (negatively skewed structure) is more accurate at predicting the worse

outcome (not winning) and Option 2 (positively skewed structure) is more accurate at predicting

the better outcome (winning). Across all priors, Option 1 is more likely to give slightly good news

(red ball) and Option 2 is more likely to give slightly bad news (black ball) compared to the prior.

While the likelihood of getting bad news (seeing a black ball) is higher in Option 2, conditional

on the color of the ball (red or black), the posteriors induced by Option 2 are higher than the

posteriors induced by Option 1 for the same ball color.

25Execution details: person wins if ticket number is even (odd) and the die throw outcome is even (odd).

54

Skewness and Preferences for Non-Instrumental Informationon Kreps and Porteus (1978) is most...

Documents

Transcript of Skewness and Preferences for Non-Instrumental Informationon Kreps and Porteus (1978) is most...