Skewness and Preferences for Non-Instrumental Informationon Kreps and Porteus (1978) is most...
Transcript of Skewness and Preferences for Non-Instrumental Informationon Kreps and Porteus (1978) is most...
Skewness and Preferences for Non-Instrumental Information∗
PRELIMINARY, PLEASE DO NOT CITE
Yusufcan Masatlioglu†
University of Michigan
Yesim Orhun‡
University of Michigan
Collin Raymond§
University of Oxford
November 1, 2015
Abstract
We test individuals’ preferences for skewed information structures in situations where informa-
tion has no instrumental value. We find individuals exhibit a strong preference for positively
skewed information structures as well as Blackwell more informative information structures. We
show that our results allow for testing of a variety of models. A model based on the framework
on Kreps and Porteus (1978) is most consistent with the data we observe.
∗We thank the Michigan Institute for Teaching and Research in Economics, Ross School of Business Faculty GrantFund, and the University of Oxford John Fell Fund for Support. Our errors are ours alone.†Email: [email protected]‡Email: [email protected]§Email: [email protected].
1
1 Introduction
Information affects the beliefs we hold, what we choose, how happy our choices make us, the
precision and impact of opinions we perpetuate, and more generally, how we make sense of and
interact with the world around us. Neoclassical theory posits that people value information if and
only if the information is instrumental for decision-making. According to this view, neither would
people pay to acquire, nor would they pay to avoid an information that would not change their
decision-making.
People however often have strong preferences regarding information about an uncertain and
unavoidable outcome even when information has no instrumental value: they try to avoid situations
that make them feel anxious about a future event and cultivate instances of hope at the face of
uncertainty, even when doing so is costly. For example, anxious patients with potential symptoms
of a disease may put off taking a diagnostic test, even if it means to delay possible treatments; and,
hopeful voters park themselves in front of the TV on election night, even though it costs them a
good night’s sleep.
Psychologists have long recognized that people regulate their anticipatory emotions regarding
an uncertain outcome in the future, such as hope, anxiety and suspense, by managing their beliefs
about the outcome. While people cannot directly choose their beliefs, they can choose the sources of
information they are exposed to that shape those beliefs. Therefore, these regulatory psychological
objectives shape what people want to learn, and when they want to learn it, even in the absence
of their ability to condition their actions on that information. In order to explain preferences
over non-instrumental information across a variety of contexts, both theory and empirical work
in economics recently embraced the idea that utility can also depend on beliefs.1 These models
lead to very different behavior over information acquisition and consumption than the objective of
instrumentality alone would generate.
The bulk of the work trying to understand preference for non-instrumental information in
economics have focused on two particular types of preferences — preferences for early resolution of
information (as in Kreps and Porteous, 1978), or preferences for one-shot resolution of information
(as in Dillenberger, 2010 or Koszegi and Rabin, 2009). Thus, as discussed in a recent survey by
Golman, Hagmann and Loewenstein (2015) much of the literature has focused on how individuals
attempt to avoid (or seek out) information.
However, more generally, individuals may also care not only about whether they observe any
information at all, but what kind of information they observe. Thus, they may seek out certain
types of information in order to manipulate their posterior beliefs. This leads us to a an important,
1For applied work, see Koszegi, 2006, Caplin and Leahy, 2004, Mullainathan and Shleifer, 2005, Gentzkow andShapiro, 2010, Caplin and Eliaz, 2003, and Oster, Shoulson and Dorsey, 2011, for the theoretical work, see Kreps andPorteous, 1978, Dillenberger, 2010, and Koszegi and Rabin, 2009.
2
but relatively neglected, type of informational preference: the preference for skewed information.
Given equal priors over possible outcomes, we define a preference for positively skewed information
as a preference for a signal with a high false negative rate and a low false positive rate to a signal
with the symmetric and opposite features, where the false positive and false negative rates have
been exchanged.2 Consider information structures that give binary signals regarding whether the
outcome is going to result in a desired state or an undesired state. Negatively skewed information
structures eliminate more uncertainty about the undesired outcome conditional on generating a bad
signal, and positively skewed information eliminate more uncertainty about the desired outcome
conditional on generating a good signal.
In many real world situations we find ourselves having to trade off false negatives versus false
positives. Consider the voters who are sitting in front of the TV at midnight on November 6, 2012.
They can hear projections of the results on FOX or on MSNBC. Each source of information is known
to be somewhat biased: the stations bend the discussion of the available results towards the possible
victory of the party their consumer base supports. Hence, a Democrat has the choice between either
watching Fox, which generates more false negatives, or watching MSNBC, which generates more
false positives for her desired outcome. Clearly, watching either station will not change the election
results. However, voters may have intrinsic preferences over the trade-off between false positives
and false negatives, depending on whether they would rather be overly optimistic in the process
and disappointed at the end, or overly pessimistic in the process and surprised at the end. Despite
the fact that these types of preferences are ubiquitous and intuitive, little has been done to study
them, or ascertain how individuals might manifest them.
Recently, a few theoretical papers have attempted to specifically model non-instrumental pref-
erences for skewed information. These include Dillenberger and Segal (2015), Szech and Schweitzer
(2014), Caplin and Eliaz (2003), Eliaz and Spiegler (2006), Eliaz and Schotter (2010), and Mul-
lainathan and Shleifer (2005), where priors effect the preference for skew. However, these applica-
tions disagree about what kind of preferences one should expect individuals to exhibit. For example,
Szech and Schweitzer (2014), Caplin and Eliaz (2003) and Dillenberger and Segal (2014) predict
preferences for positive skew, while Eliaz and Schotter (2010) and Eliaz and Spiegler (2006) focus
on the case of preferences for left-skewed information.
One can draw on competing intuitions for each of these predictions about preferences. If,
conditional on observing any given signal, an individual prefers to have a higher posterior belief,
then they should prefer positively skewed signals. However, if in contrast, and individual prefers
to see more signals that are associated with increasing their posterior, rather than decreasing, then
they should prefer negatively skewed signals. Thus, it becomes important to empirically test what
2The positive/negative nomenclature is based on the fact that the two signals generate distribution of posteriorsthat have the same expectation and variance (first and second moments), but the former has a positive third moment,while the latter has a negative third moment.
3
kind of preferences individuals actually exhibit.
Our main motivation for of investigating informational preferences for skewness is their preva-
lence in many economic decisions individuals face. However, an important ancillary benefit is that
characterization of these preferences also allows us distinguish between belief-based utility models
in a manner that is not possible by examining preferences for early or one-shot resolution of infor-
mation. A variety of theoretical models have been developed to try to understand and characterize
belief-based utility that generate an intrinsic preference for information. These include models
of: dynamic reference dependence, such as Koszegi and Rabin (2009); anticipatory utility such as
Brunnermeier and Parker (2005) and Caplin and Leahy (2001); and models of preferences for the
timing of the resolution of information such as Kreps and Porteus (1978), Dillenberger (2010) and
Dillenberger and Segal (2014). These models is designed to accommodate behavior that neoclassi-
cal models cannot, and as such have become widely used in the literature. Distinguishing between
different models of information preferences is an important step in incorporating belief-based utility
in policy-making, as it would help understand what policies can best improve welfare in contexts
where there is a conflict between utility from beliefs and utility from material payoffs.
However, current experimental evidence studying preferences for early or one-shot resolution of
information cannot distinguish between most existing models of belief-based utility, because these
models are designed to generate these exact preferences. On the other hand, preferences for skewed
information were not one of the original motivating factors in the development of many of models
of belief based utility. Thus, our experiments allows us to test an important, but “out-of sample”
prediction — whether individuals prefer to learn more about whether they will receive the better
outcome or the worse outcome. Importantly, different classes of models generate very different
predictions for skewness preferences. These diverging ancillary predictions allow us to distinguish
competing models of belief-based utility based on our experimental results.3
We report results from a lab experiment that elicits preferences for information in an environ-
ment where the information, by construction, cannot influence actions. The future outcome of
interest has two states that are ordered in terms of payoffs: whether the lottery ticket each partici-
pant was given is one of the winning tickets that were drawn at the start of the experiment, or not.
The likelihood of having one of the winning tickets is 50%. Participants make choices between two
information structures across five questions. All information structures can generate one of two
possible signals: good and bad. However, the information structures vary in how much and what
kind of information uncertainty they resolve. The participants observe the signal generated by one
of the information structures they choose, learn the posterior likelihood that their ticket is one of
the winning tickets, and sit with that information for at least half an hour before the winning ticket
numbers are revealed.
3Although preferences for skewed distributions of outcomes have been discussed extensively in economics, thediscussion has not been extended for the most part to preferences for skewed information structures.
4
Our experimental design addresses three important challenges in identifying preferences for
non-instrumental information. First, it ensures that information cannot influence actions in the
experiment or elsewhere: subjects are in the lab answering unrelated questions after observing
the signal generated by the information structure they chose, and before learning the outcome for
sure. Therefore, preferences for information are entirely driven by belief-based utility. Second, it
ensures that the observed preferences are for information that impact subjects’ beliefs about future
outcomes and their belief-utility, rather than for information that shapes their self-perceptions,
impacts their ego-utility or confidence regarding making a choice: subjects are randomly assigned
to the high payoff condition, which is entirely unrelated to any characteristics of them, and do
not have any control or agency regarding the outcome. Third, it reduces information processing
cost of subjects to ensure that preferences reflect utility and not cognitive processing constraints:
As a part of the experiment we explain to subjects with what probability they will receive any
given signal, and what posteriors they should have after observing any given signal. Thus, choices
we observe are not the result of individuals incorrectly updating, or being confused by what the
information is telling them.
Our three main contributions are as follows. First, we demonstrate the existence of strong
preferences regarding the skewness of information. Our results indicate that individuals exhibit a
strong preference for positively skewed information structures relative to those that are negatively
skewed. These preferences are demonstrated at both the between person and within person level and
are robust to the type of information structures subjects compare. Individuals seem to prefer ruling
out more uncertainty about the desired outcome (and tolerating uncertainty about the undesired
outcome) compared to ruling out more uncertainty about the undesired outcome (and tolerating
uncertainty about the desired outcome).
Second, we discuss how preferences for skewed information relate to other preferences for in-
formation that are discussed in the literature. Preferences for third moments (preferences for
skewness) have been extensively discussed for choices over risky outcomes (e.g. lotteries) and have
been related to behavior such as precautionary savings. In fact, there is a natural analogy between
studying preferences over lotteries and intrinsic preferences over information structures. The first,
second and third derivatives of the Bernoulli utility function over wealth is known to define prefer-
ences over first order stochastic dominance, mean-preserving spreads and mean-variance preserving
shifts (Rotschild and Stiglitz, 1970, Menezes et al, 1980). The first and second derivatives of a
utility function defined over information structures is known to relate again to monotonicity and
preferences for early versus late resolution of information respectively (Kreps and Porteus 1978,
Grant, Kajii and Polak, 1998). Moreover, as Grant, Kajii and Polak (1998) show there is a nat-
ural analogy between risk aversion and a preference for late resolution of information, linking the
orderings associated with risk and the orderings associated with information. Our within person
5
design allows for an empirical investigation of how preferences for skewness relate to preferences
regarding early versus late and one-shot versus gradual resolution of uncertainty.
Third, we use our results as a testing ground of utility-based models. We find that our model
finds strongest support for the class of preferences introduced by Kreps and Porteus (1978), and
extended by Grant, Kajii and Polak (1998). We find that subjects exhibit preferences for skewness
are inconsistent with the predictions of many other models used to generate non-instrumental
preferences for information, including Koszegi and Rabin (2009). Moreover, other models that
are able to generate preferences for skewness in accordance with what we observe — such as
dynamic extensions of rank-dependent utility, Gul’s (1991) model of disappointment aversion and
Dillenberger and Segal’s (2015) model of skewness — cannot accomodate the fact we no evidence
for a preference for no information over very little information.
The rest of the paper is structured as follows. Section 2 presents a simple theoretical framework
that allows us to analyze preferences over information structure. It focuses on discussing the situa-
tions where there are two possible outcomes and two possible signals — a simple environment that
will still allow us to demonstrate important results, as well as summarizing the existing literature.
Section 3 discusses the implications for preferences over information structures of important classes
of models in our environments, which we will be able to test in our environment. Section 4 outlines
the experimental design. Section 5 presents the results of our experiment. Section 6 discusses these
results and relates them to the theoretical predictions of Section 3. Section 7 concludes.
2 Framework and Literature Review
2.1 General Framework and Preliminaries
We focus on individuals’ preferences for information where all probabilities are objectively known,
rather than subjective. In order to capture preferences for information, our theory focuses on an
idealized situation where there are three periods (0, 1 and 2). In Period 0 individuals have a
prior probability distribution over states that will be realized and determine payoffs in Period 2.
In Period 1 they receive a signal, which might cause them to update their prior to a posterior.
In Period 2 the states are revealed and individuals receive their payoff. Importantly, individuals
cannot take any actions; thus all preferences for information must come from intrinsic, rather than
instrumental, motivations.
Formally, we imagine there are a finite number N of indexed states ωi. Each state corresponds
to a different payoff for the individual. Moreover, there are M signals indexed by sj . An information
structure I is an N by M matrix, such that the entries in each row sum to 1. The i, j-th entry of
the matrix, denoted Iij gives the probability that signal sj is realized if the state is ωi. Given a
prior distribution f over states, if the individual utilizes Bayes’ rule then a posterior probability of
6
state ωi conditional on observing signal sj is given by:
ψj(ωi) =f(ωi)Iij∑k f(ωk)Ikj
We will suppose that individuals have preferences over information structures given the prior f ,
denoted by%f . Formally, within the economics literature, these are typically modeled as preferences
over two-stage compound lotteries; lotteries over lotteries. Each signal si induces a lottery over
outcomes — the posterior distribution ψj . This is the lottery that individuals face in period 1
after receiving information. In period 0, the individual faces a lottery over these possible lotteries
— signal sj is received with probability∑
i f(ωi)Iij . There is a natural bijection between prior-
information structure pairs and two-stage compound lotteries. Because our focus is on information,
we will write preferences, and utility functionals, over the space of prior-information structure pairs.
However, formal results will use the induced preferences in the space of two-stage compound lotteries
(for example, assumptions on convexity or concavity of the utility function will be stated using the
space of two-stage compound lotteries).4
In order to derive predictions applicable to our particular experimental setting, we will focus
on situations where there are two outcomes, high and low, with utility values u(H), u(L) (in
this subsection we normalize these utilities to 0 and 1 respectively); and so only two states (i.e.,
M = N = 2). We will define the information structures we will consider, as well as important
orderings on the set of information structures, which we will use in deriving testable predictions
regarding behavior from important classes of models.
Given the two outcomes, we denote the prior probability on the high outcome as f . The
decision-maker has access to a set of binary signal structures: the realizations are G (good) or B
(bad). A good (bad) signal is a signal that increases (decreases) the beliefs about the outcome
being high compared to the prior. I is then a two by two matrix which can be characterized by
only two numbers (since the entries in each row must sum to 1). We do so by setting p = I11 and
q = I22. Thus, our information structures are characterized as points in R2: (p, q). The probability
of good signal conditional on high outcome is p = p(G|H) and probability of bad signal conditional
on low outcome is q = p(B|L). Using Bayes’ Rule, after observing a good signal the posterior for
a high outcome is
ψG =fp
fp+ (1− f)(1− q). (1)
After observing a bad signal the posterior is
ψB =f(1− p)
f(1− p) + (1− f)q. (2)
4For an extended discussion of these issues, please see the Appendix.
7
Observing a good signal occurs with probability fp+ (1− q)(1− f) and observing a bad signal
occurs with probability f(1− p) + q(1− f).
In our binary-binary world, we can represent any possible signal structure as a point in (p, q)
space, with the horizontal access being the p-value. For the rest of the paper we assume that
all information structures (which could lie anywhere in the unit square) must lie above the line
p+ q = 1 along with the point (.5, .5). We denote this set by S := {(p, q)| p+ q > 1} ∪ (.5, .5). We
do so for two reasons. First, all points in S have a natural interpretation: a good signal is good
news (a bad signal is bad news). Lemma 1 formalizes this.
Lemma 1 For any (p, q) ∈ S, observing a good signal increases the posterior on high outcome
relative to the prior, and observing a bad signal decreases the posterior on high outcome relative to
the prior.
Moreover, this set of signals is a minimal set that still allows us to capture all possible posterior
distributions, as shown by Lemma 2.
Lemma 2 For any signal structure (p′, q′) ∈ [0, 1] × [0, 1], there exists a (p, q) ∈ S that generates
the same posterior distribution. However, for any T ⊂ S there exists a (p′, q′) ∈ S such that there
is no element of T that generates the same posterior distribution as (p′, q′).
Given this restriction we can consider examples of information structures. An information
structure that resolves all information as early as possible is one in which p = q = 1. In this
case, a good signal implies that the outcome is high for sure. Similarly, a bad signal indicates
a bad outcome for sure. An information structure which conveys no information at all will have
p = q = .5. In this case, the posterior after either signal is equal to the prior. With these examples
in mind we now turn to focus on some interesting orderings over prior-information structure pairs.
2.2 Types of Informational Preferences
Preference for earlier versus later: Most of the theoretical models for non-instrumental
information focus on accommodating preferences for early versus late resolution of information.
There have been a fair amount of work in this domain both the decision theory literature (e.g.,
Kreps and Porteus, 1978; Epstein and Zin, 1989; Grant, Kajii, and Polak, 1998) and the behavioral
literature (e.g., Koszegi and Rabin, 2009 and Koszegi, 2009). These models characterize the condi-
tions under which a preference for early resolution of uncertainty can be observed, albeit proposing
distinct processes.
A preference for early or late resolution is tightly linked to the most well-known ordering
of information structures in economics, Blackwell’s ordering. Blackwell’s ordering was originally
designed to be used in situations where the individual’s payoffs in Period 2 depend on both the
state and an action taken by individuals in Period 1. Information structure (p′, q′) Blackwell
8
dominates structure (p, q) if, for any set of actions and utility functions over action and state
pairs, an individual with expected utility preferences receives a higher expected utility from (p′, q′)
compared to (p, q).
However, as Kreps and Porteus (1978) and Grant, Kajii and Polak (1998) demonstrate, there
is a meaningful mapping between Blackwell’s ordering and information preferences even when
information is non-instrumental (i.e., individuals cannot take any action based on it). In particular,
one information structure resolves more uncertainty earlier than another if the first information
structure is Blackwell more informative than the second. Note that lotteries may be earlier or later
than each other, even if all uncertainty is not resolved early (Period 1) or late (Period 2). Lemma
3 formalizes the intuitive concept of “informativeness” of the signal structure in our setting by
defining when one signal structure is Blackwell dominated by another.
Lemma 3 (p′, q′) Blackwell dominates (is Blackwell more informative than) (p, q) if and only if
p′ ≥ max{ p1−q (1− q′), 1− q′ 1−pq }.
If (p′, q′) is more Blackwell informative than (p, q) then the posteriors under (p′, q′) are a mean
preserving spread of the posteriors under (p, q) — a result that follows from the law of iterated
expectations. Clearly, this will be true if p′ > p and q′ > q, but Lemma 3 shows it can also
be true under less stringent conditions. Figure 1 illustrates the set of all signals (p′, q′) that are
Blackwell more informative than the signal (p, q) = (.6, .6). Importantly, the definition of Blackwell
more information is prior independent; and so preferences that prefer early over late resolution (or
vice-versa) must obey the constraints imposed by Lemma 3 regardless of the prior.
1
1
1 0.6
0.6
Blackwell More Informa4ve Set
Figure 1: Signals that are Blackwell More Informative than (.6,.6)
Preference for one-shot versus gradual: Another ordering over information structures
9
that is discussed in the literature is a preference for one-shot resolution of uncertainty. Building
on Palacios-Huertas (1999), Dillenberger (2010) provides a characterization of a preference for one-
shot resolution of uncertainty. Dillenberger describes an individual who prefers full early resolution
(p = q = 1) or full late resolution (p = q = .5) over any other information structure, given any
prior f . This phenomena is closely linked to the notion of a preference for clumping, introduced
by Koszegi and Rabin (2009). Ely, Frankel and Kamenica (2013) model preferences for gradual
resolution of information and their application in a mechanism design setting In our experiment we
focus on looking for a local preference for clumping, where if an individual an information structure
that reveals nothing in Period 1 to all “nearby” information structures, which convey only a little
information in Period 1.
Preference for positive versus negative skewness: The last ordering we want to discuss,
and the novel contribution of this paper, is testing preferences for skewed information — whether in
Period 1, individuals prefer to learn more about whether they will receive the better outcome (H) or
the worse outcome (L). In the theoretical literature there are several notions related to preferences
for skewness (including third-order stochastic dominance, third-degree risk order, mean variance
preserving probability transformations, the central third moment and the Dillenberger and Segal
(2015) notion of skewness). Given our experimental design, with a fixed prior and “symmetric”
signal structures, these notions coincide, and so for simplicity we focus on providing intuition via
mean-variance preserving probability transformations, which as Menezes et al (1980) has shown, is
equivalent to the third degree risk order.
For example, suppose we fix f = .5. Consider structures (.5, 1) and (1, .5); which are positively
and negatively skewed respectively. The positively skewed information structure provides a 25%
chance to resolve all uncertainty in favor of the better outcome (giving a posterior of 1), while
delivering worse-than-before news 75% of the time (delivering a posterior of 13). The negatively
skewed information structure provides a 25% chance to resolve all uncertainty in favor of the worse
outcome (giving a posterior 0), while delivering better-than-before news 75% of the time (and giving
a posterior of 23).
These examples of skewed information structures have a particular feature that they also allow
full resolution of uncertainty in Period 1 in some cases. Theories that predict a preference for
these “extreme” skewed information structures also predict the same preference for interior cases
where information is always resolved gradually. Thus, we are interested in whether people prefer
informations structures that, given equal priors, are more accurate at predicting the worse outcome
than those that are more accurate in predicting the better outcome. Formally, suppose a ≥ b. A
preference for positive skewed information occurs if the decision maker prefers (b, a) if preferred to
(a, b). Given equal priors, the expectation of the posterior distribution is the same for both infor-
mation structures (by the Law of Iterated Expectations). Moreover, the variance of the posterior
10
distributions are the same. But, (b, a) has a posterior distribution with a positive third moment,
while (a, b) has a posterior distribution with a negative third moment (and in fact, the two third
moments have the same absolute value).5
3 Theoretical Predictions
There are a variety of theoretical models that predict preferences over information structures. In
this section we will derive testable predictions of important classes of models. We can will then take
these predictions to the data, which can help us understand the best way to model the preferences
we observe. This is important because, for example, although many models can predict a preference
for skewness, they will predict it in conjunction with other phenomenon.
In order to facilitate comparisons of the predictions, we will frame the assumptions we make
as restrictions on functional forms. For some of the predictions, there are equivalent formulations
which can be made directly on the preferences. We will also not describe the models themselves in
detail in this section. The interested reader can see the Appendix for a more detailed description
of the models, assumptions and axioms. In all predictions, in order to match the design of the
experiment, we assume the prior is 0.5, as well as the binary state, binary signal framework discussed
in the previous section. Thus we frame our prediction in terms of observed preferences, %.5.
Traditional economics would assume that individuals do not have non-instrumental preferences
over information; Segal (1990) describes these individuals as satisfying an axiom called Reduction
of Compound Lotteries. When re-framed in our domain, information structures, this axiom simply
says that an individual should not care about what information structure they face, i.e., (p, q) ∼f(p′, q′).
Prediction 1 Fixing f , the decision maker should be indifferent between all information structures.
Of course, it is easy to imagine that individuals are not indifferent between all information
structures even when information has no instrumental value. Thus, the literature has considered
various weakenings of the Reduction of Compound Lotteries assumption. One assumption, intro-
duced by Segal (1990), is that rather than being indifferent between all lotteries that have the same
reduced form probabilities over final outcomes, individuals are only indifferent between full early
resolving lotteries (i.e., p = q = 1) and full late resolving lotteries (i.e., p = q = .5) that have same
reduced form probabilities over final outcomes (i.e. the same f). Segal describes these individuals
as satisfying the Time Neutrality axiom, which says (1, 1) ∼f (.5, .5).
Prediction 2 Fixing f , the decision-maker should be indifferent between (1, 1) and (.5, .5).
5Of course, if priors vary, then although the posterior distributions of (b, a) and (a, b) will always have the samemean, they will not necessarily have the same variance.
11
Because time neutrality imposes a type of stationarity on preferences they have been widely
used in the literature, including papers such as Dillenberger (2010). In contrast, a large number of
papers, beginning with Kreps and Porteus (1978), have discussed the importance of a preferences
that do not satisfy time neutrality. In particular they focus on individuals who have a preference
for early resolution of information. Individuals have a preference for early resolution of uncertainty
if, given two lotteries which generate the same reduced form probabilities over the same outcomes,
they always prefer a compound lottery which is more Blackwell informative in the first stage. Grant,
Kajii and Polak (1998) show that given mild differentiability assumptions on the utility function
V that represents the preferences, a preference for more (less) Blackwell informative signals is
equivalent to the local utility function of V being convex (concave).6
Prediction 3 Let %f represented by V , where V is Gateaux differentiable. Then the local utility
function of V is everywhere convex (concave) if and only if the decision-maker prefers Blackwell
more (less) information structures.
The intuition for this result (drawing on Machina, 1982) is that individuals like increases in the
spread of their posterior if and only if their local utility functions are convex (here the posterior
distribution acts like a monetary lottery). Similarly, we know in the case of monetary lotteries
that if the derivative of the local utility function is convex, then the individual prefers positively
skewed lotteries. This intuition naturally maps into compound lotteries, as the next prediction
demonstrates.7,8
Prediction 4 Let %.5 represented by V , where V is Gateaux differentiable. If the local util-
ity function of V is thrice differentiable and has a convex (concave) derivative everywhere, then
(x, y) %.5 (-.5)(y, x) whenever x ≤ y.
Segal (1990) proposed a class of preferences, called “Recursive Preferences”. In this class,
decision-makers evaluate situations with information revelation using a folding-back procedure by
using two functionals V1 and V2, which represent the utility at Period 1 and 2, respectively.
We next turn to the predictions of some specific models that specifically assume recursivity and
can address preferences for skewness.9 The first model to implicitly use recursive preferences to
address non-instrumental preference for information was introduced by Kreps and Porteus (1978).
6Recall that convexity, concavity, as well as the local utility functions, are defined in the space of two-stagecompound lotteries induced by the prior-information structure pair.
7Again, the function is defined in the space of induced two-stage compound lotteries.8Our requirement on the differentiability of the utility functional and the local utility functions is stronger than
it actually needs to be in both this and the previous prediction. Using the techniques of Cerreia-Vioglio, Maccheroniand Marinacci (2015) we can relax all differentiability assumptions.
9Eliaz and Spiegler (2005) discuss an impossibility result related to preferences for skewness and preferences forfull-early resolution. However, their exact results rely on existential quantifiers that are impossible to violate andcalibrate.
12
They assume both V1 and V2 have expected utility representations. Given these specification, we
now provide a stronger version of Predictions 3 and 4.10
Prediction 5 Suppose preferences have a recursive representation (V1, V2) such that V1 and V2
have expected utility representations with Bernoulli utilities u1 and u2. Then u1 ◦ u−12 is convex
(concave) if and only if the decision-maker prefers Blackwell more (less) information structures.
Moreover, if the derivative of u1 ◦ u−12 is convex (concave), then (x, y) %.5 (-.5)(y, x) whenever
x ≤ y.
There are also other models, used in a variety of applications, which make other particular
functional form assumptions. Two well-known classes of models used in dynamic applications are
the recursive extensions of Gul’s (1991) model of disappointment aversion and rank dependent
utility.11 Both of these models can accommodate a preference for right-skewed information or for
left-skewed information.12 However, they also generate additional predictions regarding behavior
that can separate them from the basic Kreps-Porteus model. Our next prediction is one such
behavior: it states that if an individual’s preferences fall within either of these classes and are
consistent with the empirical evidence on the Allais paradox and first-order risk aversion, then they
must exhibit local preference for late resolution.13
Prediction 6 Suppose preferences have a recursive representation (V1, V2) such that V1 and V2
are both in Gul’s class of disappointment aversion functionals (or rank-dependent utility) and the
decision-maker is disappointment averse (has a strictly convex weighting function). Then there
exists an 0 < ε′ such that for all ε < ε′, (.5, .5) �.5 (.5 + ε, .5 + ε).
One paper directly addressing preferences for skewed information is Dillenberger and Segal
(2015). They provide sufficient conditions such that, fixing a prior, if an individual prefers full late
resolution (.5, .5) over all more informative structures, (p, q) where p ≥ .5 and q ≥ .5, then they
must also prefer (.5, .5) over all negatively skewed structures. However, these individuals prefer
some positively skewed structures over (.5, .5). We refer the interested reader to their paper for a
full description of the conditions.14
10Caplin and Leahy’s (2001) model of psychological expected utility nests Kreps and Porteus’ (1978) specificationin our framework.
11One might ask why not directly test the assumption of recursivity with our data. We cannot do so because, fixinga prior belief about outcomes, recursivity itself has no testable implications regarding preferences over information.Thus we must test recursivity in conjunction with other assumptions about the structure of the functionals.
12This is almost trivially true since they nest the Kreps-Porteus functional form just discussed13As inspection of the proofs will demonstrate, these are stronger than necessary conditions.14Dillenberger and Segal (2015) provide a different definition of skewness. In our particular domain, their definition,
as well as all other notions of skewness coincide, because of the binary nature of the outcome and priors being equalto 50%.
13
Prediction 7 Suppose preferences belong to the class defined by Dillenberger and Segal (2015).
Then if (.5, .5) %.5 (x, x) for all x ≥ .5, then for all x ≤ y, (.5, .5) %.5 (y, x). However, it is possible
that (.5, .5) -.5 (x, y).
We also want to consider three important models of preferences which do not satisfy recursivity,
but which are used in many applications to generate preferences over information. Brunnermeier
and Parker (2005) introduce a well known model of optimal expectations. In their model individuals
trade off having (distorted) optimistic beliefs today with possibly taking incorrect actions in the
future based off of those incorrect beliefs. Of course, in our environment there are no actions to
take, so individuals should be indifferent between all structures. We refer to their functional form
as BP.
Prediction 8 Suppose preferences represented by a BP functional form. Then the decision-maker
should be indifferent between all information structures.
A second important class of non-recursive preferences are those of Koszegi and Rabin (2009).
We refer to their functional form as KR. These preferences, although flexible enough to capture
preferences for early versus late resolution of information, and a preference for clumping, have
strong predictions regarding preferences for skewness. This is because their functional form imbeds
strong symmetry assumptions regarding the payoffs over beliefs.
Similarly, the non-recursive preferences using in Ely, Frankel, and Kamenica (2013), referred to
here as EFK, also imbed strong symmetry assumptions. Thus, despite the fact that they are meant
to capture a preference for gradual resolution of information, the opposite intuition of Koszegi and
Rabin (2009), we obtain the same prediction.
Prediction 9 Suppose preferences represented by a KR or EFK functional form. Then (x, y) ∼.5(y, x).
4 Experimental Design and Procedures
This section provides a brief primer on the experimental design (for a more detailed description
see Appendix C). Subjects received a raffle ticket upon entering a room which gave them a 50%
chance of winning winning $10 in addition to their show-up fee and a 50% chance of winning no
additional money. They were told that the winning ticket numbers would be announced at the
end of the 60 minute study and that they could choose the kind and amount of information they
would like to receive in the middle of the study by making choices among information options. The
particular setting we chose focuses on information regarding an outcome that the subjects have
no control or agency over, thus ruling out information preferences linked to ego-utility, confidence,
14
etc. (see Hoffman, 2011; Eil and Rao, 2011; Mobius et al., 2011 for examples where information of
interest is about the action or characteristics of the subject, and Eliaz and Schotter, 2010 for the
case where confidence matters due to agency). Subjects were also told that they would receive a
clue generated by the information option they chose at the end of Part 1, and that this information
would not change whether they actually won the money or not, nor would it help them elsewhere
in the experiment, nor have any impact on their earnings. Finally, the subjects were informed that
they would sit with this information until the outcome of the lottery was announced at end of the
experiment while answering hypothetical questions in Part 2 of the study. They were told that Part
2 of the study would not affect their payments or the lottery outcome in any way. As such, the
second part of the study served mainly as a filler task as they waited to learn the lottery outcome.
This design choice introduced a considerable delay between the time of information acquisition to
the time of uncertainty resolution, while keeping the subjects in a controlled environment to rule
out any potential instrumental use of the information acquired. In other words, subjects could
not engage in any actions, such as purchasing goods, based on the information about their future
earnings.
In the experiment we specifically test preferences between pairs of information structures.15 We
also focus on a particular prior, where f = .5. The specific pairs we focus on are:
• (1,1) and (.5, .5), which we denote as full early resolution and full late resolution respectively.
In the former case, the decision-maker learns everything after observing the signal, and in the
latter nothing.
• (.5, 1) and (1, .5), which we denote as extremely positively skewed information and extremely
negatively skewed information respectively. In the former case, the decision-maker knows for
sure that the low outcome will happen whenever they observe a bad signal, but does not know
for sure if the high outcome will happen after observing a good signal. In the latter case the
decision-maker knows for sure that the high outcome will happen whenever they observe a
good signal, but does not know for sure if the low outcome will happen after observing a bad
signal.
• (.3, .9) and (.9, .3), which we denote as slightly positively skewed information and slightly
negatively skewed information. In the former case, given equal priors, the decision-maker
is much more sure that the low outcome will happen whenever they observe a bad signal
compared to their belief about the high outcome will happen if they observe a good signal.
15Because that our domain appears similar to a standard consumption domain, it would be possible to give con-sumers a “budget” constraint and have them choose their favorite signal within the budget constraint. We do notdo so because we believe this would make it harder for the subjects to understand the posterior distribution inducedby any given signal and the probabilities with which a given signal is realized. Given pairwise choices we can ensurethat the subjects understand, as well as possible, these attributes.
15
In the latter case, given equal priors, the decision-maker is much more sure that the high
outcome will happen whenever they observe a good signal compared to their belief about the
low outcome will happen if they observe a bad signal.
• (.6, .9) and (.9, .6), which we denote as moderately positively skewed information and mod-
erately negatively skewed information. The intuition is as for (.9,.3) and (.3,.9) respectively.
• (.5, .5) and (.55, .55), which are both symmetric signal structures with no skewness. The
former option does not provide any information in Period 1 and the latter option conveys
only a little information in Period 1. Therefore we denote them as as full late resolution and
a little early resolution, respectively.
• (.76, .76) and (.3, .9), which are a symmetric signal structure with no skewness and a positively
skewed information structure, respectively. Note that (.76, .76) is just barely Blackwell more
informative than either (.3, .9).
• (.67, .67) and (.1, .95), which are a symmetric signal structure with no skewness and a
positively skewed information structure, respectively. Note that (.67, .67) is just barely
Blackwell more informative than (.1, .95).
• (.66, .66) and (.5, 1), which are a symmetric signal structure with no skewness and a positively
skewed information structure, respectively. Note that (.66, .66) is just barely Blackwell less
informative than (.5,1).
• (.55, .55) and (.3, .9), which are a symmetric signal structure with no skewness and a positively
skewed information structure, respectively. Note that (.55, .55) is just barely Blackwell less
informative than (.3,9).
Figure 2 and Figure 3 show the relationship between all the information structures we consider
in our setting where the prior is 0.5. The p value is the horizontal axis in each figure, while the
q value is the vertical axis. Figure 2 demonstrates the types of skewed information structures we
are considering, along with full early and full late resolution information structure. In our choices
to assess preferences across skewed information structures, we will compare a skewed information
structure to the information structure that is it’s reflection across the line (p = q) — e.g. we will
compare slightly negative skew to slightly positive skew. The reflection across the line (p = q) gives
us the most natural pair with the opposite skew, while keeping the Blackwell informativeness of
the signals constant.
16
0.3 1
1
0.5
0.5
1 0.5
0.5 Extreme
Slightly
Full Late Medium
Nega:vely Skewed
Extrem
e
Slightly
Med
ium
Posi:vely Skewed
0.9
0.9
Full Early
0.3
Figure 2: Information Structures, Skewness
In contrast, Figure 3 focuses on showing the relationship in terms of Blackwell dominance
between different signals. Given our discussion above, points that are higher on the 45-degree (i.e.
p = q) line Blackwell dominate points that are lower. More interestingly, we can also use Lemma
2 to compare information structures with skew to symmetric information structures that generate
(given equal priors) a posterior distribution with a 0 third moment — information structures
along the line p = q. In our experiment we will compare a skewed information structure to a
symmetric information structure that is either just Blackwell more or just Blackwell less informative,
which helps us better distinguish between models that capture non-instrumental preferences for
information. Moreover, it allows us to assess to what extent preferences for skewness may alter
preferences for Blackwell dominance.
The lines drawn through a positively skewed structure demarcate the set of structures either
Blackwell more informative or less informative. One line passes through (.5, 1). Points are below
this line if and only if they are Blackwell dominated by (.5, 1). In contrast, two lines pass through
(.3, .9). First consider the line that is steeper (i.e. has a slope that is larger in absolute value).
Points are below this line if and only if they are Blackwell dominated by (.3, .9). Next consider the
line that is flatter. Points are above this line if and only if they are Blackwell dominate (.3, .9).
In a similar manner, the line drawn through (.1, .95) demarcates the set of points that Blackwell
dominate (.1, .95).
17
Figure 3: Information Structures, Blackwell Ordering
There were two conditions. In each condition subjects faced a series of five pair-wise choices
between information structures. They were told that one of the questions would be chosen, at ran-
dom, with equal probability assigned to all questions, after they had made all pairwise choices.16, 17
In order to verify the robustness of observed preferences, Conditions 1 and 2 varied the order of
the options presented in Q1, Q2, Q5b, and counterbalanced the order in which Q3 and Q5a were
presented, and also asked different Q4a and Q4b.18
Table 1 details the order of questions and options presented in Condition 1 and Condition 2.
Q1 elicited preferences regarding full early resolution of uncertainty (indicated by (1,1)) versus
full late resolution of uncertainty (indicated by (.5, .5)). Q2 elicited preferences between the
extreme positively and negatively skewed signal structures, while Q3 and Q5 presented the slightly
and moderately skewed signal structures. Note that half the time, Q5 presented another signal
structure that tested preferences for full late resolution and another symmetric signal structure that
is more informative (.55, .55). Finally, Q4 tested for a trade-off between Blackwell informativeness
and skewness. If subjects preferred late resolution, in both conditions they were asked to choose
between a skewed signal to a symmetric signal that was Blackwell less informative than that skewed
16One may be concerned that we did not elicit willingness to pay for information structures. In order to verify thatindividuals were in fact willing to pay for their information structures we ran a separate experiment. We found thatindividuals are willing to pay for their favored information structure, we discuss these results as a robustness checkin the next section, and the design of the experiment in Appendix D.
17Implementation using to a random-lottery incentive system is quite common in the literature. However, it hasbeen criticized (Holt, 1986). If subjects treat each choice in isolation, then this incentive system introduces nodistortions. Experimental evidence on this, although mixed, has been generally supportive; Starmer and Sugden(1991) and Cubitt et al. (1998) are supportive, while Harrison et al. (2013) finds distortions. Wakker (2007) providesa useful summary of these issues.
18We did not test richer question order randomization because the video instructions explaining each question builton each other and starting with Q1-Q2 made most sense because they were the simplest to explain.
18
signal. Similarly, if they preferred full early resolution, they were asked to choose between a skewed
signal to a symmetric signal that was Blackwell more informative. Therefore choices presented in
Q4 were aimed at testing whether the preferences of subjects who exhibit a preference for Blackwell
informativeness over symmetric signal structures also respect that same Blackwell ordering when
comparing positively-skewed structures to symmetric structures.
Figure 4: Q2, Condition 1
Each of the information structures were presented as options from which the computer would
draw a ball from according to whether the subject won or lost the lottery. The subjects could
not see which box the computer was drawing a ball from, but could observe the color of the ball.
Figure 4 depicts the options presented in Q2 of Condition 1. Subjects watched an instructional
video before each question that presented the two options, explained the percentage of the instances
a red or a black ball would be drawn from each option, and displayed the posterior probability of
winning or losing associated with observing a red or a black ball from each option. After this
instructional video, subjects completed comprehension questions that checked their understanding
before proceeding to making their choices. The information presented by the video was repeated
on the page that described each option and asked for their choice.
After the subjects made choices in all five questions, the computer randomly picked one question
among the five to be carried out. The subjects saw the chosen question and their choice in that
question was repeated on their screen. Then, the computer displayed the color of the ball drawn
from their preferred option and the posterior probability of the subject having won the lottery based
on the color of the ball, and asked the subject to enter this probability to confirm that they were
paying attention. After their information choice in one randomly chosen question was carried out
in this manner, subjects were asked to answer several qualitative questions regarding their choice.
19
Condition 1 Condition 2
Option 1 Option 2 Option 1 Option 2 conditional on
Q1 (1, 1) (.5, .5) (.5, .5) (1, 1) -
Q2 (1, .5) (.5, 1) (.5, 1) (1, .5) -
Q3 (.9, .3) (.3, .9) (.6, .9) (.9, .6) -
Q4(.76, .76) (.3, .9) (.67, .67) (.1, .95) if (1, 1) � (.5, .5)
(.55, .55) (.3, .9) (.66, .66) (.5, 1) if (1, 1) � (.5.5)
Q5(.9, .6) (.6, .9) (.9, .3) (.3, .9) random
(.55, .55) (.5, .5) (.5, .5) (.55, .55) random
Table 1: The order of questions and options in Condition 1 and 2
In the remaining time before the outcome of the lottery was to be revealed, subjects were also
asked a series of hypothetical questions across 5 blocks in Part 2 of the study.19 Each block featured
10 questions, asking whether individuals preferred to take Option A or Option B. In blocks 1-3,
Option B was receiving some amount of money for sure, beginning with $2 and increasing in $2
increments to $20 dollars. In block 1, Option A was a gamble that was structured as follows: “a
ball will be drawn from a box with 50 white and 50 blue balls. If a blue ball is drawn you receive
$30, otherwise nothing.” In block 2, Option A was a gamble that was structured as follows: “a
ball will be drawn from a box with white and blue balls (the respective number were not specified).
If a blue ball is drawn you receive $30, otherwise nothing.” Option B was receiving some amount
of money for sure, beginning with $2 and increasing in $2 increments to $20 dollars. In block 3,
Option A was a gamble that was structured as follows: “a ticket will be drawn from an urn that
features 101 tickets labeled from 0 to 100. The number on the ticket determines how many blue
balls will be in a box of 100 blue and white balls. Next, a ball will be drawn from the box. If a blue
ball is drawn you receive $30, otherwise nothing.” In block 4, Option A allowed the individual to
receive $30 for sure. Option B was a gamble that paid an 80% of x and a 20% of 0. x varied from
$34 to $74 in $4 increments. In block 5, Option A was a gamble which allowed the individual to
receive a 25% chance of $30 and 75 % chance of $0. Option B was a gamble that paid an 20% of
x and a 80% of 0. x varied from $34 to $74 in $4 increments.
Our design allows us to directly relate the theoretical predictions to the questions. Prediction
1 can be tested by all questions, as it says that the decision-maker should always be indifferent.
Prediction 2 is specifically tested in Q1. Prediction 3 is tested by looking at preferences for earlier
19These questions were hypothetical in order to ensure that subjects could not use the information to adjust theirresponses to questions that would result in actual monetary rewards.
20
or later resolution, which are elicited in Q1, Q4 and the second possible Q5 question. Prediction 4
is tested by looking at preferences for skewness, and so is tested by looking at Q2, Q3 and the first
possible Q5 question. Prediction 5 is tested by looking at both preferences for skewness, Q2, Q3
and the first possible Q5 question, as well as preferences for avoiding information entirely, which
is tested in Q1, Q4 and the second possible Q5 question. Prediction 6 is tested both by looking at
preferences for extreme skewness and whether preferences are monotone in the Blackwell ordering,
so Q1, Q2 and Q4.20 Prediction 7 is tested by looking for preferences for skewness Q2, Q3 and the
first question of Q5, and a preference for full late resolution locally, so the second question of Q5.
Like Prediction 1, Prediction 8 is tested by all questions. Predictions 9 and 10 specifically rely on
Q2, Q3, and the first question of Q5. The following Table highlights these relationships.
5 Data and Results
Table 2 summarizes choices across the information structures tested by Q1-Q5. The first set of
results describes preferences over information structures that are symmetric, but vary in terms of
Blackwell informativeness. This first set of results indicates that individuals generally prefer full
early resolution relative to full late resolution. Moreover, individuals prefer even learning a little
bit earlier rather than full late resolution in just a large a proportion. Therefore, the results do not
support a general preference for one-shot resolution of uncertainty.
The second set of results relates to preferences for negatively versus positively skewed infor-
mation structures. We observe that most individuals prefer the positively skewed information
structure relative to the negatively skewed information structure. Moreover, the proportion of indi-
viduals who prefer the positively skewed structure does not seem to vary with the type of structure
(i.e. slightly, moderately, or extremely skewed). In addition, the preference for positively skewed
information seems as prevalent in the population as the preference for early resolution.
The third set of results concerning choices between symmetric and skewed information op-
tions require more interpretation due to the conditional nature of the experimental design. Recall
that individuals only compared (.76, .76) to (.3, .9) if they previously indicated they preferred full
early resolution of information to full late resolution (this was the same for comparing (.1, .95) to
(.67, .67)). Individuals compared (.3, .9) to (.55, .55) if they made the opposite choice regarding
the timing of full resolution of information (similarly for comparing (.66, .66) to (.5, 1)). Thus, we
can interpret these choices as asking, conditional on individuals seeming to exhibit a preference
regarding Blackwell informativeness over symmetric signal structures, do we observe their pref-
erences respecting that same Blackwell ordering when comparing positively-skewed structures to
symmetric structures. Because most individuals prefer positively-skewed structures, and because
20We do not directly compare skewed information to full early resolution, but rely on the fact we find monotonicityin the Blackwell ordering elsewhere.
21
Early vs. Late
(1, 1) � (.5, .5) 77%∗∗∗ (196/250)
(.55, .55) � (.5, .5) 78%∗∗∗ (91/121)
Pos. Skewed vs. Neg. Skewed
(.5, .1) � (1, .5) 75%∗∗∗ (193/250)
(.6, .9) � (.9, .6) 72%∗∗∗ (144/196)
(.3, .9) � (.9, .3) 84%∗∗∗ (149/183)
Symmetric vs. Skewed
(.76, .76) � (.3, .9) 71%∗∗∗ (65/92)
(.3, .9) � (.55, .55) 67%∗ (18/27)
(.67, .67) � (.1, .95) 64%∗∗∗ (67/104)
(.66, .66) � (.5, 1) 56% (15/27)
� represents “chosen over”. ∗∗∗\∗∗\∗ implies pro-portion is significantly different from 0.5 at the0.01\.05\.1 level.
Table 2: Percentage of choices
the comparisons are to symmetric structures that are just barely more or less Blackwell informative,
these questions also test whether preferences for skewness can dominate preferences for Blackwell
informativeness.
From the comparisons of (.76, .76) to (.3, .9) and (.1, .95) to (.67, .67) we see that most (around
two thirds) of the individuals that exhibit a preference for early resolution over symmetric structures
also prefer Blackwell more informative signals to skewed signals. Looking at the choice between
(.3, .9) and (.55, .55) we observe even though all these individuals preferred full late resolution to full
early resolution, most individuals still preferred positively-skewed information that was Blackwell
more informative to a less informative symmetric signal structure. Looking at the choice between
(.66, .66) to (.5, 1) we see that almost equal proportions of individuals choose either option. Thus,
the preferences of individuals who preferred full late to full early resolution of uncertainty when
comparing symmetric information structures do not respect respect the same ordering induced by
Blackwell dominance when comparing a positively skewed structure to a symmetric structure.
In addition to choosing an information structure, subjects were also asked by how much they
preferred their chosen option over the unchosen option. Although these questions were not incen-
tivized, we believe they give a sense of the relative strength of preference within individuals across
the chosen options. We test whether reversals of preference regarding Blackwell ordering among the
individuals who preferred full late to full early resolution of uncertainty are more likely among those
with weak preferences for late resolution. Indeed, we find that the weaker individuals’ preference for
22
late resolution, the more likely that they prefer positively-skewed information to a less informative
symmetric signal (p−value= .009, logistic regression of conditional Q4 choice on preference strength
in Q1). Those who always prefer less information have rated their preference for full late resolution
to be on average 8.8 out of a 10 point scale, whereas those who prefer the more informative skewed
signal have rated their preference for full late resolution to be 6.5 on average.Therefore, it seems
that at least some of the individuals who do not seem to have a consistent preference regarding
Blackwell informativeness of signals may have weaker preferences of late resolution of uncertainty
to begin with.21
Chosen Option
First Second
Early vs. Late
(1, 1) vs (.5, .5) 9.23 7.37
(.55, .55) vs (.5, .5) 7.08 5.67
Pos. Skewed vs. Neg. Skewed
(.5, .1) vs (1, .5) 8.24 7.19
(.6, .9) vs (.9, .6) 7.13 6.48
(.3, .9) vs (.9, .3) 7.54 6.76
Symmetric vs. Skewed
(.76, .76) vs (.3, .9) 7.42 7.93
(.3, .9) vs (.55, .55) 7.11 8.44
(.67, .67) vs (.1, .95) 7.67 7.24
(.66, .66) vs (.5, 1) 7.87 6.58
Table 3: Intensity of Choices
Table 3 summarizes the preference strength data conditional on a particular chosen option. This
data supports the previous results — the option that was chosen by a majority of subjects also
was more strongly preferred by the subjects that chose it, compared to the strength of preference
reported by individuals who preferred the less-chosen option. The one question where this is not
true is the choice between (.3, .9) and (.55, .55), where most individuals preferred (.3, .9) but
individuals who chose (.55, .55) exhibit a stronger preference for their chosen option.
Even though the existing theoretical models do not provide any clues regarding the relationship
between preferences for early versus late resolution of uncertainty and preferences for skewness
given a prior, the within-person nature of our experiment’s design allows us to investigate whether
21Differences in strength of preference does not explain whether individuals that exhibit a preference for earlyresolution over symmetric structures also prefer Blackwell more informative signals to skewed signals, most probablybecause only a minority of subjects fail to do so.
23
preference for positively skewed information option is stronger among subjects who prefer late
resolution of uncertainty compared to subjects who prefer an early resolution.
Extreme Medium Slight
Pos. Neg. Pos. Neg. Pos. Neg.
(.5,1) (1,.5) (.6,.9) (.9,.6) (.3,.9) (.9,.3)
Early (1,1) 123 73 196 113 41 154 117 28 145
Late (.5,.5) 44 10 54 31 11 42 32 6 38
167 83 250 144 52 196 149 34 183
Table 4: Early or Late vs Skewed
Table 4 cross-tabulates these within-person choice patterns. For example, out of the 196 subjects
who prefer early resolution of uncertainty, 123 choose the extreme positively skewed information
option and the remaining 73 choose the extreme negatively skewed information option. Similarly,
out of the 154 (145) subjects who prefer early resolution of uncertainty who also were asked to
indicate a choice between information options with medium (slight) skew, 113 (117) choose the
positively skewed information option. We see that subjects who have a preference for early res-
olution of uncertainty are relatively less likely to choose the extremely positively skewed signal,
compared to those who prefer late resolution (p−value= .012, logistic regression of Q1 choice onto
Q3 choice). However, such a relationship does not exist between medium or slight positive skewness
and late resolution preferences. Therefore, the evidence is intriguing, but inconclusive.
Table 5 cross-tabulates within-person choice patterns in the questions that present positively
and negatively skewed information structures. Overall, we see that most individuals are consistent
in their preferences. In particular, those who prefer one positively skewed signal are very likely
to prefer another positively skewed signal. Overall, 73% of the individuals who made choices
among {(1,.5), (.5,1)} and {(.9,.3), (.3,.9)}, 65% of the individuals who made choices among {(1,.5),
(.5,1)} and {(.9,.6), (.6,.9)}, and 76% of the individuals who made choices among {(.9,.3), (.3,.9)}and {(.9,.6), (.6,.9)} display the same ordering when comparing negatively and positively skewed
structures. We also find some evidence that people who have weak preferences regarding their
choice in one of the questions are more likely to switch their preference ordering in another question.
However, the consistency across questions is considerably high, especially given the complexity of
the experiment.
One plausible concern regarding the design of the experiment is that we did not directly elicit
a willingness to pay for individuals preferred information structure. We did this to avoid further
complicating an already complex elicitation procedure. A second plausible concern is that in having
individuals make several pairwise choices, we elicited preferences different from what they would
24
Extreme Medium Medium
Pos. Neg. Pos. Neg. Pos. Neg.
(.5,1) (1,.5) (.6,.9) (.9,.6) (.6,.9) (.9,.6)
Pos. (.3, .9) 107 42 149 (.3, .9) 85 22 107 (.5, 1) 101 26 127
Neg. (.9,.3) 17 17 34 (.9,.3) 9 13 22 (1, .5) 43 26 69
124 59 183 94 35 129 144 52 144
Table 5: Relationships: Skewness Preferences
express if they were making a single pairwise choice (despite the fact that only one of the pairwise
choices would be implemented).
In order to ensure our results are robust to these concerns, we ran an additional experiment.
The experiment was very similar — two outcomes, with a 50 % prior on the high outcome. However,
individuals made only a single pairwise choice between two information structures. Once subjects
made a choice, and before they saw the signal, they were asked to indicate the amount they would
be willing to accept in order to change their choice. This amount measures the utility difference
between the two informational options. Then their choices were carried out using a DGM procedure
and they saw a signal from the information structure their willingness to accept answers indicate.
For more details of the implementation, please see Appendix D.
Pos. Skewed vs. Neg. Skewed
(.5, .1) � (1, .5) 64% (35/55)
(.7, .9) � (.9, .7) 85%∗∗∗ (57/67)
� represents “chosen over”. ∗∗∗ implies proportion issignificantly different from 0.5 at the 0.01.
Table 6: Percentage of choices: Single pairwise choice
Thus we find results that are qualitatively similar to the results in the main study, leading us to
believe that our findings are not significantly effects about concerns regarding a lack of payment or
the multiple pairwise choices. Moreover, individuals exhibited a strict willingness to pay — fewer
than 10 % of individuals who chose the positively skewed signal required 0 cents to change their
signal — thus almost all individuals who exhibited a preference for positive skew also demonstrated
a strictly positive willingness to accept in order to change to the negatively skewed signal.
25
6 Discussion
6.1 Our Results
Our results are useful in three ways. First, they allow us to demonstrate that individuals exhibit
consistent patterns of preference over skewed information structures. We find a strong preference
for positively skewed information. People choose right skewed information structures over left
skewed information structures and indicate that this is a relatively strong preference. Moreover,
this preference is robust, and exhibited through the comparison of several types of positively skewed
information structures to negatively skewed structures.
There are two concerns we want to address regarding the interpretation of the data. First, we
might be concerned that individuals are so used to having information be instrumentally valuable
that they simply apply the instrumental heuristic in non-instrumental settings. This would be a
good reason for being concerned that individuals choose more informative over less informative
signals. However, there are many simple instrumental settings where choosing the left-skewed
information structure gives a higher expected payoff than choosing the right skewed structure, and
so it is hard to conceive that this preference is purely heuristic in nature. More generally, for any
convex Kreps-Porteus functional, there exists an action set such that the functional can represent
the preferences a standard EU decision-maker who receives information and then must take a single
action out of that set.
Second, as we have observed before, positively skewed structures, compared to negatively skewed
structures, always generate a higher posterior probability of winning conditional on observing either
signal. Thus might be concerned that people are only focusing on these posterior probabilities. The
real concern is that this is not a true preference but simply a boundedly rational way of evaluating
the signals. However, our data shows that individuals do not simply want to maximize the posterior
probability of winning, conditional on observing a signal. We find that individuals prefer Blackwell
more informative symmetric signals to right skewed signals. The former have the same posterior
probability of winning conditional on a red ball, but a lower posterior probability of winning
conditional on a black ball.
Second, our results allow us to understand the relationship between preferences for skewed
information and related preferences. Consistent with previous studies, we find a preference for full
early resolution of information relative to late resolution of information. Moreover, we find that
individuals prefer Blackwell more informative signals to those that exhibit positive skew but are less
Blackwell informative. We find no evidence for preference for one-shot resolution of information.
Our results regarding skewness are consistent with a positive third derivative in the utility function
defined over compound lotteries (i.e. information structures); moreover, our results regarding
preference for early resolution are consistent with a a convex utility function defined over compound
26
lotteries.
Third, we can use our findings to test the various assumptions and models discussed in Section 3.
Because individuals exhibit preferences over different information structures with the same prior,
they violate the reduction of compound lottery assumption, as discussed in Prediction 1. Such
behavior is also inconsistent with the predictions of Prediction 8. Moreover, looking at Prediction
2, we find that individuals do not satisfy time neutrality — individuals, both at an aggregate and
individual level, are not indifferent between (1,1) and (0,0).
Turning to the predictions of specific models, Hypotheses 9 and ?? tells us that if preferences
are in either Koszegi and Rabin’s or Ely, Frankel and Kamenica’s class, then for f = .5 individuals
should be indifferent between (p, q) and (q, p). In fact, at both aggregate and individual levels we
do not find such an indifference — people prefer the positively skewed structure.
A variety of models have the possibility of predicting preferences for skewed information, even
when f = .5. These include Gul’s model of Disappointment Aversion and Rank Dependent Utility.
However, Prediction 6 indicates that those preferences should also prefer (.5, .5) to (p+ ε, q+ ε) for
a ε close enough to .5. We find no evidence for this type of preference for clumping.22
Our data fails to be consistent with the approach of Dillenberger and Segal (2015) for much the
same reason. Prediction 7 requires that individuals prefer (.5, .5) to any other more informative
symmetric signal. We find that this is not the case, as most individuals prefer (.55, .55) to (.5, .5).
Our subjects also fail to exhibit either a strong type I error preference or a strong type II error
preference, and so fall outside of the scope of Eliaz and Spiegler (2006). This is because we find that
individuals preferences seem to respect Blackwell’s ordering, and so no less than fully informative
signal structure would to be preferred to (1, 1).
In contrast, our data is generally consistent with the traditional model of Kreps and Porteus
(1978). Considering Prediction 3, we find that individuals behave in a way that is consistent with
the local utility function being convex in the model of Kreps and Porteus (1978). Moreover, given
Prediction 4 we know that if the derivative of the local utility function is also convex, then we can
rationalize the preference for positive skew.
In order to provide more context for these results, consider the Epstein-Zin parameteriza-
tion of the Kreps-Porteus model. Then V1(l) =∑
x∈l u1(x)l(x) =∑
x∈l xρl(x) while V2(l) =∑
x∈l u2(x)l(x) =∑
x∈l xαl(x). In this case the local utility function is convex if and only if
u1(u−12 (x)) is convex — or x
ρα is convex, which is the same as ρ ≥ α. Similarly, the derivative of
22One potential objection is that perhaps we did not set ε close enough to 0. In fact, simple calibrations showthe power of our test, where ε = .05. Suppose preferences in V1 and V2 with both have a disappointment aversionrepresentations (ui, βi). Moreover, suppose, as is plausible for a our small stakes, the ui is linear. In this case, if anindividual prefers (.55, .55) over (.5, .5), then for any plausible value of β2 (i.e. 0 ≥ β2 ≤ 100) β1 must be less than.01, or in other words, people must be ‘almost’ expected utility over gambles that resolve now — a fact inconsistentwith the risk aversion we actually observe over small stakes. Similar calibrations can be done with rank-dependentutility.
27
the local utility function is convex if and only if the derivative of u1(u−12 (x)) is convex. Given ρ ≥ α
this equivalent to ρ ≥ 2α. Thus, individuals must have a strong preference for early resolution.
We can relate our restrictions to the larger literature estimating Kreps-Porteus and Epstein-Zin
preferences. Epstein-Zin preferences are used widely in macroeconomics and have been estimated
from a variety of data. We can ask how well do our restrictions actually match the estimates
obtained from an entirely different domain. In fact, recent estimates are consistent with the re-
strictions our observed preferences for skewness place on the data (i.e. the convexity of the first
derivative of u1(u−12 ). For example, both Brown and Kim’s (2014) estimation, which relies on lab
experiments testing risk preferences, intertemporal elasticity, and preference for early resolution,
and Binsbergen et al’s (2012) results, which uses macroeconomic data, find that ρ ≥ 2α (much
greater in fact).
6.2 Literature Review
Here we review the literature that relates to our results. We discuss in detail both theoretical and
empirical approaches that touch on preferences for information, looking a preferences over skewness,
preferences for early versus late resolution, and preferences for one shot versus gradual resolution
separately. We discuss the literature for compound lotteries separately; even though preferences
over compound lotteries and information structures are theoretically linked, the decision framing
is quite different across the two domains.
Preferences for skewed information: There has been little empirical of preferences for
skewness. In the first work looking at this subject we are aware of, Boiney (1993) finds a preference
for positively skewed compound lotteries, but in the context of ambiguity, rather than objective
lotteries. Treatment 4 of Eliaz and Schotter (2010) provides the only experimental investigation
of preferences over positively versus negatively skewed information structures that we are aware
of. In particular, in a two-stage compound lottery context where one option dominates the other
in all states of the world, they test whether individuals are willing to pay to obtain information
about the degree to which the dominating option is superior, even though this information will
not affect their ultimate choice. Overall, they find that individuals are willing to pay to learn
whether the option they are about to choose is strongly or mildly dominating the option they are
rejecting. In Treatment 4, they find that individuals prefer a negatively skewed signal over one that
is positively skewed, as well as over one that is uninformative. They argue that individuals demand
non-instrumental information in this setting in order to feel more confident about choosing the
dominating option. As a result, they predict and show, that individuals would not have demand
for any non-instrumental information after the choice is made. Relatedly, they also hypothesize
that demand for non-instrumental information would not exist if participants did not have a choice
to make. Note that we study demand for non-instrumental information in a context where people
28
neither make choices nor have any other form of agency in determining the outcome of interest. As
a result, their results are not applicable in our setting. In addition to this fundamental difference
in the proposed process, their design also has important differences that make it hard to generalize
their results to our domain. Most notably, there isn’t a delay between the receipt of information and
the full resolution of information in their experiment. As a result, regulating anticipatory emotions
is not likely to play a role in shaping preferences over information in their data. Therefore, the
focus of the present paper and that of Eliaz and Schotter (2010) are quite different.
Preferences for early versus late resolution: The literature beginning with Kreps and
Porteus (1978) has spawned a great deal of empirical testing of preferences over timing of resolution.
This literature has broadly found support for individuals preferring earlier to later resolution.
To organize our discussion of the empirical tests of these preferences, we distinguish three
branches of the literature. The first branch is motivated by macro-economic applications of the
Kreps-Porteus model, and in particular the Epstein-Zin parameterization. These models allow for
the separation of risk preferences from inter-temporal elasticity and so can accommodate a wider
range of data than traditional models. As such, the Epstein-Zin model was widely adopted by
macroeconomists to estimate both a risk preference paramater and a intertemporal elasticity using
either survey data and financial decision-making data. However, these papers do not directly test
for preferences over information. These preferences are indirectly inferred from the estimates of risk
preferences and inter-temporal elasticity. This literature offers mixed results concerning preferences
for early versus late resolution of uncertainty. For example, the data in the early investigations
provided by Epstein and Zin (1991) indicate a preference for late resolution of uncertainty. However,
more recent papers, such as Binsbergen et al. (2012) have present a strong preference for early
resolution of uncertainty.
The second branch of the literature directly tests preferences over information structures, as
we do in this paper. However, most of this literature either asks hypothetical questions and/or
studies demand for information in contexts where information may be instrumentally valuable, for
example by providing planning benefits. For example, Arai (1997) explores whether individuals
have a preference for early or late resolution based on non-incentivized choices in a setting where
there the information concerns future income/consumption. He finds an overall preference for early
resolution. Ahlbrecht and Weber (1997) study whether individuals’ choices are consistent with the
Kreps-Porteus model. They find that a plurality of individuals always prefer full early resolution
to full late resolution (we say plurality because they asked for subjects preferences for full early
versus full late multiple times). However, they find that individuals do not satisfy some ancillary
predictions of the Kreps-Porteus model when preferences are recursive and satisfy independence in
each period. Thus, they caution against interpreting their evidence as simply support for Kreps-
Porteus.
29
Chew and Ho (1994) test preferences for early versus late resolution using hypothetical real-
world scenarios. They find that most individuals prefer early resolution. Similarly, and again using
hypothetical choices, Lovallo and Kahneman (2000) find that subjects prefer early resolution of
information. Moreover, moving from gains to losses strengthens the preference for early resolution
of uncertainty; and, at least in the domain of gains, a negatively skewed prior (which is quite
distinct from a skewed information structure) induces a greater interest in speeding up resolution
for gains compared to positively skewed gambles. Ganguly and Tasoff (2014) find that individuals’
preferences for earlier information grows the larger the gain they are facing is. Similarly, larger
losses lead to a preference for delaying information.
While most of the empirical data seems to suggest a preference for early versus late resolution of
uncertainty, Kocher, Krawczyk and Van Winden (2014) provide evidence of heterogeneity in prefer-
ences. Using lottery tickets in the lab, show that although many participants prefer lottery tickets
for an immediate drawing rather than one for the subsequent day, a substantial fraction actually
prefers delayed resolution23. Moreover, Von Gaudecker et al. (2011), using survey evidence from
a representative sample in Holland, find that the median subject is essentially indifferent between
early and late resolution. They also demonstrate a large degree of heterogeneity in preferences.
It is possible to think of the first two branches of the literature as complementary because
the Kreps-Porteus and Epstein-Zin models predict a tight connection between preferences over the
resolution of uncertainty and the relative values of the risk parameter and elasticity of intertemporal
substitution. Brown and Kim (2014) directly test this connection by eliciting risk preferences,
discount rates, elasticity of intertemporal substitution and preferences for resolution of uncertainty.
They find a majority (approximately 60 percent) of the subjects have a preference for earlier
resolution of uncertainty. Only a negligible proportion of subjects prefer late resolution.
The third vein of the literature considers the effect of delaying (or speeding up) resolution of
uncertainty on actions of players that are indicative of their information preferences. In a real-
stakes investment task, van Winden, Krawczykb, Hopfensitz (2011) find that subjects invest more
in a risky investment if resolution is sooner. Erev and Haruvy (2010) find that subjects value a
delayed chance at winning a prize more than an immediate chance.
Although the literature has found general support for early resolution of uncertainty, there is
also substantial heterogeneity within the subject population of a given study, and across studies.
One reason for the heterogeneity across studies may be due to different framing effects.
Preferences for one-shot versus gradual resolution: Individuals may prefer to learn all
information at once (one-shot resolution) or to learn the information gradually, in increments over
time. This preference is distinct from the preference of late versus early resolution of uncertainty,
as it concerns lumping versus spreading out information.
23Note, however, that in their context there is some planning benefit to receiving the lottery early.
30
Using incentivized choices, although allowing for the possibility of instrumental value of in-
formation, Zimmerman (2013) finds no evidence that subjects are averse to gradual resolution of
information. On the other hand, when examining these preferences concerning the potential of out-
comes in the loss domain (electric shocks), Falk and Zimmerman (2014) do. Other papers examine
actions regarding financial investments that may reflect preferences regarding gradual resolution
of uncertainty. For example, Karlsson, Loewenstein and Seppi (2009) find that individuals check
portfolios more often if they hold a high prior than a low prior. Bellemare, Krause, Kroger, and
Zhang (2005) demonstrate that if individuals receive information more often about their risky in-
vestment, they tend to invest less in that option and favor a safe investment where such information
is essentially eliminated.
Preferences over compound lotteries: Some tests of the theory frame the choices as com-
pound lotteries, rather than information structures. Halevy (2007) finds that individuals prefer
one-shot lotteries to compound lotteries. Abdellaoui, Klibanoff and Placido (2015) elicit willing-
ness to pay for compound lotteries and one-shot lotteries and also find individuals to prefer one-shot
lotteries to compound lotteries. These results can be interpreted as individuals having preferences
for one-shot over gradual resolution of uncertainty. However, it isn’t clear whether subjects view a
one-stage lottery as an early resolving lottery or a late resolving lottery, therefore making it difficult
to fit these results into the information framework and draw such conclusions. Miao and Zhong
(2012) explicitly address this concern, allowing individuals to compare two stage lotteries, but
where either the first or second stage is degenerate. Doing this, they can distinguish a preference
for early versus late resolution. Interestingly, they find that individuals prefer compound lottery
structures that feature full late resolution to most other compound lottery structures — even to
those that induce full early resolution. They also find violations of independence of %1.
7 Conclusion
Our results indicate individuals have strong preferences for right skewed information structures.
However, even without changing the number of outcomes or the number of signals, much work re-
mains to explore how these preferences may vary as the payoff differential across outcomes changes
or the prior probability of the high outcome changes. Moreover, there are different notions of
‘skewness’ in the literature. The most well known one, introduced by Menezes et al, 1980), involves
mean-variance preserving changes in the distribution that change the third moment (without chang-
ing the first two). Dillenberger and Segal (2013) introduce a different definition of skewed compound
lotteries. A third definition would define preferences over p and q rather than over the induced
compound lotteries. All three of these definitions coincide in our current setting. Future work can
disentangle which of these notions is most appropriate for evaluating preferences over information
31
structures.
32
Appendix A: Formal Definitions
This section will provide formal definitions for the theoretical discussion in the paper. In order to
link our discussion more closely to the existing literature, this Appendix will work with two-stage
compound lotteries, the set of which are equivalent to the set of prior, information structure pairs.
Formally, consider an interval [w, b] = X ⊂ R of money. Let ∆X be the set of all simple lotteries
on X. A lottery F ∈ ∆X is a function from X to [0, 1] such that∑
x∈X F (x) = 1 and the number of
prizes with non-zero support is finite. F (x) represents the probability assigned to the outcome x in
lottery F . For any lotteries F,G we let αF + (1−α)G be the lottery that yields x with probability
αF (x) + (1− α)G(x). Denote by δx the degenerate lottery that yields x with probability 1. Next,
denote ∆(∆X) as the set of simple lotteries over ∆X . For P,Q ∈ ∆(∆X) denote R = αP +(1−α)Q
as the lottery that yields simple (one-stage) lottery F with probability αP (F )+(1−α)Q(F ). Denote
by DF the degenerate, in the first stage, compound lottery that yields F with certainty. % is a weak
order over ∆(∆X) which represents the decision-maker’s preferences over lotteries and is continuous
(in the weak topology). Moreover, we will define a reduction function that maps compound lotteries
to reduced one-stage lotteries: φ(Q) =∑
F∈∆XQ(F )F .
Now consider the set of prior-information structure pairs, such that the prior f has support on
[w, b]. The main text discusses how to map a prior-information structure pair into a (unique) two
stage compound lottery. We now show that any given two-stage compound lottery maps into a
unique prior-information structure pair. Given a two-stage lottery P with support p1, ...pn we first
can find f , the prior.
φ(P )(ωi) = f(ωi)
. To identify I, observe that we have a set of equations pj(ωi) = ψj(ωi) =f(ωi)Iij∑k f(ωk)Ikj
, along with
restrictions on the elements of I discussed in the main text (and with a known f). These form a a
set of equations that generates a unique solution I.
We can now turn to discussing the formal properties and models related to our predictions,
using the framework of compound lotteries. First, reduction of compound lotteries implies that
individuals only care about the reduced one-stage lotteries that they face:
Reduction of Compound Lotteries: For all P,Q ∈ ∆(∆X) if φ(P ) = φ(Q) then
P ∼ Q.
In deriving additional predictions, it will be useful to formally define early and late resolving
lotteries:
• Γ = {DF |F ∈ ∆X}, the set of degenerate lotteries in ∆(∆X)
• Λ = {Q ∈ ∆(∆X)|Q(F ) > 0⇒ F = δx for some x ∈ X}, the set of compound lotteries whose
outcomes are degenerate in ∆X .
33
Early resolving lotteries have all uncertainty resolved in the first stage and so the second stage
lotteries are degenerate. These are equivalent to situations where the information structure reveals
all information in Period 1; thus, posteriors after observing the signal are degenerate. In contrast,
late resolving lotteries have all uncertainty resolved in the second stage and so their first stage is
degenerate. These are equivalent to situations where the information structure reveals no informa-
tion in Period 1. Thus, posteriors after receiving information are exactly the same as the priors
before receiving information. We define the restriction of % to the subsets Γ and Λ as %Γ and %Λ.
Given these definitions, we can now state Time Neutrality.
Time Neutrality: If P ∈ Γ and Q ∈ Λ and φ(P ) = φ(Q) then P ∼ Q.
Grant, Kajii and Polak (1998) formally define a preference for early resolution of information
in the setting of compound lotteries as:
Definition: % displays a preference for early resolution of uncertainty if for all Q,P ∈ ∆(∆X),
where Q =∑N
i Fiqi, P =∑j−1
i=1 Fiqi + G1βqj + G2(1 − βqj) +∑N
i=j+1 Fiqi where β ∈ [0, 1]; if
Fj = βG1 + (1− β)G2 then P % Q.
Grant, Kajii and Polak (1998) define a notion of “elementary linear bifurcations” which is
equivalent to a binary relation over compound lotteries. They show that one compound lottery is
an elementary linear bifurcation of another if and only if the former Blackwell dominates the latter.
Given a function V on the set of probability measures ∆(X), then for each each µ ∈ ∆(X) we
say that V is Gateaux differentiable at µ in ∆(X) if there is a measurable function v(;µ) on X such
that for any ν in ∆(X) and any α ∈ (01):
W (αν + (1− α)µ) = α
∫v(ζ;µ)[ν(dζ)− µ(dζ)] + o(α)
where o(α) is a function with the property that o(α)α → 0 as α → 0. v(;µ) is the Gateaux
derivatve of V at µ. V is Gateaux differentiable if V is V is Gateaux differentiable at all µ. We
call v(;µ) the local utility function at µ.
We next turn to discussing recursivity.
Recursivity: For all F,G ∈ ∆X , all Q ∈ ∆(∆X) and α ∈ (0, 1), DF % DG if and only
if αDF + (1− α)Q succsim αDG + (1− α)Q
As discussed in the text recursivity is useful because decision-makers with recursive prefer-
ences evaluate compound lotteries using a folding-back procedure — preferences over two stage
lotteries can be evaluated using preferences over one stage lotteries. Decision-makers replace the
second stage of any given compound lottery by the certainty equivalent generated by %Γ. The
34
resulting lottery is evaluated using %Λ. For example, suppose that %Γ and %Λ both satisfy In-
dependence, and denote the Bernoulli utility function used to evaluate each of them respectively
as uΓ (uΛ). In order to calculate the value of P (Figure ??) the decision-maker first evaluates
the possible second stage lotteries separately. Thus, she evaluates F according to uΓ and finds
the certainty equivalent u−1Γ (0.75uΓ(1) + 0.25uΓ(0)). She also evaluates F ′ according to uΓ and
finds the certainty equivalent u−1Γ (0.25uΓ(1) + 0.75uΓ(0)). In order to evaluate P , she substitutes
for F and F ′ their respective certainty equivalents. This generates a one-stage lottery that with
probability 0.5 gives outcome u−1Γ (0.75uΓ(1) + 0.25uΓ(0)) and with probability 0.5 gives outcome
u−1Γ (0.25uΓ(1) + 0.75uΓ(0)). She then evaluates this lottery using uΛ.
We say that a preference over two stage lotteries has a recursive representation (V1, V2) if the
preference can be represented with a functional V , such that V is derived using V1 and V2 in the
folding-back procedure described above. More formally, let F denote a one stage lottery, which gives
outcome xi with probability f(xi) which we can represent as follows F = (x1, f(x1); . . . ;xn; f(xn)).
For a two-stage lottery P , let p(Fi) denote the probability of receiving lottery Fi in the second stage,
and represent P as (F1, p(F1); . . . ;Fn; p(Fn)). Last, let CE2(F ) denote the certainty equivalent of
F using %Γ.
Definition 1 Suppose preferences over two-stage lotteries can be represented by V . We say pref-
erences have a recursive representation (V1, V2), where V1 and V2 are utility functions over one-
stage lotteries, if and only if for all P = (F1, p(F1); . . . ;Fn; p(Fn)), it is the case that V (P ) =
V1(CE2(F1), p(F1); . . . ;CE2(Fn); p(Fn)).
Independence within Γ and Λ is defined as per standard for any one-stage lottery.
We now sketch out some of the functional forms that are relevant for our predictions. If
preferences satisfy recursivity, we can represent them using VΛ and VΓ. Because Vi for i ∈ {Λ,Γ}is defined over two-stage lotteries that are isomorphic to one stage lotteries, we can simply define
Vi using one stage lotteries. If Period 1 and Period 2 preferences both satisfy Independence then
Vi =∑u(x)F (x) for i ∈ {Λ,Γ}. We refer to these as the Kreps-Porteus class of preferences.
If preferences in both periods are in Gul’s class of disappointment aversion, then
Vi(F ) =∑x
u(x)F (x) + β∑
x≤u−1(VG(F ))
(u(x)− VG(F ))F (x)
where u is a function mapping from wealth to the reals, and β is a scalar. Individuals are disap-
pointment averse if and only if β ≥ 0. If preferences in both periods are in the rank dependent
class then
Vi(F ) =∑x
u(x)
w∑y≥x
F (y)
− w(∑y>x
F (y)
)35
where u is a function mapping from wealth to the reals, and w is a function mapping from [0, 1] to
[0, 1], such that w(0) = 0, w(1) = 1 and w is strictly increasing. Individuals are pessimistic if and
only if w is convex.
Because Dillenberger and Segal’s (2015) conditions on preferences are quite subtle (beyond the
assumption of recursivity) we refer interested readers to their discussion. Moreover, since Brunner-
meier and Parker’s (2005) model predicts that all information structures should be indifferent to
one another, we direct the interested reader to their paper for the details of their functional form.
We next summarize Koszegi and Rabin’s functional form. Given a gain-loss functional η, a scalar
weight on expected utility κ, a scalar weight on first period gain-loss utility of ν, and denoting,
given a distribution h over the payoff across states, any ζ ∈ (0, 1) let u(ωh(p)) denote the utility of
the payoff level at percentile p. Then the functional form is:24
κEf (u(ωi)) + ν∑i
f(ωi)Iij
∫ 1
0η(u(ωpsij (p))− u(ωf (p)))dp
+∑i
f(ωi)Iijψj(ωi)
∫ 1
0η(u(ωi(p))− u(ωψj (p)))dp
Because this is complicated, we will define the function for our simple binary-binary setup.
Normalizing the Bernoulli utility of the high and low outcomes to 0 and 1 the total utility of an
information structure is:
κ(1f + (1− f)0)
+ ν(η(1− 0)(fp
fp+ (1− q)(1− f)− f)(fp+ (1− q)(1− f)))
+ ν(η(0− 1)(f − f(1− p)f(1− p) + q(1− f)
)(f(1− p) + q(1− f)))
+ (fp+ (1− f)(1− q))(
fp
fp+ (1− f)(1− q)(η(1− 0)
(1− q)(1− f)
fp+ (1− q)(1− f))
)+ (fp+ (1− f)(1− q))
((1− fp
fp+ (1− f)(1− q))η(0− 1)
fp
fp+ (1− q)(1− f)
)+ ((1− p)f + q(1− f))
(f(1− p)
(1− p)f + q(1− f)η(1− 0)
q(1− f)
f(1− p) + q(1− f)
)+ ((1− p)f + q(1− f))
((1− f(1− p)
(1− p)f + q(1− f))η(0− 1)
f(1− p)(1− p)f + q(1− f)
)The last functional form we consider is Ely, Frankel and Kamenica (2014). They have two
models, both of which deliver the same predictions regarding skewness. We acccomodate the more
24Denoting beliefs in Period 0 as f (our prior) and the beliefs in Period 1 (after receiving signal sj) as ψj
36
general forms of their models and allow for individuals to care both about an expected utility portion
of their beliefs, as well as suspense or surprise. We also allow individuals to weight suspense and
surprise different across periods. We denote ϑ as a function that turns suspense and surprise into
utils. As before we also have a scalar weight on expected utility κ and scalar weight on first period
gain-loss utility of ν
We first consider their model of suspense, where utility is given by:
κEf (u(ωi)) + νϑ(∑j
f(ωi)Iij∑i
(ψj(ωi)− f(ωi))2)
+∑j
f(ωi)Iijϑ(∑i
ψj(ωi)∑i
(I− ψj(ωi))2)
Simplifying in our binary-binary setup, we get:
κ(1f + (1− f)0)
+ νϑ((fp + (1− q)(1− f))2(fp
fp + (1− q)(1− f)− f)2 + (f(1− p) + q(1− f))2(f −
f(1− p)f(1− p) + q(1− f)
)2)
+ (fp + (1− f)(1− q))ϑ(fp
fp + (1− f)(1− q)2(1−
fp
fp + (1− f)(1− q))2+ (1−
fp
fp + (1− f)(1− q))2(
fp
fp + (1− f)(1− q))2)
+ ((1− p)f + q(1− f))ϑ(f(1− p)
(1− p)f + q(1− f)2(1−
f(1− p)(1− p)f + q(1− f)
)2+ (1−
f(1− p)(1− p)f + q(1− f)
)2(f(1− p)
(1− p)f + q(1− f))2)
EFK also provide a model of suprise:
κEf (u(ωi)) + ν∑j
f(ωi)Iijϑ(∑i
(ψj(ωi)− f(ωi))2)
+∑j
f(ωi)Iij∑i
ψj(ωi)ϑ(∑i
(I− ψj(ωi))2)
In our binary-binary setting this becomes:
37
κ(1f + (1− f)0)
+ ν(fp+ (1− q)(1− f))ϑ(2(fp
fp+ (1− q)(1− f)− f)2)
+ ν(f(1− p) + q(1− f))ϑ(2(f − f(1− p)f(1− p) + q(1− f)
)2)
+ (fp+ (1− f)(1− q)) fp
fp+ (1− f)(1− q)ϑ(2(1− fp
fp+ (1− f)(1− q))2)
+ (fp+ (1− f)(1− q))(1− fp
fp+ (1− f)(1− q))ϑ(2(
fp
fp+ (1− f)(1− q))2)
+ ((1− p)f + q(1− f))f(1− p)
(1− p)f + q(1− f)ϑ(2(1− f(1− p)
(1− p)f + q(1− f))2)
+ ((1− p)f + q(1− f))(1− f(1− p)(1− p)f + q(1− f)
)ϑ(2(f(1− p)
(1− p)f + q(1− f))2)
Appendix B: Proofs
Before we prove the proofs in the text, we will prove a useful lemma.
Lemma A Suppose f = .5. Then if x < y the posterior distribution induced by (y, x) has more
downside risk, in the sense of Menezes, Geiss and Tressler (1980) than that induced by (x, y).
Proof We prove that, given x < y and a prior of .5 , the posterior distribution induced by (x, y) is a
mean-variance preserving transformation of that induced by (y, x): or in other words, the posterior
distribution have the same mean and variance, but the former has a skew greater than the latter;
or the posterior distribution induced by the latter has more downside risk than the former. We
denote the cdf of the posterior distribution induced by the former as F and of the latter as G.
First, observe that the two distributions induce the same mean posterior, by law of iterated
expectations, which is simply the prior.
Second, we will show that∫ x
0
∫ z0 (G(y)−F (y))dydz > 0 for x < 1 and
∫ 10
∫ z0 (G(y)−F (y))dydz =
0.
Observe that G is equal to 0 on the range [0, 1−y)1−y+x), and then .5(x + 1 − y) on the range
[ 1−y)1−y+x ,
yy+1−x) and 1 on the range [ y
y+1−x , 1]. Observe that F is equal to 0 on the range [0, 1−x)1+y−x),
and then .5(1− x+ y) on the range [ 1−x)1+y−x ,
x1+x−y ) and 1 on the range [ x
1+x−y , 1].
We denote regionA as [0, 1−y)1−y+x); B as [ 1−y)
1−y+x ,1−x)
1+y−x); C as [ 1−x)1+y−x ,
yy+1−x); D as [fracyy + 1− x, x
1+x−y );
and E as [ x1+x−y , 1].
Thus (G(y) - F(y)) is 0 on A, then .5(1 +x− y) on B, then x− y on C, then .5(1 +x− y) again
on D, and then 0 on E.
This implies that∫ z
0 (G(y)−F (y))dy, is for z ∈ A is 0. For z ∈ B it is .5[1+x−y] 1−x1−x+y−.5(1−y).
For z ∈ C it is (z − .5)(x− y). For z ∈ D it is .5[1 + x− y]z − .5x. For z ∈ E it is 0.
38
We will now divide C into two separate intervals: C1 = [ 1−x)1+y−x , .5) and C2 = [.5, y
y+1−x).
Observe that∫ z
0 (G(y)− F (y))dy is weakly greater than 0 when z is in A, B and C1. Similarly,∫ z0 (G(y)− F (y))dy is weakly less than 0 when z is in C2, D and E.
Thus, we simply need to compute∫ .
0 5∫ z
0 (G(y) − F (y))dydz > 0 and show that∫ .5
0
∫ z0 (G(y) −
F (y))dydz = −∫ 1.5
∫ z0 (G(y)− F (y))dydz and we will have shown both parts.
Observe that∫ .5
0
∫ z0 (G(y) − F (y))dydz = 1
8 [1 − 2y1+x−y ][ (1+x−y)(1−x)
1−x+y − (1 − y)]. Moreover∫ 1.5
∫ z0 (G(y)− F (y))dydz = 1
8 [ 2x1+x−y − 1][x− y][ 2y
1+y−x − 1].
Routine algebra shows that the first is then equal to 18 [−1−x−y
1+x−y ][y2−y−x2+x
1−x+y ] and the second is
equal to 18 [−1−x−y
1+x−y ][−y2+y+x2−x1−x+y ].
Thus, by Menezes, Geiss and Tressler (1980) the posterior distribution G has more downside
risk than the posterior distribution F . �
Lemma 1 For any (p, q) ∈ S, observing a good signal increases the posterior on high outcome
relative to the prior, and observing a bad signal decreases the posterior on high outcome relative to
the prior.
Proof We will prove each part of the Lemma in turn. First we prove the first part. Recall that
for a given prior 0 < f < 1 on a high payoff and information structure (p, q), the posterior for the
high payoff given the good signal is
ψF =fp
fp+ (1− f)(1− q).
Now ψF > f if and only if
ψF =fp
fp+ (1− f)(1− q)> f,
which holds if and only if
(1− f)p > (1− f)− (1− f)q,
which is the same as
p+ q > 1.
An analogous series of steps establishes the result for the posterior after observing a B signal,
ψR =f(1− p)
f(1− p) + (1− f)q.
�
Lemma 2 For any signal structure (p′, q′) ∈ [0, 1] × [0, 1], there exists a (p, q) ∈ S that generates
the same posterior distribution. However, for any T ⊂ S there exists a (p′, q′) ∈ S such that there
is no element of T that generates the same posterior distribution as (p′, q′).
39
Assume that p+ q < 1 (observe that all signal structures on p+ q = 1 give the same posterior
distribution). In this case, denote p′ = 1−p and q′ = 1−q. We will work with likelihood ratios rather
than posterior beliefs. Under (p, q), likelihood ratio p1−q occurs with probability fp+ (1− f)(1− q)
and likelihood ratio 1−pq occurs with probability f(1− p) + (1− f)q.
Under (p′, q′) likelihood ratio 1−p′q′ = p
1−q occurs with probability f(1 − p′) + (1 − f)q′ =
fp + (1 − f)(1 − q). Likelihood ratio p′
1−q′ = 1−pq occurs with probability fp′ + (1 − f)(1 − q′) =
f(1− p) + (1− f)q. Therefore (p′, q′) generates the same posterior distribution as (p, q). Moreover,
p′ + q′ = (1 − p) + (1 − q) = 2 − p − q ≥ 1 since p + q ≤ 1. So therefore, instead of considering
some (p, q) we can always instead consider the corresponding p′ = 1 − p, q′ = 1 − q. This proves
the second part.
To prove the second part observe that in order for two signal structures (p, q) and (p′, q′) to
generate the same posteriors (so that for both signal structures, a fair weather prediction increases
the posterior relative to the prior and a rainy weather prediction decreases it) it must be the case
that p′
1−q′ = p1−q and 1−p′
q′ = 1−pq .
Therefore p′ − p′q = p − pq′ and q − p′q = q′ − pq′, which is equivalent to q = −p+pq′+p′p′ and
q = q′−pq′1−p′ . Simplifying, we have −p+pq
′+p′
p′ = q′−pq′1−p′ , or p′q′−pq′p′ = −p+pq′+p′+pp′−pp′q′−p′2.
This holds if and only if p′q′ = −p+pq′+p′+pp′−p′2, or p(1−q′−p′) = −p′q′+p′−p′2 = p′(1−q′−p′).This equality is true if and only if p = p′ or q′ + p′ = 1. �
Lemma 3 (p′, q′) Blackwell dominates (is Blackwell more informative than) (p, q) if and only if
p′ ≥ max{ p1−q (1− q′), 1− q′ 1−pq }.
Proof Recall that one signal structure (p′, q′) is Blackwell more informative than another (p, q)
if and only if the distribution of posteriors induced by (p′, q′) is a mean preserving spread of the
distribution induced by (p, q). By the law of iterated expectations, the expected posterior under
(p′, q′) and (p, q) must be the same — the prior. Because there are only 2 signals (and so 2 posteriors)
as well as only 2 states, the problem reduces to showing that the posteriors under (p′, q′) are more
extreme (in the sense that they are farther from the prior) than the posteriors under (p, q). In
order to simplify the proofs, we will show an equivalent result — that the likelihood ratios under
(p′, q′) are more extreme (farther from 1) than the likelihood ratios under (p, q).
The likelihood ratios after observing a fair signal under (p′, q′) and (p, q) are (respectively) p′
1−q′
and p1−q while the likelihood ratios after observing a rainy signal are 1−p′
q′ and 1−pq .
In order for the ratios under (p′, q′) to be farther from 1 than (p, q), then p′
1−q′ ≥p
1−q and1−p′q′ ≤
1−pq . This is equivalent to p′ ≥ p
1−q −p
1−q q′ and p′ ≥ 1− q′ 1−pq . �
Prediction 1 Fixing f , the decision maker should be indifferent between all information structures.
Proof Under reduction of compound lotteries this is true by definition.�
Prediction 2 Fixing f , the decision-maker should be indifferent between (1, 1) and (.5, .5).
40
Proof Under time neutrality this is true by definition.�
Prediction 3 Let %f represented by V , where V is Gateaux differentiable. Then the local utility
function of V is everywhere convex (concave) if and only if the decision-maker prefers Blackwell
more (less) information structures.
Proof This is proved by Grant, Kajii and Polack (1998).�
Prediction 4 Let %.5 represented by V , where V is Gateaux differentiable. If the local util-
ity function of V is thrice differentiable and has a convex (concave) derivative everywhere, then
(x, y) %.5 (-.5)(y, x) whenever x ≤ y
Proof Assume that all local utility functions are thrice differentiable and have a positive third
derivative. Denote the local utility function v(;µ). Fix two lotteries induced by information
structures with a prior of .5 Z1 and Z0 where Z0 generates a posterior distribution which has
more downside risk aversion than Z1. We simply need to show that W (Z1) −W (Z0) ≥ 0. Let
Z(α) = αZ1 + (1− α)Z0. By Grant, Kaji and Polack (pg 255) ddαW (Z(α))|α=β exists for any β in
(0, 1) and is equal to∫v(µ;Z(β))(Z1(dµ)−Z0(dµ)). Observe that this is simply the expected value
of v under Z1 less the expected value of v under Z0. By Theorem 2 of Menezes, Geiss and Tressler
(1980) this is positive for any β ∈ (0, 1). Integrating with respect to β yieldsW (Z(1))−W (Z(0)) ≥ 0
which gives the required result since W (Z(1)) = W (Z1) and W (Z0) = W (Z(0)). �
Prediction 5 Suppose preferences have a recursive representation (V1, V2) such that V1 and V2
have expected utility representations with Bernoulli utilities u1 and u2. Then u1 ◦ u−12 is convex
(concave) if and only if the decision-maker prefers Blackwell more (less) information structures.
Moreover, if the derivative of u1 ◦ u−12 is convex (concave), then (x, y) %.5 (-.5)(y, x) whenever
x ≤ y.
Proof The first relationship — between Blackwell informativeness and convexity/concavity is
proved in Grant, Kajii and Polak (1998).
For the next part, denoting τ = u1(u−12 ) the utility of (p, q) is simply: τ( fp
fp+(1−f)(1−q))(fp +
(1− f)(1− q)) + τ( (1−p)f(1−p)f+q(1−f))((1− p)f + q(1− f)). Observe that this implies the individual is
an EU maximizer over a utility function defined over their posteriors. We know that x < y and a
prior of .5, (y, x) has more downside risk than (x, y). Thus by Theorem 2 of Menezes, Geiss and
Tressler (1980) if the third derivative of τ is positive then (x, y) must by preferred to (y, x). �
Prediction 6 Suppose preferences have a recursive representation (V1, V2) such that V1 and V2
are both in Gul’s class of disappointment aversion functionals (or rank-dependent utility) and the
decision-maker is disappointment averse (has a strictly convex weighting function). Then there
exists an 0 < ε′ such that for all ε < ε′, (.5, .5) �.5 (.5 + ε, .5 + ε).
Proof Recall that we have representations in period 1 and period 2 which are (u1, β1) and (u2, β2).
41
If we get a good signal, the utility isfp
fp+(1−f)(1−q)+0
1+β2(1−f)(1−q)
fp+(1−f)(1−q)= fp
fp+(1+β2)(1−q)(1−f) . If we get a bad signal,
the utility is (1−p)f(1−p)f+(1+β2)q(1−f) . Iterating forward to period 1, and denoting τ = u1(u−1
2 ) we have
τ(
fpfp+(1+β2)(1−q)(1−f)
)(fp+ (1− q)(1− f)) + τ
((1−p)f
(1−p)f+(1+β2)(1−f)q
)(1 + β1)((1− p)f + (1− f)q)
1 + β1((1− p)f + (1− f)q)
Taking the value at f = .5, setting p = q:
τ(
pp+(1+β2)(1−p)
)(p+ (1− p)).5 + τ
((1−p)
(1−p)+(1+β2)p
)(1 + β1)((1− p) + p).5
1 + .5β1((1− p) + p)
τ(
p1+β2−β2p
)+ τ
((1−p)1+β2p
)(1 + β1)
2
p goes to .5
τ
(.5
1 + .5β2
)Clearly the 1
2+β1is irrelevant for the derivative, so take the derivative of τ
(p
1+β2−β2p
)+
τ(
(1−p)(1+β2p
)(1 + β1) with respect to p:
τ ′(
p
1 + β2 − β2p
)1 + β2 − β2p+ β2p
(1 + β2 − β2p)2+ τ ′
((1− p)
(1 + β2p
)(1 + β1)
−1− β2p− β2 + β2p
(1 + β2p)2
or
τ ′(
p
1 + β2 − β2p
)1 + β2
(1 + β2 − β2p)2− τ ′
((1− p)
(1 + β2p
)(1 + β1)
1 + β2
(1 + β2p)2
and then taking limit of the derivative as p→ .5 gives:
τ ′(
.5
1 + β2 − β2.5
)1 + β2
(1 + β2 − β2.5)2− τ ′
(.5
(1 + β2.5
)(1 + β1)
1 + β2
(1 + β2.5)2
or
τ ′(
.5
1 + .5β2
)1 + β2
(1 + .5β2)2− τ ′
(.5
(1 + .5β2
)(1 + β1)
1 + β2
(1 + .5β2)2< 0
Next we prove the result for RDU. If we get a good signal, utility is: w2( fpfp+(1−f)(1−q)) + 0. If
we get a bad signal, the utility is w2( (1−p)f(1−p)f+q(1−f)). Iterating forward one period, utility is then:
42
τ(w2( fpfp+(1−f)(1−q)))w1(fp+ (1− f)(1− q)) + τ(w2( (1−p)f
(1−p)f+q(1−f)))(1−w1(fp+ (1− f)(1− q)))Observe that this function is continuous and differentiable in the neighborhood of p = q = .5.
Substituting in f = .5:
τ(w2( pp+(1−q)))w1(.5(p+ (1− q))) + τ(w2( (1−p)
(1−p)+q ))(1− w1(.5(p+ (1− q))))And substituting in p = q gives τ(w2(p))w1(.5) + τ(w2(1− p))(1− w1(.5)).
The derivative of this with respect to p is then τ ′(w2(p))w′2(p)w1(.5)− τ ′(w2(1− p))w′2(1− p) +
τ ′(w2(1−p))w′2(1−p)w1(.5). Taking the limit of this as p goes to .5 gives τ ′(w2(.5))w′2(.5)[2w1(.5)−1]. When w is convex, this must be negative, since w1(.5) ≤ .5. �
Prediction 7 Suppose preferences belong to the class defined by Dillenberger and Segal (2015).
Then if (.5, .5) %.5 (x, x) for all x ≥ .5, then for all x ≤ y, (.5, .5) %.5 (y, x). However, it is possible
that (.5, .5) -.5 (x, y).
Proof See Dillenberger and Segal (2015) for the proof.�
Prediction 8 Suppose preferences can be represented by a BP functional form. Then the decision-
maker should be indifferent between all information structures.
Proof This is proved by construction.�
Prediction 9 Suppose preferences represented by a KR or EFK functional form. Then (x, y) ∼.5(y, x).
Proof We discussed KR’s functional form previously. It is:
κ(1f + (1− f)0)
+ ν(η(1− 0)(fp
fp+ (1− q)(1− f)− f)(fp+ (1− q)(1− f)))
+ ν(η(0− 1)(f − f(1− p)f(1− p) + q(1− f)
)(f(1− p) + q(1− f)))
+ (fp+ (1− f)(1− q))(
fp
fp+ (1− f)(1− q)(η(1− 0)
(1− q)(1− f)
fp+ (1− q)(1− f))
)+ (fp+ (1− f)(1− q))
((1− fp
fp+ (1− f)(1− q))η(0− 1)
fp
fp+ (1− q)(1− f)
)+ ((1− p)f + q(1− f))
(f(1− p)
(1− p)f + q(1− f)η(1− 0)
q(1− f)
f(1− p) + q(1− f)
)+ ((1− p)f + q(1− f))
((1− f(1− p)
(1− p)f + q(1− f))η(0− 1)
f(1− p)(1− p)f + q(1− f)
)Dropping the expected utility part, since it depends only on f , this becomes
43
νη(1)(fp− f(fp+ (1− q)(1− f))) + νη(−1)(−f(1− p) + f(f(1− p) + q(1− f)))
+ (η(1) + η(−1))
(f(1− f)p(1− q)fp+ (1− f)(1− q)
+f(1− p)q(1− f)
f(1− p) + (1− f)q
)= ν(η(1) + η(−1))f(1− f)(p+ q − 1)
+ f(1− f)(η(1) + η(−1))
(p(1− q)
fp+ (1− f)(1− q)+
(1− p)qf(1− p) + (1− f)q
)= (η(1) + η(−1))f(1− f)
(p(1− q)
fp+ (1− f)(1− q)+
(1− p)qf(1− p) + (1− f)q
+ ν(p+ q − 1)
)The results are immediate from this. Observe that the inside of parantheses under signal x, y
is x(1−y)xf(1−f)(1−y) + (1−x)y
(1−x)f+(1−f)y and under y, x is y(1−x)yf(1−f)(1−x) + (1−y)x
(1−y)f+(1−f)x . If f = .5 these are
equal.
Moreover, if we substitute in q = k−p and f = .5 then we get, on the inside, 2p(1−q)p+(1−q) + 2(1−p)q
(1−p)+q +
ν(p+ q− 1) or 2p(1−k+p)p+1−k+p + 2(1−p)(k−p)
1−p+k−p + ν(p+ k− p− 1) or 2p(1−k+p)2p+1−k + 2(1−p)(k−p)
1−2p+k + ν(k− 1). The
FOC should follow. �
We next turn to the EFK functional forms. Using their model of suspense, and letting f = .5
we get:
.5κ
+ νϑ((p+ (1− q))( p
p+ (1− q)− .5)2 + ((1− p) + q)(.5− (1− p)
(1− p) + q)2)
+ .5(p+ (1− q))ϑ(p
p+ (1− q)2(1− p
p+ (1− q))2 + (1− p
p+ (1− q))2(
p
p+ (1− q))2)
+ .5((1− p) + q)ϑ((1− p)
(1− p) + q2(1− (1− p)
(1− p) + q)2 + (1− (1− p)
(1− p) + q)2(
(1− p)(1− p) + q
)2)
44
Simplifying this gives:
.5κ
+ νϑ(p2
p+ (1− q)− p+ .25(p+ (1− q)) +
(1− p)2
(1− p) + q− (1− p) + .25((1− p) + q))
+ .5(p+ (1− q))ϑ(2p
p+ (1− q)− 4(
p
p+ (1− q))2 + 2(
p
p+ (1− q))3 + 2(
p
p+ (1− q))2 − 2(
p
p+ (1− q))3)
+ .5((1− p) + q)ϑ(2(1− p)
(1− p) + q− 4(
(1− p)(1− p) + q
)2 + 2((1− p)
(1− p) + q)3 + 2(
(1− p)(1− p) + q
)2 − 2((1− p)
(1− p) + q)3)
= .5κ
+ νϑ(p2
p+ (1− q)+
(1− p)2
(1− p) + q− .5)
+ .5(p+ (1− q))ϑ(2p
p+ (1− q)(1− p
p+ (1− q)))
+ .5((1− p) + q)ϑ(2(1− p)
(1− p) + q(1− (1− p)
(1− p) + q))
Under structure (x, y) this becomes:
.5κ
+ νϑ(x2
x+ (1− y)+
(1− x)2
(1− x) + y− .5)
+ .5(x+ (1− y))ϑ(2x
x+ (1− y)(1− x
x+ (1− y)))
+ .5((1− x) + y)ϑ(2(1− x)
(1− x) + y(1− (1− x)
(1− x) + y))
= .5κ
+ νϑ(x2
x+ (1− y)+
(1− x)2
(1− x) + y− .5)
+ .5(x+ (1− y))ϑ(2x
x+ (1− y)(
1− yx+ (1− y)
))
+ .5((1− x) + y)ϑ(2(1− x)
(1− x) + y(
(y)
(1− x) + y))
Under structure (y, x) the utility becomes:
45
.5κ
+ νϑ(y2
y + (1− x)+
(1− y)2
(1− y) + x− .5)
+ .5(y + (1− x))ϑ(2y
y + (1− x)(1− y
y + (1− x)))
+ .5((1− y) + x)ϑ(2(1− y)
(1− y) + x(1− (1− y)
(1− y) + x))
= 5κ
+ νϑ(y2
y + (1− x)+
(1− y)2
(1− y) + x− .5)
+ .5(y + (1− x))ϑ(2y
y + (1− x)(
1− xy + (1− x)
))
+ .5((1− y) + x)ϑ(2(1− y)
(1− y) + x(
(x)
(1− y) + x))
Subtracting the second from the first gives:
.5κ − .5κ
+ νϑ(x2
x+ (1− y)+
(1− x)2
(1− x) + y− .5)
− νϑ(y2
y + (1− x)+
(1− y)2
(1− y) + x− .5)
+ .5(x+ (1− y))ϑ(2x
x+ (1− y)(
1− yx+ (1− y)
))
− .5((1− y) + x)ϑ(2(1− y)
(1− y) + x(
(x)
(1− y) + x))
+ .5((1− x) + y)ϑ(2(1− x)
(1− x) + y(
(y)
(1− x) + y))
− .5(y + (1− x))ϑ(2y
y + (1− x)(
1− xy + (1− x)
))
= ν[ϑ(x2
x+ (1− y)+
(1− x)2
(1− x) + y− .5)
− ϑ(y2
y + (1− x)+
(1− y)2
(1− y) + x− .5)]
Simplifying the above expression gives
46
ν[ϑ(x2 − x3 + x2y + x− 2x2 + x3 + 1− 2x+ x2 − y + 2xy − yx2
x− x2 + yx+ 1− x+ y − y + yx− y2− .5)
− ϑ(y2 − y3 + xy2 + y − 2y2 + y3 + 1− 2y + y2 − x+ 2yx− xy2
y − y2 + yx+ 1− y + x− x+ yx− x2− .5)]
= ν[ϑ(−x+ 1− y + 2xy
(1 + x− y)(1 + y − x)− .5)
− ϑ(1− y − x+ 2yx
(1 + y − x)(1 + x− y)− .5)]
= ν[ϑ(1− x− y + 2xy
(1 + x− y)(1 + y − x)− .5)
− ϑ(1− x− y + 2yx
(1 + y − x)(1 + x− y)− .5)]
= 0
We next derive the result for EFK’s model of surprise. Again, taking the functional form
discussed above, and substituting in f = .5, we obtain:
κ.5 + ν.5(p+ (1− q))ϑ(2(p
p+ (1− q)− .5)2)
+ ν.5((1− p) + q)ϑ(2(.5− (1− p)(1− p) + q
)2)
+ .5(p+ (1− q)) p
p+ (1− q)ϑ(2(1− p
p+ (1− q))2)
+ .5(p+ (1− q))(1− p
p+ (1− q))ϑ(2(
p
p+ (1− q))2)
+ .5((1− p) + q)(1− p)
(1− p) + qϑ(2(1− (1− p)
(1− p) + q)2)
+ .5((1− p) + q)(1− (1− p)(1− p) + q
)ϑ(2((1− p)
(1− p) + q)2)
which is equivalent to
47
.5[κ + ν(p+ (1− q))ϑ(2(p
p+ (1− q)− .5)2)
+ ν((1− p) + q)ϑ(2(.5− (1− p)(1− p) + q
)2)
+ pϑ(2(1− q
p+ (1− q))2)
+ (1− q)ϑ(2(p
p+ (1− q))2)
+ (1− p)ϑ(2(q
(1− p) + q)2)
+ qϑ(2((1− p)
(1− p) + q)2)]
Under structure (x, y) this becomes:
.5[κ + ν(x+ (1− y))ϑ(2(x
x+ (1− y)− .5)2)
+ ν((1− x) + y)ϑ(2(.5− (1− x)
(1− x) + y)2)
+ xϑ(2(1− y
x+ (1− y))2)
+ (1− y)ϑ(2(x
x+ (1− y))2)
+ (1− x)ϑ(2(y
(1− x) + y)2)
+ yϑ(2((1− x)
(1− x) + y)2)]
Under structure (y, x) the utility becomes:
.5[κ + ν(y + (1− x))ϑ(2(y
y + (1− x)− .5)2)
+ ν((1− y) + x)ϑ(2(.5− (1− y)
(1− y) + x)2)
+ yϑ(2(1− x
y + (1− x))2)
+ (1− x)ϑ(2(y
y + (1− x))2)
+ (1− y)ϑ(2(x
(1− y) + x)2)
+ xϑ(2((1− y)
(1− y) + x)2)]
48
Subtracting the second from the first gives:
.5[κ + ν(x+ (1− y))ϑ(2(x
x+ (1− y)− .5)2)
−κ + ν(y + (1− x))ϑ(2(y
y + (1− x)− .5)2)
+ ν((1− x) + y)ϑ(2(.5− (1− x)
(1− x) + y)2)
− ν((1− y) + x)ϑ(2(.5− (1− y)
(1− y) + x)2)
+ xϑ(2(1− y
x+ (1− y))2)
− xϑ(2((1− y)
(1− y) + x)2)]
+ (1− y)ϑ(2(x
x+ (1− y))2)
− (1− y)ϑ(2(x
(1− y) + x)2)
+ (1− x)ϑ(2(y
(1− x) + y)2)
− (1− x)ϑ(2(y
y + (1− x))2)
+ yϑ(2((1− x)
(1− x) + y)2)]
− yϑ(2(1− x
y + (1− x))2)
We can simplify this to:
49
= ν.5[(x+ (1− y))ϑ(2(x
x+ (1− y)− .5)2)
− (y + (1− x))ϑ(2(y
y + (1− x)− .5)2)
+ ((1− x) + y)ϑ(2(.5− (1− x)
(1− x) + y)2)
− ((1− y) + x)ϑ(2(.5− (1− y)
(1− y) + x)2)
= ν.5[ (x+ (1− y))ϑ(2(.5x− .5(1− y)
x+ (1− y))2)
− (y + (1− x))ϑ(2(.5y − .5(1− x)
y + (1− x))2)
+ ((1− x) + y)ϑ(2(.5y − .5(1− x)
(1− x) + y)2)
− ((1− y) + x)ϑ(2(.5x− .5(1− y)
(1− y) + x)2)
= ν.5[ (1 + x− y)ϑ(2(.5x− .5(1− y)
x+ (1− y))2)
− (1 + y − x)ϑ(2(.5y − .5(1− x)
y + (1− x))2)
+ (1 + y − x)ϑ(2(.5y − .5(1− x)
(1− x) + y)2)
− (1 + x− y)ϑ(2(.5x− .5(1− y)
(1− y) + x)2)
= 0
Appendix C: Experimental Protocol
The experimenters gave participants a ticket from a raffle ticket roll in the sequence with which
they entered the lab. The ticket assignment was simple and public, making it transparent to all
participants that each participant had equal chances in the lottery coming up. The participants
read the instructions displayed on their screens and waited for the experimenter to begin the study.
The instructions on the screen informed participants that the study was 75 minutes long and had
two parts and that they would receive $7 for participating in the study. These instructions also
informed the participants that they would be participating in a lottery with the ticket they received
as they entered the room. With 50% chance, they would earn an additional $10, and with 50%
chance they would not earn any additional money.
The experimenter told the participants to put on their headphones in order to listen to the
instructions that will be given on the next page. The instructional video explained that whether
50
a particular ticket wins or loses the lottery is determined by the last digit of the ticket number
and the outcome of a 10-sided die throw. They learned that the experimenter would roll the die
and cover it with a cup after seeing the die outcome. They were told that if the die outcome is an
odd (even) number and the last digit of the ticket the participant is holding is also odd (even), the
participant would win $10. And, if the last digit of the ticket and the die outcome fail to match in
this way, the participant does not win any money. Importantly, the instructions emphasized that
none of the participants would learn the outcome of the die and thus whether they won or lost,
before the experiment was over. They were told that one of the participants would be invited to
lift the cup and read the number on the die out loud at the end of the experiment for everyone to
learn the outcome. The participants were also told that they would enter their ticket number and
the experimenter would supply a code to be entered so that the computer program would know
whether they won or lost, right from the beginning of the experiment. Then, participants were
given hypothetical examples of this process and understood how the computer would be able to
know more than they did and would be able to generate clues if needed. The instructions also
explained that in the first part of the study was expected to take around half the allotted time
for the experiment and the participants would be answering five questions, each with two clue-
generating options, about their preferences about what kind of clues they would like to get about
whether their ticket won or lost. They were told that one of these five questions would be chosen
at random to be carried out at the end of the first part and that they would observe the clue
generated by the option they chose in that question at that time. The participants understood that
they would sit with that clue for the rest of the experiment until they were able to learn whether
they won or lost the lottery at the end of the 75 minutes. The participants were told that they
would be answering questions unrelated to the lottery in the second part and that these questions
did not have any informational or monetary value associated with them.
After listening to these instructions, the participants were asked if they had any difficulties with
the video or audio components of the program. Only 1 person did, and his/her microphone was
adjusted immediately.
When the instructional video was over, participants were asked to enter the last digit of their
ticket number and the experimenter rolled a 10-sided die on the table publicly and covered it with
a cup so that the outcome was not visible to the participants. The experimenter informed the
participants Ive rolled the die. At this point, the outcome of the lottery for everyone is determined.
I will now look at the outcome and give you a code to enter, so that the computer knows what the
outcome was. and gave one of the following codes: sugar, milk, cake, candy, coffee, butter; where
sugar, cake or coffee informed the computer that the die outcome was an odd number. We used
more than one code and changed it around across sessions to prevent participants from learning
the codes across sessions.
51
After entering the code and the last digit of their ticket numbers, the participants worked on the
study on their own. They first answered some comprehension questions regarding the instructions
they received. The program instructed them if they answered any question incorrectly.
On the next page, they were asked to rate their happiness in order to elicit an initial baseline
happiness measure. The question asked “Please indicate how happy/unhappy you are feeling in the
current moment by sliding the scale. -100 means you are feeling ‘very unhappy’, 100 means you are
feeling ‘very happy’, 0 means you are feeling ‘neutral’.” After this question, they were informed
that they were proceeding to part 1 of the study where they would be making choices about the
kind and amount of information they would like to get about whether their ticket won or lost the
lottery.
Before each question in part 1, they listened to video instructions by using their headphones and
answered comprehension questions. The program instructed them of the correct answers if they
made any mistakes in the comprehension questions, before they could proceed to making choices.
After making a choice and indicating their preference strength for the option they chose on the
following page, they proceeded to the next video that explained the following question, until all
five questions were answered.
The videos for each question were all structured in the following manner: 1) The two options in
the question were presented, and the text indicating the contents of each box in the options were
read. 2) For each option, the box from which the ball would be drawn if the participant won the
lottery was highlighted, followed by the box from which the ball would be drawn if the participant
lost the lottery. 3) The percentage of the instances a red or a black ball would be drawn from
Option 1 was indicated and explained, 4) The meaning (posterior probability of winning or losing)
associated with observing a red or a black ball from Option 1 was defined and explained, 5) steps
3 and 4 were repeated for Option 2, 6) Option 1 and Option 2 were displayed next to one another
and a summary of the information regarding the likelihood of observing each ball color and the
posterior probability of winning associated with each color was included below each option. This
final comparison visual is the same graphic as the one that the participants saw when they were
making a choice between the two options. The video instructions did not provide any additional
information than the information already included on their screens right at the time of making
a choice, however we believe that watching the video instructions before making a choice forced
participants to pay more close attention to this information and provided them with more of an
understanding of how the posterior probabilities were calculated.
An example video can be found at: YOUTUBE LINK
After watching the video and completing the comprehension questions, the participants arrived
at a page that displayed the two options graphically and explained each verbally. All questions are
included in the end of this document.
52
After answering all five questions, one question was randomly chosen for each participant to be
carried out and the program randomly drew a ball from the option the participant chose in that
question. The program displayed the two options in the chosen question along with the participants
choice in that question on the screen. It also indicated whether the ball drawn from the option
the participant chose was red or black. Given the color of the ball drawn from the option, and the
information about the posteriors included in the graphics of the option, the participant was asked
to enter the probability that s/he won the lottery (which s/he could simply read from the graphic
if s/he paid attention).
On the next page, the participants were asked to rate their happiness in that moment using the
same scale as before. On the following pages, they were also asked to rate how optimistic/pessimistic
they feel about winning the lottery, to note whether they had any questions or confusions about
part 1 and to provide a short explanation for the reason behind their choices in the first three
information-preference questions in part 1.
In the second part of the experiment, they were asked hypothetical questions that each presented
two options to elicit their risk preferences, ambiguity aversion, ability to reduce compound lotteries
and attitude differences towards common ratios.
At the end, when all participants were done (or when time was running out), one participant
was invited to lift the cup and announce the die outcome. All participants were asked to indicate
this outcome and whether they won or lost the lottery as a result on their screens. On the next
page, right after learning the outcome of the lottery, the participants were asked once again to rate
their happiness in that moment.
The experimenters went to each participants stall to pay him or her in private. The experimenter
checked the ticket number, paid the participant in cash and asked him or her to fill out the receipt
form and answer one more question on the last page of the study and advanced the participants
program to that last page. On the last page, after receiving the cash, the participants were asked
once again to rate their happiness in that moment.
Appendix D: Experimental Design — Single Pairwise Comparison
With initial funding from the Ross School of Business and the University of Oxford, we conducted
a real-choice study in May 2014. We conducted a 1x2 between subjects experiment (2 sets of
information structures). Each participant is paid $7 for participating the hour-long experiment.
At the beginning of the experiment, each participant is given a red ticket with a 5 digit number
on it and are informed of their chances of winning (50%). If they win, they get an additional
$10. A die is thrown, which determines the winning ticket numbers, but is covered with a cup
so that participants cannot see the outcome of the die throw. So, the outcome is determined at
53
the beginning of the experiment, but each participant remains uncertain whether s/he won.25 The
participants are told that the cup would be lifted at the end of the experiment and all participants
can see the outcome of the die throw at that time.
After receiving these instructions, the participants first completed a 10-minute training session
that gave them experience in using the willingness to accept protocol. After this seemingly unrelated
task, we explained two signal structures (Option 1 and Option 2, detailed below) and ask them
to make a choice between the two. Once they made a choice, and before they see the signal, they
were asked to indicate the amount they would be willing to accept (WTA) in order to change their
choice. This amount measures the utility difference between the two informational options. Then
their choices were carried out using a DGM procedure and they saw a signal from the information
structure their WTA answers indicate. In the last 30 minutes of the study, they made choices
in hypothetical risk scenarios. At the end of the study, the die outcome was announced and the
holders of the winning tickets were paid an additional $10.
There were two sets of information structures, presented across sessions (between subjects
design). Set 1 operationalized the information structure in Figure 4 (also presented as Q2 in the
main experiment), and Set 2 operationalized the information structure in Figure 5.
Figure 5: Set 2
In both sets, Option 1 (negatively skewed structure) is more accurate at predicting the worse
outcome (not winning) and Option 2 (positively skewed structure) is more accurate at predicting
the better outcome (winning). Across all priors, Option 1 is more likely to give slightly good news
(red ball) and Option 2 is more likely to give slightly bad news (black ball) compared to the prior.
While the likelihood of getting bad news (seeing a black ball) is higher in Option 2, conditional
on the color of the ball (red or black), the posteriors induced by Option 2 are higher than the
posteriors induced by Option 1 for the same ball color.
25Execution details: person wins if ticket number is even (odd) and the die throw outcome is even (odd).
54