Extending the Bounds of Rationality: Evidence and...

31
Journal of Economic Literature Vol. XLIV (September 2006), pp. 631–661 Extending the Bounds of Rationality: Evidence and Theories of Preferential Choice JÖRG RIESKAMP , JEROME R. BUSEMEYER, AND BARBARA A. MELLERS Most economists define rationality in terms of consistency principles. These principles place “bounds” on rationality—bounds that range from perfect consistency to weak stochastic transitivity. Several decades of research on preferential choice has demon- strated how and when people violate these bounds. Many of these violations are inter- connected and reflect systematic behavioral principles. We discuss the robustness of the violations and review the theories that are able to predict them. We further discuss the adaptive functions of the violations. From this perspective, choices do more than reveal preferences; they also reflect subtle, yet often quite reasonable, dependencies on the environment. 631 Rieskamp: Max Planck Institute for Human Development, Berlin, Germany. Busemeyer: Indiana University. Mellers: University of California, Berkeley. We gratefully acknowledge helpful comments on previous ver- sions of this article by Claudia Gonzalez-Vallejo, Mark J. Machina, Anthony A. Marley, and Peter P. Wakker. We also thank Donna Alexander for editing a draft of this article. The second author was partially supported by the China Europe International Business School and by NSF SES- 0083511 awarded to Indiana University, Bloomington, and the third author was partially supported by NSF SES- 0111944 awarded to the University of California, Berkeley. 1. Introduction P references are inherently subjective and arise from a mixture of aspirations, thoughts, motives, emotions, beliefs, and desires. This inherent subjectivity means that preferences are not easily evaluated against objective criteria without knowledge of an individual’s goals. For example, it is impossible to know whether an individual’s preference for chocolate over vanilla ice cream is “rational” without knowing the atti- tudes, values, perceptions, beliefs, and goals of that individual. For this reason, most the- orists evaluate the rationality of behavior using principles of consistency and coher- ence within a system of preferences and beliefs. The view that rationality only refers to internal coherence and logical consistency has been sharply criticized on multiple grounds. Many have argued (e.g., Gerd Gigerenzer 1996a) that consistency princi- ples are insufficient for defining rationality. If the achievement of an individual’s goal does not imply consistency, it is questionable whether functional behavior that violates consistency principles should be called “irra- tional.” In addition, people are imperfect information processors and limited in both knowledge and computational capacity. SECOND PASS Quebecor World Premedia sep06_Article2 8/4/06 4:36 PM Page 631

Transcript of Extending the Bounds of Rationality: Evidence and...

Journal of Economic Literature Vol. XLIV (September 2006), pp. 631–661

Extending the Bounds of Rationality:Evidence and Theories of Preferential

Choice

JÖRG RIESKAMP, JEROME R. BUSEMEYER, AND BARBARA A. MELLERS∗

Most economists define rationality in terms of consistency principles. These principlesplace “bounds” on rationality—bounds that range from perfect consistency to weakstochastic transitivity. Several decades of research on preferential choice has demon-strated how and when people violate these bounds. Many of these violations are inter-connected and reflect systematic behavioral principles. We discuss the robustness ofthe violations and review the theories that are able to predict them. We further discussthe adaptive functions of the violations. From this perspective, choices do more thanreveal preferences; they also reflect subtle, yet often quite reasonable, dependencies onthe environment.

631

∗ Rieskamp: Max Planck Institute for HumanDevelopment, Berlin, Germany. Busemeyer: IndianaUniversity. Mellers: University of California, Berkeley. Wegratefully acknowledge helpful comments on previous ver-sions of this article by Claudia Gonzalez-Vallejo, Mark J.Machina, Anthony A. Marley, and Peter P. Wakker. We alsothank Donna Alexander for editing a draft of this article.The second author was partially supported by the ChinaEurope International Business School and by NSF SES-0083511 awarded to Indiana University, Bloomington, andthe third author was partially supported by NSF SES-0111944 awarded to the University of California, Berkeley.

1. Introduction

Preferences are inherently subjective andarise from a mixture of aspirations,

thoughts, motives, emotions, beliefs, anddesires. This inherent subjectivity meansthat preferences are not easily evaluatedagainst objective criteria without knowledgeof an individual’s goals. For example, it isimpossible to know whether an individual’s

preference for chocolate over vanilla icecream is “rational” without knowing the atti-tudes, values, perceptions, beliefs, and goalsof that individual. For this reason, most the-orists evaluate the rationality of behaviorusing principles of consistency and coher-ence within a system of preferences andbeliefs.

The view that rationality only refers tointernal coherence and logical consistencyhas been sharply criticized on multiplegrounds. Many have argued (e.g., GerdGigerenzer 1996a) that consistency princi-ples are insufficient for defining rationality.If the achievement of an individual’s goaldoes not imply consistency, it is questionablewhether functional behavior that violatesconsistency principles should be called “irra-tional.” In addition, people are imperfectinformation processors and limited in bothknowledge and computational capacity.

SECOND PASSQuebecor World Premedia

sep06_Article2 8/4/06 4:36 PM Page 631

SECOND PASSQuebecor World Premedia

632 Journal of Economic Literature, Vol. XLIV (September 2006)

Figure 1. Five Consistency Principles and Their Relatedness

Herbert A. Simon (1956, 1983) and others(Gigerenzer and Reinhard Selten 2001) usethe term bounded rationality to describehuman behavior. Consistency violationsmight well be adaptive when time, knowl-edge, or resources are scarce (e.g., Robert J.Aumann 1962; Mark J. Machina 1982;Michele Cohen and Jean-Yves Jaffray 1988;Gigerenzer 1991, 1996b; Lola L. Lopes andGregg C. Oden 1991; Kenneth R.Hammond 1996; Lopes 1996; PhilippeMongin 2000).

Several decades of research on preferentialchoice has led to an impressive body ofempirical knowledge on how and when peo-ple violate consistency principles of preferen-tial choice. The purpose of this article is toreview these basic facts. Our review differsfrom previous reviews (e.g., Paul J. H.Schoemaker 1982; Chris Starmer 2000) interms of the consistency principles we exam-ine. Previous reviews focused primarily onconsistency across operations applied toproperties of choice objects. For example, theindependence axiom of expected utility theo-ry requires consistency across linear transfor-mations of probabilities and thus is restrictedto the domain of decisions under risk. Incontrast, we are interested in consistency

principles that go beyond assumptions aboutthe properties or attributes of the choiceobjects. For example, the transitivity axiom isapplicable to a wide range of choice objects,including both risky gambles as well as riskfree commodity bundles.

We organize our paper around five con-sistency principles that are summarized infigure 1. Each principle helps to define aclass of theories. The principles differ intype (logically independent principles—oneprinciple does not imply anything aboutanother), and strength (logically dependentprinciples—a stronger principle implies aweaker one). The principles have beenlabeled from I to V, but they only allow apartial ordering of constraints.

The first principle is perfect consistencyor invariant preferences across occasions.Perfect consistency implies the other fourprinciples. The second principle, strong sto-chastic transitivity, is logically equivalent tothe third principle, independence from irrel-evant alternatives (Amos Tversky andEdward J. Russo 1969). Regularity is thefourth principle, but there is no logical con-nection between independence from irrele-vant alternatives and regularity (one doesnot logically imply the other). Independence

sep06_Article2 8/4/06 4:36 PM Page 632

SECOND PASSQuebecor World Premedia

633Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

from irrelevant alternatives implies the fifthprinciple of weak stochastic transitivity.

A primary goal of this paper is to assess thedegree to which preferential choices obeythe principles. For each principle, we sum-marize the empirical evidence. When are theviolations serious and robust? When are theysmall or limited? By evaluating the types ofviolations and the magnitude of those viola-tions, we can determine which principlesmust be relaxed in descriptive theories ofpreferential choice. Furthermore, we showthat many violations are related and suggestsome underlying behavioral regularities thatshould be addressed in a general descriptivetheory.

A secondary goal is to discuss theories ofpreferential choice in terms of the consisten-cy principles. We will review classes of theo-ries and discuss their compliance or lack ofcompliance with the five consistency princi-ples. When evaluating the different theories,we will consider the extent to which they candescribe empirical violations of the consisten-cy principles. We will also consider the com-plexity of the theories. Selten (1991) arguedthat, if two models are equally successful inpredicting existing data, the parsimoniousmodel that, in principle, is only able to pre-dict a small range of data should be preferredover the more complex model that, a priori,can predict a larger range of data. In a similarvein, Mark Pitt, In Jae Myung, and ShaoboZhang (2002) have argued that a model’s gen-eralizability, or the ability of the model to pre-dict independent new data, is essential tomodel selection. A complex model might besuperior to a simpler model when fittingobserved behavior, but, due to the danger ofoverfitting, the complex model might be pre-dicting noise rather than systematic regulari-ties and, therefore, is less suitable forpredicting new independent behavior.

A third goal of this paper is to step back andreconsider the meaning of rationality. We dis-cuss the advantages and disadvantages of eachprinciple from different perspectives andaddress the question of how far the bounds of

rationality must be pushed to accommodatebasic and systematic regularities in humanchoice.

2. Perfect Consistency

The standard utility theory of preferencebegins by positing a choice set, X = {A, B,C, …}, with elements, A, B, and C, denotingthe options. The choice might involvemeals, movies, or mutual funds. A prefer-ence relation, ≥p, is defined for all pairs ofoptions, such that A ≥p B means that optionA is preferred to option B, or that the deci-sionmaker is indifferent. Finally, a set ofaxioms is imposed on the preference rela-tions to permit the construction of a utilityfunction u: X → R. The utility functionassigns a real-valued utility to each option(e.g., u(A) is the utility assigned to optionA), and the order implied by the utilities isused to predict the empirical preferenceordering, that is, u(A) ≥ u(B) if, and only ifA ≥p B.

Standard utility theories include theexpected utility theory (John von Neumannand Oskar Morgenstern 1947), rank-dependent utility theories (e.g., JohnQuiggin 1982; Chew Soo Hong, Edi Karni,and Zvi Safra 1987; Menahem E. Yaari 1987;Jerry R. Green and Bruno Jullien 1989; R.Duncan Luce 1990), and multiattribute util-ity theories (Ralph L. Keeney and HowardRaiffa 1976; Detlof von Winterfeldt andWard Edwards 1986). All such theories arebased on the axiom of transitivity.Transitivity states that for any triad ofoptions, A, B, and C, if two preferences hold,A ≥p B and B ≥p C, then a third one shouldfollow, A ≥p C. This axiom is a cornerstone ofnormative and descriptive theories becauseit implies that the utilities can be represent-ed as transitive, real numbers (PaulSamuelson 1953; Edwards 1954, 1961).

To test this axiom, one must measure pref-erences. There are at least two commonlyused methods to “elicit” preferences—onebased on choices and the other based on

sep06_Article2 8/4/06 4:36 PM Page 633

SECOND PASSQuebecor World Premedia

634 Journal of Economic Literature, Vol. XLIV (September 2006)

1 It is reasonable to assume that when a decisionmakerstates his or her price or certainty equivalent for an option,he or she will often cognitively perform a series of com-parisons between the option and candidate prices,although these comparisons are not indispensable.

judged prices, typically referred to as cer-tainty equivalents. Unfortunately, thesemethods do not always agree. For example,when presented with a choice between twogambles, A and B, individuals may prefergamble A on the basis of choice, but state ahigher price for gamble B (for reviews, seeDavid M. Grether and Charles R. Plott1979; Paul Slovic and Sarah Lichtenstein1983; Tversky, Slovic, and Daniel Kahneman1990; Barbara A. Mellers, Shi-Jie Chang,Michael H. Birnbaum, and Lisa D. Ordonez1992). The psychological reasons for differ-ent preferences obtained with differentresponse modes are controversial, and thereis no real consensus about which method isbetter. Both methods of eliciting preferenceshave pros and cons. Nonetheless, manyresearchers prefer to use choices, and forthis reason, we focus on choice1 (for an alter-native perspective, see Luce, Mellers, andChang 1993).

Differences in measured preferencesoccur not only across methods but alsoacross occasions within an individual. In aclassic study, Frederick Mosteller and PhilipNogee (1951) demonstrated that individualswere often inconsistent in their preferencesfor simple gambles over repeated occasions.Participants were asked to accept or rejectbinary gambles with fixed-outcome proba-bilities and increasing amounts to win. Thetendency for an individual to select the gam-ble did not suddenly jump from zero to oneat an acceptable winning payoff. Instead,acceptance frequencies were a smooth S-shaped function when plotted against thewinning amount. The preference functionwas strikingly similar to a psychometric func-tion, such as those derived from the discrim-ination of stimuli along psychophysicalcontinuua, such as loudness, heaviness, andbrightness. More recent evidence shows that

2 Note, however, that in Hey (2001) participants wereforced to select one gamble out of the pairs instead of alsoallowing expressions of indifference. Thereby, inconsistentchoices do not necessarily represent systematic or unsys-tematic inconsistencies, but could be due to indifferencesbetween particular gambles.

participants typically change their minds inapproximately 25 percent of choice pairs(approximately equal in expected value butdiffering in risk) that are presented twice(Colin F. Camerer 1989; John D. Hey andChris Orme 1994; Peter Wakker, Ido Erev,and Elke U. Weber 1994; T. Parker Ballingerand Nathaniel T. Wilcox 1997; Jerome R.Busemeyer, Ethan Weg, Rachel Barkan,Xuyang Li, and Zhengping Ma 2000).

In an extensive study of choice consisten-cy, Hey (2001) asked participants to makeone hundred choices between pairs of gam-bles that were repeated in five sessions.Participants were paid according to the out-come of the chosen gamble. Overall rates ofinconsistencies varied greatly across partici-pants, ranging from 1 percent to 21 percent.On average, participants changed theirchoices 11 percent of the time from the firstto the second session. Not a single partici-pant had consistent preferences across thefirst two sessions, while rates of inconsisten-cy decreased slightly. Participants reversedtheir choices 9 percent of the time from thefourth to the fifth session.2 Virtually all of theparticipants were inconsistent throughoutthe entire experiment. Only one of the fifty-three participants made identical choices inthe last three sessions.

In sum, we have crossed the first bound ofrationality. The literature is replete withchoice variability over time, contexts, andoccasions. To accommodate these resultsand those of many other studies, we mustrelax the assumption that choices are deter-ministic, and accept the inherent variabilityof preferences.

3. Strong Stochastic Transitivity

We now turn to probabilistic theories ofchoice (for an overview, see Anthony A. J.

sep06_Article2 8/4/06 4:36 PM Page 634

SECOND PASSQuebecor World Premedia

635Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

Marley 1992). These theories posit the exis-tence of a function that maps choice pairs intoprobabilities. Hereafter, p(A|{A,B}) denotesthe probability of choosing A from the set Aand B. Choice probabilities are often estimat-ed by presenting an individual with the samechoice on multiple occasions (with otherproblems interwoven between repetitions).Weak probabilistic choice theories imply thateither A is preferred to B if, and only if,p(A|{A,B}) > .50, or B is preferred to A if, andonly if, p(B|{A,B}) > .50 (indifference holds inthe very rare case that p(A|{A,B}) = .50; for areview, see Luce 2000). Strong probabilisticchoice theories make precise predictionsabout the probability mass of each option inthe set.

There are two classes of strong probabilis-tic theories—random utility theories andfixed utility theories (see Gordon Becker,Morris H. Degroot, and Jacob Marschak1963; Luce and Patrick Suppes 1965).According to random utility theories, indi-viduals select the option with the highestutility using a deterministic decision rule andvariable utilities that differ over time andcontexts (Louis L. Thurstone 1959; Becker,Degroot, and Marschak 1963; TomDomencich and Daniel L. McFadden 1975;Charles F. Manski 1977; McFadden 1981,2001; Marley and Hans Colonius 1992; for areview, see Marley 1992). In contrast, fixedutility theories (also called “simple scalabili-ty theories”) assume that individuals selectthe option with the highest utility using aprobabilistic decision rule and deterministicutilities. If a theory fulfills the property ofsimple scalability, each option in a choice setcan be assigned a scale value such that thechoice probabilities are a monotone functionof the scale values of the options (Tversky1972a). A third approach assumes that boththe utilities and the decision process aredeterministic, but choices are based on ran-domly selected strategies (see, e.g., Machina1985; John W. Payne, James R. Bettman, andEric J. Johnson 1993; Graham Loomes andRobert Sugden 1995; Jörg Rieskamp and

3 More formally, assume that X is a complete set ofoptions and Y is a subset of X(Y � X), that is Y = {A1,…,Am}and X = {A1,…,An} with n ≥ m. Define UX = [U1,U2,…,Un]as a random utility vector with the probability distributionfunction F(u1,…,un) = F(uu) and density f(u1,…,un) = f(uu).Then p(Ai|Y) equals the probability of sampling a randomutility vector UX within the set ∆ i = {uu ∈RRn | ui ≥ uj forj = 1,…,m}. This probability equals the integral of the dis-tribution within the region ∆ i:

.p A Y dF f di

ii

( ) ( ) ( )| = = ∫∫ u u u��

Philipp E. Otto 2006). However, thisapproach has not addressed the phenomenareviewed in this article.

Random utility theories, representing thefirst class of strong probabilistic choice theo-ries, are defined as follows: Given a set of noptions, each option, Ai, is assigned a ran-dom utility U(Ai), defined by a single jointdistribution function for all choice sets.3

Option Ai is always chosen when it has thehighest utility of the available options, andits choice probability is:

(1) p(Ai) = p[U(Ai) > U(Aj)],

for all j = 1, 2,…, n, j ≠ i.

A simplifying assumption commonly used inconjunction with standard random utilitymodels is that the utility of an option can beexpressed as

(2) U(Ai) = ∑kβk. xik + ε i,

where x are the k attributes of the options, βare the weights assigned to the attributes,and ε is an error. Random utility models tendto differ according to the assumptions madeabout the errors. If it is assumed that theerror component is identically and independ-ently distributed following an extreme valuedistribution, the multinomial logit modelresults, which is the most common randomutility choice model (Kenneth E. Train 2003,for an overview see McFadden 2001).According to this model, the probability thatoption Ai is chosen is defined as:

exp(∑kβk . xik)(3) p(Ai) = ________________ ,∑nj =1exp(∑kβk

. xjk)

sep06_Article2 8/4/06 4:36 PM Page 635

SECOND PASSQuebecor World Premedia

636 Journal of Economic Literature, Vol. XLIV (September 2006)

where x and β are defined in equation 2. Thekey assumption of the model is the independ-ent error component, which implies that thealternatives’ utilities are also independent.

Fixed utility theories, representing thesecond class of strong probabilistic choicetheories, are probabilistic extensions ofdeterministic utility theories that share aproperty known as “simple scalability”(Becker, Degroot, and Marschak 1963; Luceand Suppes 1965; Tversky 1972a). Fixed util-ity theories imply that a single scale value canbe assigned to each option in a complete setof options and that this value will be inde-pendent of the other options in the choice setpresented. For example, u(A) and u(B) canbe real numbers that represent the utilities ofoptions A and B. The choice probability foran arbitrary pair of options is determined bymeans of a function, F, ranging from zero toone, where:

(4) p(A|{A,B}) = F[u(A), u(B)].

F is strictly increasing in the first argumentand strictly decreasing in the second, andF[u(A), u(B)] = .50 when u(A) = u(B). Allsimple fixed utility theories imply strong sto-chastic transitivity, which is a natural exten-sion of the deterministic axiom of transitivity(see Edwards 1961).

The consistency principle of strong sto-chastic transitivity states that for any arbitrarytriplet of options, A, B, and C:

If p(A|{A,B}) ≥ .50 and p(B|{B,C}) ≥ .50,

then p(A|{A,C}) ≥ max{p(A|{A,B}),p(B|{B,C})}.

The first condition, p(A|{A, B}) ≥ .50,implies that u(A) ≥ u(B). The second con-dition, p(B|{B,C}) ≥ .50, implies thatu(B) ≥ u(C). By the transitivity of realnumbers, u(A) ≥ u(C), the first conse-quence, p(A|{A,C}) ≥ p(A|{A,B}), followsfrom u(B) ≥ u(C) and the assumption thatF is a decreasing function of the secondargument. The second consequence,p(A|{A,C}) ≥ p(B|{B,C}), follows from

u(A) ≥ u(B) and the assumption that F is anincreasing function of the first argument. Theclass of fixed utility theories includes Luce’s(1959) choice model and many more recenttheories. Although not all random utility the-ories do, Thurstone’s (1927) Case V modeland the multinominal logit model describedabove also imply strong stochastic transitivity(for proofs, see Luce and Suppes 1965). For areview, see Hey and Orme (1994) and DavidW. Harless and Camerer (1994).

Does human choice behavior obey theprinciple of strong stochastic transitivity? Anoverwhelming number of studies suggestotherwise. Many of the original experimentswere carried out several decades ago (ClydeH. Coombs 1958; Coombs and Dean G.Pruitt 1961; David H. Krantz 1967; Tverskyand Russo 1969; Lennart Sjoberg 1975,1977; Sjoberg and Dora Capozza 1975). Inone study (Coombs and Pruitt 1961), partic-ipants were asked to choose between mone-tary gambles with different variances. In 25percent of all possible triplets, choices vio-lated strong stochastic transitivity. In anoth-er study of choices between gambles withcigarettes as outcomes that were given to theparticipants, Betty J. Griswold and Luce(1962) found strong stochastic transitivityviolations in 26 percent of all triplets.Donald L. Rumelhart and James G. Greeno(1971) presented participants with pairs ofcelebrities and asked them to select thecelebrity with whom they would prefer tohave a one-hour conversation. Violations ofstrong stochastic transitivity appeared in 46percent of the possible triplets.

Mellers and Karen Biagini (1994)reviewed the literature on strong stochastictransitivity violations and argued that thepresence or absence of such violationsdepends on the comparability of options(Krantz 1967; Tversky and Russo 1969).When people compare multiattributeoptions, the similarity of values along oneattribute enhances differences on otherattributes. These comparability effects areeasily linked to cognitive processing. When

sep06_Article2 8/4/06 4:36 PM Page 636

SECOND PASSQuebecor World Premedia

637Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

4 In their paper, Mellers et al. (1992) presentedstrength of preference judgments, which were convertedto choice proportions for this review.

evaluating options, people pay more atten-tion to attributes that differ. Consider, forexample, two jobs with comparable salaries,but different commuting times. Due to thesimilarity of salaries, people place greaterattention and importance on differences incommuting times.

Comparability effects imply that viola-tions of strong stochastic transitivity shouldfollow a specific predictable pattern. Mellersand Biagini (1994) reanalyzed several datasets and found strong experimental evi-dence. To illustrate the effect, consider thethree gambles shown in table 1. Gamble Aoffers a 29 percent chance of winning $3,otherwise $0, gamble B has a 5 percentchance of winning $17.50, otherwise $0, andgamble C has a 9 percent chance of winning$9.70, otherwise $0. Gambles are matchedin expected values. Furthermore, gambles Band C have relatively similar levels of proba-bility (i.e., 5 percent and 9 percent).Mellers, Chang, Birnbaum, and Ordonez(1992) found that A was preferred to B,p(A|{A,B}) = .59, and B was preferred to C,p(B|{B,C}) = .72. Strong stochastic transitivi-ty implies that the preference for A over Cshould be at least as high as .72. Instead,p(A|{A,C}), the preference for A over C wasonly .51. This finding was observed with andwithout financial incentives (i.e., paying par-ticipants the gambles’ outcomes). One couldargue that either the preference for B overC was “too strong” or the preference for Aover C was “too weak.” Mellers and Biaginiargued that the similarity of probability lev-els for gambles B and C (5 percent and 9percent, respectively) increased the per-ceived differences in payoffs ($17.50 and$9.70, respectively) making B seem muchbetter than C.4

Mellers and Biagini (1994) proposed a contrast-weighting model to explain this pat-tern of strong stochastic transitivity violations.

Consider a choice between two gambles,where gamble A has some probability, pA, ofwinning xA, and gamble B has some probabili-ty, pB, of winning xB. The probability of select-ing gamble A over B is the difference betweenthe products of utilities and subjective probabilities as follows:

(5) p(A|{A,B}) =

J[u(xA)�s(pA)β − u(xB)�s(pB)β],

where J is a response function (e.g., acumulative logistic function), and � and βare weights associated with utilities andsubjective probabilities, respectively. Theweight associated with the utilities, �,depends on the difference between thesubjective probabilities (i.e., s(pA) − s(pB)).The bigger the probability difference, thesmaller the value of �. The same logic holdsfor β; the bigger the difference betweenthe utilities, the smaller the weightattached to probabilities. Similar values onone attribute enhance differences on otherattributes.

The stochastic difference model (ClaudiaGonzalez-Vallejo 2002) is another psycholog-ical model that can explain violations ofstrong stochastic transitivity by assuming thatoptions are compared intradimensionally.For each dimension, the model determinesthe proportional advantage or disadvantageof an option compared with the other option.For example, if gamble A’s outcome is largerthan gamble B’s outcome (i.e., xA > xB), theproportional advantage of gamble A is com-puted as (xA − xB)/xA. If the winning probabil-ity of gamble A is smaller than the winningprobability of B, this represents a propor-tional disadvantage defined as (pB − pA)/pB.The disadvantage is subtracted from theadvantage providing a proportional differ-ence d. The decisionmaker prefers the firstoption, when d is larger than (δ + ε), where δis a personal decision threshold parameterand ε is a normal distributed error with meanzero and a standard deviation, �, as a freeparameter.

sep06_Article2 8/4/06 4:36 PM Page 637

SECOND PASSQuebecor World Premedia

638 Journal of Economic Literature, Vol. XLIV (September 2006)

TABLE 1GAMBLES USED BY MELLERS ET AL. (1992) TO ILLUSTRATE COMPARABILITY EFFECTS

A B C

Probability of Winning .29 .05 .09

Amounts to Win $3.00 $17.50 $9.70

5 Also called “property alpha” by Sen (1993); for anoverview of the different notions, see Iain McLean (1995).

4. Independence from IrrelevantAlternatives

Another principle implied by fixed utilitytheories, is the independence from irrelevantalternatives (IIA). This principle states thatthe relative preference for two options, suchas A and B, should not vary when each isevaluated with respect to option C or D (seeRoy Radner and Marschak 1954; Tverskyand Russo 1969; Tversky 1972a).5

More formally, this principle states thatfor any arbitrary quadruple of options, A, B,C, or D, the following property must hold:

If p(A|{A,C}) ≥ p(B|{B,C}),then p(A|{A,D}) ≥ p(B|{B,D}).

The condition on the left implies thatu(A) ≥ u(B), and that this order implies theconsequence on the right. There is an equiv-alence relation between strong stochastictransitivity and the IIA. Tversky and Russo(1969) proved that one principle is satisfiedif, and only if, the other is satisfied. Tversky(1972a) also proved that the IIA is sufficientfor guaranteeing the construction of a fixedutility theory.

There is a stronger version of the IIAprinciple economists often refer to:According to Luce (1959), this principlestates that when having an option set T con-taining the options A and B the followingproperty must hold:

p(A|{A,B}) p(A|S)(6) ________ = _____ ,p(B|{A,B}) p(B|S)

where S is any possible subset of T includingA and B.

This stronger version does not onlyrequire that the order of the choice proba-bilities for A and B be independent, but alsothat the ratios of the choice probabilities beindependent of the option sets.

Psychologists have commonly focusedupon testing the weaker and more generalversion of the IIA principles, which of courseimplies violations of the stronger independ-ence principle. Violations of the weak versionare more remarkable since they representchanges in the order of preferences, whereasviolations of the strong version may becaused by only slight changes in the strengthof preferences. Therefore the focus of ourreview is on the weak version. Many studiesshow that people violate the weaker form ofthe independence principle. Busemeyer andJames T. Townsend (1993) reviewed this lit-erature and argued that these violations weredriven by the same comparability effects thatlead to violations of strong stochastic transi-tivity. For example, Busemeyer (1985) inves-tigated the independence axiom insequential choices between a gamble and a“sure thing.” Participants, who were paidaccording to the gambles’ outcomes, had tochoose between gambles: There was a low-risk gamble, A, (normally distributed withmean zero and standard deviation 5¢) and ahigh-risk gamble, B, (normally distributed

sep06_Article2 8/4/06 4:36 PM Page 638

SECOND PASSQuebecor World Premedia

639Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

with mean zero and standard deviation 50¢).There were also two sure things: C was −1¢and D was 1¢. Participants preferred A to C,and B to C. Furthermore, p(A|{A,C}) wasgreater than p(B|{B,C}) (i.e., p(A|{A,C}) = .75,and p(B|{B,C}) = .56). According to the inde-pendence principle, p(A|{A,D}) should begreater than p(B|{B,D}). However, the oppo-site order appeared (i.e., p(A|{A,D}) = .30,and p(B|{B,D}) = .50).

Busemeyer and Townsend (1993) arguedthat choice probability between two gam-bles, A and B, is an increasing function of theratio d = (u(A) − u(B))/�, where the numer-ator is the expectation of the differencesbetween the random utilities of the gambles,and the denominator is the standard devia-tion of the differences between the randomutilities of the gambles. The standard devia-tion, �, determines how easy or difficult it isto discriminate the mean difference. Choicesbetween the low-risk gamble and the surething are easy to discriminate because thestandard deviation, �, is small; whereaschoices between the high-risk gamble andthe sure thing are difficult to discriminatebecause the standard deviation, �, is large.

Violations of the independence principlehave also been found when choice sets con-tain more than two options. A famous exam-ple originally described by Gerard Debreu(1960) illustrates this case. Envisage a musiclover who makes choices between classicrecordings. When presented with a choicebetween a recording of Beethoven’s EighthSymphony (option B) and a recording of theDebussy quartet (option D), the music enthu-siast has a higher probability of choosing B.Then a different recording of Beethoven’sEighth Symphony by the same orchestra, butwith a different conductor is added to thechoice set (option B’). When evaluating allthree options, the music enthusiast has thehighest probability of choosing the recordingof the Debussy quartet. The probability ofchoosing a recording of the Eighth Symphonyis now divided between the two recordings ofthe Eighth Symphony, and thus is smaller.

This preference pattern violates fixed utilitytheories for the following reason: The initialbinary choice between Beethoven andDebussy implies that F[u(B), u(D)] > .50 and,thus, u(B) > u(D). The advantage ofBeethoven over Debussy should continue inthe presence of a second Beethoven record-ing. That is, with three options,p(B|{B,D,B’}) = F[u(B),u(D),u(B’)] > F[u(D),u(B),u(B’)] = p(D|{B,D,B’}) because F is anincreasing function of the first argument and a decreasing function of the remaining arguments. But that is not what happens.Tversky (1972b) called this violation of the independence principle the similarity effect.

Debreu’s example has been frequentlyreferred to and adapted to many applicationsin economics. For instance, the “red-busblue-bus problem” represents a famousadaptation to research on transportation andtravel behavior (e.g., Train 2003). Here atraveler has the choice between a car and ablue bus for traveling to work. Now a secondbus, which only differs from the blue bus inits red color, is added to the option set.When now choosing among the threeoptions, decisionmakers might split theirpreferences for buses between the red busand the blue bus, violating the IIA principle.

The most common random utility theory,the multinomial logit model describedabove, can not account for these violationsbecause it assumes that the errors are statis-tically independent. The joint distribution ofutilities for the set of options Y is assumed tobe identically and statistically independentfor all choice sets. Since these independenceviolations seem intuitively reasonable, theinsufficiency of the multinomial logit modelto explain them has initiated an extensivedevelopment of new versions of random util-ity theories. These new versions of randomutility theories, predominantly proposed byeconomists, allow correlations in the jointdistributions of utilities (for an overview, seeRobert J. Meyer and Barbara E. Kahn 1991;McFadden 2001). These theories have beensuccessfully applied to describe choices in a

sep06_Article2 8/4/06 4:36 PM Page 639

SECOND PASSQuebecor World Premedia

640 Journal of Economic Literature, Vol. XLIV (September 2006)

variety of domains, including transportationalternatives (David A. Hensher 1986;McFadden 1974), occupations (PeterSchmidt and Robert P. Strauss 1975), orautomobiles (Steven T. Berry 1994; Berry,James Levinsohn, and Ariel Pakes 1999).Three important classes of random utilitymodels can be distinguished: generalizedextreme value models, multinomial probit orThurstone models, and mixed logit models(for an overview, see William H. Greene2000 or Train 2003).

The most prominent model of the general-ized extreme value models (McFadden 1978)is the nested logit model, which assumes thatthe choice set can be partitioned into subsets,called nests. The error components in equa-tion 2 may be correlated within each subsetbut are uncorrelated across nests. With thisassumption, the IIA principle holds for alloptions within a nest but not for options acrossnests. These models can explain particular violations of IIA.

The second class of random utility models,multinomial Thurstone or probit models,including Richard D. Bock’s and Lyle V.Jones’s (1968) Thurstone model, and Jerry A.Hausman’s and David Wise’s (1978) condi-tional probit model, can also explain viola-tions of the IIA. For these models, the errorcomponents in equation 2 are assumed to bedrawn from correlated normal distributions.Dependencies between these distributionscapture correlations between the options.These models are less restrictive than thenested logit models and allow the predictionof “any pattern of substitutions amongoptions” (Meyer and Kahn 1991, p. 98).

The third and most recent class of randomutility models is the mixed logit models, alsocalled random coefficient models (Berry1994; Berry, Levinsohn, and Pakes 1995,1999; Hensher and Greene 2003; DavidRevelt and Train 1998). Mixed logit modelsassume that the weights, β, of the explanato-ry variables k are random parameters thatvary across individuals q. According to thesemodels, the utility of an option is:

(7) Uq(Ai) = ∑kβkq. xikq + ε iq,

where xikq are explanatory variables, βkq arethe weights for each variable that isassumed to be a random variable following aparticular form, such as a normal distribu-tion with a particular correlation matrix, ∑,and ε iq are error components independentlyand identically extreme value distributed.On the one hand, the weights are allowed tovary across individuals to capture the het-erogeneity of individuals’ preferences; onthe other hand, the weights are assumed tobe stable within an individual to permit cor-relations between utilities. These correla-tions can capture similarity effects, andthereby violations of IIA.

To illustrate how the mixed logit model canpredict similarity effects, consider the follow-ing problem. Suppose two different types ofcars share the same share of the automobilemarket. The first car, A, is rather expensive,but very luxurious, whereas the second car,B, is cheap, but offers only the standardequipment. Mixed logit models can explainthe equal share by assuming that the weightsgiven to price and comfort vary across con-sumers. Half of the consumers might give arelatively high weight to the price, whereasthe other half gives a relative low weight tothe price, so that different utilities for the twocars, but similar market shares, result. Nowsuppose a new car, C, enters the market. It ischeap and similar to car B. Consumers whoplace a relatively high weight on price willfind this car appealing, whereas the otherconsumers who preferred car A will not beattracted to the new car. Therefore, the mar-ket share of the cheap car B will decrease dueto the market entry of the second cheap carC, whereas the market share of the expensivecar A stays stable, which represents a IIA vio-lation. Thus, heterogeneity across individualscan produce correlations among the cars’utilities. Moreover, the distributions fromwhich the weights are drawn can be correlat-ed, so a variety of similarity effects can berepresented.

sep06_Article2 8/4/06 4:36 PM Page 640

SECOND PASSQuebecor World Premedia

641Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

To explain specific similarity effects, it canbe helpful to represent the mixed logitmodel in another form that is mathematical-ly equivalent to equation 7, but decomposesthe random weights βqk as follows:

(8) Uq(Ai) = ∑k�k. xikq + ∑kγkq

. zikq + ε iq,

where �k are fixed weights, γkq are randomweights with a mean of zero, and xikq and zikq

are explanatory variables. Since the variableszikq are multiplied by weights with a mean ofzero, the sum of the explanatory variableswith the random weights, γkq, and the errorcomponents, εiq, represent the error compo-nent of the utility of option i. Notice that,unlike the error components, ε iq, the ran-dom weights, γkq, can vary across individualsbut not across options. Depending on thevalues of zikq, options’ utilities can be corre-lated. For instance, if an explanatory vari-able, zikq, is a dummy variable that reflectsmembership in a nest, the mixed logit modelcan approximate the predictions of nestedlogit models. McFadden and Train (2000)have shown that, under quite generalassumptions, any random utility model canbe approximated by a mixed logit model.

The mixed logit models and the othernewer versions of the random utility modelsare relatively complex when compared tothe multinomial logit model, so that compu-tational problems limited their popularity.But now, since it is easier to do parameterestimations, these models are making acomeback. However, the lack of parsimonymakes it difficult to interpret the parame-ters, and the models are rather ad hoc (e.g.,membership in different nests for the nestedlogit models), and therefore, do not shedmuch light on why the similarity effectsoccur. If subsets of options for differentchoices can not be defined a priori, oneshould not generalize the predictions of themodel to new choices. Likewise, when fit-ting a mixed logit model to data, one candescribe different types of similarity effects,but it is often difficult to predict the effectsa priori. Thus, these models are less suitable

for making predictions about choices if newoptions are added to the option set (seeMeyer and Kahn 1991). In general, therecent random utility theories have beencriticized as being “black box” models thatdo not specify the psychological process ofchoice (Eliahu Stern and Harry W.Richardson 2005).

Psychologists have addressed similarityeffects, such as the violations of the IIA, bydeveloping descriptive theories. Tversky(1972b) proposed a probabilistic choicetheory called elimination by aspects. Thetheory makes the crucial assumption ofattention switching, which means that the decisionmakers’ attention randomlyswitches between the attributes of theavailable options (i.e., their aspects). Forinstance, in the “red-bus blue-bus-prob-lem” the decisionmaker might considerconvenience and environment friendlinessas two relevant attributes, with the car ashaving the aspect of being convenient andthe busses as having the aspect of beingenvironment friendly. When the decision-maker’s attention first focuses on the con-venience, the two buses are eliminatedfrom further consideration and the car ischosen. When, however, first giving consid-eration to the environment friendlinessleads to the car being eliminated, after-wards one of the two buses is chosen. Inthis process the blue bus splits its sharewith the red bus, so that the chances ofselecting one of the two buses are smallerthan the chance of choosing the car.

More generally, let T be a three-alterna-tive set, T = {A,B,C}. Each option has aspectsassociated with it, some shared and someunique. Figure 2 illustrates a special case.Let K be the sum of the utilities of all of theaspects, unique aspects u(a), u(b) and u(c),and shared aspects u(ab), u(ac) and u(bc).The probability of selecting A from the set,T, is:

(9) p(A|T) = [u(a) + u(ab) . p(A|{A,B}) +u(ac) . p(A|{A,C})]/K.

sep06_Article2 8/4/06 4:36 PM Page 641

SECOND PASSQuebecor World Premedia

642 Journal of Economic Literature, Vol. XLIV (September 2006)

Figure 2. An Alternative Set Where Alternatives Share Aspects

Equation 9 shows the three ways that Acould be chosen. First, a could be selectedfrom the set of aspects with probabilityu(a)/K. If so, A would be chosen with prob-ability 1. Second, ab could be selected withprobability u(ab)/K, and then A would bechosen over B with probability

p(A|{A,B}) = (u(a) + u(ac))/(u(a) + u(ac) +u(b) + u(bc)).

Finally, ac could be selected with probabili-ty u(ac)/K, and A would be chosen over Cwith probability

p(A|{A,C}) = (u(a) + u(ab))/(u(a) + u(ab) +u(c) + u(cb)).

The attention switching assumption of theelimination by aspects theory resembles themechanism the mixed logit model is using toexplain IIA violations since both theoriesassume that the weights of attributes are notfixed. The distinction is that the eliminationby aspect theory assumes that weights varydue to attention fluctuations of each individ-ual, whereas the mixed logit model assumes

that the weights are fixed for each individualbut vary across individuals.

The compromise effect (Itamar Simonson1989; Tversky and Simonson 1993) is another,more recently observed violation of inde-pendence with multiple alternatives. Let T bea three-option set, T = {B,C,D} each of whichis described by two attributes. Option B hasthe most advantageous attribute value for oneattribute, whereas option D has the mostadvantageous attribute value for the otherattribute. Option C has attribute values thatlie between the attribute values of option Band D along both attributes. When presentedwith only two options, B and C, option B ismore popular so that p(B|{B,C}) ≥p(C|{B,C}); however when presented with allthree options, the relation between B and Creverses so that p(C|{B,C,D}) > p(B|{B,C,D}),thus violating IIA.

Simonson demonstrated the effect in astudy in which participants made choicesamong cameras. One camera was high in qual-ity and price, and the other was low in quality

sep06_Article2 8/4/06 4:36 PM Page 642

SECOND PASSQuebecor World Premedia

643Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

and price. The third option, called the com-promise, had moderate values on bothdimensions. When individuals were present-ed with binary choices, participants wereroughly indifferent between the two options.However, when given a choice among allthree cameras, individuals chose the com-promise camera more often than either ofthe two extremes. This finding violates theindependence principle.

To explain the compromise effect,Tversky and Simonson (1993) developedthe componential context theory, whichrequires a dimensional representation of theoptions in a multidimensional space. In thistheory, Tversky and Simonson distinguishedbetween the background context defined byprior choice sets and the local contextdefined by the immediate choice set. If thebackground context is held constant, thenan option, A, is selected from a set, T, if itmaximizes:

(10) V(A|T) =

∑kβkvk(A) + θ∑B∈TR(A,B)

where the overall value of choosing A fromset, T, depends on the global context and theimmediate context. The first term on theright side of the equation reflects the globalcontext and is the sum of the products ofeach value vk associated with A along eachdimension k times the relative contributionsof that dimension. The second term thatreflects the immediate context is θ, the rela-tive contribution of the immediate contexttimes the sum of the advantages and disad-vantages of A relative to other options in thechoice set. The relative advantage of A overB on dimension k is defined as:

(11) R(A,B) =

∑kadvk(A,B)______________________________ ,∑kadvk(A,B) + ∑kdisk(A,B)

for which advk(A,B) = vk(A) − vk(B) if vk(A) −vk(B) is positive or otherwise zero. If optionA has a disadvantage over B on dimension k

so that vk(A) − vk(B) is negative, thendisk(A,B) = δ[vk(B) − vk(A)], or otherwisezero. The theory makes the crucial assump-tion that δ is a convex function incorporatingloss aversion. Due to the loss aversionassumption, the disadvantage of A over B islarger than the corresponding advantage ofB over A. In the compromise example, thecompromise alternative C has small advan-tages and disadvantages relative to B and D,whereas both B and D have relatively largeadvantages and disadvantages with respectto each other. Due to the convex loss aver-sion function, the large disadvantages of Band D make them less appealing, whereasdue to C’s small disadvantages, it becomesthe most attractive alternative.

Interestingly, the elimination by aspectstheory (Tversky 1972b) fails to predict com-promise effects (see Appendix A for a proof),and the componential context theory fails toexplain similarity effects (see Appendix B fora proof). Neither theory provides a generaland coherent account of independence vio-lations. Nevertheless, the theories demon-strate that fixed utility theories in whichoptions are evaluated independently of otheroptions are not suitable for explaining thecompetitive effects of options we reviewed.

Eldar B. Shafir, Daniel N. Osherson, andEdward E. Smith (1993) have argued that, toexplain these competitive effects, a compara-tive approach of decision theories isrequired. Theories following this compara-tive approach, like the elimination by aspectstheory or the componential context theory,incorporate comparison processes amongoptions in the evaluation process. Shafir,Osherson, and Smith (1993) proposed anadvantage model, which also takes intoaccount comparisons among options.However, since the theory is restricted topairwise choices and is defined as a deter-ministic theory, it cannot handle the compet-itive effects of option sets with more than twooptions or the probabilistic nature of choice.

In sum, two more bounds of rationalityhave now been crossed. There are numerous

sep06_Article2 8/4/06 4:36 PM Page 643

SECOND PASSQuebecor World Premedia

644 Journal of Economic Literature, Vol. XLIV (September 2006)

studies and robust examples of how andwhen people violate strong stochastic transi-tivity and the closely related principle of IIA.These consistency principles must be relaxedto account for human choice behavior.

5. Regularity

Random utility theories obey a property ofchoice known as regularity. This principleasserts that the addition of an option to achoice set should never increase the proba-bility of selecting an option from the originalset. More formally, assume that X is a com-plete set of options and Y is a subset ofX(Y � X), that is Y = {A1,…, Am} andX = {A1,…, An} with n ≥ m. If Y is presentedfor choice, and Ai is a member of both Y andX, then the probability of choosing Ai fromthe set Y is always equal to or greater thanthe probability of selecting Ai from the set X,that is, p(Ai |Y) ≥ p(Ai |X). This property holdsfor random utility theories since it is less like-ly that Ai has the highest utility in a larger set,X, with n options than in a smaller set, Y, withm options:

p(Ai |Y) = p(U(Ai) = max{U(A1),…,U(Am)})≥ p(U(Ai) = max{U(A1),…,U(Am)}) ×

p(U(Ai) = max{U(A1), ...,U(An)}|U(Ai) =max{U(A1),…,U(Am)}) = p(Ai |X).

That random utility theories obey the regu-larity principle also holds for newer versionsof random utility theories, including themixed logit model (see Appendix C for aproof).

Do people obey the principle of regulari-ty? Many studies show robust violations ofthis principle. The addition of a new optioncan increase the choice probability for anoption in the original set. Such violations arecalled either asymmetrical dominanceeffects (Joel Huber, Payne, and ChristopherPuto 1982) or attraction effects (Huber andPuto 1983; for an overview, see Timothy B.Heath and Subimal Chatterjee 1995). Thedifference between these two effects refers

to the attribute levels of the new option thatis added to the set.

Suppose people are indifferent betweentwo options, A and B. The “decoy” option, D,is then added to the choice set. D is said tobe “asymmetrically dominated” if it has val-ues that are worse than those of anotheroption, such as B, along one attribute and itsvalues are no better than those of B on allother attributes. Asymmetric dominanceeffects occur if the probability of choosing Bincreases when D is added to the choice set.Attraction effects are similar to asymmetricdominance effects, but D is not asymmetri-cally dominated by B. Instead, D simplyseems a lot worse.

These effects are illustrated with theoptions in table 2. Suppose a decisionmakeris indifferent between two laptop computers,A and B. A is relatively heavy, but has a largedisplay. B is lighter, with a smaller display.Now the decisionmaker considers a thirdcomputer, D that is heavier than B, but hasthe same sized display. The mere presence ofD makes B seem better. The probability ofselecting B from the set of three computers(A, B, and D) is greater than the probabilityof selecting B from the set of two computers(A and B). This increase in probability is theasymmetric dominance effect. Now supposethat the decoy, D, is not asymmetricallydominated by B, yet it still seems inferior toB (e.g., the display for D is 13.2 inches,rather than 13 inches). Here, the increase inthe probability of B with the inclusion of D iscalled the attraction effect. In both cases, theresult is a violation of regularity.

Asymmetrical dominance effects havebeen found in choices among gambles(Douglas H. Wedell 1991), consumer prod-ucts and services (Wedell and Jonathan C.Pettibone 1996), job applicants (ScottHighhouse 1996), and political candidates inU.S. elections (Yigang Pan, Sue O’Curry, andRobert Pitts 1995). The effects occur withchoices and other measures of preference(Dan Ariely and Thomas S. Wallsten 1995;Sanjay Mishra, U.N. Umesh, and Donald E.

sep06_Article2 8/4/06 4:36 PM Page 644

SECOND PASSQuebecor World Premedia

645Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

TABLE 2ILLUSTRATION OF ASYMMETRIC DOMINANCE EFFECTS WITH CHOICES AMONG LAPTOPS

A B D

Weight 6 lbs 3 lbs 3.5 lbs

Display Size 15 ins 13 ins 13 ins

Stem 1993; Pan and Donald R. Lehmann1993; Tversky and Simonson 1993). Finally,the effects are highly robust in both within-subject designs (repeated choices by thesame individuals) and between-subjectdesigns (different choices across groups ofindividuals).

Random utility theories cannot accountfor violations of regularity because the jointdistribution of utilities for the option set, Y,is assumed to remain the same when optionsare added to create a larger set, X. However,the empirical evidence suggests that thejoint distribution of utilities for the larger setchanges depending on the options that areincluded in the set. A psychological theorythat specifies the cognitive process of choiceand can explain dependencies amongoptions has been proposed by Busemeyerand Townsend (1993) and Robert M. Roe,Busemeyer, and Townsend (2001). It iscalled decision field theory. According to thistheory, choices are the result of a dynamicprocess during which the decisionmakerretrieves, compares, and integrates randomutilities over time. During the deliberationperiod, a random utility is retrieved for eachoption, Ai, at each moment in time t, denot-ed Ut(Ai). These utilities are integratedacross time by a linear dynamic process toform a preference state:

(14) Pt(Ai) = s . Pt −1(Ai) + Ut(Ai) −∑ j = 1cij

. Pt − 1(Aj), j ≠ i,

where s is a weight that reflects the memoryof recent preferences. The coefficients,ci j = cj i, are a function of the distancesbetween the options in a multidimensionalattribute space.

The theory postulates that the decision-maker forms preference states for optionsthat are the result of previous preferencestates plus the currently retrieved utilitiesfor options minus negative influences ofalternative options in the choice set. If apreference state for an option crosses athreshold, the decisionmaker selects thatoption. During the dynamic process of com-paring the alternatives, attention switchesfrom one attribute to another; thus, the the-ory also includes an attention switchingassumption as was proposed by the elimina-tion by aspect theory (Tversky 1972b), whichallows the explanation of similarity effects.In addition, the theory assumes that optionscompete with each other (in the case wherecij > 0) and similar options compete moreheavily than dissimilar options by exertingstronger negative influences. These compet-itive effects can predict violations of regular-ity (see Roe, Busemeyer, and Townsend2001; Busemeyer and Adele Diederich2002). The theory also predicts that theseviolations of regularity should increase whendecisionmakers deliberate longer, a predic-tion that has been supported in severalexperiments (Simonson 1989; Wedell 1991;Ravi Dhar, Stephen M. Nowlis, and StevenJ. Sherman 2000).

Another psychological theory that canexplain dependencies between options is anonlinear adaptive network model called theleaky, competing accumulator model (MariusUsher and James L. McClelland 2004). Thistheory is very similar to decision field theoryin the sense that it also assumes that prefer-ences develop in a dynamic manner overtime. Likewise, it includes the attention

sep06_Article2 8/4/06 4:36 PM Page 645

SECOND PASSQuebecor World Premedia

646 Journal of Economic Literature, Vol. XLIV (September 2006)

TABLE 3GAMBLES USED BY TVERSKY (1969) TO DEMONSTRATE INTRANSITIVE PREFERENCES

A B C D E

Probability of Winning 7/24 8/24 9/24 10/24 11/24

Payoff $5.00 $4.75 $4.50 $4.25 $4.00

switching assumption to explain similarityeffects. In contrast to decision field theory, itassumes loss aversion to explain regularityviolations. Thus, it combines the attentionswitching assumption of the elimination byaspect theory (Tversky 1972b) with the lossaversion assumption of the componentialcontext theory (Tversky and Simonson 1993)to explain IIA and regularity violations.

6. Weak Stochastic Transitivity

Weak stochastic transitivity is perhaps theweakest bound on rationality of the five prin-ciples that we consider. This principle statesthat for any arbitrary triple of options, A, B,and C:

If p(A|{A,B}) ≥ .50 and p(B|{B,C}) ≥ .50,then p(A|{A,C}) ≥ .50.

Weak stochastic transitivity lies at the heartof weak probabilistic choice theories.Those theories based on standard utilityassumptions must satisfy the principlebecause u(A) ≥ u(B) ≥ u(C) → p(A|{A,C})≥ .50. Furthermore, many strong probabilis-tic choice theories must also satisfy weakstochastic transitivity, including all simplescalable or fixed utility theories, eliminationby aspects theory, and decision field theory.The contrast-weighting model and the sto-chastic-difference model can, however,account for violations of weak stochastictransitivity. Among the random utility mod-els considered here, only the mixed logitmodel allows violations of weak stochastictransitivity (for an example, see AppendixD). In contrast, the other random utilitymodels, like, for instance, the generalizedextreme value model, obey the principle of

weak stochastic transitivity (see Appendix Efor a proof).

It is interesting to look closely at a classicstudy by Tversky (1969) to understand theconditions under which violations of weakstochastic transitivity can occur. Tverskyasked participants to choose between pairs ofgambles, as shown in table 3. Gambles variedin probabilities and payoffs, but expectedvalues were similar. When presented withgambles having similar probabilities, someparticipants frequently preferred the gamblewith the larger payoff. That is, they preferredA to B, B to C, C to D, and D to E. But whenpresented with gambles that substantiallydiffered in probabilities, they preferred thegamble with the greater probability of win-ning. That is, they preferred E to A. Tversky’sresults were later replicated by Harold R.Lindman and James Lyons (1978) andextended by Jonathan W. Leland (1994).

Tversky (1969) proposed an additive dif-ference model based on the concept of a justnoticeable difference (Luce 1956, p. 180) toaccount for the violations. A just noticeabledifference is the amount of change requiredin a physical stimulus before people detect adifference between similar stimuli on somepercentage of occasions. Tversky argued thatdifferences in probabilities for the adjacentgambles (i.e., A and B, or B and C) were lessthan a just noticeable difference, or not suffi-ciently great for people to differentiatebetween gambles. The additive differencemodel is a noncompensatory, lexicographicsemi-order. A model is noncompensatory ifthe information from one dimension cannotbe compensated by any combination of infor-mation from other less important dimensions

sep06_Article2 8/4/06 4:36 PM Page 646

SECOND PASSQuebecor World Premedia

647Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

(see also Rieskamp and Ulrich Hoffrage1999). In this account, decisionmakers selectthe gamble with the highest value on themost important dimension, but if the differ-ence between gambles on that dimension istoo small to be detected, decisionmakersconsider the next most important dimensionand choose the gamble with the highest valueon that dimension, and so on.

Ariel Rubinstein (1988) and Leland (1994)offer a similar account to describe choicesbetween gambles. They argue that peoplefollow a sequential decision process in whichthey first check whether one gamble domi-nates the other. Then they compare theprobabilities across gambles and outcomesacross gambles. If levels of either attributeare perceived as being similar on one dimen-sion and dissimilar on the other dimension,the dissimilar dimension becomes decisive.Only after these steps are completed dodecisionmakers move to an unspecified thirdstep of selecting a gamble. This account, alsobased on perceived similarity, predicts theintransitive choices described above. Thecontrast-weighting model offered by Mellersand Biagini (1994) is a more precise versionof this idea in which the effects of similarityand dissimilarity are reflected in the weightsassociated with the contrasts.

The idea of a noncompensatory decisionprocess that focuses on one dimension at atime combined with a just noticeable differ-ence is a reasonable explanation for such vio-lations, especially when options differ alongmany dimensions and the differences varysubstantially. It is efficient to place greaterattention on dimensions with larger differ-ences and ignore or discount dimensions withsmaller differences (Mellers and Biagini1994). Furthermore, theories that permit vio-lations of weak stochastic transitivity (Tversky1969; Mellers and Biagini 1994; Gonzalez-Vallejo 2002) assume that decisionmakerscompare options intradimensional-wise, thatis, focusing on each dimension sequentially.Intradimensional comparisons are usuallyeasier than interdimensional comparisons,

especially when dimensions differ in meas-urement units (Tversky 1969). Finally,intradimensional comparisons simplify thedetection of dominance relationships.

Another class of theories that allow intran-sitive choices is regret theories (David E. Bell1982; Peter C. Fishburn 1982; Loomes andSugden 1982). These theories assume thatthe decisionmaker prefers the option withthe highest utility but the utility of an optiondepends on an independent evaluation of theoption and a comparative evaluation with theother option as follows:

(15) u(Ai) = ∑sps. [u(xsi) +

w . R(u(xsi), u(xsj))],

where Ai is the option under consideration,Aj is the alternative option, s refers to stateswith consequences x, and where w = 1 ifu(xsi) > u(xsj), and w = −1 otherwise. Thenonlinear function R(u(xsi), u(xsj)) is increas-ing in its first argument and decreasing in itssecond argument; a possible function couldbe R[u(xsi), u(xsj)] = [u(xsi) − u(xsj)]2.

To illustrate, consider three pairwisechoices between the gambles A, B, and C,with outcomes determined by throwing adie. Gamble A has the smallest payoff of $2with certainty, gamble B has a medium pay-off of $3 if the die throw results in a one,two, three, or four, and option C has thehighest payoff of $5 if the die throw resultsin a five or six. In a choice between A and B,A has the higher utility because the antici-pated regret associated with B if the diecomes up five or six is disproportionatelylarge relative to the regret associated with Aif a one, two, three, or four were thrown.Furthermore, the decision maker will preferB to C using the same logic. However, in achoice between A and C, the decisionmakerwill prefer C, because the anticipated regretwith A if a one or two is thrown is extremelyhigh. As this examples shows, regret theoriescan explain intransitive choices, but theycannot explain compromise or attractioneffects, nor can they handle the probabilisticnature of choice.

sep06_Article2 8/4/06 4:36 PM Page 647

SECOND PASSQuebecor World Premedia

648 Journal of Economic Literature, Vol. XLIV (September 2006)

How common are violations of weak sto-chastic transitivity? John M. Davis (1958)found some violations but decided they werenonsystematic. Donald Davidson andMarschak (1959) studied preferences forbets and found that 7 percent to 14 percentof choice triples violated transitivity.Kenneth R. MacCrimmon (1968) examinedthe choices of thirty-eight business man-agers and found that eight exhibited someintransitivities, although the violations didnot appear to be reliable. William T.Harbaugh, Kate Krause, and Timothy R.Berry (2001) examined transitivity in chil-dren and found relatively few violations,although many inconsistencies. The intransi-tivities found by Tversky (1969) were basedon a preselected group of participants forwhom inconsistent behavior had beenobserved before. In a recent summary of theliterature, Luce (2000) pointed out thatintransitivities may occur, but it may be dueto the simple fact that choices are not alwaysconsistent (see also Jean Claude Falmagneand Geoffrey Iverson 1979).

Other researchers have found systematicviolations of weak stochastic transitivity(Henry Montgomery 1977; Robert H.Ranyard 1977; David V. Budescu and WendyWeiss 1987; Birnbaum, Jamie N. Patton, andMelissa K. Lott 1999; Lindman and Lyons1978; Peter Roelofsma and Daniel Read2000). However, these violations—in starkcontrast to violations of strong stochastictransitivity—are relatively rare. That is,there are reliable circumstances where theyoccur, but those circumstances are fairlyunusual (see Becker, Degroot, andMarschak 1963; Krantz 1967; Rumelhartand Greeno 1971; Sjoberg 1975, 1977;Sjoberg and Capozza 1975; Mellers, Chang,Birnbaum, and Ordonez 1992). For thesereasons, it can be argued that the principleof weak stochastic transitivity should gener-ally be retained as a bound of rationality.Descriptive theories that do not allow viola-tions of weak stochastic transitivity usuallyperform reasonably well when predicting

choice behavior in many, perhaps even most,situations.

7. Discussion

The goal of the present review is to exam-ine how, when, and why people violate con-sistency principles that embody thetraditional view of rationality. Five con-straints on preferential choice wereaddressed. Those constraints, or consistencyprinciples, are partially ordered on a contin-uum from strictest to weakest. Figure 1shows that the principles include perfectconsistency, strong stochastic transitivity,IIA, regularity, and, finally, weak stochastictransitivity. The evidence for and againsteach principle was examined. The empiricaleffects, their connections to principles, andbrief definitions of the effects are presentedin table 4. The frequency and breadth of theviolations declined as we proceeded fromstricter to weaker bounds. There was strongevidence against perfect consistency, strongstochastic transitivity, independence, andregularity. This evidence is importantbecause it implies that descriptive theoriesmust relax these constraints in order to makeaccurate predictions.

7.1 Comparison of Models

Deterministic theories of choice, such asthe subjective expected utility theory(Leonard J. Savage 1954) or multiattributeutility theory (Keeney and Raiffa 1976), can-not accommodate the fact that choices areprobabilistic. The “conventional strategy”(Starmer 2000) for handling this problem isto claim that inconsistencies are random.Though tempting, this strategy fails toexplain how and when choice probabilitieswill vary across sets. To accomplish the lat-ter, two classes of probabilistic choice theo-ries were proposed: fixed utility theories(e.g., Debreu 1958; Luce 1958) and randomutility theories (e.g., Thurstone 1959;Becker, Degroot, and Marschak 1963;McFadden 2000).

sep06_Article2 8/4/06 4:36 PM Page 648

SECOND PASSQuebecor World Premedia

649Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

TABLE 4SUMMARY OF EFFECTS

Principles Emperical Evidence Brief Description of Effects

Strong Stochastic Comparability Effects p(A|{A,B}) > .5, p(B|{B,C}) > .5, and p(A|{A,C}) > .5,Transitivity but p(B|{B,C}) > p(A|{A,C}) when B and C have similar values on one dimension

Independence Similarity Effects p(B1|{B1,D}) > p(D|{B1,D}),from Irrelevant but p(B1|{B1,D,B2}) < p(D|{B1,D,B2})Alternatives

Compromise Effects p(B|{B,C}) > p(C|{B,C}) but p(C|{A,B,C}) > p(B|{A,B,C}) when values of Care between values of A and B on all dimensions

Regularity Asymmetric Dominance p(A|{A,B,D}) > p(A|{A,B}) when D is dominated by A, but not by BEffects

Attraction Effects p(A|{A,B,D}) > p(A|{A,B}) when D is much inferior to A, but not to B

Weak Stochastic Transitivity Violations p(A|{A,B}) > .5, p(B|{B,C}) > .5, but p(C|{A,C}) > .5Transitivity

Fixed utility theories that assume a well-defined preference function imply strongstochastic transitivity and the IIA. Again,there is a considerable body of empirical evi-dence, including similarity effects, compara-bility effects, and the compromise effect,showing violations of these principles.Random utility theories imply a consistencyprinciple of regularity. There is a substantialbody of evidence, including attractioneffects and asymmetric dominance effects,against regularity.

A number of psychological theories havebeen proposed to accommodate the viola-tions. Table 5 summarizes many of the theo-ries previously mentioned and shows whicheffects they can and cannot predict. Tversky’s(1972b) elimination by aspects theory pro-vides an elegant account of similarity effects(violation of independence) but fails toaccount for asymmetric dominance effectsand attraction effects (violations of regularity).Tversky and Simonson (1993) developed thecomponential context theory to account forcompromise effects but their theory cannotdescribe similarity effects. Violations of strongand weak stochastic transitivity can be

explained by the contrast-weighting model(Mellers and Biagini 1994) and the stochasticdifference model (Gonzalez-Vallejo 2002).But since these models have only beendefined for binary choices, they cannot predictasymmetric dominance effects, attractioneffects, compromise effects, or similarityeffects. The most general theories considered,decision field theory (Busemeyer andTownsend 1993; Roe, Busemeyer, andTownsend 2001) and the leaky, competingaccumulator model (Usher and McClelland2004), can explain most effects, except violations of weak stochastic transitivity.

In sum, we have reviewed several theoriesthat can predict, to differing degrees, viola-tions of the consistency principles. We nowface the more general question of how toselect a descriptive theory of preferentialchoice. Most theorists agree that two criteriaare essential for model selection: pre-dictability and parsimony (see Heinz Linhartand Walter Zucchini 1986; Pitt, Myung, andZhang 2002). The best theory is one thataccurately predicts human behavior and issimple. Unfortunately, when fitting a modelto the data, accuracy and parsimony usually

sep06_Article2 8/4/06 4:36 PM Page 649

SECOND PASSQuebecor World Premedia

650 Journal of Economic Literature, Vol. XLIV (September 2006)

TABLE 5SUMMARY OF PRINCIPLES, VIOLATIONS, AND THEORETICAL ACCOUNTS

Principles Violations Theoretical Accounts

Consistency Variations in MNL GEV MLM DFT LCA CWM SDM EBAChoices

Strong Comparability GEV MLM DFT LCA CWM SDM EBA ADMStochastic EffectsTransitivity

Independence Similarity Effects GEV MLM DFT LCA EBAfrom IrrelevantAlternatives Compromise GEV MLM DFT LCA CCT

Effects

Regularity Asymmetric DFT LCA CCTDominanceEffects

Attraction Effects DFT LCA CCT

Weak Transitivity MLM CWM SDM ADMStochastic ViolationsTransitivity

where MNL = Multinomial Logit Model, GEV = Generalized Extreme Value Model, MLM = Mixed Logit Model, DFT =Decision Field Theory, LCA = Leaky, Competing Accumulator Model, CWM = Contrast-Weighting Model, SDM = StochasticDifference Model, EBA = Elimination by Aspects, CCT = Componential Context Theory, and ADM = Additive DifferenceModel

work in opposition; the more complex themodel, the better the fit.

For choice contexts with equally similar orequally discriminable options, violations ofIIA are less common and simple scalablemodels should suffice. However, this case isfairly unusual and fails to hold in most con-sumer choice tasks (e.g., whenever there arecorrelated attribute values or unequal vari-ances in payoffs). If options do vary in simi-larity or discriminability but all of theobviously deficient options have beenremoved, regularity is likely to be satisfiedalthough IIA could be violated. In this situa-tion, recent random utility models like themixed logit model can be effective. Thus, itcan happen that multiple theories can pre-dict the same violations. For example, bothmixed logit models and decision field theoryare capable of explaining similarity effects.However, the mixed logit model requires

different parameters, whereas decision fieldtheory can explain these results using thesame set of parameters. In this instance,decision field theory provides a simpler andmore parsimonious explanation. Moreover,unlike decision field theory and the leaky,competing accumulator model, mixed logitmodels are unable to explain violations ofthe regularity principle. There are manyconsumer choice situations that contain defi-cient options (e.g., a product that is extreme-ly overpriced because of its brand name),and violations of regularity would be expect-ed. When this occurs, one needs to turn tomore complex models that do not requireregularity. In sum, only the more complexmodels such as the decision field theory, theleaky competing accumulator model, or themixed logit model can explain several of thebehavior effects that we have reviewed.Future work will need to focus on direct

sep06_Article2 8/4/06 4:36 PM Page 650

SECOND PASSQuebecor World Premedia

651Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

comparisons of these models to explore howappropriately the models predict humanbehavior.

Another important principle to considerfor model selection is generalizability. Thegoal of model selection is to find a theorythat is capable of predicting not only theobserved choices in an experiment but alsoother behaviors that are independent of theparticular experiment used for parameterestimation. When predicting new data, sim-pler models are often more robust andachieve higher levels of accuracy under newconditions. This may be particularly impor-tant for the design of new products. Simplermodels, built on psychological principlesrather than ad hoc parameters, promise tobe more effective for purposes of generaliza-tion. For example, Tversky’s (1972b) elimi-nation by aspects model providesmechanisms to make a priori predictionsabout the effects of new options.

Is there a proper balance between pre-dictability and complexity? The answerdepends on the application and the circum-stances. Descriptive theories should be ableto predict violations of consistency princi-ples when violations are commonplace. Areview of the literature suggests that viola-tions are frequent for all of the principlesthat we examined, except perhaps weak sto-chastic transitivity. These violations arequite systematic, but relatively rare (cf.Mellers and Biagini 1994). In fact, WilliamH. Morrison (1963) noted that the theo-retical number of intransitive triples in acomplete set of pairwise comparisons isquite limited: Given n options, the maxi-mum number of intransitive triples is(n + 1)/(4(n − 2)). As n increases, the maxi-mum number of intransitive triples approach-es one fourth. This result implies that even ifa decision process produces a maximum ofintransitive choices, the intransitive tripleswill still be relatively atypical. Therefore,increasing the complexity of a theory toaccount for violations of weak stochastictransitivity appears less justified.

7.2 When Are Violations of ConsistencyPrinciples Reasonable?

We have argued that some violations ofconsistency principles are common. Now weaddress the question of whether these viola-tions are reasonable. Rationality is tradition-ally defined as compliance with consistencyprinciples, and inconsistency implies “irra-tionality” (Shafir and Robin A. LeBoeuf2002). This approach has been widely chal-lenged (e.g., Aumann 1962; Machina 1982;Cohen and Jaffray 1988; Gigerenzer 1991,1996a, 1996b; Lopes and Oden 1991;Hammond 1996; Lopes 1996; McFadden1999; Mongin 2000). Selten (2001, p. 15)argued that “behavior should not be calledirrational simply because it fails to conformto norms of full rationality” and that “bound-ed rationality is not irrationality.” Boundedrationality is a more accurate description ofhuman behavior taking into account thatpeople make decisions with limited time,knowledge, and computational power. Thisview, initially proposed by Simon (1956,1983), recognizes that the lack of compli-ance with consistency principles could, infact, be functional. For example, the benefitof a quick decision could easily outweigh thecost of some inconsistency.

How reasonable are violations of the fiveconsistency principles discussed here? Wewill begin with perfect consistency, a princi-ple that follows from deterministic choicetheories. If individuals had “true” prefer-ences, they might find themselves at a dis-tinct disadvantage if they deviated from thosepreferences, especially in the long run.However, in dynamic environments, whereindividuals are continually faced with newsets of options, some degree of variability isbeneficial for exploring the environment. Acompetitive environment is another casewhere some degree of variability could bedesirable. The decisionmaker whose choicesare difficult to forecast has an advantage overa predictable opponent. In fact, in zero-sumgames stochastic decisions, representing

sep06_Article2 8/4/06 4:36 PM Page 651

SECOND PASSQuebecor World Premedia

652 Journal of Economic Literature, Vol. XLIV (September 2006)

mixed strategies, are the only existing Nashequilibrium defining optimal behavior. Insum, these real-world advantages of vari-able preferences could easily carry over intoartificial stable laboratory settings.

Numerous studies cast doubt on the ideathat people possess “true” preferences in thesense of fixed utility theories. Instead, the evi-dence suggests that people construct or dis-cover their preferences when presented withoption sets. The unfolding of these processesdepends on the circumstances of the choicesituation (i.e., the time pressure, the stakes,and the way in which preferences are elicit-ed) and on the options under consideration(i.e., the context and the frame). The attrib-utes an individual uses for comparing optionsmight depend on the option set. Their rela-tive “importance” could depend on the simi-larity of attribute values, and thus explaincomparability effects. Focusing the individ-ual’s attention on attributes with different val-ues seems like an efficient and reasonableform of decision making.

Violations of IIA could also be reasonable.For instance, Greene (2000, p. 864) in hisstandard textbook in econometrics writesthat the principle “is not a particularlyappealing restriction to place on consumerbehavior.” David Brownstone and Train(1999, p. 110) argue that the substitutionseffects that are implied by the IIA principle“can be unrealistic in many settings” andMcFadden (2001, p. 358) argues that they“may not be behaviorally plausible.”Likewise, Hensher and Greene (2003, p.2003) regard the principle as “questionable.”

The famous “red-bus blue-bus” exampledescribed above shows that in many situa-tions it appears unreasonable to follow theIIA principle. Amartya Sen (1993) presentedanother illustrative example of a decision as towhether to take an apple out of a fruit basketat a dinner party. A person who behavesdecently might not take the apple if it is thelast. But, if more than one apple is left in thebasket, the same person might take an apple.Of course, the additional apples are not really

“independent” because they change the con-sequences of the original option (i.e., norms ofpoliteness). However, this is exactly why viola-tions of independence occur: When peopleconstruct their preferences in a choice situa-tion, additional options can change the conse-quences of other options or the perception ofthese consequences.

A decision process that constructs or dis-covers preferences could also explain viola-tions of regularity. The addition of newoptions to a choice set (i.e., adding a lessattractive option) systematically influencespreferences for options in the original set,violating the regularity principle. These typesof contextual effects need not be unreason-able. When evaluating choice sets, such asthose in table 2, participants in an experimentmight infer that a dominated option, such asD, was included in the choice set to provideuseful information (Paul H. Grice 1975,1989; see also Denis J. Hilton 1995).Participants might infer that laptop comput-ers with large displays, such as A, are relative-ly uncommon and perhaps less desirable thanlaptop computers with smaller displays, suchas B and D. If the market for larger displays issmall, B could be the best choice.

Birger Wernerfelt (1995) made a similarargument by suggesting that people use theavailable options to draw inferences abouttheir own tastes and utilities. Decisionmakerswho know their relative but not theirabsolute utilities could make assumptionsabout the correct choice from the choice setif they believed the options reflected popula-tion utilities. They would use the marketofferings to assign utilities and infer theirown preferences. Drazen Prelec, Wernerfelt,and Florian Zettelmeyer (1997) developed acontext model that reflects simple infer-ences about one’s own preferences (idealpoints) given what is available (product“addresses”). They generate predictions thatcan simulate compromise and attractioneffects. Though promising, their modelrequires additional information (categoryratings) that reflect a decisionmaker’s ideal

sep06_Article2 8/4/06 4:36 PM Page 652

SECOND PASSQuebecor World Premedia

653Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

point and his or her perceptions of productaddresses.

In sum, the different violations of consis-tency principles can be justifiable within a“bounded rationality” framework. Boundedrationality stresses how choices depend onthe characteristics of the environment. It alsostresses how the selection of simple strate-gies can be quite adaptive. For example,Gigerenzer and Daniel G. Goldstein (1996)demonstrated that a simple lexicographicheuristic provided reasonable predictions ina paired-comparison inference task. If aheuristic is fairly good, it might be beneficial,even if it occasionally produces intransitivechoices. Consistent, Rieskamp and Otto(2006) could show that simple heuristics canpredict individuals’ choices quite well.Warren Thorngate (1980) and Payne,Bettman, and Johnson (1988) demonstratedthat simple heuristics can also be efficientwhen choosing among options with a flatmaximum. Another heuristic that can lead tointransitivities is the majority rule (e.g.,Kenneth O. May 1954; Maya Bar-Hillel andAvishai Margalit 1988). With this rule, thedecisionmaker simply counts the number ofdimensions where an option has an advan-tage (disregarding the magnitude of theadvantages) and selects the option with thegreatest number of advantages. By focusingonly on the relative advantage of an optioncompared to alternative options, the majorityrule can violate transitivity. Nonetheless,these heuristics, such as the lexicographic ormajority rule, require only a small subset ofthe available information, which is then easyto process. Heuristics can be regarded as“approximation methods” of more complexstrategies, and by “using such methods . . . weimplicitly assume that the world is notdesigned to take advantage of our approxi-mation methods,” whereas experiments areoften “designed with exactly that goal inmind” (Tversky 1969, p. 46).

The traditional argument that transitivityis a cornerstone of rationality is based onthe observation that a decisionmaker who

violates the axiom may be “taken” by amoney pump (see Davidson, John CharlesC. McKinsey, and Suppes 1955; Robin P.Cubitt and Sugden 2001). Being “taken”means that a decisionmaker who starts withan option and cycles through a series ofchoices, paying each time for a new option,will eventually end up with less money andthe original option. It is unquestionable thatsuch intransitive choices are costly andunreasonable. However, we do notencounter money pumps very often and, ifwe did, we would probably learn to avoidthem. As experimental results show (Yun-Peng Chu and Ruey-Ling Chu 1990), build-ing a money pump is not easy becausepeople typically recognize their intransitivechoices and quickly change their behavior.Thus, money pump “arguments are logicalbogeymen” (Lopes 1996, p. 187) thatdemonstrate how irrational behavior inprinciple could occur, but that do not showthat irrational behavior in fact occurs.

The regular, systematic, and robust viola-tions of consistency set the requirements fordescriptive theories of choice. The violationswe have discussed are interrelated in subtleand interesting ways; they illustrate system-atic behavioral principles. Whether theyrepresent rational or irrational behaviormight be analogous to asking whether theglass is half empty or half full. What reallymatters is (1) understanding how and whypeople make choices and (2) being able topredict those choices. Debates about ration-ality focus attention far too narrowly. Abroader conversation—one that considersreasonable behavior, adaptive behavior, andthe environment in which choices occur—islong overdue. We look forward to this shiftin focus and the related evidence and theo-ries that will unfold in the next severaldecades.

APPENDICES

Appendix A: Proof that the elimination byaspect theory cannot explain compromiseeffects.

sep06_Article2 8/4/06 4:36 PM Page 653

SECOND PASSQuebecor World Premedia

654 Journal of Economic Literature, Vol. XLIV (September 2006)

Consider a choice set consisting of threecameras labeled A, B, and C. A is a high-quality camera with many high-quality fea-tures and is very expensive; B is a low-qualitycamera with no high-quality features and isquite cheap; C is between the two extremes,having some, but not all, of the high-qualityfeatures and is moderately priced. The com-promise camera shares some features withboth extreme options. With the compromiseeffect, p(A|{A,B}) = p(A|{A,C}) = p(B|{B,C}).But, when the three options are evaluatedtogether, p(C|{A,B,C}) > p(A|{A,B,C}) andp(C|{A,B,C}) > p(B|{A,B,C}). The compo-nential context theory (Tversky andSimonson 1993) predicts the compromiseeffect. In contrast, the elimination by aspectstheory (Tversky 1972b) cannot predict theeffect, as shown below. In the elimination byaspects theory, alternatives are composed ofcommon and unique positive aspects:

a = unique aspects of A,b = unique aspects of B,c = unique aspects of C,

ab = common aspects between A and B,but not with C,

ac = common aspects between A and C,but not with B, and

bc = common aspects between B and C,but not with A.

Let u be a scale that assigns a positivevalue to these aspects so that, for instance,u(a) represents the positive value of aspectsa to the decisionmaker. Cameras A and Cshare some high-quality features, whereascameras A and B have little in common sothat u(ac) > u(ab). Cameras B and C sharethe positive aspect of not being very expen-sive, which cameras A and C do not share sothat u(bc) > u(ab). Thus, on the basis of thesimilarity relations among the choiceoptions, it can be assumed that:

(A1) u(ac) > u(ab) and u(bc) > u(ab).

These assumptions, regarding the mappingfrom options to similarity relations, are essen-tial for EBA to account for the similarity

effect (see Tversky 1972b). For binarychoices, EBA predicts:(A2) p(A|{A,B}) =

u(a) + u(ac)____________________________ .u(a) + u(ac) + u(b) + u(bc)

The compromise effect implies that thechoice probabilities for the three binarychoices are equal so that:

(A3) u(a) + u(ac) = u(b) + u(bc),

(A4) u(a) + u(ab) = u(c) + u(bc), and

(A5) u(b) + u(ab) = u(c) + u(ac).

From (A1) and (A5) it is known thatu(b) > u(c) and from (A1) and (A4) it isknown that u(a) > u(c). For triadic choices,EBA predicts

p(A|{A,B,C}) = [u(a) + u(ab) . p(A|{A,B}) +u(ac) . p(A|{A,C})]/u(K),

where u(K) = [u(a) + u(b) + u(c) + u(ab) +u(ac) + u(bc)]. The compromise effectimplies

p(C|{A,B,C}) > p(A|{A,B,C}) or[u(c) + .5(u(ac) + u(bc))]/u(K) >[u(a) + .5(u(ab) + u(ac))]/u(K),p(C|{A,B,C}) > p(B|{A,B,C}) or

[u(c) + .5(u(ac) + u(bc))]/u(K) >[u(b) + .5(u(ab) + u(bc)}]/u(K).

If both sides of the inequalities are multi-plied by 2 and u(K) is cancelled out, theresult is:

(A6) [u(c) + u(bc) + u(c) + u(ac)] >[u(a) + u(ac) + u(a) + u(ab)],

(A7) [u(c) + u(bc) + u(c) + u(ac)] >[u(b) + u(bc) + u(b) + u(ab)].

With the substitutions from (A3) and (A4),inequality (A7) can be rewritten as

[u(a) + u(ab) + u(c) + u(ac)] >[u(a) + u(ac) + u(b) + u(ab)].

After canceling on both sides, it is foundthat u(c) > u(b), contradicting the earlier

sep06_Article2 8/4/06 4:36 PM Page 654

SECOND PASSQuebecor World Premedia

655Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

conclusion that u(b) > u(c) from (A1) and(A5). Similarly, with the substitutions from(A4), inequality (A6) can be rewritten as

[u(a) + u(ab) + u(c) + u(ac)] >[u(a) + u(ac) + u(a) + u(ab)].

After canceling on both sides, it is foundthat u(c) > u(a), contradicting the earlierconclusion that u(a) > u(c) from (A1) and(A4). Therefore, EBA cannot predict the compromise effect.

Appendix B: Proof that the componentialcontext theory cannot explain similarityeffects.

Consider a choice set consisting of threecameras labeled A, B, and D. A is a high-qual-ity camera with many high-quality featuresand is expensive; B is a low-quality camerawith no extra features and is very cheap.Camera D is a slightly higher-quality camerathan A; it has more quality features and aslightly higher price than A. D is very similarto A, and both A and D are quite differentfrom B. The similarity effect implies thatp(A|{A,B}) = p(A|{A,D}) = p(B|{B,D}). But,when the three options are evaluated togeth-er, either p(B|{A,B,D}) > p(A|{A,B,D}) orp(B|{A,B,D}) > p(D|{A,B,D}). The proof thatthe componential context theory cannot pre-dict the similarity effect can be demonstratedwith the first of the two triadic choices.

The componential context theory is basedon the idea that people consider the advan-tages of one option over those of another.Camera A has a higher-quality advantageover B and camera B has a higher-priceadvantage over A. The quality advantage ofA over B is denoted adv(A,B) and the priceadvantage of B over A is denoted adv(B, A).For triadic choice sets, the model assumes:

(B1) p(A|{A,B,D}) = F[V(A|{A,B,D}),V(B|{A,B,D}), V(D|{A,B,D})]

where F is an increasing function of the firstargument and a decreasing function of thesecond two. Furthermore:

(B2) V(A|{A,B,D}) =∑kβkvk(A) + θ[R(A,B) + R(A,D)]

where ∑kβkvk(A) = v(A) represents the valuev of option A neglecting advantages of otheroptions and:

adv(A,B)(B3) R(A,B) = _________________ =adv(A,B) + dis(A,B)

∑kadvk(A,B)______________________ ,∑kadvk(A,B) + ∑kdisk(A,B)

where advk(A,B) = vk(A) − vk(B) if A has apositive advantage over B, and zero other-wise, and disk(A,B) = δ[vk(B) − vk(A)] ifoption A has a disadvantage over B, and zerootherwise, and where δ is a convex functionconsistent with loss aversion so that thevalue for the disadvantage of A over B islarger than the corresponding advantage ofB over A. Convexity is required to explainthe compromise effect (Tversky andSimonson 1993).

The fact that the binary choices are allequal implies: adv(B,D) = adv(D,B) andv(B) = v(D); adv(B,A) = adv(A,B) and v(A)= v(B); adv(D,A) = adv(A,D) and v(A)= v(D).

For the similarity effect, as describedabove, the largest advantage on price occursbetween B and D, the next largest advantageoccurs between B and A, and the smallestadvantage occurs between A and D. That is:

(B4) adv(B,D) > adv(B,A) > adv(A,D).

To explain the similarity effect we mustobtain

(B5) V(B|{A,B,D}) > V(A|{A,B,D})

and because

adv(B,A)⎯⎯⎯⎯⎯⎯⎯⎯ −adv(B,A) + dis(B,A)

adv(A,B)⎯⎯⎯⎯⎯⎯⎯⎯ −adv(A,B) + dis(A,B)

(B5) is true when

sep06_Article2 8/4/06 4:36 PM Page 655

SECOND PASSQuebecor World Premedia

656 Journal of Economic Literature, Vol. XLIV (September 2006)

adv(B,D)⎯⎯⎯⎯⎯⎯⎯⎯⎯ >adv(B,D) + dis(B,D)

adv(A,D)⎯⎯⎯⎯⎯⎯⎯⎯⎯ ,adv(A,D) + dis(A,D)

dis(A,D) dis(B,D)or ⎯⎯⎯⎯ > ⎯⎯⎯⎯ .adv(A,D) adv(B,D)

But if adv(B,D) > adv(A,D), as in (B4), thisequation must be false for convex δ func-tions. Thus, the componential context modelcannot predict the similarity effect.

Appendix C: Proof that mixed logit modelssatisfy regularity.

Assume that X is a complete set of optionsand Y is a subset of X(Y � X), that isY = {A1,…,Am} and X = {A1,…, An} withn ≥ m. Define L(A i |Y;β) as the probabilitythat option A i is chosen from a set Y given afixed set of coefficients β. For example, Lmay be defined by a mixed logit model withrandom coefficients β (see for example,McFadden and Train, 2000). Assume thatL(A i |Y;β) satisfies regularity. In other words,given Y � X, it is assumed that L(A i |Y;β) −L(A i |X;β) > 0 for every β. Then it followsthat p(Ai|Y) is given by a probability mixtureof L(A i |Y;β) as defined by the integral overthe density f:

p(Ai|Y) = �L(Ai|Y;β) . f(β)dβ.

Then the difference

p(Ai |Y) − p(Ai |X) =

�L(Ai |Y;β) . f(β)dβ −�L(Ai |X;β) . f(β)dβ =

�[L(Ai|Y;β) − L(Ai|X;β)] . f(β)dβ > 0

is always positive, which implies thatp(Ai|Y) > p(Ai|X). Therefore the mixed logitmodel satisfies regularity.

Appendix D: Example of how the mixedlogit model can explain violations of weakstochastic transitivity.

Assume that Y is a set of optionsY = {A1,…, Am}, for instance a number ofcollege applicants following Tversky’s(1969) example. Each candidate i isdescribed by three attribute values xik, intel-lectual stability (k = 1), emotional stability(k = 2), and social facility (k = 3). Suppose,the first candidate has the attribute valuesx11 = 66, x12 = 90, and x13 = 85, the secondcandidate has the values x21 = 78, x22 = 70,and x23 = 55, and the third candidate has thevalues x31 = 90, x32 = 32, and x33 = 4. We canapply the mixed logit model as defined byequation 7, such that the choice probabilitywhen choosing between the first and secondcandidate is

exp(∑kβkq. x1k)p(A1|{A1,A2}) = _______________________

exp(∑kβkq . x1k)+exp(∑kβkq. x2k)

.

For convenience, define viq = exp(∑kβkq. xik).

Following the logic of the mixed logit modelassume the vector of weights, is selectedwith equal probability of 1/3 from the setQ: = {(β11 = −1, β21 = 3.2, β31 = −2.6), (β12 = 1,β22 = −3.025, β32 = 2.45), (β13 = 0.2, β23 =−0.595, β33 = 0.51)}. Then, the probability ofchoosing the first candidate when comparingthe first and second candidate is:

p(A1|{A1,A2}) = [v11/ (v11 + v21) +v12/(v12 + v22) + v13/(v13 + v23)]/3

= [e1/(e1 + e3) + e2/(e2 + e1) +e3/ (e3 + e2)]/3 = 0.53.

Correspondingly, the probability of choosingthe second candidate when comparing thesecond and third candidate can be deter-mined, so that p(A2|{A2,A3}) = 0.53 results.Finally, when comparing the first and thethird candidate the probabilityp(A1|{A1,A3}) = .47 results, which violatesweak stochastic transitivity.

Appendix E: Proof that the generalizedextreme value model satisfies weak stochastictransitivity.

Assume that Y is a set of optionsY = {A1, …, Am}. The generalized extreme

sep06_Article2 8/4/06 4:36 PM Page 656

SECOND PASSQuebecor World Premedia

657Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

value model defines the binary choiceprobability as follows:

p(Ai|{Ai,Aj}) =exp(βXi /θij)⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ ,exp(βXi /θij) + exp(βXj /θij)

where X is a vector of attributes, β is a vec-tor of weights assigned to the attributes, andθ is as free similarity parameter that dependson the similarity between the pair of optionsfor each choice set and that can account forviolations of IIA. Note thatp(Ai|{Ai,Aj}) > .50 → p(Ai|{Ai,Aj})/p(Aj|{Ai,Aj})

exp(βXi /θij)= ⎯⎯⎯⎯⎯ = → βXi > βXj.exp(βXj /θij)

Therefore ifp(Ai|{Ai,Aj}) > .50 and p(Aj|{Aj,Ak}) > .50,

then βXi > βXj and βXj > βXk → βXi > βXk →p(Ai|{Ai,Ak}) > .50.

REFERENCES

Ariely, Dan, and Thomas S. Wallsten. 1995. “SeekingSubjective Dominance in Multidimensional Space:An Exploration of the Asymmetric DominanceEffect.” Organizational Behavior and HumanDecision Processes, 63(3): 223–32.

Aumann, Robert J. 1962. “Utility Theory Without theCompleteness Axiom.” Econometrica, 30(3): 445–62.

Ballinger, T. Parker, and Nathaniel T. Wilcox. 1997.“Decisions, Error and Heterogeneity.” EconomicJournal, 107(443): 1090–1105.

Bar-Hillel, Maya, and Avishai Margalit. 1988. “HowVicious are Cycles of Intransitive Choice?” Theoryand Decision, 24(2): 119–45.

Becker, Gordon, Morris H. Degroot, and JacobMarschak. 1963. “Probabilities of Choices AmongVery Similar Objects: An Experiment to DecideBetween Two Models.” Behavioral Science, 8(4):306–11.

Bell, David E. 1982. “Regret in Decision MakingUnder Uncertainty.” Operations Research, 30(5):961–81.

Berry, Steven T. 1994. “Estimating Discrete-ChoiceModels of Product Differentiation.” RAND Journalof Economics, 25(2): 242–62.

Berry, Steven T., James Levinsohn, and Ariel Pakes.1995. “Automobile Prices in Market Equilibrium.”Econometrica, 63(4): 841–90.

Berry, Steven, James Levinsohn, and Ariel Pakes. 1999.“Voluntary Export Restraints on Automobiles:Evaluating a Trade Policy.” American EconomicReview, 89(3): 400–430.

Birnbaum, Michael H., Jamie N. Patton, and MelissaK. Lott. 1999. “Evidence Against Rank-Dependent

Utility Theories: Tests of Cumulative Independence,Interval Independence, Stochastic Dominance, andTransitivity.” Organizational Behavior and HumanDecision Processes, 77(1): 44–83.

Bock, Richard D., and Lyle V. Jones. 1968. TheMeasurement and Prediction of Judgment andChoice. San Francisco: Holden-Day.

Brownstone, David and Kenneth E. Train. 1999.“Forecasting New Product Penetration with FlexibleSubstitution Patterns.” Journal of Econometrics,89(1–2): 109–29.

Budescu, David V., and Wendy Weiss. 1987. “Reflectionof Transitive and Intransitive Preferences: A Test ofProspect Theory.” Organizational Behavior andHuman Decision Processes, 39(2): 184–202.

Busemeyer, Jerome R. 1985. “Decision Making UnderUncertainty: A Comparison of Simple Scalability,Fixed-Sample, and Sequential-Sampling Models.”Journal of Experimental Psychology: Learning,Memory, and Cognition, 11(3): 538–64.

Busemeyer, Jerome R., and Adele Diederich. 2002.“Survey of Decision Field Theory.” MathematicalSocial Sciences, 43(3): 345–70.

Busemeyer, Jerome R., and James T. Townsend. 1993.“Decision Field Theory: A Dynamic–CognitiveApproach to Decision Making in an UncertainEnvironment.” Psychological Review, 100(3): 432–59.

Busemeyer, Jerome R., Ethan Weg, Rachel Barkan,Xuyang Li, and Zhengping Ma. 2000. “Dynamic andConsequential Consistency of Choices BetweenPaths of Decision Trees.” Journal of ExperimentalPsychology: General, 129(4): 530–45.

Camerer, Colin F. 1989. “An Experimental Test ofSeveral Generalized Utility Theories.” Journal ofRisk and Uncertainty, 2(1): 61–104.

Chu, Yun-Peng, and Ruey-Ling Chu. 1990. “TheSubsidence of Preference Reversals in Simplifiedand Marketlike Experimental Settings: A Note.”American Economic Review, 80(4): 902–11.

Cohen, Michele, and Jean-Yves Jaffray. 1988. “IsSavage’s Independence Axiom a Universal RationalityPrinciple?” Behavioral Science, 33(1): 38–47.

Coombs, Clyde H. 1958. “On the Use of Inconsistencyof Preferences in Psychological Measurement.”Journal of Experimental Psychology, 55(1): 1–7.

Coombs, Clyde H., and Dean G. Pruitt. 1961. “SomeCharacteristics of Choice Behavior in RiskySituations.” Annals of the New York Academy ofSciences, 89(5): 784–94.

Cubitt, Robin P., and Robert Sugden. 2001. “OnMoney Pumps.” Games and Economic Behavior,37(1): 121–60.

Davidson, Donald, and Jacob Marschak. 1959.“Experimental Tests of a Stochastic DecisionTheory.” In Measurement: Definitions and Theories,ed. West C. Churchman and Philburn Ratoosh. NewYork: Wiley, 233–69.

Davidson, Donald, John Charles C. McKinsey, andPatrick Suppes. 1955. “Outlines of a Formal Theoryof Value, I.” Philosophy of Science, 22(2): 140–60.

Davis, John M. 1958. “The Transitivity of Preferences.”Behavioral Science, 3(1): 26–33.

Debreu, Gerard. 1958. “Stochastic Choice andCardinal Utility.” Econometrica, 26(3): 440–44.

sep06_Article2 8/4/06 4:36 PM Page 657

SECOND PASSQuebecor World Premedia

658 Journal of Economic Literature, Vol. XLIV (September 2006)

Debreu, Gerard. 1960. “Review of R. D. Luce,Individual Choice Behavior: A Theoretical Analysis.”American Economic Review, 50(1): 186–88.

Dhar, Ravi, Stephen M. Nowlis, and Steven J.Sherman. 2000. “Trying Hard or Hardly Trying: AnAnalysis of Context Effects in Choice.” Journal ofConsumer Psychology, 9(4): 189–200.

Domencich, Tom, and Daniel L. McFadden. 1975.Urban Travel Demand: A Behavioral Analysis.Amsterdam: North-Holland.

Edwards, Ward. 1954. “The Theory of DecisionMaking.” Psychological Bulletin, 51(4): 380–417.

Edwards, Ward. 1961. “Behavioral Decision Theory.”Annual Review of Psychology, 12: 473–98.

Falmagne, Jean Claude, and Geoffrey Iverson. 1979.“Conjoint Weber Laws and Additivity.” Journal ofMathematical Psychology, 20(2): 164–83.

Fishburn, Peter C. 1982. “Nontransitive MeasurableUtility.” Journal of Mathematical Psychology, 26(1):31–67.

Gigerenzer, Gerd. 1991. “How to Make CognitiveIllusions Disappear: Beyond ‘Heuristics and Biases’.”In European Review of Social Psychology, Vol. 2, ed.Wolfgang Stroebe and Miles Hewstone. New York:Wiley, 83–115.

Gigerenzer, Gerd. 1996a. “Rationality: Why SocialContext Matters.” In Interactive Minds: Life-SpanPerspectives on the Social Foundation of Cognition,ed. Paul B. Baltes and Ursula M. Staudinger.Cambridge: Cambridge University Press, 319–46.

Gigerenzer, Gerd. 1996b. “On Narrow Norms andVague Heuristics: A Reply to Kahneman andTversky.” Psychological Review, 103(3): 592–96.

Gigerenzer, Gerd, and Daniel G. Goldstein. 1996.“Reasoning the Fast and Frugal Way: Models ofBounded Rationality.” Psychological Review, 103(4):650–69.

Gigerenzer, Gerd, and Reinhard Selten, eds. 2001.Bounded Rationality: The Adaptive Toolbox.Cambridge: MIT Press.

Gonzalez-Vallejo, Claudia. 2002. “Making Trade-Offs: AProbabilistic and Context-Sensitive Model of ChoiceBehavior.” Psychological Review, 109(1): 137–55.

Green, Jerry R., and Bruno Jullien. 1989. “OrdinalIndependence in Nonlinear Utility Theory:Erratum.” Journal of Risk and Uncertainty, 2(1):119.

Greene, William H. 2000. Econometric Analysis.Upper Saddle River, N.J.: Prentice-Hall.

Grether, David M., and Charles R. Plott. 1979.“Economic Theory of Choice and the PreferenceReversal Phenomenon.” American EconomicReview, 69(4): 623–38.

Grice, Paul H. 1975. “Logic and Conversation.” InSyntax and Semantics 3: Speech Acts, ed. Peter Coleand Jerry L. Morgan. New York: Academic Press,43–58.

Grice, Paul H. 1989. Studies in the Way of Words.Cambridge: Harvard University Press.

Griswold, Betty J., and R. Duncan Luce. 1962.“Choices among Uncertain Outcomes: A Test of aDecomposition and Two Assumptions of Transitivity.”American Journal of Psychology, 75(1): 35–44.

Hammond, Kenneth R. 1996. Human Judgment and

Social Policy. New York: Oxford University Press.Harbaugh, William T., Kate Krause, and Timothy R.

Berry. 2001. “GARP for Kids: On the Developmentof Rational Choice Behavior.” American EconomicReview, 91(5): 1539–45.

Harless, David W., and Colin F. Camerer. 1994. “ThePredictive Utility of Generalized Expected UtilityTheories.” Econometrica, 62(6): 1251–89.

Hausman, Jerry A., and David A. Wise. 1978. “AConditional Probit Model for Qualitative Choice:Discrete Decisions Recognizing Interdependenceand Heterogeneous Preferences.” Econometrica,46(2): 403–26.

Heath, Timothy B., and Subimal Chatterjee. 1995.“Asymmetric Decoy Effects on Lower-Quality versusHigher-Quality Brands: Meta-Analytic andExperimental Evidence.” Journal of ConsumerResearch, 22(3): 268–84.

Hensher, David A. 1986. “Dimensions of AutomobileDemand: An Overview of an Australian ResearchProject.” Environment and Planning A, 18(10):1339–74.

Hensher, David A., and William H. Greene. 2003. “TheMixed Logit Model: The State of Practice.”Transportation, 30(2): 133–76.

Hey, John D. 2001. “Does Repetition ImproveConsistency?” Experimental Economics, 4(1): 5–54.

Hey, John D., and Chris Orme. 1994. “InvestigatingGeneralizations of Expected Utility Theory UsingExperimental Data.” Econometrica, 62(6): 1291–1326.

Highhouse, Scott. 1996. “Context-DependentSelection: The Effects of Decoy and Phantom JobCandidates.” Organizational Behavior and HumanDecision Processes, 65(1): 68–76.

Hilton, Denis J. 1995. “The Social Context ofReasoning: Conversational Inference and RationalJudgment.” Psychological Bulletin, 118(2): 248–71.

Hong, Chew Soo, Edi Karni, and Zvi Safra. 1987. “RiskAversion in the Theory of Expected Utility with RankDependent Probabilities.” Journal of EconomicTheory, 42(2): 370–81.

Huber, Joel, John W. Payne, and Christopher Puto.1982. “Adding Asymmetrically DominatedAlternatives: Violations of Regularity and theSimilarity Hypothesis.” Journal of ConsumerResearch, 9(1): 90–98.

Huber, Joel, and Christopher Puto. 1983. “MarketBoundaries and Product Choice: IllustratingAttraction and Substitution Effects.” Journal ofConsumer Research, 10(1): 31–44.

Keeney, Ralph L., and Howard Raiffa. 1976. Decisionswith Multiple Objectives: Preferences and ValueTradeoffs. New York: Cambridge University Press.

Krantz, David H. 1967. “Rational Distance Functionsfor Multidimensional Scaling.” Journal ofMathematical Psychology, 4(2): 226–45.

Leland, Jonathan W. 1994. “Generalized SimilarityJudgments: An Alternative Explanation for ChoiceAnomalies.” Journal of Risk and Uncertainty, 9(2):151–72.

Lindman, Harold R., and James Lyons. 1978.“Stimulus Complexity and Choice InconsistencyAmong Gambles.” Organizational Behavior andHuman Decision Processes, 21(2): 146–59.

sep06_Article2 8/4/06 4:36 PM Page 658

SECOND PASSQuebecor World Premedia

659Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

Linhart, Heinz, and Walter Zucchini. 1986. ModelSelection. New York: Wiley.

Loomes, Graham, and Robert Sugden. 1982. “RegretTheory: An Alternative Theory of Rational Choiceunder Uncertainty.” Economic Journal, 92(368):805–24.

Loomes, Graham, and Robert Sugden. 1995.“Incorporating a Stochastic Element into DecisionTheories.” European Economic Review, 39(3-4):641–48.

Lopes, Lola L. 1996. “When Time Is of the Essence:Averaging, Aspiration, and the Short Run.”Organizational Behavior and Human DecisionProcesses, 65(3): 179–89.

Lopes, Lola L., and Gregg C. Oden. 1991. “TheRationality of Intelligence.” In Probability andRationality: Studies on L. Jonathan Cohen’sPhilosophy of Science, ed. Ellery Eels and TomaszMaruszewski. Amsterdam: Editions Rodopi,199–223.

Luce, R. Duncan. 1956. “Semiorders and a Theory ofUtility Discrimination.” Econometrica, 24(2): 178–91.

Luce, R. Duncan. 1958. “A Probabilistic Theory ofChoice.” Econometrica, 26(2): 193–224.

Luce, R. Duncan. 1959. Individual Choice Behavior.New York: Wiley.

Luce, R. Duncan. 1990. “Rational Versus PlausibleAccounting Equivalences in Preference Judgments.”Psychological Science, 1(4): 225–34.

Luce, R. Duncan. 2000. Utility of Gains and Losses:Measurement-Theoretical and ExperimentalApproaches. Mahwah, N.J.: Lawrence ErlbaumAssociates.

Luce, R. Duncan, Barbara A. Mellers, and Shi-jieChang. 1993. “Is Choice the Correct Primitive? OnUsing Certainty Equivalents and Reference Levelsto Predict Choices among Gambles.” Journal of Riskand Uncertainty, 6(2): 115–43.

Luce, R. Duncan, and Patrick Suppes. 1965.“Preference, Utility, and Subjective Probability.” InHandbook of Mathematical Psychology, Vol. 3, ed. R.Duncan Luce, Robert B. Bush, and EugeneGalanter. New York: Wiley, 249–410.

Luce, R. Duncan, and Detlof von Winterfeldt. 1994.“What Common Ground Exists for Descriptive,Prescriptive, and Normative Utility Theories?”Management Science, 40(2): 263–79.

MacCrimmon, Kenneth R. 1968. “Descriptive andNormative Implications of the Decision-TheoryPostulates.” In Risk and Uncertainty, ed. Karl Borchand Jan Mossin. New York: St. Martin’s Press, 3–23.

Machina, Mark J. 1982. “‘Expected Utility’ Analysiswithout the Independence Axiom.” Econometrica,50(2): 277–323.

Machina, Mark J. 1985. “Stochastic Choice FunctionsGenerated from Deterministic Preferences overLotteries.” Economic Journal, 95(379): 575–94.

Manski, Charles F. 1977. “Structure of Random UtilityModels.” Theory and Decision, 8(3): 229–54.

Marley, Anthony Alfred J. 1992. “A Selective Review ofRecent Characterizations of Stochastic ChoiceModels Using Distribution and Functional EquationTechniques.” Mathematical Social Sciences, 23(1):5–29.

Marley, Anthony Alfred J., and Hans Colonius. 1992.“The ‘Horse Race’ Random Utility Model for ChoiceProbabilities and Reaction Times, and its CompetingRisks Interpretation.” Journal of MathematicalPsychology, 36(1): 1–20.

May, Kenneth O. 1954. “Transitivity, Utility, andAggregation in Preferences Patterns.” Econometrica,22(1): 1–13.

McFadden, Daniel. 1974. “Conditional Logit Analysisof Qualitative Choice Behavior.” In Frontiers inEconometrics, ed. Paul Zarembka. New York:Academic Press, 105–42.

McFadden, Daniel. 1978. “Modeling the Choice ofResidential Location.” In Spatial Interaction Theoryand Planning Models, ed. Anders Karlqvist, LarsLundqvist, Folke Snickars, and Jorgen W. Weibull.New York: North-Holland, 75–96.

McFadden, Daniel. 1981. “Econometric Models ofProbabilistic Choice.” In Structural Analysis ofDiscrete Data with Economic Applications, ed.Charles F. Manski and Daniel McFadden.Cambridge: MIT Press, 198–272.

McFadden, Daniel. 1999. “Rationality for Economists?”Journal of Risk and Uncertainty, 19(1–3): 73–105.

McFadden, Daniel. 2001. “Economic Choices.”American Economic Review, 91(3): 351–78.

McFadden, Daniel, and Kenneth E. Train. 2000.“Mixed MNL Models for Discrete Response.”Journal of Applied Econometrics, 15(5): 447–70.

McLean, Iain. 1995. “Independence of IrrelevantAlternatives before Arrow.” Mathematical SocialSciences, 30(2): 107–26.

Mellers, Barbara A., and Karen Biagini. 1994.“Similarity and Choice.” Psychological Review,101(3): 505–18.

Mellers, Barbara A., Shi-Jie Chang, Michael H.Birnbaum, and Lisa D. Ordonez. 1992.“Preferences, Prices, and Ratings in Risky DecisionMaking.” Journal of Experimental Psychology:Human Perception and Performance, 18(2): 347–61.

Meyer, Robert J., and Barbara E. Kahn. 1991.“Probabilistic Models of Consumer ChoiceBehavior.” In Handbook of Consumer Behavior, ed.Thomas S. Robertson and Harold R. Kassarjian.Upper Saddle River, N.J.: Prentice Hall, 85–123.

Mishra, Sanjay, U.N. Umesh, and Donald E. Stem.1993. “Antecedents of the Attraction Effect: AnInformation-Processing Approach.” Journal ofMarketing Research, 30(3): 331–49.

Mongin, Philippe. 2000. “Does Optimization ImplyRationality?” Synthese, 124(1): 73–111.

Montgomery, Henry. 1977. “A Study of IntransitivePreferences Using a Think-Aloud Procedure.” InDecision-Making and Change in Human Affairs, ed.Helmut Jungermann and Gerard de Zeeuw.Dordrecht: Reidel, 347–64.

Morrison, William H. 1963. “Testable Conditions forTriads of Paired Comparison Choices.”Psychometrika, 28(4): 369–90.

Mosteller, Frederick, and Philip Nogee. 1951. “AnExperimental Measurement of Utility.” Journal ofPolitical Economy, 59(5): 371–404.

von Neumann, John, and Oskar Morgenstern. 1947.Theory of Games and Economic Behavior. Princeton,

sep06_Article2 8/4/06 4:36 PM Page 659

SECOND PASSQuebecor World Premedia

660 Journal of Economic Literature, Vol. XLIV (September 2006)

N.J.: Princeton University Press.Pan, Yigang, and Donald R. Lehmann. 1993. “The

Influence of New Brand Entry on Subjective BrandJudgments.” Journal of Consumer Research, 20(1):76–86.

Pan, Yigang, Sue O’Curry, and Robert Pitts. 1995. “TheAttraction Effect and Political Choice in TwoElections.” Journal of Consumer Psychology, 4(1):85–101.

Payne, John W., James R. Bettman, and Eric J. Johnson.1988. “Adaptive Strategy Selection in DecisionMaking.” Journal of Experimental Psychology:Learning, Memory, and Cognition, 14(3): 534–52.

Payne, John W., James R. Bettman, and Eric J.Johnson. 1993. The Adaptive Decisionmaker.Cambridge: Cambridge University Press.

Pitt, Mark A., In Jae Myung, and Shaobo Zhang. 2002.“Toward a Method of Selecting AmongComputational Models of Cognition.” PsychologicalReview, 109(3): 472–91.

Prelec, Drazen, Birger Wernerfelt, and FlorianZettelmeyer. 1997. “The Role of Inference inContext Effects: Inferring What You Want fromWhat Is Available.” Journal of Consumer Research,24(1): 118–25.

Quiggin, John. 1982. “A Theory of Anticipated Utility.”Journal of Economic Behavior and Organization,3(4): 323–43.

Radner, Roy, and Jacob Marschak. 1954. “Note onSome Proposed Decision Criteria.” In DecisionProcesses, ed. Robert M. Thrall, Clyde H. Coombs,and Robert L. Davis. New York: Wiley, 61–68.

Ranyard, Robert H. 1977. “Risky Decisions WhichViolate Transitivity and Double Cancellation.” ActaPsychologica, 41(6): 449–59.

Revelt, David, and Kenneth E. Train. 1998. “MixedLogit with Repeated Choices: Households’ Choicesof Appliance Efficiency Level.” Review of Economicsand Statistics, 80(4): 647–57.

Rieskamp, Jörg, and Ulrich Hoffrage. 1999. “When DoPeople Use Simple Heuristics, and How Can WeTell?” In Simple Heuristics That Make Us Smart, ed.Gerd Gigerenzer, Peter M. Todd, and the ABCResearch Group. New York: Oxford UniversityPress, 141–67.

Rieskamp, Jörg, and Philipp E. Otto. 2006. “SSL: ATheory of How People Learn to Select Strategies.”Journal of Experimental Psychology: General,135(2): 202–36.

Roe, Robert M., Jerome R. Busemeyer, and James T.Townsend. 2001. “Multialternative Decision FieldTheory: A Dynamic Connectionist Model of DecisionMaking.” Psychological Review, 108(2): 370–92.

Roelofsma, Peter, and Daniel Read. 2000. “IntransitiveIntertemporal Choice.” Journal of BehavioralDecision Making, 13(2): 161–77.

Rubinstein, Ariel. 1988. “Similarity and Decision-Making under Risk (Is There a Utility TheoryResolution to the Allais Paradox?)” Journal ofEconomic Theory, 46(1): 145–53.

Rumelhart, Donald L., and James G. Greeno. 1971.“Similarity Between Stimuli: An Experimental Testof the Luce and Restle Choice Models.” Journal ofMathematical Psychology, 8(3): 370–81.

Samuelson, Paul. 1953. Foundations of EconomicAnalysis. Cambridge: Harvard University Press.

Savage, Leonard J. 1954. The Foundations of Statistics.New York: Wiley.

Schmidt, Peter, and Robert P. Strauss. 1975.“Estimation of Models with Jointly DependentQualitative Variables: A Simultaneous LogitApproach.” Econometrica, 43(4): 745–55.

Schoemaker, Paul J. H. 1982. “The Expected UtilityModel: Its Variants, Purposes, Evidence andLimitations.” Journal of Economic Literature, 20(2):529–63.

Selten, Reinhard. 1991. “Properties of a Measure ofPredictive Success.” Mathematical Social Sciences,21(2): 153–67.

Selten, Reinhard. 2001. “What Is BoundedRationality?” In Bounded Rationality: The AdaptiveToolbox, ed. Gerd Gigerenzer and Reinhard Selten.Cambridge: MIT Press, 13–36.

Sen, Amartya. 1993. “Internal Consistency of Choice.”Econometrica, 61(3): 495–521.

Shafir, Eldar B., and Robyn A. Leboeuf. 2002.“Rationality.” Annual Review of Psychology, 53(1):491–517.

Shafir, Eldar B., Daniel N. Osherson, and Edward E.Smith. 1993. “The Advantage Model: A ComparativeTheory of Evaluation and Choice Under Risk.”Organizational Behavior and Human DecisionProcesses, 55(3): 325–78.

Simon, Herbert A. 1956. “Rational Choice and theStructure of the Environment.” PsychologicalReview, 63(2): 129–38.

Simon, Herbert A. 1983. “Alternative Visions ofRationality.” In Reason in Human Affairs, ed. HerbertA. Simon. Stanford: Stanford University Press, 7–35.

Simonson, Itamar. 1989. “Choice Based on Reasons:The Case of Attraction and Compromise Effects.”Journal of Consumer Research, 16(2): 158–74.

Sjoberg, Lennart. 1975. “Uncertainty of ComparativeJudgments and Multidimensional Structure.”Multivariate Behavioral Research, 10(2): 207–18.

Sjoberg, Lennart. 1977. “Choice Frequency andSimilarity.” Scandinavian Journal of Psychology,18(2): 103–15.

Sjoberg, Lennart, and Dora Capozza. 1975.“Preference and Cognitive Structures of ItalianPolitical Parties.” Giornale Italiano di Psicologia,2(3): 391–402.

Slovic, Paul, and Sarah Lichtenstein. 1983. “PreferenceReversals: A Broader Perspective.” AmericanEconomic Review, 73(4): 596–605.

Starmer, Chris. 2000. “Developments in Non-ExpectedUtility Theory: The Hunt for a Descriptive Theory ofChoice under Risk.” Journal of Economic Literature,38(2): 332–82.

Stern, Eliahu, and Harry W. Richardson. 2005.“Behavioural Modelling of Road Users: CurrentResearch and Future Needs.” Transport Reviews,25(2): 159–80.

Thorngate, Warren. 1980. “Efficent DecisionHeuristics.” Behavioral Science, 25(3): 219–25.

Thurstone, Louis L. 1927. “A Law of ComparativeJudgment.” Psychological Review, 34(4): 273–86.

Thurstone, Louis L. 1959. The Measurement of Values.

sep06_Article2 8/4/06 4:36 PM Page 660

SECOND PASSQuebecor World Premedia

661Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality

Chicago: University of Chicago Press.Train, Kenneth E. 2003. Discrete Choice Models with

Simulation. Cambridge: Cambridge University Press.Tversky, Amos. 1969. “Intransitivity of Preferences.”

Psychological Review, 76(1): 31–48.Tversky, Amos. 1972a. “Choice by Elimination.”

Journal of Mathematical Psychology, 9(4): 341–67.Tversky, Amos. 1972b. “Elimination By Aspects: A Theory

of Choice.” Psychological Review, 79(4): 281–99.Tversky, Amos, and J. Edward Russo. 1969.

“Substitutability and Similarity in Binary Choices.”Journal of Mathematical Psychology, 6(1): 1–12.

Tversky, Amos, and Itamar Simonson. 1993. “ContextDependent Preferences.” Management Science,39(10): 1179–89.

Tversky, Amos, Paul Slovic, and Daniel Kahneman.1990. “The Causes of Preference Reversal.”American Economic Review, 80(1): 204–17.

Usher, Marius, and James L. McClelland. 2004. “LossAversion and Inhibition in Dynamical Models ofMultialternative Choice.” Psychological Review,111(3): 757–69.

Wakker, Peter, Ido Erev, and Elke U. Weber. 1994.“Comonotonic Independence: The Critical Testbetween Classical and Rank-Dependent UtilityTheories.” Journal of Risk and Uncertainty, 9(3):195–230.

Wedell, Douglas H. 1991. “Distinguishing AmongModels of Contextually Induced PreferenceReversals.” Journal of Experimental Psychology:Learning, Memory, and Cognition, 17(4): 767–78.

Wedell, Douglas H., and Jonathan C. Pettibone. 1996.“Using Judgments to Understand Decoy Effects inChoice.” Organizational Behavior and HumanDecision Processes, 67(3): 326–44.

Wernerfelt, Birger. 1995. “A Rational Reconstructionof the Compromise Effect: Using Market Data toInfer Utilities.” Journal of Consumer Research,21(4): 627–33.

von Winterfeldt, Detlof, and Ward Edwards. 1986.Decision Analysis and Behavioral Research.Cambridge: Cambridge University Press.

Yaari, Menahem E. 1987. “The Dual Theory of Choiceunder Risk.” Econometrica, 55(1): 95–115.

sep06_Article2 8/4/06 4:36 PM Page 661