Nonlinear Pricing in Village Economies - Stanford University · PDF fileNonlinear Pricing in...
-
Upload
duongkhanh -
Category
Documents
-
view
216 -
download
1
Transcript of Nonlinear Pricing in Village Economies - Stanford University · PDF fileNonlinear Pricing in...
Nonlinear Pricing in Village Economies∗
Orazio Attanasio† Elena Pastorino‡
June 2015
Abstract
We propose a model of price discrimination to account for the nonlinearity of unit prices of basicfood items in developing countries. We allow consumers to differ in marginal willingness and absoluteability to pay for a good, incorporate consumers’ subsistence constraints, and model outside optionsfrom purchasing a good that depend on consumers’ preferences, like self-production or access to othermarkets. We obtain a simple characterization of optimal nonlinear pricing, and show that nonlinearpricing leads to higher levels of consumption, lower marginal prices, and higher consumer surplusthan implied by the standard nonlinear pricing model. The model is nonparametrically identified un-der common assumptions. We derive nonparametric and semiparametric estimators of the model’sprimitives that can be easily implemented using individual-level data commonly available for bene-ficiaries of conditional cash transfer programs in developing countries. The model well accounts forthe data, and at the estimated primitives, most consumers benefit from nonlinear pricing comparedto linear pricing. We also find that consumption distortions are less pronounced for individuals whopurchase small quantities, despite the steep decline in unit prices with quantity purchased, contrary towhat implied by the standard model. We show that when sellers have market power, which we detectin our data, policies like cash transfers that affect households’ ability to pay can effectively strengthenthe incentive for sellers to price discriminate across consumers and so give rise to asymmetric pricechanges for low and high quantities, which can exacerbate the consumption distortions typically as-sociated with nonlinear pricing. We find evidence of these patterns of price changes in our data inresponse to transfers. These results confirm the importance of our proposed extensions of the standardnonlinear pricing model in evaluating the distributional effects of nonlinear pricing.
Keywords: Nonlinear pricing; Budget Constraints; Cash Transfers; Structural Estimation
JEL Codes: D82, I38, J18, L11, O12, O13, O15
∗We thank Aureo de Paola, V.V. Chari, Tom Holmes, Bruno Jullien, Toru Kitagawa, Aprajit Mahajan, Costas Meghir,Holger Sieg, Kjetil Storesletten, and Joel Waldfogel for helpful comments. Eugenia Gonzalez Aguado, Sanghmitra Gautam,and Sergio Salgado Ibanez have provided excellent research assistance.†University College London and Institute for Fiscal Studies.‡University of Minnesota and Federal Reserve Bank of Minneapolis.
1 IntroductionQuantity discounts in the form of unit prices declining with quantity appear to be pervasive in developing
countries. McIntosh (2003), for instance, documents differences in the prices of drinking water paid by
poor households in the Philippines, whereas Fabricant et al. (1999) and Pannarunothai and Mills (1997)
document differences in the price of health care and services in Thailand and Sierra Leone. Attanasio and
Frayne (2006) show evidence that consumers purchasing basic staples in Colombian villages face price
schedules rather than linear prices: within a village, households buying larger quantities pay substantially
lower prices for homogeneous commodities. Similar patterns are observed in rural Mexico, as documented
here. This evidence is commonly interpreted as a symptom of the fact that “the poor pay more” than rich
households for the same goods. Since in these countries the poor are close to subsistence, nonlinear prices
are usually considered to have especially undesirable distributional consequences.
This view is consistent with the intuition from the standard model of nonlinear pricing (see Maskin and
Riley (1984)), which assumes that consumers only differ in their marginal willingness to pay for a good.
This model explains quantity discounts as arising from a seller’s incentive to screen consumers through
the offer of multiple price and quantity combinations. Its main insight is that the ability of a seller to
discriminate across consumers though nonlinear pricing not only implies that the consumption of (nearly)
all consumers is depressed relative to first best but, crucially, that consumption distortions are more severe
for consumers who purchase the smallest quantities, typically the poorest ones in developing countries.
The standard model assumes that consumers are unconstrained in their ability to pay for a good and
have access to similar alternatives to trading with a particular seller or in a particular market. This frame-
work then naturally explains the dispersion in unit prices for goods that absorb a small fraction of con-
sumers’ incomes, in settings in which consumers have available similar outside consumption opportuni-
ties.1 As such, the standard model abstracts from key features of food markets in developing countries.
In these countries, households typically face subsistence constraints on the consumption of basic sta-
ples, spend a large fraction of their income on food, and often have available alternative heterogeneous
consumption possibilities, through self-production or highly-subsidized government stores, that are un-
common in developed countries.
To rationalize the occurrence of quantity discounts in these settings, we propose an alternative model of
price discrimination that explicitly formalizes households’ subsistence constraints, and allows households
to differ in both their marginal willingness to pay and absolute ability to pay for a good. The model also
incorporates a rich set of alternatives to purchasing in a particular market that differ across consumers. We
show that in these settings, nonlinear pricing typically has distributional effects that run counter standard
intuition, as it leads to consumption below and above first best in a given population of consumers, yielding
levels of consumer surplus in general higher than those implied by the standard model.
1Formally, consumers are assumed to be able to pay more than their reservation prices for a good. See Che and Gale (2000).
1
The model we propose can be structurally identified from information in a given market on the distri-
bution of prices and quantities. Furthermore, it is possible to distinguish different versions of the model
that have very different welfare implications. When we bring the model to the data from a sample of
villages in rural Mexico, we find that the model fits extremely well the observed differences in prices and
quantities within and across villages, and that the standard model often used in the literature to explain
quantity discounts is strongly rejected.
Our analysis starts with the observation that when facing subsistence constraints, consumers can be
formally thought as facing an additional budget constraint just on the expenditure on a seller’s good. That
is, in the language of the literature on nonlinear pricing and auctions, consumers are budget-constrained
and their constraints depend on their preferences and incomes. We show that, even when consumers differ
in both their marginal willingness and absolute ability to pay, a model with budget-constrained consumers
maps into a class of nonlinear pricing models with so-called countervailing incentives, in which consumers
have heterogeneous reservation utilities (see Jullien (2000)). By formally exploiting this equivalence
across models, we then prove that in the richer environment we consider, a simple characterization of
nonlinear pricing can be obtained. In such an environment, nonlinear pricing has more desirable welfare
properties than those implied by the standard model, as it leads to higher levels of consumption, lower
marginal prices, and also higher consumer surplus. Intuitively, since a budget credibly conveys to a seller
that there is maximum price a consumer will pay, such a constraint effectively limits a seller’s ability to
extract consumer surplus. In these situations, nonlinear pricing might be preferable to linear pricing even
if sellers have market power.
The reason for why consumption is higher, and therefore marginal prices are lower, in our framework
than predicted by the standard model is that, unlike in the standard model, in our framework an optimal
menu entails overprovision as well as underprovision of quantity relative to first best. The intuition behind
this result can be explained simply. In the standard nonlinear pricing model, in order to discriminate
across consumers, a seller just needs to prevent consumers from effectively understating their preferences
and purchasing less than their taste for a good would imply. In our model, instead, a consumer may
have an incentive to overstate her preference. The reason for this reversal of the logic of the standard
model is that in our model a consumer who is ‘marginal’, that is, indifferent between purchasing and not
purchasing from the seller, may be one with an intermediate or even high preference for the good—in the
standard model, the only marginal consumer is the one with the lowest valuation for a good. This situation
occurs when consumers with intermediate or high valuations for a good have also better outside options
or face tighter subsistence constraints. If so, then a seller needs to offer such a consumer an attractive or
inexpensive enough price and quantity combination to induce her to buy. But then, the price and quantity
combination meant for this marginal consumer can also be attractive to lower valuation consumers: by
offering large enough quantities, even above first best, a seller can then discourage these lower valuation
consumers from purchasing quantities intended for higher valuation ones.
2
As for why consumer surplus may be higher than implied by the standard model, note that a budget
constraint on the expenditure on a seller’s good effectively constrains the price a seller can ask. For
similar outside options across models, the combination of higher quantities, lower marginal prices, and this
expenditure limit naturally gives rise to higher levels of utility for consumers than predicted by the standard
model. Despite the scope for sellers to extract consumer surplus through quantity-specific prices, nonlinear
pricing has also an obvious efficiency-enhancing dimension to it, especially salient when consumers are
differentially constrained in their access to a market. By allowing a seller to tailor prices and quantities
to a consumer’s marginal willingness to pay, nonlinear pricing enables a seller to trade at a profit with
consumers whose subsistence constraints and outside options would lead them to demand especially large
quantities under linear pricing. Such consumers would be excluded from the market under linear pricing,
and are therefore better off under nonlinear pricing.
By a similar argument, ignoring consumers’ budget constraints, and the endogenous price response
to changes in consumers’ ability to pay, can lead to an incorrect assessment of the impact of common
food subsidization policies, like conditional cash transfers, aimed at expanding households’ consumption
possibilities. Intuitively, by increasing consumers’ budgets, these policies not only stimulate consumption
but also provide an incentive for a seller to take advantage of the greater ability to pay of consumers. In
general, targeted cash transfers that are more generous for poorer households, typically the purchasers
of the smallest quantities, lead not just to a greater demand for a seller’s good but also to higher prices.
Depending on the distribution of outside options across consumers, transfers can then give rise to increases
in unit prices for low quantities but decreases in unit prices for high quantities, thus increasing the degree
of price discrimination and exacerbating some of the associated consumption distortions.
We prove that our model is nonparametrically identified under assumptions standard in the empirical
auction and nonlinear pricing literature (see Guerre, Perrigne, and Vuong (2000) and Perrigne and Vuong
(2010)) and can be non- or semiparametrically estimated using household-level expenditure data, typically
available for beneficiaries of conditional cash transfer programs in developing countries. Intuitively, our
identification and estimation strategy exploits the joint information about the primitives of both buyers
and sellers contained in the price schedule and in the distribution of quantity purchases in a village.
The model fits the data remarkably well. The standard model, in contrast, is robustly rejected in nearly
all villages. Estimates of the model primitives conform to all restrictions of the model not imposed in
estimation, like the monotonicity of hazards of the distribution of consumers’ unobserved willingness
to pay, and the inverse relationship between marginal utility and quantity consumed. We find that the
distribution of consumers’ tastes, and so marginal willingness to pay, is much more dispersed than the
distribution of observed quantities in each village, which implies a potentially strong incentive for sellers
to price discriminate across consumers. We also estimate a large degree of curvature in utility, which
suggests the potential for rich distributional implications of nonlinear pricing.
We estimate that sellers indeed have market power in all the villages in our sample, and exercise it
3
by distorting the quantities offered to most households. Interestingly, we find that nonlinear pricing leads
to both underconsumption for consumers of the smallest quantities and overconsumption for consumers
of the largest quantities but distortions are more pronounced for the purchasers of the largest quantities,
which empirically correspond to the relatively richer households. Although quantity distortions are of
smaller magnitude for consumers with lower marginal willingness to pay, price distortions are for them
more severe so that, on balance, purchasers of the smallest quantities would benefit more from a more
competitive market for rice.
When comparing welfare under observed nonlinear pricing to a counterfactual scenario in which
sellers charge linear prices, we find that for most consumers—except those purchasing the smallest
quantities—are better off under nonlinear pricing. For consumers who prefer nonlinear pricing, the greater
quantities and lower marginal prices that nonlinear pricing implies, relative to linear pricing, give rise not
just to higher levels of consumer surplus but also of social surplus. Consumer and social surplus for these
consumers are higher under nonlinear pricing partly due to the higher degree of market participation than
nonlinear pricing generates. Indeed, as consistent with our model, we find that consumers with both small
and high reservation utilities or budgets would be excluded from the market under linear pricing. We also
find that the standard nonlinear pricing model would systematically overestimate the gain from nonlin-
ear pricing: by implying, by construction, much lower reservations utilities than our model, the standard
model predicts much higher linear prices and thus lower consumer surplus under linear pricing.
We finally assess the effect of conditional cash transfers on prices in the villages in our data. Cash
transfers, as in the case of the Mexican Progresa program implemented in some of the villages in our
sample, have been increasingly popular in recent years, both in Latin America and in other developing
countries. A few studies have analyzed the impact of transfers on the price of commodities. Hoddinott
et al. (2000), for instance, study the impact of Progresa on household consumption and, in the process,
examine the price effect of Progresa concluding that: “There is no evidence that Progresa communities
paid higher food prices than similar control communities” (p. 33). Analogously, Angelucci and De Giorgi
(2009), when assessing the impact of Progresa on the consumption of non-eligible households, consider
the possibility that their results are mediated by changes in local prices but dismiss this possibility based
on their empirical analysis. The consensus, therefore, seems to be that Progresa did not have noticeable
effects on local prices. Analogous evidence has been documented by Cunha et al. (2014), who study
a large food assistance program for poor households in Mexico, the Programa de Apoyo Alimentario
(PAL), which was supposed to complement Progresa in that it was targeted to extremely marginalized and
isolated communities. That program was evaluated through a four-arm randomized control trial, in which,
in addition to a control group, the 200 villages in the evaluation sample were divided into 3 other groups,
one receiving an in-kind transfer and the other two receiving a monetary transfer. They find that relative
to the control group, in-kind transfers gave rise to a 4 percent fall in prices whereas cash transfers gave
rise to a positive but negligible price increase.
4
All of these studies focus on changes in average (unit) prices associated with the introduction of cash
or in-kind transfers but fail to account for the nonlinearity of unit prices when assessing the impact of
transfers on prices. Our model implies that income transfers to consumers would induce a seller to change
his price schedule altogether. In line with this prediction, we find evidence of a significant effect of
transfers on prices: in response to transfers, the schedule of unit prices becomes significantly steeper with
unit prices increasing at low quantities but decreasing at low quantities. Since the resulting average effect
of cash transfers on unit prices is much less pronounced, we also show that ignoring the variability of unit
prices with quantity leads to a much smaller estimate of the price effect of transfers, as consistent with
the empirical specifications and results in the literature. On the contrary, by increasing consumers’ ability
to pay, cash transfers also effectively increase a seller’s market power, the degree of price discrimination,
and can potentially lead to much lower consumer surplus gains than typically inferred.
2 Quantity Discounts: the Case of MexicoAs mentioned in the introduction, quantity discounts are common in several markets in developing coun-
tries. Attanasio and Frayne (2006), for instance, estimate the supply schedule for several basic food
staples, including rice, carrots and beans in Colombian villages and show substantial quantity discounts.
For instance, they find that the elasticity of the price of rice to the quantity bought is as large as −0.11
in their preferred specification. They estimate even larger discounts (in absolute value) for other specifi-
cations and for commodities such as beans or carrots. In what follows, we use a large dataset from rural
communities in Mexico to study similar patterns and test the model we propose to explain them.
The dataset we use, described in detail in the Supplementary Appendix, was collected to evaluate the
impact of the conditional cash program Progresa, which was started in 1997 under the Zedillo administra-
tion. The program consists of cash transfers to eligible families with children, conditional on behavior like
attendance to classes, taking young children to health centers, and sending school-aged children to school.
For participating households, on average, grants amount to 25% of their income, therefore constituting a
substantial fraction of it. Within the localities targeted by the first wave of the program expansion that
occurred between 1998 and 2000, about 70% of households qualified as eligible for the program.
The Mexican government decided to evaluate the program and its expansion using a randomized con-
trolled trial. In 1997, the program selected 506 localities in 7 states, each belonging to one of 191 larger
administrative units (municipalities), to be included in the evaluation sample. Of these localities, 320, ran-
domly chosen, were assigned to early treatment, which started in the middle of 1998, and the remaining
186 were assigned to late treatment, which started in December 1999. The households in these localities
were followed for several periods. In what follows, we use the wave collected in May 1999.
The evaluation data, which has been used extensively in recent years, are remarkable for several rea-
sons. First, it provides a census of 506 villages in that all households in the localities are surveyed. Second,
the data is very rich and exhaustive. For the purpose of our paper, we note that it contains information,
5
for each of 36 food commodities, both on the quantity consumed, on the amount purchased, and on the
outlay that such purchase involved. It is therefore possible to compute, for each purchase observed in the
village, unit values for many commodities. Third, the data contain information on a variety of locality-
level variables, including some prices collected in local stores. This information can be used to check
the reliability of the unit values one can construct from the household survey. Attanasio et al. (2009)
found that unit values approximate well local prices, and these, in turn, match reasonably well data from
national sources on prices. The locality dataset also contains some information on the market structure in
each village. Finally, the fact that the program was randomly implemented in a subset of the villages—at
least for the first waves, including the one we are using—is useful as it introduces random and substantial
variation in the amount of resources available to some households, which can be used to study some of
the implications of the models we analyze.
Having being targeted by Progresa, the villages included in the evaluation survey are small, remote,
and “marginalized,” according to an index used by the Mexican government to target social programs.
The average number of households in a locality is just over 50. These localities are not, however, the
most marginalized and poorest villages in Mexico, as to be eligible for Progresa their inhabitants had to
have access to some basic infrastructure such as health centres and schools. On average, about 78% of the
households of the villages in the evaluation survey are eligible for the program, which is designed for the
poorest sectors of Mexican society. Obviously, beneficiary households are “poorer” than non-beneficiary
ones, as can be verified in a variety of dimensions, from the ownership of durables to the fraction of total
consumption devoted to food. However, even non-eligible households in the villages we study are quite
poor. In the overall sample, including eligible and non-eligible households, food accounts for nearly 70%
of their total budget. This share is higher for eligible households but even among non-eligible households,
food constitutes a substantial part of their expenditure. Whilst most of the households in our sample are
poor, it should also be pointed out that there is a substantial amount of heterogeneity in our sample as
different households differ in the degree of poverty. There is also substantial variability across villages in
the level of poverty, reflected in the significant variation of eligibility rates across localities.
One of the main reasons for our use of the May 1999 wave of the survey is the richness of the con-
sumption and expenditure data it contains, in addition to a number of demographic and socioeconomic
variables. In the case of food and drink, the survey contains information on weekly expenditure and quan-
tity purchased for 36 goods, together with the quantity consumed and home produced, as mentioned. The
food items recorded include fruits and vegetables, grains and pulses, and meat and animal products. The
list is supposed to be exhaustive of the foods consumed by households.
Given that the survey contains information on quantities purchased and consumed for each recorded
item as well as total household expenditure on each item, it is possible to determine unit values for each
food item as measured by the ratio of expenditure to quantity consumed. From now on, we term unit
6
values as prices.2 Given its high degree of storability, relative homogeneity, and the large frequency
of households purchasing it—on average 86% of observations overall and 82% in our sample—in our
empirical analysis we focus on rice. The evidence on quantity discounts, however, can be obtained for a
variety of commodities.
For a sense of the presence of quantity discounts and their importance, we perform the exercise in
Attanasio and Frayne (2006) on our data for a variety of different commodities. In particular, we try to
identify the price schedule faced by consumers by regressing the log of price of a commodity on the log
of quantity. To take into account that quantity is endogenously determined and that the estimates of the
coefficient on such a variable could be affected by the presence of measurement error (and by the fact
that “prices” are obtained by dividing the total value by the quantity purchased, which then appears on the
right hand side of the equation), we use an instrumental variable approach as Attanasio and Frayne (2006),
where variables that determine demand but are not likely to affect supply, such as total household income
and demographic composition, are used to instrument quantity. In Table 1, we report results for rice,
beans, sugar, tomatoes and tortillas. The top panel of the Table contains results obtained on all available
data, whereas in the bottom one we focus on municipalities with at least 100 households. The number of
observations varies across commodities because of the different frequency of transactions for which we
have information. The elasticity of prices to quantities along the supply schedules we identify is largest
for rice, −0.2. It is also high for tomatoes (−0.15) and tortillas (−0.16); it is smaller (in absolute value)
for beans and sugar but statistically different from zero. We note that the estimates we obtain do not vary
when we restrict attention to a smaller set of municipalities.
The models we study below relate the shape of the price schedule to the distribution of quantities in
a given market. Therefore, to perform the empirical exercise we propose, we need to define a market.
Ideally, one would like to consider a relatively isolated market in which one or any number of sellers
face heterogeneous buyers and enjoy some degree of market power. It would then be natural to use the
“locality” as the unit of analysis. As our strategy will rely on estimating the distribution of quantities and
prices within each “village,” given the size of the localities and the number of transactions we observe,
using localities results in too few observed transactions. We therefore decided to use the “municipality” as
the unit of analysis. Each municipality is composed of several localities, not all of which are included in
the survey. The municipality is the administrative unit and, therefore, the assumption that it constitutes the
relevant market is not too far fetched. As mentioned above, the 506 localities belong to 191 municipalities.
In all but one of these municipalities, we observe purchases of rice. In the empirical exercise below, we
focus on the market for rice, as it is one of the commodities with the lowest number of zeros in the sample
we use. To minimize the impact of measurement error, we focus on rice purchases up to 3 kilos at a
median price not higher than 16 pesos. These restrictions imply the loss of very few observations. In
2For measurement issues involved in the construction of unit values, see Attanasio et al. (2009). The authors show that unitvalues are to a large extent validated by the available information on store-level prices in each village.
7
Table 1: Price ScheduleFull Sample
Rice Beans Sugar Tomatoes TortillasLn(quantity) −0.204 −0.093 −0.070 −0.157 −0.163
(0.026) (0.013) (0.016) (0.019) (0.028)Constant 1.945 2.334 1.806 1.721 1.535
(0.011) (0.009) (0.008) (0.008) (0.064)
Observations 13301 19499 20476 20223 5277R2 0.164 0.067 0.047 0.103 0.125
Restricted Sample: Villages with at Least 100 HouseholdsRice Beans Sugar Tomatoes Tortillas
Ln(quantity) −0.207 −0.095 −0.060 −0.143 −0.160(0.030) (0.015) (0.019) (0.022) (0.031)
Constant 1.947 2.33 1.803 1.731 1.539(0.013) (0.0106) (0.008) (0.009) (0.071)
Observations 10622 15414 16118 15957 4185R2 0.166 0.070 0.040 0.088 0.133
Note: instrumental variable estimates (instruments: family compositionvariables and age); clustered standard errors at village level in parentheses
this sample, households consume between 0.1 to 3 kilos of rice, with a mean of 0.80 kilos and a standard
deviation of 0.44 kilos, at a unit price ranging from 0.67 to 16 pesos per kilo, with a mean of 7.64 pesos
and a standard deviation of 2.26 pesos.
The leading model we present in the next section considers a seller in a fairly closed market facing
an heterogeneous population of consumers. As we discuss below, this model can account for virtually
any degree of competition. Having said that, it is interesting to consider the typical market structure in
the villages in our sample. Of the 497 villages in the data set for which we have this information, 4 have
a mercado publico (public market), 108 have a tienda Diconsa (government regulated store), 5 have an
almacen or botega de abasto (supply warehouse), 167 have a tiendas de abarrotes (grocery store), 9 have
a tianguis (an open air market or bazaar that is traditionally held on certain market days in a town or city
neighborhood), 4 have a mercado regional (regional market), 5 have a mercado ambulante (street market),
21 have a mercado sobre ruedas (a street market that is usually installed outdoors in one or more specific
days of the week), and 246 have a comercio casero (home trade). We therefore conclude that supply is
indeed highly concentrated on a handful of stores of similar type.
3 Models of Price DiscriminationAs just reviewed, prices per unit are nonlinear and imply quantity discounts in the villages in our data.
A simple model that generates these features is that of Maskin and Riley (1984) of price discrimination,
which assumes that a seller screens consumers by their marginal willingness to pay according to the
quantities they purchase. This model, however, can be too restrictive for the context we study because
8
it assumes that consumers have the same reservation utility. Therefore, we build on the model of Jullien
(2000) who assumes that consumers differ not only in their willingness to pay for a good but also in their
reservation utility in order to capture flexibly the value of consumption possibilities alternative to trading
with a particular seller.
Suitable interpretations of consumers’ reservation utilities can then accommodate different environ-
ments of interest in our context. A particularly relevant case arises when consumers face subsistence
constraints in consumption, which give rise to a budget constraint on the expenditure on a seller’s good.
Models with these types of budget constraints are known to be intractable (see Che and Gale (2000)): the
optimal pricing schedule in these settings is only known for special cases, when, for instance, utility is
linear in consumption (see Che and Gale (2000)) or the budget is identical across consumers so that a
seller has no incentive to price discriminate across all consumers (see Thomas (2002)). Key to our ap-
proach to characterizing nonlinear pricing in the presence of these budget constraints is that under simple
conditions, a model with heterogeneous reservation utilities is equivalent to a model with heterogeneous
budget constraints. Establishing this equivalence allows us to adapt Jullien’s results to such a setting.
The model we propose captures a variety of situations that are likely to be relevant for our application.
First, consumers in our data have access to a wide range of outside options: households in a village may
purchase a good from sellers in other villages, even sellers whose behavior is not determined by the market
as is the case for government-regulated Diconsa stores; they may have the ability to produce a good as
an alternative to purchasing it; or they may receive a good from relatives, friends, or the government as a
transfer. Second, while we analyze the problem of a single seller, since a consumer’s reservation utility
can be interpreted as the indirect utility obtained when purchasing from other sellers, the model accounts
for varying degrees of market power, ranging from monopoly to monopolistic competition to near perfect
competition. Finally, consumers may have preferences for other goods, some of which may be offered
by a same seller. We assume that a consumer’s utility is separable in the quantity of each such good and
that a seller prices his goods independently—see Stole (2006) for a discussion of situations in which this
component pricing is indeed optimal.3
As common in the nonlinear pricing literature, our framework implicitly excludes the possibility of
collusion among consumers, for instance, through resale. Interestingly, we do not observe resale among
consumers in our data. A natural question is why consumers do not form coalitions, buy bulk, and resell
the quantity purchased from the seller among themselves at linear prices. A possible answer is that our
context is that of small, isolated, and quite geographically dispersed communities in rural Mexico that
largely function as informal economies. Thus, it might be difficult for consumers to engage in the type
of contracts that would sustain resale. Conceptually, such a situation can be translated into a simple
assumption on imperfections in contracting between consumers similar to the assumption on imperfections
3Near competition obtains when every consumer’s reservation utility is arbitrarily close to first-best utility and marginalprices are close to marginal cost.
9
in contracting between sellers and consumers usually maintained in models of nonlinear pricing. 4
3.1 A Model With Heterogeneous Outside Options
We model a village as a market in which consumers (households) and a seller exchange a quantity q ∈[0,∞) of a good—rice in our case—for the monetary transfer t. Consumers’ preferences depend on a
characteristic, θ, continuously distributed across them with support [θ, θ], θ > 0, cumulative distribution
function F (θ), and probability density function f(θ), positive for θ ∈ (θ, θ). We assume that the seller
observes θ but that prices contingent on consumers’ characteristics are not enforceable. For convenience
only, we assume equivalently that the seller does not observe θ and rely on results from the literature on
mechanism design with private information in our derivations. We believe the first interpretation better
accords to our data. The seller, however, is allowed to post a nonlinear price schedule for all consumers
entailing possibly different unit prices for each offered quantity.
Upon trade, a consumer obtains utility v(θ, q) − t, with v(·, ·) twice continuously differentiable,
vq(θ, q) > 0, and vqq(θ, q) ≤ 0, whereas the seller obtains profit t − c(q), where c(·) is the cost of
producing q and is assumed to be twice continuously differentiable. We normalize the seller’s reservation
profit to zero. Let s(θ, q) = v(θ, q) − c(q) denote the social surplus from trade. We define the first-best
quantity, qFB(θ), as the one that maximizes social surplus when the consumer’s type is θ.
We assume, as standard, that vθq(θ, q) > 0 for q positive so that consumers can be ranked according
to their marginal utility from the good. We also maintain that sq(θ, q)/vθq(θ, q) decreases with q, which
ensures that a seller’s problem admits a unique solution and that first-order conditions are necessary and
sufficient to characterize its solution. This assumption plays the same role as the assumptions that s(θ, ·)is concave in q and vθ(θ, ·) is convex in q in the standard nonlinear pricing model. A consumer of type θ is
said to participate when the consumer purchases a single quantity q(θ) with probability one—the restric-
tion to deterministic contracts is without loss here. Since we focus on situations in which all consumers
purchase from the seller, q = 0 is interpreted as the limit when the contracted quantity becomes small.
Note that with exclusion, the optimal contract for types who participate would be the same as the one
characterized here.
By the revelation principle, a contract can be summarized by a menu {t(θ), q(θ)} such that the best
choice within the menu for a consumer of type θ is to choose the price t(θ) and the quantity q(θ), that is,
the menu is incentive compatible. Let u(θ) = v(θ, q(θ))− t(θ) denote the utility of a consumer of type θ
when purchasing from the seller. Define the reservation utility, u(θ), to be the consumer’s utility when the
consumer does not purchase from the seller. The seller’s optimal menu is chosen to maximize expected
4Formally, if enforcement or transaction costs, like commuting within or across villages, for each consumer were of theorder of the surplus from trade net of consumer’s and seller’s outside options or higher, then not even a coalition of all consumerscould achieve higher utility for any member than the utility each member obtains by trading with a nonlinear pricing seller.
10
profits subject to incentive compatibility and participation constraints, that is,
(IR problem) max{t(θ),q(θ)}
∫ θ
θ
[t(θ)− c(q(θ))]f(θ)dθ s.t.
(IC) v(θ, q(θ))− t(θ) ≥ v(θ, q(θ′))− t(θ′) for any θ, θ′
(IR) u(θ) ≥ u(θ) for any θ.
We refer to this model in which the seller’s constraints are IC and IR as the IR model, and define an
allocation {u(θ), q(θ)} to be implementable if it satisfies the IC and IR constraints. Clearly, the IC con-
straint is satisfied for a consumer of type θ if choosing q(θ) for the price of t(θ) maximizes the left-hand
side of the constraint. Taking first-order conditions, this requires vq(θ, q(θ))q′(θ) = t′(θ) or, equivalently,
u′(θ) = vθ(θ, q(θ)). Since vθq(θ, q) > 0, an allocation is incentive compatible, as usual, if, and only if, it
is locally incentive compatible in that u′(θ) = vθ(θ, q(θ)), the schedule q(θ) is weakly increasing (a.e.),
and the utility u(θ) is absolutely continuous.
Jullien (2000) shows that under three assumptions on the distribution of consumer types and utility,
homogeneity, potential separation, and full participation, there exists a unique optimal solution with full
participation to the seller’s problem, which is characterized by the first-order condition
vq(θ, q(θ))− c′(q(θ)) =γ(θ)− F (θ)
f(θ)vθq(θ, q(θ)) (1)
for each type, together with the complementary slackness condition associated with the IR constraints,
∫ θ
θ
[u(θ)− u(θ)]dγ(θ) = 0, (2)
with q(θ) weakly increasing.5 Note γ(θ) =∫ θθdγ(x), the cumulative multiplier on the IR constraints, has
the properties of a cumulative distribution function, that is, it is nonnegative, increasing, and γ(θ) = 1.
The integral in the definition of γ(θ) is interpreted as accommodating not just discrete and continuous
distributions but also mixed discrete-continuous ones. That is, this formulation covers the possibility
that the IR constraints bind at isolated points. For instance, in the standard nonlinear pricing model,
consumers’ reservation utility is assumed to be independent of θ, so the IR constraints simplify to u(θ) ≥ u
and bind only for the lowest type. Thus, γ(θ) = 1 for all types, and γ(θ) has a mass point at θ.6 By writing
the seller’s first-order condition in explicit form as q(θ) = l(γ, θ) for an arbitrary cumulative multiplier γ,
it follows that the solution to the seller’s problem can be expressed as q(θ) = l(γ(θ), θ) with associated
price schedule t(θ) = v(θ, q(θ))− u(θ). See the Supplementary Appendix for details.
5Homogeneity requires a menu inducing full participation with binding IR constraints for all types be incentive compatible.6It is understood that q(θ) is evaluated taking the left-limit at jump points.
11
We interpret the reservation utility, u(θ), as the value of purchasing from another seller or producing
the good at home. Note that by varying the level of the reservation utility, u(θ), the model can accommo-
date very different degrees of market power among sellers, ranging from no market power to any degree
of monopoly power. For instance, when the reservation utility equals the social surplus under first best for
all θ, that is, u(θ) = v(θ, qFB(θ))−c(qFB(θ)), the solution to the seller’s problem implies γ(θ) = F (θ)
for all types. Accordingly, consumers purchase from the seller first-best quantities at the first-best nonlin-
ear prices of c(qFB(θ)). As the reservation utility is lowered, a seller’s profit correspondingly increases,
allowing the model to capture any degree of monopoly power. The ability of this model to approximate ar-
bitrarily well perfect competition is an important advantage over the standard model for the measurement
exercises of later sections.
The characterization of the optimal menu depends crucially on the shape of the reservation utility.
As discussed in Jullien (2000), when the degree of convexity of u(θ) is high or low, only two types of
menus are optimal, referred to as the highly-convex and weakly-convex cases, which are characterized by
opposite patterns of consumption distortions relative to first best if sellers have market power. We provide
an example of these two cases in Figure 1 with power-law distributed types, v(θ, q) = θν(q), where
ν(q) = (1− d)[aq/(1− d) + b]d/d is a three-parameter HARA function, and constant marginal cost. We
let θ = 1, θ = 2.2, a = c = 1, b = 0, and d = 1/2. See the Supplementary Appendix for details.
Highly-Convex Case. This case arises when the participation constraint binds at isolated points. Since
q(θ) is continuous, γ(θ) can have mass points only at θ or θ, and so the IR constraint can bind at isolated
points only for these extreme types. When so, γ(θ) = γ for all interior types and the optimal quantity is
q(θ) = l(γ, θ). If γ = 0, then the participation constraint binds only for the highest type. As γ ≤ F (θ)
in this case, (almost) all types consume quantities above first best. If, instead, γ = 1, then the constraint
binds only for the lowest type, and the optimal menu coincides with that of the standard model. Since
γ ≥ F (θ) in this case, (almost) all types consume quantities below first best. The most interesting case
arises when 0 < γ < 1 so that the constraint binds for both the lowest and highest types. Since F (θ)
increases from 0 to 1, in this case F (θ) starts below γ, crosses γ at some type θHC , and then lies strictly
above it. By (1), consumer types below θHC consume quantities below first best whereas types above θHCconsume quantities above first best.
Weakly-Convex Case. This case arises when the participation constraint binds for an interval of types,
say, [θ1, θ2]. Let q(θ) denote quantity that implements the reservation utility profile in that u′(θ) =
vθ(θ, q(θ)) for each θ. Then, the optimal quantity is given by q(θ) = l(0, θ) for θ ≤ θ1, q(θ) = q(θ)
for θ1 ≤ θ < θ2, and q(θ) = l(1, θ) for θ ≥ θ2. The cumulative multiplier equals zero up to θ1, increases
up to one between θ1 and θ2, and equals one above θ2. Thus, there exists a type θWC in [θ1, θ2] at which
γ(θ) crosses F (θ): consumer types below θWC consume quantities above first best whereas types above
θWC consume quantities below first best.
12
Figure 1: Highly-Convex and Weakly-Convex Case0
.51
1.5
Util
ity a
nd R
eser
vatio
n U
tility
1 1.2 1.4 1.6 1.8 2 2.2Type
Utility Reservation Utility
Highly-Convex Case
qHC
0.2
.4.6
.81
Dis
trib
utio
n F
unct
ion
of T
ypes
and
Mul
tiplie
r
1 1.2 1.4 1.6 1.8 2 2.2Type
Type Distribution Function Multiplier on IR Constraint
Highly-Convex Case
q1 q2
0.5
11.
52
Util
ity a
nd R
eser
vatio
n U
tility
1 1.2 1.4 1.6 1.8 2 2.2Type
Utility Reservation Utility
Weakly-Convex Case
q1 q2qWC
0.2
.4.6
.81
Dis
trib
utio
n F
unct
ion
of T
ypes
and
Mul
tiplie
r
1 1.2 1.4 1.6 1.8 2 2.2Type
Type Distribution Function Multiplier on IR Constraint
Weakly-Convex Case
3.2 A Model with Budget Constraints
Suppose now that consumers face subsistence constraints. Formally, these constraints give rise to a lim-
ited liability constraint on consumers’ expenditure on the seller’s good that takes the form of a budget
constraint for the good. Here we show that under simple and intuitive conditions, this model is equivalent
to the one of the previous subsection.
Setup. Assume that consumers have quasi-linear preferences over the seller’s good, q, and the numeraire,
z, which represents all other goods. A consumer is characterized by a preference characteristic θ that
affects her valuation of q, as before, and by a parameter w that affects her overall budget or income, Y (w).
The consumer faces a subsistence constraint on the consumption of z of the form z ≥ z(θ, q). Then, the
consumer’s problem is
maxq,z{v(θ, q) + z} s.t. T (q) + z ≤ Y (w) and z ≥ z(θ, q). (3)
The constraint in (3) can be thought of as arising from a situation in which a certain number of calories
are necessary for survival, which can be achieved by consuming q units of the seller’s good and z units of
the numeraire. Such a calorie constraint can be expressed as Cq(θ, q) + Cz(θ)z ≥ C(θ), where Cq(θ, q)
13
and Cz(θ)z are, respectively, the calories produced by the consumption of q units of the seller’s good and
z units of the numeraire, and C(θ) is the subsistence level of calories for a consumer of type θ. We can
then rewrite the calorie constraint as z ≥ [C(θ)− Cq(θ, q)]/Cz(θ) ≡ z(θ, q) as in (3).7
Using the fact that at an optimum the consumer’s budget constraint holds as an equality and substi-
tuting z = Y (w) − T (q) into both the consumer’s objective function and the constraint z ≥ z(θ, q), the
problem in (3) can be restated as
maxq{v(θ, q) + Y (w)− T (q)} s.t. T (q) ≤ I(θ, q, w) ≡ Y (w)− z(θ, q), (4)
where Y (w) is an irrelevant constant in the objective function and I(θ, q, w) is the maximal amount that
the consumer can pay to purchase q units of the seller’s good and still meet her subsistence constraint.
Note that the constraint in (4) is a budget constraint for the seller’s good that arises from the consumer’s
subsistence constraint.
Suppose that when consumers do not purchase from the seller, they can achieve the exogenous utility
level u. Then, the seller’s optimal menu solves
(BC problem) max{t(θ),q(θ)}
∫ θ
θ
[t(θ)− c(q(θ))]f(θ)dθ s.t.
(IC) v(θ, q(θ))− t(θ) ≥ v(θ, q(θ′))− t(θ′) for any θ, θ′
(IR’) u(θ) ≥ u for any θ
(BC) t(θ) ≤ I(θ, q(θ), w) for any θ.
We refer to this model in which the seller’s constraints are IC, IR’, and BC as the BC model. We as-
sume that I(θ, q, w) is twice continuously differentiable and Iθ(·), Iq(·) ≥ 0. Although the model admits
heterogeneity in θ and w, in the text we consider the case of constant w and suppress the dependence of
I(θ, q, w) on w. We discuss the effect of this additional dimension of heterogeneity on an optimal menu
in Appendix A. In analogy to the homogeneity assumption in the IR model, we assume that there exists a
menu {t(θ), q(θ)} such that when faced with this menu, the budget constraint of each consumer holds as
equality and each consumer demands an incentive compatible quantity, q(θ):
(BC homogeneity) t(θ) = I(θ, q(θ)), t′(θ) = vq(θ, q(θ))q′(θ), and q(θ) is weakly increasing. (5)
This condition requires that there exists an incentive compatible way to induce a consumer to spend her
entire budget for the seller’s good. Importantly, under BC homogeneity, incentive compatibility can be
satisfied when the budget constraint binds. Like in the IR model, this condition then ensures that there7This formulation of the constraint generalizes the common one used, for instance, by Jensen and Miller (2008), Cqq +
Czz ≥ C, where Cq and Cz are the calories provided by one unit of q and one unit of z, and C is the subsistence calorie intake.
14
exists an implementable menu inducing all consumers to participate.
Equivalence Between Participation and Budget Constraints. The seller’s problem with constraints
IC, IR’, and BC has no known solution. Here we proceed to characterize the optimal menu indirectly by
establishing an equivalence between the BC problem and the IR problem. A natural approach, which leads
to a simple constructive argument, would be to define the income schedule of a consumer as I(θ, q(θ)) =
v(θ, q(θ)) − u(θ). Since, by definition, t(θ) = v(θ, q(θ)) − u(θ), it is immediate that in this case the BC
constraint is equivalent to the reservation utility constraint of the IR model.
Although such an approach is intuitive as it directly relates reservation utilities and budgets, it is unduly
restrictive in requiring the schedules of reservation utilities and budgets to agree for each type. For the
two problems to be equivalent, it is sufficient that the slope of the income schedule Iq(θ, q) coincides
with the marginal utility schedule, vq(θ, q), just for types whose IR constraints bind in the IR problem or,
equivalently, vq(θ, q) coincides with Iq(θ, q) for types whose BC constraints bind in the BC problem. We
establish this result by proving that under such a condition, the first-order and complementary slackness
conditions characterizing the solution to the IR problem, (1) and (2), coincide with those for the BC
problem. Formally, note that, as proved in Appendix A, the BC problem can be conveniently re-stated as
(simple BC problem) max{q(θ)}
∫ θ
θ
v(θ, q(θ))−c(q(θ)) +[F (θ)−Φ(θ)+Φ(θ)−1
f(θ)
]vθ(θ, q(θ))
+φ(θ)[I(θ,q(θ))−v(θ,q(θ))]f(θ)
f(θ)dθ, (6)
with q(θ) weakly increasing. In this problem, Φ(θ) =∫ θθφ(x)dx is the cumulative multiplier, defined
analogously to γ(θ), on the budget constraint expressed as I(θ, q, w) ≥ t(θ) = v(θ, q)− u(θ) and φ(θ) is
its derivative. For each type, the first-order conditions to this problem are
vq(θ, q(θ))−c′(q(θ))+
[F (θ)− Φ(θ) + Φ(θ)− 1
f(θ)
]vθq(θ, q(θ))+
φ(θ)[Iq(θ, q(θ))− vq(θ, q(θ))]f(θ)
=0, (7)
along with the complementary slackness condition
∫ θ
θ
{I(θ, q(θ))− [v(θ, q(θ))− u(θ)]}dΦ(θ) = 0. (8)
Under assumptions analogous to those of the IR model, Result 1 in Appendix A states that an imple-
mentable allocation is optimal if, and only if, there exists a cumulative multiplier function Φ(θ) such that
conditions (7) and (8) are satisfied. We provide a sufficient condition for the first-order and complementary
slackness conditions of the IR and BC problems to coincide in the next proposition.
Proposition 1 (Equivalence Between Problems). The solution to the BC problem coincides with that to
the IR problem if Iq(θ, q) equals vq(θ, q) for types whose participation constraints bind in the IR problem.
For intuition, note that in both problems a seller needs to induce a consumer to purchase in the first
15
place. In principle, a seller can induce a consumer to buy by offering a high enough quantity for a given
price or a low enough price for a given quantity. The IR constraint, however, implicitly places a restriction
on the maximum price a seller can charge to a consumer, since the requirement that u(θ) ≥ u(θ) is
equivalent to v(θ, q(θ)) − u(θ) ≥ t(θ), which effectively constrains a consumer’s expenditure on the
seller’s good. The desired result follows by combining this intuition with the construction of a multiplier
for the BC constraint such that the BC constraint binds in the BC problem if, and only if, the IR constraint
binds in the IR problem. The equivalence in Proposition 1 requires the slope of consumers’ budgets and
marginal utility to agree for some types but extends to any BC problem obtained under a monotone affine
transformation of the preferences of the IR problem.
Corollary 1 (Preferences for Equivalence). Consider an IR problem with preferences v(θ, q) + z and a
BC problem with budget I(θ, q). Let Iq(θ, q) = vq(θ, q) for types whose participation constraints bind in
the IR problem. Then, the solution to the new BC problem with preferences η0(θ)+η1(θ)[v(θ, q)+z], with
η0(θ) and η1(θ) monotonically increasing, and the solution to the IR problem also coincide.
Proposition 1 and its corollary are important for several reasons. From a theoretical point of view,
models with budget-constrained consumers are usually considered intractable. Our result provides a sim-
ple argument for how a model with budget constraints can be represented as a model with a type-dependent
IR constraint, and its solution characterized, even when the preferences of the two problems do not exactly
coincide. Importantly, this result allows us to consider a number of cases of great empirical and practical
relevance. For instance, we can explicitly allow for consumers’ substitution across different goods—as
long as utility is separable between the good under consideration and other goods—in the presence of
subsistence constraints affecting the consumption of each of them. We can also evaluate the effect of
policies, like cash transfers, that directly affect consumers’ ability to pay and, hence, budgets.
For intuition on the environments to which our equivalence result applies, assume v(θ, q) = θν(q),
a specification common in the literature. A key assumption is BC homogeneity, which requires: q(θ)
or, equivalently, its inverse θ(q) to be increasing; T (q) = t(θ(q)) to coincide with Y − z(θ(q), q); and
T ′(q) = θ(q)ν ′(q). Substituting θ(q) = T ′(q)/ν ′(q) in the condition T (q) = Y − z(θ(q), q), it is easy to
see that T (q) is determined by a non-linear first-order differential equation, T (q) = Y −z(T ′(q)/ν ′(q), q),
with boundary condition T (q) = Y − z(θ, q) for q = q(θ). Verifying that BC homogeneity holds is then
equivalent to verifying that this differential equation admits an increasing solution. We show next that this
is the case for a broad range of specifications for z(θ, q).
Proposition 2 (Subsistence Functions Compatible with Equivalence). Let utility be v(θ, q) = θν(q) and
the subsistence function be z(θ, q) = −z1(θ)− z2ν(q), where z′1(θ) = ψ(log(θ − z2)) with ψ(·) positive,
continuous, and integrable, and θ > z2 > 0. Then, a BC problem with I(θ, q) = Y − z(θ, q) satisfies the
BC homogeneity assumption and the condition of Proposition 1.
The conditions in Proposition 2 imply that BC homogeneity holds for a wide range of specifications
16
of the utility function, v(θ, q), and the subsistence function, z(θ, q)—see the proof of Proposition 2 for an
example. The assumption that z(θ, ·) decreases with q, which is equivalent to assuming that the calorie
intake from q, C(θ, ·), increases with q, is natural: the greater the amount consumed of the seller’s good,
the greater the calorie intake. The requirement that ψ(·) be positive or, equivalently, that z(·, q) be a de-
creasing function of θ, is to ensure that q(θ) is increasing. The intuition for why z(·, q) may be decreasing
with θ in practice is that if the same calorie intake can be reached through different combinations of food
items, a consumer who values more the seller’s good may require less of other goods to achieve it.8
3.3 Distributional Effects of Nonlinear Pricing
Given the equivalence just established between models with heterogeneous reservation utilities and models
with budget constraints, from now on we refer to the IR model as the augmented model and think of it as
covering both cases. We now examine some of the implications of this model for the characteristics of an
optimal menu of prices and quantities, for welfare under nonlinear and linear pricing, and for the impact of
policies, like cash transfers, that affect consumers’ ability to pay. Throughout this discussion, we maintain
for simplicity that v(θ, q) = θν(q) and c′(q) = c as in our leading specification in the empirical analysis.
Our results extend to the case of non-separable utility and increasing and convex cost functions.
Prices and Quantities. Since the optimal quantity schedule is increasing, we can define the inverse
function, θ(q), of q = q(θ), and obtain the observed price schedule, just function of quantity, as T (q) =
t(θ(q)). By expressing the local incentive compatibility condition, θν ′(q(θ))q′(θ) = t′(θ), as θν ′(q(θ)) =
T ′(q(θ)), we can then rewrite (1) simply as
T ′(q(θ))− cT ′(q(θ))
=γ(θ)− F (θ)
θf(θ), (9)
with T (q(θ)) given by θν(q(θ)) − u(θ) and u(θ) = u(θ′) +∫ θθ′ν(q(x))dx for some given θ′. The price
schedule exhibits quantity discounts if T ′′(q) ≤ 0 or the unit price p(q) = T (q)/q decreases with q.9
Proposition 3 (Quantity Discounts). If in the highly-convex case
minθ,ρ∈[0,1)
({2θf 2(θ) + [ρ− F (θ)]θf ′(θ)}/{θf 2(θ)− [ρ− F (θ)]f(θ)}
)≥ 1, (10)
and in the weakly-convex case θf 2(θ) ≥ F (θ)[f(θ) + θf ′(θ)] for θ in [θ, θ1], then T ′′(q) ≤ 0 for all types
in the highly-convex case and types in [θ, θ1] and [θ2, θ] in the weakly-convex case.
Note that condition (10) is always satisfied if the type distribution is, for instance, uniform. Observe
also that with constant or convex marginal cost, the price schedule cannot imply quantity discounts at all
8See Lancaster (1966) on the distinction between the caloric and taste attributes of goods and Jensen and Miller (2008) onthe relationship between these attributes and subsistence constraints.
9Note that the two characterizations of quantity discounts are equivalent when q(θ) = 0.
17
quantities in the weakly-convex case: the fact that γ(θ) = 0 and γ(θ) = 1 implies that T ′(q(θ)) = c′(q(θ))
and T ′(q(θ)) = c′(q(θ)) so T ′(q) cannot be decreasing at all quantities. In Appendix A we provide
conditions under which the price schedule entails quantity premia for any type between θ1 and θ2.
As discussed, the highly- and weakly-convex cases of the augmented model have very different impli-
cations for the type of consumption distortions that nonlinear pricing gives rise to. In the highly-convex
case, quantity discounts imply consumption levels below first best for low consumer types but above first
best for high consumer types. The reverse consumption pattern arises in the weakly-convex case. In gen-
eral, by comparing (9) to the first-order condition for the first-best allocation, T ′(q(θ)) = c, it is immediate
that when the difference γ(θ) − F (θ) is positive, the quantity provided by a seller to a consumer type is
below first best whereas when the difference γ(θ)− F (θ) is negative, the quantity provided is above first
best. As both instances of the augmented model give rise to unit prices declining with quantity, quantity
discounts by Proposition 3 are actually consistent with opposite patterns of consumption distortions.
We can provide an intuition for why an optimal menu with quantity discounts leads some consumers
to purchase less than first-best quantities (underprovision) and others more than first-best quantities (over-
provision) through a simple example with only two types of consumers: high, θ, and low, θ, types—the
logic naturally extends to general utility, cost, and continuous type distribution functions.10 Note that a
seller maximizes profits by inducing the two groups of consumers to pay different prices by purchasing
different quantities. Underprovision arises when consumers’ reservation utility does not increase fast with
θ so that as long as the low-type participates, the high-type participates too. Then, by setting a high
enough marginal price decreasing with quantity, a seller can induce a low-type consumer to purchase a
small quantity, at a price at which this consumer just reaches her reservation utility level, but a high-type
consumer to purchase a strictly higher quantity. Indeed, since higher types face a higher marginal benefit
from consuming the good, an understatement of preferences by a high consumer type, through the pur-
chase of a small quantity, is best discouraged by a seller by decreasing the quantity offered to lower types
relative to first best: by offering smaller and smaller quantities to lower types, a seller makes purchasing
these quantities unattractive to higher types. This argument for why a seller provides low-type consumers
with quantities below first best also explains underprovision in the standard model.
The opposite situation occurs when the reservation utility increases fast with θ and it is much higher
for a high-type than for a low-type consumer. In this case, a seller needs to offer an attractive enough
price and quantity combination to a high-type consumer to induce her to purchase. A seller can profitably
do so by offering a high enough quantity to a high-type consumer, even above first best, but at a price at
which the high type’s utility equals her reservation level. By also making the marginal price decrease with
quantity, and sufficiently high at low quantities, a seller can ensure that a low-type consumer purchases
10Formally, incentive constraints are upward binding for all consumer types whose closest marginal type—a type indifferentbetween purchasing and not—is above them. Incentive constraints are downward binding for all consumer types whose closestmarginal type is below them. In the standard model, the only marginal type is the lowest one, so quantity underprovision affectsall types except for the highest one, whose consumption is undistorted.
18
a strictly lower quantity and, thus, separate the two consumer groups. Intuitively, an overstatement of
preferences by low consumer types is best discouraged by a seller by increasing the quantity offered to
higher types relative to first best: as larger and larger quantities are less and less appealing to lower types,
offering higher types large enough quantities makes overstating preference more costly to low types.
Figure 2: Menus Under Augmented and Standard Models
01
23
4T
otal
Pric
e
0 1 2 3Quantity
Augmented Model Standard Model
24
68
1012
Pric
e P
er U
nit
0 1 2 3Quantity
Augmented Model Standard Model
01
23
Qua
ntity
1 1.2 1.4 1.6 1.8 2Marginal Willingness to Pay
Augmented Model Standard Model
Welfare Implications of Alternative Nonlinear Pricing Models. A clear ranking emerges between
the allocations implied by the augmented model and by the standard nonlinear pricing model in terms
of quantity provided, marginal prices, and consumer surplus. The reason is that when the augmented
model leads to allocations different from those implied by the standard model, these allocations entail
higher quantities and lower marginal prices, since for γ(θ) to be different from its value (of one) in
the standard model, it must be necessarily smaller than it. But if so, then it is immediate by (1) that
consumption distortions are smaller under the augmented model, since T ′(q) is closer to c. Then, for
similar (or higher) profiles of reservation utility in the two models, consumer surplus must also be higher.
Intuitively, if it is profitable for the seller to trade with a consumer, then the more attractive outside
consumption opportunities are, the better the seller’s offered price and quantity must be to induce the
consumer to purchase. See Figure 2 for an illustration of the optimal menu in the two models for a
parameterization similar to the one in Figure 1 and Appendix A for details.
Proposition 4 (Augmented vs. Standard Models). Let u be the reservation utility in the standard nonlinear
pricing model. If u(θ) ≥ u, then the augmented nonlinear pricing model implies higher consumption and
consumer surplus and lower marginal prices for each consumer than the standard model.
Nonlinear vs. Linear Pricing. A natural question is whether consumers are better off under nonlinear
pricing compared to linear pricing. In the standard linear pricing problem, a seller charges the unit price
pm for any quantity demanded. A consumer of type θ then chooses q to maximize θν(q) − pmq. By
the consumer’s first-order condition for the choice of q, θν ′(q) = pm, the demand function of a type θ
consumer is qm(θ) = (ν ′)−1 (pm/θ) and consumer surplus is um(θ) = θν(qm(θ)) − pmqm(θ). Given
aggregate demand Q(pm) =∫θqm(θ)f(θ)dθ, a seller’s problem just consists of choosing the linear price
19
pm that maximizes total profit, (pm − c)Q(pm). If the seller has market power, then by usual arguments
consumers are offered quantities below first best at marginal prices above marginal cost.
Whether consumers prefer nonlinear or linear pricing depends on the degree of market participation
that the two pricing schemes induce. When all consumers participate under both schemes and nonlinear
pricing entails quantity discounts, consumers are better off under linear pricing. Intuitively, linear pric-
ing is preferred when the quantity purchased by a consumer is higher under linear than under nonlinear
pricing. In this case, a seller’s ability to price discriminate simply exacerbates the quantity underprovision
typical of non-competitive linear pricing. Perhaps more surprising is that with full participation under
both pricing schemes, consumers prefer linear to nonlinear pricing even when quantities are smaller un-
der linear pricing. The reason is that a seller who can price discriminate across consumers asks for “too
high a price” for the greater quantity he is willing to provide under nonlinear pricing.
Proposition 5 (Nonlinear vs. Linear Pricing With Full Participation). Assume full participation under
nonlinear and linear pricing. Suppose the price schedule exhibits quantity discounts (T ′′(q) ≤ 0 and
p′(q) ≤ 0). If qm(θ) ≥ q(θ), or q(θ) > qm(θ) and γ(θ) < 1, then a consumer of type θ enjoys higher
consumer surplus under linear than under nonlinear pricing.
This proposition implies that for nonlinear pricing to be preferred to linear pricing in the presence
of quantity discounts, some consumers must be excluded from trade under the augmented model.11 In
particular, a consumer who is excluded under linear pricing but included under nonlinear pricing prefers
nonlinear pricing. The next result shows that under a mild sufficient condition, which implies there exist
consumers “at risk” of exclusion, consumers who participate under nonlinear pricing, s(θ, q(θ)) ≥ u(θ),
but have reservation quantities above first best, q(θ) > qFB(θ), would be excluded under linear pricing
and, hence, are better off under nonlinear pricing.12 See Example 1 in Appendix A for an illustration.
Proposition 6 (Nonlinear vs. Linear Pricing With Exclusion). Let ν ′′(·) < 0. Assume s(θ, q(θ)) ≥ u(θ)
and q(θ) > qFB(θ) for consumer types in the interval [θ′, θ′′]. If there exists θ ∈ [θ′, θ′′] with um(θ) = u(θ),
then an interval of consumer types in [θ, θ′′] are excluded from trade under linear pricing but included
under nonlinear pricing and so enjoy higher consumer surplus under nonlinear pricing.
The proposition implies that the availability of desirable enough outside consumption opportunities
provides incentives for a seller to match them when it is possible to include different consumers at different
prices. This result then highlights an efficiency dimension of nonlinear pricing: by price discriminating
across consumers, a seller may have an incentive to serve consumers who would demand unprofitably
large quantities under linear pricing.11In general, when there is some exclusion under either pricing scheme the proposition does not apply, and a consumer
included under both schemes can prefer nonlinear to linear pricing. The proposition also applies to the standard model ifqm(θ) ≥ q(θ) but not necessarily otherwise. When quantity distortions are small, it is easy to show that under the standardmodel, consumer types close to θ may prefer nonlinear pricing, even if they participate both under nonlinear and linear pricing.
12The claim requires the existence of a type whose utility under linear pricing equals her reservation utility. For instance,this occurs when such type’s reservation utility is her first-best utility. In this case, the consumer is either included under linearpricing, and so um(θ) = u(θ), or excluded under linear pricing, in which case again um(θ) = u(θ).
20
Income Transfers. The budget constraint version of our model provides a natural framework to explore
how cash transfers affect sellers’ incentives to price discriminate and, thus, prices and consumption when
individuals have limited ability to pay. Our model implies that income transfers increase the amount of
rice purchased by consumers but also endogenously give rise to higher prices as a seller adjusts his price
schedule in response to the greater ability to pay of consumers. Thus, sellers’ market power is a key force
shaping the impact of transfers on prices and quantities. Indeed, in a model with linear pricing and perfect
competition, transfers would affect prices only if the village economy was isolated and the supply of the
good of interest limited—see also Cunha et al. (2014) on this point. This situation, however, is unlikely
in the context we study where goods like rice are easily available from outside.
When consumers face budget constraints, changes in their income affect prices by creating an incentive
for a seller to take advantage of the greater ability to pay of consumers. To see this point, imagine that
consumers receive a cash transfer τ(θ) > 0. Suppose first that the transfer is independent of consumers’
characteristics (τ(θ) = τ ). Such a transfer then leads to a uniform increase in the price schedule, as
a seller anticipates that consumers can afford to pay more, but has no impact on purchased quantities
or marginal prices: the schedule of quantities offered before the transfer is still incentive compatible.
Suppose now that the transfer depends on consumers’ characteristic, θ, and so affects individual demand.
For instance, in the villages in our data, transfers are inversely proportional to income and, thus, larger for
households purchasing smaller quantities. A progressive transfer that increases consumers’ budgets for the
seller’s good (τ ′(θ) < 0) reduces the rate at which budgets increase with θ and so effectively increases the
marginal utility schedule of all consumers whose budget constraints for the seller’s good bind, u′(θ). By
incentive compatibility, the quantity purchased by each type then increases so the corresponding marginal
price—or, equivalently, the marginal price for a given percentile of quantities before and after the transfer
as F (θ) = G(q)—decreases. As before, given that consumers’ ability to pay has increased, the overall
price schedule increases.
Proposition 7 (Transfers Under Nonlinear Pricing). Consider an income transfer decreasing with θ. In
the highly-convex case for γ ∈ (0, 1] and in the weakly-convex case, the transfer leads to a lower marginal
price schedule and a first-order stochastic improvement in the distribution of quantity purchases. In both
cases, the income transfer induces a higher price schedule.
By Proposition 7, the price schedule becomes lower and flatter after transfers are introduced.13 Since
transfers cause both quantities and prices to increase, their effect on the price per unit of the good and,
thus, on the degree of price discrimination, is in principle ambiguous. Asymmetric changes in unit prices
at low and high quantities leading to a greater intensity of price discrimination occur, however, naturally.
For instance, when the utility function exhibits decreasing absolute risk aversion and the type distribution
is uniform, it is easy to show that in the highly-convex case—a common instance in our data—the price13When γ = 0, the same result applies if q(θ) ≥ l(0, θ). This latter condition ensures that the multiplier does not increase
after the transfer, in which case offered quantities may decrease.
21
per unit increases at low quantities but decreases at high quantities in response to an increase in income,
as the optimal price increase is smaller and smaller for higher types. See the Supplementary Appendix.
Hence, the schedule of unit prices becomes steeper. Recall that in the highly-convex case, quantity is
underprovided to low types but overprovided to high types. Then, such an increase in the degree of price
discrimination is associated with smaller consumption distortions at low quantities, due to the increase in
offered quantities after the transfer, but greater consumption distortions at higher quantities, as the higher
offered quantities accentuate the overprovision implied by nonlinear pricing. Thus, when sellers have
market power, relaxing consumption constraints does not obviously lead to greater consumer surplus—as
discussed, it may lead to lower consumer surplus if the transfer is uniform across consumers—given that
the transfer provides an opportunity for sellers to extract more resources from consumers.
A further implication of this argument is that when markets for food are imperfectly competitive, in-
kind transfers may be preferable to cash transfers, especially if they lead to an increase in the consumption
floor on other goods, z(θ, q), and so to a decrease in the budget available for a seller’s good. In this case,
by the reverse logic of Proposition 7, a transfer can lead to a decrease in prices with virtually no change
in purchased quantities.
4 Identification and EstimationIn this section, we first establish the identification of the model’s primitives building on Perrigne and
Vuong (2010). We then show how to obtain nonparametric and semi-parametric estimators.
4.1 Identification
In this section, we show that the model’s primitives, namely, consumers’ utility function, v(θ, q), the
cumulative distribution function of preference characteristics, F (θ), the associated probability density
function, f(θ), and a seller’s marginal cost, c′(q), can be identified in each village under standard assump-
tions, just based on data on households’ quantity purchases and expenditures that are commonly available
for developing countries. However, u(θ) in the IR model and, hence, I(θ, q) in the BC model can only be
identified for households whose corresponding constraint binds.14
In recovering the model’s primitives, we maintain that the sufficient condition s(θ, q(θ)) ≥ u(θ) for
full participation holds: it states that a seller makes non-negative profits from each consumer’s type at
the reservation quantity. This approach is justified by the fact that all households in each of our villages
consume rice. We also make use of the normalization θ = 1, since a scaling assumption is necessary for
identification. In the discussion that follows, we treat T (q) as known and, for simplicity, we denote the
cumulative multiplier γ(·) as the multiplier.
Since the first-order condition in (1) provides our estimating equation for a seller’s cost structure,
14Any economy with reservation utility u(θ) binding on the set Θ′ is observationally equivalent to an economy with thesame primitives but reservation utility u(θ) that agrees with u(θ) on Θ′ and is appropriately adjusted for the remaining types.
22
we can identify nonparametrically marginal cost at most at finitely many points without any other in-
formation in addition to prices and quantities consumed. Based on these data and the assumption of
known or parametric marginal cost, however, the remaining primitives are nonparametrically identified
for v(θ, q) = θν(q) if the support of θ is unknown, or for any v(θ, q) satisfying the assumed curva-
ture restrictions if the support of θ is known. The multiplicative specification of utility we consider,
v(θ, q) = θν(q), is ubiquitous in the theoretical and empirical literature on auctions and nonlinear pricing
both for its analytical tractability and for reasons of identification (see Guerre, Perrigne, and Vuong (2000)
and Perrigne and Vuong (2010)). We maintain that v(θ, q) = θν(q) throughout, and discuss relaxations of
this assumption as well as similarities between out setup and those common in the nonlinear pricing and
hedonic pricing literature in the Supplementary Appendix.
The assumption of a parametric cost function is common. The specific assumption of constant marginal
cost is consistent with anecdotal evidence on the cost of provision of basic staples in the villages in our
data. The marginal cost of rice for a seller is, basically, the wholesale price of rice and it is difficult to
imagine what would induce such a cost to vary with quantity.
Marginal Cost and Multipliers on Participation Constraint. The relationship between θ and q implied
by incentive compatibility is central to the identification of the model. To see why, denote by G(q) the
distribution of quantities across consumers in a village with density function g(q), and by q ≡ q(θ) and
q ≡ q(θ), respectively, the smallest and largest observed quantity. Since q = q(θ) and q′ ≥ 0, it is
immediate that G(q) = Pr(q ≤ q) = Pr(θ ≤ q−1(q) = θ) = F (θ). Hence, F (θ) is identified by G(q).
Moreover, f(θ) = g(q)q′(θ). Our model also implies a relationship between the distribution of quantities
and marginal prices that provides direct information about a seller’s marginal cost and the set of consumers
whose participation (or budget) constraints bind. Let’s re-write the seller’s first-order condition (1) as:
log [T ′(q)] = log [c′(q)]− log
[1 +
x(q)
g(q)
], (11)
where x(q) = [G(q)− γ(θ(q))]θ′(q)/θ(q). It is then apparent that a semiparametric nonlinear relationship
links T ′(q) to c′(q) and g(q). Based on (11), information on prices and quantities is sufficient to identify
whether the highly- or weakly-convex case of our model applies. In both cases, a seller’s marginal cost,
up to a finite number of parameters, and the multiplier on the relevant constraint can be easily recovered.
Intuitively, the number of times the function x(q) equals zero, and the percentiles of the distribution of
quantities at which this occurs, are directly informative about which offered quantity is undistorted, since
T ′(q) = c′(q) when x(q) = 0 by (11) or, equivalently,G(q) = γ(θ(q)) by (1) and (11). At these quantities,
c′(q) is then identified by T ′(q), and γ(θ(q)) by G(q). Recall that in the highly-convex case, the multiplier
γ(θ(q)) is constant, so G(q) = γ and then T ′(q) = c′(q) only at one interior quantity between q and q. In
the weakly-convex case, instead, T ′(q) = c′(q) at q, q, and a unique interior quantity.15 Hence, when the
15In this case single-crossing of T ′(q) and c′(q) is due to single-crossing of q(θ) and l(γ, θ) for each γ. Indeed, [γ(θ) −
23
function x(q) crosses the x-axis only at one quantity between q and q, we infer that the highly-convex case
applies. When the function x(q) crosses the x-axis exactly three times, at q, q, and one quantity between
q and q, we infer that the weakly-convex case applies.
In the highly-convex case, the quantity q(θHC) at which x(q) equals zero identifies the multiplier,
since γ = G(q(θHC)), and marginal cost when constant, since c′(q) = T ′(q(θHC)). When the resulting
γ ∈ (0, 1), then the relevant constraint binds only for the lowest and highest types. When γ ∈ (0, 1),
then the relevant constraint binds only for the highest type. In the special case in which γ equals one and
c = T ′(q), the standard nonlinear pricing model applies, and condition (1) reduces to
1
T ′(q)= λ0 + λ1
G(q)[1− T ′(q)/T ′(q)]1−G(q)
+ λ2[T ′(q)/T ′(q)− 1]
1−G(q), (12)
with λ0 = λ1 = λ2 = 1/c. Then, with constant marginal cost, testing if λ0, λ1, and λ2 are not significantly
different from each other provides a test of the standard model, which only requires knowledge of marginal
prices and the cumulative distribution function of quantities. When marginal cost is not constant and
unknown up to a finite number of parameters, a similar argument applies: both c′(q) and x(q) are identified
by (11), which can be interpreted as a standard additive semiparametric model for c′(q) and x(q) with T ′(q)
and g(q) known. As before, when x(q) equals zero once, G(q) evaluated at this quantity identifies γ.
In the weakly-convex case, T ′(q) evaluated at any of the three quantities at which x(q) equals zero
identifies cwhen marginal cost is constant. As for the multiplier, denote byA(q) the coefficient of absolute
risk aversion. Since θ′(q)/θ(q) = T ′′(q)/T ′(q) + A(q) by local incentive compatibility, (1) also implies
G(q) = γ(θ(q))− g(q)
[1− c′(q)
T ′(q)
] [T ′′(q)
T ′(q)+ A(q)
]. (13)
That is, once c′(q) is identified, γ(θ(q)) is identified up to A(q). Then, q1 is identified as the largest
quantity at which γ(θ(q)) is zero and q2 as the smallest quantity at which γ(θ(q)) is one: the relevant
constraint binds for all consumers purchasing quantities between q1 and q2 (included).
When neither the highly-convex nor the weakly-convex case apply, (11) and (13) identify marginal
cost and multipliers by the same argument as in the weakly-convex case. The relevant constraint binds for
all consumers purchasing quantities at which the derivative of γ(θ(q)) with respect to quantity is nonzero.
Proposition 8 (Identification of Marginal Cost and Multipliers). In a village, the number of zeros of
the function x(q) in (11) identify whether the highly-convex, the weakly-convex, or neither case applies.
Marginal cost (up to a finite number of parameters) and the schedule of multipliers are identified from
the cumulative distribution and probability density functions of quantities and from the marginal price
schedule by (11) in the highly-convex case, and by (11) and (13), up to the coefficient of absolute risk
aversion, in the remaining cases. With constant marginal cost, a necessary condition for the standard
F (θ)]/f(θ) is negative at θ, positive at θ, and increases from θ to θ by the potential separation assumption. See Jullien (2000).
24
model to apply is that the ratio of λ0, λ1, and λ2 in (12) does not significantly differ from 1.
Type Distribution and Preferences. Since F (θ) =G(q) and f(θ) = g(q)/θ′(q), the seller’s first-order
condition can be rewritten as θ′(q)/θ(q) = g(q)[T ′(q) − c]/{T ′(q)[γ(θ(q)) − G(q)]}. Integrating both
sides of this expression from q to q, it follows
log(θ(q)) = log(θ(q)) +
∫ q
q
∂ log(θ(x))
∂xdx = log(θ(q)) +
∫ q
q
g(x)[T ′(x)− c]T ′(x)[γ(θ(x))−G(x)]
dx. (14)
Note that the integrand term in (14) is positive since g(q) > 0, T ′(q) > 0, and, in addition, T ′(q) ≥ c
if, and only if, γ(θ(q)) ≥ G(q). This term is also well-defined both in the highly- and weakly-convex
case, despite the existence of quantities at which γ(θ(q)) = G(q) and so T ′(q) = c; see Appendix A. We
then obtain θ(q) as the exponential transformation of the right-most side of (14). Once marginal cost and
the schedule of multipliers are identified, (14) implies that θ(q) is a known function of objects that are
observed, like q, q, and T ′(q), or identified, like g(q) and G(q). Hence, θ(q) is also identified up to θ(q).
Proposition 9 (Identification of the Type Distribution). In a village, once marginal cost and the schedule
of multipliers are identified, the type support is identified from the probability density and cumulative dis-
tribution functions of quantities and from the marginal price schedule by (14) up to scale. The probability
density function of types, f(θ), is identified from the probability density function of quantities and the
derivative of the inverse of θ(q) by (14).
To see how u(θ), at points at which the participation (or budget) constraint binds, and ν(q) are identi-
fied, note first that once θ(q) is identified, the incentive compatibility condition ν ′(q) = T ′(q)/θ(q) implies
that the base marginal utility function, ν ′(q), is also identified from T ′(q). Once ν ′(q) is identified, we can
recover ν(q) as long as ν(q) or u(θ) is known at one point, say, q′ = q(θ′). Suppose, then, that the reserva-
tion utility u(θ) is known for at least one consumer type, θ′. We can then use ν(q′) = [u(θ′)+T (q(θ′))]/θ′
to identify ν(q′), and recover ν(q) as ν(q) = ν(q′)−∫ q′qν ′(x)dx for q ≤ q′ and ν(q) = ν(q′)+
∫ qq′ν ′(x)dx
for q ≥ q′. Lastly, once ν(q) is identified, we can also identify u(θ) as u(θ) = θ(q)ν(q)− T (q) for those
consumers whose participation (or budget) constraints are identified to bind.
Proposition 10 (Identification of Utility). In a village, once marginal cost, the schedule of multipliers,
and the support of preference characteristics are identified, the base marginal utility function, ν ′(q), is
identified from the marginal price schedule by the incentive compatibility condition ν ′(q) = T ′(q)/θ(q).
Then, up to the utility of one consumer type whose participation (or budget) constraint binds, the utility
function, θ(q)ν(q), is identified. The reservation utility function, u(θ), is identified for consumer types
whose participation (or budget) constraints bind.
Identification with Parametric Utility Curvature. Assuming the utility function takes the semipara-
metric form θν(q), with ν(q) specified as a member of a flexible parametric family, we can easily identify
25
γ(·) as well as all other model’s primitives based on our information on prices and quantities. Specifically,
suppose that ν(q) is a member of the HARA family with ν(q) = (1− d)[aq/(1− d) + b]d/d, a > 0, and
aq/(1− d) + b > 0. By incentive compatibility, T ′(q) = θ(q)ν ′(q), and so
log(T ′(q)) = log(θ(q)) + log
[a
(aq
1− d+b
)d−1]
= log(aθ(q))− (1− d) log
(aq
1− d+ b
).
In this case, a simple additive semiparametric relationship links T ′(q) to q, whose semiparametric compo-
nent identifies θ(q) up to scale. As before, f(θ) is identified from g(q) and θ(q). Thus, only c and γ(·) are
left to be identified. To this purpose, observe that a seller’s first-order condition, expressed as
γ(θ(q)) +θ′(q)g(q)
θ(q)T ′(q)c = G(q) + g(q)
θ′(q)
θ(q), (15)
leads to a system of as many linear equations in γ(θ(q)) and c as distinct observed quantities. With T ′(q)
known and g(q), G(q), θ(q), and θ′(q) identified—note that θ′(q) is identified from θ(q)—it follows that
c and γ(θ(q)) are identified if γ(θ(q)) takes the same value for at least two quantities. If not, then the
participation (or budget) constraint must bind at all quantities, and so there exists (at least) one consumer
type consuming qFB(θ). At this quantity, c equals T ′(q), and so is identified. At the remaining quantities,
γ(θ(q)) is then identified by (15). The identification of the remaining model primitives follows as before.
4.2 Estimation
We estimate the model separately in each village in two steps. After recovering the price schedule and the
distribution of quantities in each village, in the first step we estimate the seller’s marginal cost, the multi-
pliers associated with the participation (or budget) constraint, the support of the distribution of consumers’
marginal willingness to pay, and consumers’ utility. In the second step, we estimate the probability density
function of consumers’ marginal willingness to pay. Omitted details are collected in Appendix C.
Price Schedule. Our data contain direct information on unit prices and on the quantities purchased by
each consumer (household) in each village. Due to the existence of multiple observations on the unit
price of a given quantity in many villages, in each village we use the median unit value of each quantity
as the unit price for that quantity so as to minimize measurement error. Then, the price schedule in a
village, T (q), can be simply obtained by fitting the resulting unit prices, multiplied by the corresponding
quantity, on observed quantities. We do so by least squares allowing for different specifications in each
village.16 Note that when the fit of the estimated specification is good, the error that fitting may cause can
be considered minimal and, thus, ignored. Since the adjusted R2 for all estimated specifications is never
below 0.90 and often close to 1, we treat the price schedule as observed in inference—see Perrigne and
16We can treat quantity as exogenous since the information on the quantity purchased by each consumer provides directinformation on the price schedule of the seller and T (q) is a deterministic function of q under our model.
26
Vuong (2010) for the same assumption.
Distribution of Quantities. Denote by N the number of consumers purchasing rice in a village and
by qi the quantity purchased by consumer i. Then, G(q) can be estimated using a counting process as
G(q) = N−1∑N
i=1 1(qi ≤ q), where 1(·) is an indicator function and q ∈ [q, q]. We estimate g(q) via
a univariate kernel density estimator as g(q) = (Nhq)−1∑N
i=1Kq ((q − qi)/hq) for a suitable choice of
kernel function Kq(·) (Epanechnikov) and bandwidth hq.
Marginal Cost and Multipliers on the Participation Constraint. We test whether the standard non-
linear pricing model or the augmented model applies, and in this latter case identify the relevant case of
the augmented model, as follows. We specify x(q) as a flexible fractional polynomial. Fractional polyno-
mials allow for logarithms and non-integer powers, thus encompassing a wider range of shapes than those
obtained with regular polynomials. Given the granularity of our data, estimating x(q) nonparametrically
would be impractical. We then estimate (11) by generalized method of moments subject to the constraint
that c ranges between T ′(qmax) and T ′(qmin): since the model implies that the participation (or budget)
constraint binds for at least one consumer type, there must exist a quantity at which T ′(q) = c. Note that
this procedure allows us to obtain in one step estimates of c and x(q), and achieve standard convergence
rates in the next estimation step, while allowing for a flexible yet parsimonious specification of x(q).
We determine the relevant case of the augmented model by testing for the number of zeros of the
function x(q). Specifically, we conduct a multiple test of the hypotheses that x(q) equals zero at any of
theN points in the support of quantities in a village. We do so by computing the p-values of the hypotheses
that x(qi) = 0 at each distinct quantity i in the village. Since we test multiple linear constraints in each
village, we computed Hom-adjusted p-values to suitably bound the probability of falsely rejecting one of
the null hypotheses. If all p-values are above a given confidence level (five percent) except for one quantity,
then we reject the hypothesis that x(q) equals zero more than once. Correspondingly, we infer that the
highly-convex case of the augmented model applies to the village. If, instead, all p-values are above the set
confidence level except for three quantities, two of which are the lowest and highest quantities in a village,
then we infer that the weakly-convex case applies to the village. (Recall that in the weakly-convex case,
x(q) should be zero at q, q, and at an intermediate interior quantity.) Otherwise, we infer that the village
cannot be categorized as an instance of either the highly- or weakly-convex case.
When the highly-convex case applies, we estimate q(θHC) as the unique quantity q(θHC) at which
x(q) equals zero, and the constant multiplier γ as γ = G(q(θHC)). Note that if q(θHC) = q so that γ = 1,
then we cannot reject that the standard model applies to a village. When, instead, the weakly-convex case
applies, we specify γ(θ(q)) as a logistic function andA(q) as a flexible positive function of q, and estimate
them by generalized method of moments from by (13) as
G(q) = γ(q;ψγ)− g(q)
[1− c
T ′(q)
] [T ′′(q)
T ′(q)+ A(q;ψA)
],
27
where ψγ and ψA are vectors of parameters. We recover q1 as the largest quantity at which γ(q;ψγ) is
not significantly different from zero, and q2 as the smallest quantity at which γ(q;ψγ) is not significantly
different from one. Then, γ(q) = 0 below q1, γ(q) = γ(q;ψγ) between q1 and q2, and γ(q) = 1 above q2.
Support and Probability Density Function of Types. Since θi = θ(qi) by (14), given a kernel function
Kθ(·) (Epanechnikov) and bandwidth hθ, we estimate the density f(θ) nonparametrically as
f(θ) = (Nhθ)−1∑N
i=1Kθ((θ − θi)/hθ). (16)
Utility Function. We estimate marginal utility as ν ′(qi)=T ′(qi)/θ(qi) from the local incentive compat-
ibility condition θν ′(q)=T ′(q). Then, θiν(qi) can be obtained from θi = θ(qi) and ν ′(qi).
5 Estimation ResultsHere, we present evidence on the fit of the model to the data and estimates of the model’s primitives.
We then assess the consumption distortions associated with observed nonlinear pricing, and the degree
of sellers’ market power. We also compare the distributional properties of the observed allocations in
each village to the counterfactual ones that would emerge if sellers could not price discriminate across
consumers by charging different prices for different quantities. Finally, we consider the effect of the
Progresa cash transfers on prices and compare the predictions of our model to this evidence.
5.1 Village Categorization
As discussed in Section 2, our data cover 191 villages, which we take to coincide with the administrative
definition of a Mexican municipality. At least 100 households consume rice in 38 of these villages, 31 of
which satisfy a necessary condition for q(θ) to be increasing under the standard model. This condition is
a mild regularity requirement on the price schedule, which we impose in estimation, that just depends on
marginal prices and the distribution of quantity purchases in a village. See Appendix B for details.
Given the observed price schedule and distribution of quantity purchases, our first step estimation of
sellers’ marginal cost and multipliers on the participation (or budget) constraint converged on 24 of these
31 villages.17 We refer to these 24 villages as our estimation sample. Following the procedure outlined in
Section 4.2, we categorize villages as highly-convex, weakly-convex, or not classifiable as either instance
of our model. Based on this procedure, we found that 11 of these 24 villages conform to the highly-
convex case of our model, no village conforms to the weakly-convex case, and the remaining 13 villages
cannot be categorized as either instance. We refer to these 13 villages as the non-regular sample and to the
remaining 11 villages of the highly-convex case as the regular sample, which we focus on here. Estimates
of the model’s primitives on the 13 villages of the non-regular sample are presented in Appendix B.
17In practice, the numerous points of singularity of the right side of (11), together with the relative flatness of the scheduleof unit prices, created numerical indeterminacy that prevented the estimation of c and x(q) for those villages.
28
Figure 3: Unit Prices and Probability Density Function of Quantities5
7.5
1012
.515
Uni
t Pric
e
0 1 2 3Quantity
Regular Sample
01
23
Den
sity
Fun
ctio
n of
Qua
ntiti
es
0 1 2 3Quantity
Regular Sample
57.
510
12.5
15U
nit P
rice
0 1 2 3Quantity
Non-regular Sample
01
23
Den
sity
Fun
ctio
n of
Qua
ntiti
es
0 1 2 3Quantity
Non-regular Sample
In Figure 3, we plot the schedule of unit prices and the probability density function of observed quanti-
ties in each of the 24 villages in our estimation sample, separating the 11 villages of the regular sample (top
plots) from the 13 villages of the non-regular sample (bottom plots). Note that in these 13 villages, prices
per unit (bottom left plot) are less steep than in the 11 villages in the regular sample (top left plot), even
increasing with quantity. Also, the probability density function of quantities in these 13 villages (bottom
right plot) is more compressed than in the remaining 11 villages, as consumers are more evenly distributed
across quantities. For an intuition on these differences across samples, note that in the highly-convex case,
γ(θ) = γ and a seller’s first-order condition can be written as
c
T ′(q)+
γ −G(q)
g(q)θ(q)/θ′(q)= 1. (17)
Since γ − G(q) > 0 for θ < θHC , where θHC is the unique interior consumer type offered a first-best
quantity, and T ′(·) is decreasing, it is easy to see that for consumers with types θ < θHC , the denominators
in the two fractions in (17) should approximately move in opposite directions for (17) to hold. This fact
roughly implies that marginal prices, which track unit prices, and the density of quantities purchased
29
should be inversely related at low and intermediate quantities—especially if θ′(q) increases over these
quantities, as we estimate in several villages. This inverse pattern between marginal prices and the density
of quantities at low and intermediate quantities is evident in the regular sample (top panels of Figure
3). Instead, unit prices are unrelated or even positively related to the density of quantities at low and
intermediate quantities in the non-regular sample (bottom panels of Figure 3).
Note that in each of the 11 villages in the regular sample, the price per unit of rice declines with
quantity (top left panel) so unit prices are highest for the households who purchase the smallest quantities.
It is also apparent from the two top plots in Figure 3 that prices decrease more rapidly over the range of
quantities that most households purchase—up to 1.5 kilos per week. Thus, households in each village are
directly affected by the nonlinearity of the price of rice and nearly all of them face significant quantity
discounts: the unit price of the smallest quantity purchased, 0.2 kilos, is more than 15 pesos whereas the
unit price of the largest quantity, 3 kilos, ranges from 4.1 pesos to 5.7 pesos.
5.2 Model Fit and Evidence of Market Power
A central implication of the price discrimination models we have studied is that the shape of the price
schedule is determined by the cumulative distribution and probability density functions of consumers’
marginal willingness to pay (“type”), which is unobserved but is directly related to the observed distribu-
tion of quantities in each village. The first-order condition of a seller is key to these implications, since it
relates the price schedule in a village to the distribution of quantity purchases according to (11). Then, one
way to assess the fit of our model to the data is to determine the extent to which our estimates of c, g(q),
and the auxiliary function x(q) satisfy this relationship. Expressing (11) as g(q)/T ′(q) = [g(q) + x(q)]/c,
we plot in Figure 4 the estimated values of the left and right side of this expression with [g(q) + x(q)]/c
on the x-axis and g(q)/T (q) on the y-axis for each quantity in our estimation sample of 24 villages. Then,
the closer this relationship to the 45-degree line, the better the fit of the model to the data.
The two top plots in Figure 4 differ in that in the right panel, we trimmed the top and bottom 1% of
observations. The two bottom plots in Figure 4 display the same predicted relationship for the regular
sample of 11 villages conforming to the highly-convex case (left panel) and for the non-regular sample
of 13 villages that cannot be categorized as instances of the highly- or weakly-convex case (right panel).
Note that in all these samples, the model fits well the price and quantity data from each village.
This evidence on model fit as well as our village categorization are suggestive of the fact that our data
are consistent with the existence of market power among sellers. Indeed, if prices and quantities were
generated by a perfectly competitive market for rice, then marginal prices would equal marginal cost at
all quantities in each village. Since marginal cost is constant, then so would be average and marginal
prices, which is inconsistent with the left plots in Figure 3. Specifically, under perfect competition, a
seller’s first-order condition would reduce to T ′(q) = c, and so the estimated function x(q) would not
be significantly different from zero at all quantities—recall the argument in Section 4.2. This is not the
30
Figure 4: Model Fit Within and Across Villages0
.2.4
.6R
atio
of D
ensi
ty o
f Qua
ntiti
es to
Mar
gina
l Pric
es
0 .2 .4 .6Sum of Density of Quantities and Auxiliary Function Over Marginal Cost
Estimation Sample
0.1
.2.3
.4R
atio
of D
ensi
ty o
f Qua
ntiti
es to
Mar
gina
l Pric
es
0 .1 .2 .3 .4Sum of Density of Quantities and Auxiliary Function Over Marginal Cost
Estimation Sample (Trimmed)
0.2
.4.6
Rat
io o
f Den
sity
of Q
uant
ities
to M
argi
nal P
rices
0 .2 .4 .6Sum of Density of Quantities and Auxiliary Function Over Marginal Cost
Highly-Convex Case
0.1
.2.3
.4R
atio
of D
ensi
ty o
f Qua
ntiti
es to
Mar
gina
l Pric
es
0 .1 .2 .3 .4Sum of Density of Quantities and Auxiliary Function Over Marginal Cost
Villages Not Categorized
case in our sample: in the 11 villages of the regular sample conforming to the highly-convex case, x(q) is
not significantly different from zero only at one quantity, as consistent with the assumption of the highly-
convex case, whereas in the remaining 13 villages of the non-regular sample, x(q) is not significantly
different from zero only at about half of the quantities.18 We provide evidence on the degree of sellers’
market power and the type of price discrimination practiced in our villages in Section 5.4, where we also
assess the surplus loss associated with observed nonlinear pricing compared to first-best pricing.
5.3 Estimates
We present here estimates of sellers’ marginal cost, the multipliers on consumers’ participation (or budget)
constraints, the distribution of consumers’ marginal willingness to pay, and marginal utility.
Marginal Cost and Multipliers. Figure 5 reports the estimated marginal cost and multiplier on the
participation (or budget) constraint in the 11 villages of the regular sample, ordered by the value of the
multiplier. The figure also depicts pointwise asymptotic confidence bounds for the estimated values of
18By (11), testing for the number of times x(q) = 0 or, equivalently, T ′(q) = c, is equivalent to testing for the number ofquantities with the same marginal price since marginal cost is constant.
31
c and γ in each village. Note that both c and γ are fairly precisely estimated: confidence bounds are so
Figure 5: Estimated Marginal Cost and Multipliers2
34
56
Mar
gina
l Cos
t
0 5 10Village
.85
.9.9
51
Mul
tiplie
r on
Par
ticip
atio
n C
onst
rain
t
0 5 10Village
small that they are barely visible in most villages. Moreover, whereas sellers’ estimated marginal cost
noticeably varies across villages, from 2.823 to 5.914, the range of variation of the multiplier is much
smaller, from 0.876 to 1 with an average value of 0.960. However, the standard model (γ = 1) applies
only to two villages: in all other villages, we can reject that hypothesis that the standard model applies at
standard significance levels. Correspondingly, we infer that in nearly all villages, only consumers with the
lowest and highest marginal willingness to pay for rice have binding participation (or budget) constraints.
In the presence of quantity discounts, our model implies a negative relationship between marginal cost
and multiplier in the highly-convex case: since c = T ′(q(θHC)) and γ = G(q(θHC)) with T ′(·) decreasing
and G(·) increasing, it follows that the higher the marginal cost, the lower the quantity at which γ equals
G(·) and, thus, the lower the value of the multiplier. Hence, if the schedules of marginal prices in all
villages are similar enough, a negative relationship between marginal cost and multiplier should also
emerge across villages. This monotonicity is apparent in Figure 5 where it only fails between villages 10
and 11: in both villages the multiplier is equal to 1, and thus c = T ′(q), but the estimated marginal cost
is much higher in village 11 than in village 10. This large difference in estimated marginal cost between
the two villages is a direct consequence of the large difference in the schedule of marginal prices between
these two villages at all quantities including the largest one, q. See Figure 12 in Appendix B.
Recall that in the highly-convex case there is a single consumer type θHC consuming the first-best
quantity. Households with types below θHC consume quantities of rice below first best whereas households
with types above it consume quantities above first best. Our estimates of c and γ imply that while a large
fraction of households consume quantities below first best in all villages, the quantity q(θHC) above which
overconsumption occurs varies significantly across villages, starting at relatively low quantities in about
half of the villages. That most households are estimated to under-consume is due to the fact that the
estimated q(θHC) is in the fourth quartile of the cumulative distribution of quantities in nine villages and
32
in the third quartile of the distribution of quantities in the remaining two villages. However, this threshold
quantity differs markedly across villages: it falls between 1 and 1.5 kilos in two villages, between 1.5
to 2 kilos in three villages, between 2 to 2.5 kilos in five villages, and between 2.5 and 3 kilos in just
one village. For a sense of the magnitude of this threshold quantity, recall from Figure 3 that quantities
consumed range from 0.1 to 3 kilos across villages but only two households overall consume less than 0.2
kilo. Hence, underconsumption, a common concern among policy makers about the rural poor, is present
in our villages but is neither uniform across households or villages nor exclusive: a significant proportion
of households indeed consume quantities above first best.
Figure 6: Distribution of Types
12
34
56
Est
imat
ed T
ype
0 1 2 3Quantity
Type Support
05
1015
Est
imat
ed R
ever
se H
azar
d R
ate
1 2 3 4 5 6Type
Reverse Hazard Rate
02
46
Est
imat
ed H
azar
d R
ate
1 2 3 4 5 6Type
Hazard Rate
0.2
.4.6
.81
Est
imat
ed C
umul
ativ
e D
istr
ibut
ion
Fun
ctio
n
1 2 3 4 5 6Type
Cumulative Distribution Function
Distribution of Types. In Figure 6, we display the estimates of the type support (top left panel), θ(q), as
a function of the purchased quantity, and of the reverse hazard rate function (top right panel), F (θ)/f(θ),
in each village. Note that in each village, consumers’ estimated marginal willingness to pay for rice,
θ(q), increases with the quantity purchased, as consistent with the incentive compatibility condition of
our model, and the estimated reverse hazard function increases with θ, as consistent with the monotone
hazard rate condition implied by our separation assumption. As also required by our model, the estimated
33
hazard rate function, [1 − F (θ)]/f(θ), decreases everywhere with θ (bottom left panel). Note that none
of these restrictions have been imposed in estimation. As apparent from Figures 13 and 14 in Appendix
B, the function θ(q) and the density f(θ) are also fairly precisely estimated. We interpret this evidence as
validating our estimates of the distribution of types in each village.
Our estimates of θ(q) imply a greater dispersion in consumers’ marginal willingness to pay for rice
than at first evident from the relatively compressed distribution of observed quantities in Figure 3. Indeed,
the empirical probability density function of quantities in the top right panel of Figure 3 shows that most
consumers purchase small to intermediate quantities, between 0.25 kilos and 1.5 kilos of rice. Since the
cumulative distribution function of types coincides with the cumulative distribution function of quantities,
F (θ) = G(q), one might be tempted to conclude that as quantities are not much dispersed, so must
be tastes. That this inference is incorrect can be seen, for instance, from the top two panels of Figure
6, which show that the support of types is more than twice as wide as the support of quantities across
villages. This large degree of variation in consumers’ tastes implies a potentially strong incentive for
sellers to discriminate across consumers, as consumers markedly differ in their marginal willingness to
pay for rice. We examine the extent to which sellers exert market power through price discrimination in
Section 5.4 below, where we find evidence that sellers indeed behave noncompetitively.
Figure 7: Marginal Utility
02
46
8E
stim
ated
Bas
e M
argi
nal U
tility
0 1 2 3Quantity
Base Marginal Utility
24
68
10E
stim
ated
Mar
gina
l Util
ity
0 1 2 3Quantity
Marginal Utility
Marginal Utility Function. In Figure 7, we plot estimates of the base marginal utility function, ν ′(q),
and marginal utility function, θ(q)ν ′(q), at each quantity in each village. As it can be seen in Figure 15 in
Appendix B, the function ν ′(q) is fairly precisely estimated in each village. Note that both functions, ν ′(q)
and θ(q)ν ′(q), decrease with quantity in all villages, as consistent with the model, despite the fact that we
did not impose any such monotonicity constraint in estimation. Since θ(q) increases with q whereas ν ′(q)
decreases with q, however, estimated marginal utility decreases much less rapidly than estimated base
marginal utility. Hence, the distribution of marginal utilities is more compressed than the distribution of
base marginal utilities.
34
This relative compression of marginal utilities across consumers is still compatible with a potentially
large degree of market power for sellers, which they can exercise by price discriminating across con-
sumers. To see why, note that marginal utility is significantly downward sloping: rice is valued very
differently at the margin at different levels of consumption. Hence, in the villages in our regular sam-
ple, sellers not only have an incentive to price discriminate across consumers, as discussed, but also the
possibility to separate consumers by their intensity of preference through a menu of different quantities
differentially priced at the margin, depending on the level of a consumer’s demand.
Relative to the case of linear utility, the curvature we estimate further suggests the potential for rich
distributional implications of nonlinear pricing compared to alternative pricing schemes naturally of inter-
est, such as first-best or linear pricing. We explore these implications of our estimates in the next section,
where we link the scope of price discrimination, as captured by the distribution of consumers’ tastes and
marginal utility and by sellers’ marginal cost, to the type and intensity of price discrimination that we infer
sellers practice in our villages.
5.4 Distortions Associated with Price Discrimination
Our model allows for varying degrees of market power, which correspond to different degrees of efficiency
of the resulting price and quantity combinations available to consumers. Specifically, since a consumer’s
first-order condition is θν ′(q) = T ′(q), the closer marginal prices, T ′(q), are to marginal cost, c, in a
village, the closer the market for rice in the village is to an efficient one, in which marginal utility equals
marginal cost. In this case too, of course, the distribution of “gains from trade” between consumers and
producers depends on the ability of sellers to extract consumer surplus through nonlinear pricing.
A natural question is then whether the nonlinear pricing schedules we observe in each village reflect
the behavior of sellers who efficiently (first-degree) price discriminate across households, in the sense that
T ′(q) = c, or that of sellers who practice a distortionary type of standard second-degree price discrimina-
tion.19 In the first case, nonlinear pricing may have undesirable distributional implications but it leads to
efficient levels of consumption of rice. In particular, it does not restrict the access of any household, even
the poorest, to a basic staple like rice. Thus, although households’ ability to pay may still be of concern to
policy makers, the marginal prices, however high, that households face imply no distortion to individual
or social consumption of rice.
To assess the extent to which observed prices correspond more to first or second-degree (nonlinear)
price discrimination, we test whether the necessary condition T ′(q) = c for first-degree price discrimina-
tion holds. In Figure 12 in Appendix B, we show that in each village the marginal price schedule, {T ′(q)},is outside the 95% confidence interval around the estimated marginal cost. Based on this evidence, we
conclude that sellers have market power in the villages in our regular sample, and exercise it by distorting
the quantities offered to most households.
19This outcome occurs in our model when u(θ) = θν(qFB(θ))− cqFB(θ)− δ for some constant δ across consumers. In this
35
Figure 8: Nonlinear Pricing vs. First-Best Pricing-1
-.8
-.6
-.4
-.2
0N
onlin
ear
vs. F
irst-
Bes
t Pric
ing:
Soc
ial S
urpl
us
0 1 2 3Nonlinear Pricing Quantity
Social Surplus
01
23
Firs
t-B
est Q
uant
ity
0 1 2 3Nonlinear Pricing Quantity
Quantity Consumed
Since we reject the hypothesis that observed pricing is efficient, we now examine the severity of the
distortions associated with nonlinear pricing. In the left panel of Figure 8, for each quantity observed
consumed under nonlinear pricing, we graph the difference in social surplus under nonlinear pricing and
under first-best pricing. Specifically, we compute the percentage difference in the social surpluses arising
in each village under the observed (non-linear) price-quantity menu and that arising under the counter-
factual first-best price-quantity menu when T ′(q) = c, computed as SSnp(θ)/SSfb(θ) − 1, where the
subscript ‘np’ stands for nonlinear pricing and ‘fb’ for first-best pricing. The loss in social surplus implied
by observed nonlinear pricing ranges (except for one consumer type in one village) across quantities and
villages from 20% to 100% of the surplus associated with the first-best allocation. Importantly, this loss is
almost uniformly larger at lower quantities, and so largest for the lowest consumer types in each village.
In the right panel of Figure 8, we plot first best quantities (on the y-axis) against observed quantities
under nonlinear pricing. The dotted lines join the quantities in each village, while the solid line is the 45-
degree line. As implied by our estimates of marginal cost and the multipliers on consumers’ participation
(or budget) constraints, in most villages, households who purchase the smallest quantities consume less
than under first best whereas households who purchase the largest quantities consume more than under
first best. However, distortions seem more pronounced for households who consume relatively more than
for households who consume relatively less—they are highest for consumers with intermediate marginal
valuations of rice.
Figure 8 provides an interesting picture of the inefficiencies induced by nonlinear pricing by contrast-
ing the greater loss in consumer surplus (left panel) to the smaller distortions in consumption (right panel)
at smaller quantities relative to higher quantities under nonlinear pricing compared to first-best pricing.
For consumers with lower types, quantity distortions are less pronounced but price distortions are more
severe than for consumers with higher types, and more so that, on balance, lower consumer types would
case, the seller offers first-best quantities and charges a price at which every consumer reaches her reservation utility.
36
benefit more than higher ones from a more competitive market for rice.
5.5 Nonlinear vs. Linear Pricing
It has been argued that the ability of sellers to price discriminate through quantity discounts in developing
countries hurts poor consumers more: according to this argument, this type of nonlinear pricing may
limit the access to basic goods and services for consumers purchasing the smallest quantities, typically
the poorest households, since they face the highest unit prices—see Attanasio and Frayne (2006) for
references. Based on our theoretical framework and our estimates, we can examine which consumers are
hurt (or benefit) from the price discrimination we observe in our villages. We address this question by
contrasting consumer (and social surplus) under nonlinear and linear pricing. In particular, we ask what
level of consumer surplus would arise if the seller was constrained (by legislation, say) to practice linear
pricing. This exercise entails not only a comparison of the price and quantity combinations generated
by a nonlinear and a linear pricing scheme, but also of the size of the market served by sellers under
each scheme. As discussed in Section 3.3, the linear price a seller would choose when prevented from
discriminating may entail excluding some consumers, even if such a consumer participates in the market
under nonlinear pricing. We also discuss how this comparison differs across our augmented model and
the standard model. This exercise will then shed light on the nature and size of the bias that assuming,
counterfactually, that the standard model applies to all villages gives rise to.
Augmented Model. Given our estimates of consumers’ marginal willingness to pay, utility and sellers’
marginal cost, we compare consumer (and social surplus) in each village under the observed nonlinear
pricing allocation and under the counterfactual linear pricing one that would emerge if sellers were pre-
vented from price discriminating. One difficulty we need to face is the fact that the reservation utility
is only identified for consumer types whose participation (or budget) constraints bind. In the absence of
a global point estimate, we bound the reservation utility from below and from above and compute the
consumer and social surplus under these two extreme scenarios. In particular, we first bound the reserva-
tion utility from below by setting u(θ) = u(θ) for types smaller than θ and u(θ) = u(θ) for the highest
type. This schedule of reservation utilities is the lowest possible that is consistent with our model. We
then bound the reservation utility from above by setting u(θ) = u(θ) for all types.20 This schedule of
reservation utility corresponds to the highest possible one consistent with our model. We label the former
case as low reservation utility and the latter one as high reservation utility.
Given these bounds, we can compute the change in consumer surplus (and social surplus) between the
nonlinear and linear pricing solutions, type by type. In Figure 9, we plot the relative change in consumer
surplus, CS, under nonlinear pricing relative to the linear pricing counterfactual: CSnp(θ)/CSlp(θ) − 1
where the subscript ‘np’ stands for nonlinear pricing, and ‘lp’ for linear pricing. This relative loss is
20Note that when γ = 1, the standard model applies and so u(θ) = u(θ) for all types, since the participation or budgetconstraint only binds for the lowest type.
37
computed in each village and for each quantity (and therefore type) with CSnp(θ) = u(θ), CSlp(θ) =
θν(qm(θ)) − pmqm(θ). Recall that qm(θ) denotes the quantity demanded by a consumer of type θ under
linear pricing, and pm is the linear price in a village. Analogously, we can compute the relative change in
social surplus SSnp(θ)/SSlp(θ) − 1. See Appendix B for details. (Note that when γ = 1, the standard
model applies and so u(θ) = u(θ) for all types, since the participation or budget constraint only binds for
the lowest type.)
As the top left panel of Figure 9 shows, in most villages consumers who buy the smallest quantities
prefer linear to nonlinear pricing—note that the range of the y-axis is first negative then positive. Approxi-
mately three quarters of the remaining consumers, however, have positive values of CSnp(θ)/CSlp(θ)− 1
and, hence, are better off under nonlinear pricing. As apparent from the top right panel of Figure 9,
SSnp(θ)/SSlp(θ) − 1 is positive for nearly all consumer types in each village, even for those who are
worse off under nonlinear pricing. For these types, the higher producer surplus associated with nonlin-
ear pricing is large enough that social surplus is higher under nonlinear pricing despite consumer surplus
being lower.
Figure 9: Nonlinear vs. Linear Pricing Under Augmented Model
-1-.
50
.51
1.5
Non
linea
r vs
. Lin
ear
Pric
es: C
onsu
mer
Sur
plus
0 1 2 3Nonlinear Pricing Quantity
Low Reservation Utility
-10
12
3N
onlin
ear
vs. L
inea
r P
rices
: Soc
ial S
urpl
us
0 1 2 3Nonlinear Pricing Quantity
Low Reservation Utility
-1-.
50
.51
1.5
Non
linea
r vs
. Lin
ear
Pric
es: C
onsu
mer
Sur
plus
0 1 2 3Nonlinear Pricing Quantity
High Reservation Utility
-10
12
3N
onlin
ear
vs. L
inea
r P
rices
: Soc
ial S
urpl
us
0 1 2 3Nonlinear Pricing Quantity
High Reservation Utility
38
An interpretation for these findings is as follows. Under linear pricing, sellers provide smaller quan-
tities, thereby inducing consumers to purchase less than under nonlinear pricing, but also charge lower
prices for these quantities. As the top left panel of Figure 9 shows, the benefit of lower prices outweighs
the utility loss from lower consumption for the lowest consumer types, so for them consumer surplus is
higher under linear pricing.
The reduced ability of sellers to exert market power under linear pricing also implies a lower producer
surplus from nearly all types relative to nonlinear pricing. Overall, then, for these lowest consumer types,
consumer surplus is lower but social surplus is higher under nonlinear pricing—indeed, social surplus is
distinctively higher for some consumer types purchasing less than 0.5 kilos, as it can be seen from the
yellow and orange lines in the top right panel of Figure 9.
For the remaining consumer types who benefit from nonlinear pricing, the greater quantities and lower
marginal prices that nonlinear pricing implies, relative to linear pricing, give rise not just to higher levels of
social surplus but also of consumer surplus. As consistent with Propositions 5 and 6, consumer surplus and
social surplus are higher for these types under nonlinear pricing also due to the higher degree of market
participation than nonlinear pricing generates. As implied by Proposition 6, consumers with access to
large alternative levels of consumption (relative to their taste for rice) are unprofitable to serve under
linear pricing. Indeed, in 8 of the 11 villages in our regular sample, at least one consumer type is excluded
from trade under linear pricing: the highest type in six villages (villages 1, 5 to 8, and 10), the two lowest
types in one village (village 9), and the two lowest and the highest type in the remaining village (village
3). Social surplus does not change across nonlinear and linear pricing for those consumer types who do
not participate under linear pricing, and thus experience utility u(θ), but participate and obtain utility close
to u(θ) under nonlinear pricing. As apparent from Figure 9, in our villages these consumers are either the
smallest or the highest types.
As for the high reservation utility case, the bottom left panel of Figure 9 shows that nearly all con-
sumers prefer linear to nonlinear pricing, especially those who purchase relatively small quantities. Intu-
itively, when consumers’ reservation utility is high, sellers charge lower linear prices—indeed lower than
in the previous experiment. These lower prices benefit not just consumers with relatively low marginal
willingness to pay, who face steep unit prices under nonlinear pricing, but also higher consumer types:
the combination of their high reservation utilities and the reduced ability for sellers to extract consumer
surplus under linear pricing implies higher levels of utility for consumers with higher types too. Interest-
ingly, in nearly all villages in which exclusion occurs, consumers who do not participate in the market
are of middle to high type rather than the lowest and highest consumer types, as in the first version of
the experiment. Specifically, consumers excluded from trade under linear pricing are: the highest type in
three villages (villages 1, 7, and 10), the two lowest types in one village (village 9), the two highest types
in one village (village 6), the three highest types in two villages (villages 3 and 8), and the four highest
types in the remaining village (village 5).
39
Standard Model. An interesting exercise is to compare nonlinear and linear pricing allocations under
the counterfactual assumption that the standard model, where reservation utility is constant and does not
depend on consumer’s type, applies to all villages. To this purpose, we re-estimate the model’s primitives
in each village where our model applies under this assumption, and then perform an experiment analo-
gous to the one performed in the context of our model. That is, we compare consumer, producer, and
social surplus under the resulting nonlinear pricing allocation and under the linear pricing allocation that
would emerge if sellers were prevented from price discriminating, given the primitives estimated under the
standard model, which also assumes u(θ) = u(θ) for all consumers so that the participation (or budget)
constraint just binds for the lowest type. We report the results of this experiment in Figure 10. As before,
we plot the differences in consumer and social surplus, CSnp(θ)/CSlp(θ)− 1 and SSnp(θ)/SSlp(θ)− 1,
respectively, as functions of the quantity demanded by a given consumer type under nonlinear pricing,
q = q(θ).
Figure 10: Nonlinear vs. Linear Pricing Under Standard Model
-10
12
3N
onlin
ear
vs. L
inea
r P
rices
: Con
sum
er S
urpl
us
0 1 2 3Nonlinear Pricing Quantity
Standard Model0
24
68
Non
linea
r vs
. Lin
ear
Pric
es: S
ocia
l Sur
plus
0 1 2 3Nonlinear Pricing Quantity
Standard Model
By contrasting the left panel of Figure 10 with the top left panel of Figure 9 (the first version of our
experiment), it emerges that in this case too, consumer surplus is mostly lower for the lowest consumer
types when sellers can price discriminate and by an amount comparable to the one in the top left panel
of Figure 9. Intermediate and high consumer types mostly gain from nonlinear pricing. Their gain in
consumer surplus relative to linear pricing predicted by the standard model is, however, much larger than
predicted by our model: the range of consumer surplus increases in the left panel of Figure 10 is twice as
large as in the top left panel of Figure 9.
The pattern of differences in social surplus across nonlinear and linear pricing is also similar to the
one implied by our model but, like with consumer surplus, the gain in social surplus under nonlinear pric-
ing predicted by the standard model is larger. Specifically, comparing the right panel of Figure 10 to the
top right panel of Figure 9 shows how the greater efficiency of nonlinear pricing implied by the standard
model is distinctively more pronounced for lower types, those purchasing less than 0.5 kilos under non-
40
linear pricing, relative to our model. For high consumer types, the gain in social surplus associated with
nonlinear pricing is mostly comparable to the one predicted by our model in most villages.
Recall that the standard model implies greater consumption distortions for lower than for higher con-
sumer types since it requires [θν ′(q(θ)) − c]/ν ′(q(θ)) to equal [1 − F (θ)]/f(θ), which decreases with
θ by the monotone hazard rate condition—a maintained assumption in both the standard model and our
model. However, given the low reservation utility profile implied by the standard model, under this model
sellers have a greater ability to extract consumer surplus through linear pricing than under our model.
This argument explains the greater consumer surplus associated with nonlinear pricing relative to linear
pricing when the standard model is assumed to apply. Under the standard model, fewer consumers are ex-
cluded from trade under linear pricing and sellers charge higher linear prices than predicted by our model,
thereby depressing consumer surplus. Indeed, exclusion occurs only in villages 3, 5, and 9, and involves
the smallest two types in villages 3 and 9 but only the smallest type in village 5.
As mentioned in Section 3, the standard model can be interpreted as a special case of our model with
highly-convex reservation utility when γ = 1. It is interesting to note that, although all villages in our
regular sample turn out to correspond to the highly-convex case and our estimates of γ are relatively close
to 1, the exercise performed here shows that the standard model does not constitute a good approximation
to our data in that the welfare implications of the two models are very different. This is because even
small deviations of γ from 1 are symptomatic of very different behaviors on the part of sellers and con-
sumers. See also Appendix B for the bias to the estimates of consumers’ marginal willingness to pay and
marginal utility that would result from assuming that the standard model applies to a particular village
when, instead, it is rejected.
5.6 The Impact of Income Transfers
As discussed in Section 3.3, our model with budget constraints can be used to examine the consequences
of a targeted transfer, such as the one implemented through the Progresa program, on prices and quantities.
Proposition 7 implies that income transfers to consumers induce sellers to modify the shape of their pricing
schedules. This prediction of our model is particularly sharp for villages that can be classified “highly-
convex” or “weakly-convex” instances of our model. To examine whether this prediction is borne out in
the data, we analyze the 24 villages in our estimation sample that can be classified as highly-convex—as
explained, no village in our estimation sample conforms to the weakly-convex case. For these villages,
we then examine the extent to which the Progresa transfer has affected prices.
To this purpose, for each quantity and each such village, we compute the price implied for that quan-
tity/consumer type by the model’s estimates of T (q) in that particular village, as detailed in Section 4,
and use them to estimate a specification similar to the one reported in Table 1.21 Specifically, we regress
the log of the predicted unit price on the log of the quantity purchased in each village. The resulting
21Figure 4 implies that these theoretical predictions constitute a very good fit of actual data.
41
log-linear relationship could be considered as an approximation to the theoretical schedule, as implied
by our model’s estimate. We allow this schedule to be shifted by the presence of the Progresa transfer,
which is available only in some of the villages, by letting slope and intercept of this relationship to depend
on whether a given village is targeted by Progresa or not.22 We estimate this relationship by OLS and
present the results we obtain in Table 2, which also reports clustered standard errors at the village level in
parenthesis—once again, the commodity we are considering is rice.
A small literature has examined the effect of the Progresa program on prices of agricultural commodi-
ties. As mentioned in the introduction, Hodinott et al. (2000) and Angelucci and DeGiorgi (2009) found
no evidence that Progresa transfers induced a systematic increase in the (average) price of basic staples.
In column (1) of Table 2, we perform an exercise similar to theirs on the unit price predicted by our model
by regressing it on a constant and an indicator of the program. Consistent with this finding and similar
other findings in the literature, we observe that the parameter for the presence of Progresa is small in size
and implies an average increase in unit prices of about 1% in response to the program, not statistically
different from zero.
In column (2), we perform an exercise similar to that reported in Table 1 but using the prices predicted
by our estimates. Note that we explicitly account for the variability of unit prices with quantity with an
additional covariate, log-quantity, relative to the specification in column (1). We find that the elasticity of
unit prices to quantity is−0.215 and is, indeed, statistically different form zero. The size of this coefficient
is not too different from that reported in Table 1. In column (3), instead, we add the treatment dummy to
the specification in column (2). Once again, we find that the coefficient on this variable is not statistically
different from zero whereas the coefficient on log-quantity is not too dissimilar from that reported in the
previous specification.
Table 2: Impact of Cash Transfers on Prices
Rice Unit Values Restricted Sample:Villages with at Least 100 Households
1 2 3 4Treatment 0.011 − 0.020 0.005
(0.020 − (0.018) (0.017)Ln(quantity) − −0.215 −0.216 −0.193
(0.033) (0.034) (0.033)Ln(quantity)* − − − −0.039Treatment − − − (0.014)Constant 1.997 1.924 1.913 1.921
(0.023) (0.015) (0.017) (0.015)
Observations 4462 4462 4662 4462R2 0.0012 0.6179 0.6219 0.6267
Finally, in column (4), we add an interaction between the program dummy and quantity to the spec-
22 It should be remembered that, for evaluation purposes, the Progresa program was randomly allocated across villages.
42
ification in column (3). The coefficient on log-quantity is slightly lower (in absolute value) at −0.19
whereas the coefficient on the interaction between the program dummy and log-quantity is estimated at
−0.039 and is statistically different from zero. The results in column (4) imply that the price schedule
becomes steeper, and thus implies a higher quantity discount, in villages where the program is operating
than in villages where it is not. Not only this result is consistent with Proposition 7, but, to the best of our
knowledge, also constitutes one of the first pieces of evidence of an impact of a cash transfer on prices.
This “tilting” of the schedule of unit prices could explain the failure of many researchers to find an effect
of the program on average unit prices—Cunha et al. (2014) find a small but negligible increase in prices
in response to cash transfers but under the assumption that unit prices are constant with quantity.
Our model could also be used to interpret the fall in prices reported by Cunha et al. (2014) for the
villages in their sample that receive in-kind transfers. If the basket received by consumers includes the
commodity sold by the seller, such transfers could affect consumers’ subsistence constraints and, in par-
ticular, increase the consumption floor on other goods, thus reducing consumers’ budgets for the good
under consideration. In-kind transfers can then have an effect opposite to the one of cash transfers, in
particular, they may lead to a decrease in prices, as consistent with the findings of Cunha et al. (2014).23
These results, both theoretical and empirical, are important to assess the impact of cash transfers, a
commonly used policy tool in developing countries. In particular, we note that such cash transfers may
imply an upward shift in the price schedule and an increase in the intensity of price discrimination, which
we observe in our data. This price increase has obvious implications on the consumer surplus enjoyed by
households beneficiaries of the program but also on the surplus of non-eligible households. In particular,
the consumer surplus of eligible households is reduced relative to the surplus those households would
enjoy if prices did not change in response to transfers.
6 ConclusionWe have proposed a model of nonlinear pricing in which consumers differ in their tastes for goods, ability
to pay for them, outside options, and face subsistence constraints that give rise to budget constraints for a
seller’s good. In this case, the implications of nonlinear pricing for consumer, producer, and social surplus
are in general fundamentally different from those arising from standard models of nonlinear pricing, in
which outside options are identical across consumers and consumers are assumed unconstrained in their
purchasing decisions. In particular, in our environment quantity discounts for large volumes can be as-
sociated with overprovision of quantity at low volumes. We have proved that this more general model
is non- and semiparametrically identified under common assumptions. We have derived nonparametric
and semiparametric estimators of the model’s primitives that can be readily implemented using publicly
available data from conditional cash transfer programs, common in several developing countries. We have
23The authors extend their intuition on the price effect of cash transfers to the imperfectly competitive case of Cournotcompetition among sellers. However, in this case they assume directly conditions on aggregate demand that guarantee that thesame price effect as in the perfectly competitive case arises.
43
also showed that cash transfers, by affecting consumers’ ability to pay, provide sellers with a greater op-
portunity to extract consumer surplus. Thus, cash transfers in general lead to higher prices and, in most
villages in our sample, a greater intensity of price discrimination, as predicted by our model. Overall,
our estimation results suggest the importance of accounting for heterogeneity in consumers’ preferences,
consumption opportunities, and constraints in assessing the welfare implications of nonlinear pricing.
ReferencesANGELUCCI M., G. DE GIORGI (2009): “Indirect Effects of an Aid Program: How Do Cash Transfers
Affect Ineligibles’ Consumption?”, American Economic Review, 99(1), pages 486-508.
ATTANASIO, O., V. DI MARO, V. LECHENE, and D. PHILLIPS (2009): “The Welfare Consequences
of Increases in Food Prices in Rural Mexico and Colombia”, mimeo.
ATTANASIO, O., and C. FRAYNE (2006): “Do the Poor Pay More?”, mimeo.
BARON, D.P., and R.B. MYERSON (1982): “Regulating a Firm with Unknown Cost”, Econometrica,
50(4), 911-930.
CHE, Y.-K., and I. GALE (2000): “The Optimal Mechanism for Selling to a Budget-Constrained Buyer”,
Journal of Economic Theory, 92(2), 198-233.
CUNHA, J., G. DE GIORGI, and S. JAYACHANDRAN (2014): “The Price Effects of Cash Versus In-
Kind Transfers”, mimeo.
EKELAND, I., J.J. HECKMAN, and L. NESHEIM (2004): “Identification and Estimation of Hedonic
Models”, Journal of Political Economy, 112(S1), S60–S109.
GUERRE, E., I. PERRIGNE, and Q. VUONG (2000): “Optimal Nonparametric Estimation of First-Price
Auctions”, Econometrica, 68(3), 525-574.
HALL, P. (1992): “Effect of Bias Estimation on Coverage Accuracy of Bootstrap Confidence Intervals
for a Probability Density”, Annals of Statistics, 20(2), 675–694.
HECKMAN, J.J., , R.L. MATZKIN, and L. NESHEIM (2010): “Nonparametric Identification and Esti-
mation of Nonadditive Hedonic Models”, Econometrica, 78(5), 1569–1591.
HODDINOTT, J., E. SKOUFIAS and R. WASHBURN (2000): “The impact of Progresa on Consumption:
A Final Report”, IFPRI, mimeo.
JENSEN, R.T., and N.H. MILLER (2008): “Giffen Behavior and Subsistence Consumption”, American
Economic Review, 98(4), 1553–1577.
JULLIEN, B. (2000): “Participation Constraints in Adverse Selection Models”, Journal of Economic
Theory, 93(1), 1–47.
LANCASTER, K.J. (1966): “A New Approach to Consumer Theory,” Journal of Political Economy,
74(2), 132–157.
LESLIE, P. (2004): “Price Discrimination in Broadway Theatre”, Rand Journal of Economics, 35(3),
520–541.
44
MASKIN, E., and J. RILEY (1984): “Monopoly with Incomplete Information”, Rand Journal of Eco-
nomics, 15(2), 171–196.
MCINTOSH, A.C. (2003): “The poor pay more for their water. ”’ Habitat Debate, September 2003 Vol.
9 No. 3 , http://www.unhabitat.org/hd/hdv9n3/12.asp.
PERRIGNE, I., and Q. VUONG (2010): “Nonlinear Pricing in Yellow Pages”, mimeo.
ROBIN, J.-M. (1993): “Econometric Analysis of the Short-Run Fluctuations of Households’ Purchases”,
Review of Economic Studies, 60(4), 923-934.
STOLE, L.A. (2006): “Price Discrimination and Competition”, mimeo.
THOMAS, L. (2002): “Nonlinear Pricing with Budget Constraint”, Economics Letters, 75(2), 257–263.
A Omitted Proofs and DetailsCumulative Multiplier in the Highly-Convex Case: Consider the case in which v(θ, q) = θν(q) andc′(q) = c. Here we derive an expression that defines the multiplier in the highly-convex case in terms ofprimitives. Recall that in the highly-convex case, the cumulative multiplier γ(θ) is equal to a constant, γ,at all points θ ∈ [θ,θ). Here we solve for γ in the interesting case in which 0 < γ < 1 for [θ,θ). In theother two cases, the multiplier is trivial: γ(θ) = 1 for all θ ∈ [θ,θ] and γ(θ) = 0 for all θ ∈ [θ,θ). First,observe that (1) implies that
ν ′(q(θ)) = cf(θ)/[θf(θ) + F (θ)− γ], (18)
so q(θ) = (ν ′)−1(cf(θ)/[θf(θ) + F (θ) − γ]). Note that u(θ) = u(θ) and u(θ) = u(θ) when 0 < γ < 1.Hence, u(θ)− u(θ) = u(θ)− u(θ) so
u(θ)− u(θ) =
∫ θ
θ
u′(x)dx =
∫ θ
θ
ν(q(x))dx =
∫ θ
θ
ν
((ν ′)−1
(cf(x)
xf(x) + F (x)− γ
))dx,
where the first equality follows from u(θ)−u(θ) = u(θ)−u(θ) and the fundamental theorem of calculus,and the second equality by the local incentive compatibility condition u′(θ) = ν(q(θ)).Proof of Proposition 1: This result establishes that the first-order and complementary slackness condi-tions for the simple BC problem in (6) are necessary and sufficient to characterize an optimal menu. Theproof of this result requires that a version of the assumptions of potential separation, homogeneity, andfull participation in Jullien (2000) for the IR model hold for the BC model. Like in the IR model, thepotential separation assumption requires l(γ, θ) to be a weakly increasing function of θ for all γ ∈ [0, 1],for which sufficient conditions are
∂
∂θ
(sq(θ, q)
vθq(θ, q)
)≥ 0 and
d
dθ
(F (θ)
f(θ)
)≥ 0 ≥ d
dθ
(1− F (θ)
f(θ)
).
As explained in Jullien (2000), the first inequality in this condition implies that the conflict betweenrent extraction and efficiency is not too severe so that the marginal benefit of increasing the slope of theutility profile is weakly increasing with the type. When this occurs, the seller tends to desire convexquantity profiles, which implies that the monotonicity condition for q(θ) for incentive compatibility iseasier to satisfy. The second and third inequalities in this condition amount to a simple strengthening ofthe monotone hazard rate condition ubiquitous in the mechanism design literature: as the type increases,the relative weight of types above θ compared to below θ decreases, and the seller becomes progressively
45
more concerned about the informational rents left below θ. We have discussed the BC homogeneityassumption in the text. The full participation assumption simply ensures that the seller makes nonnegativeprofits when trading with each consumer type. Sufficient conditions for this assumption to hold are thathomogeneity is satisfied and that for each type θ, the seller makes weakly positive profits by supplying thereservation quantity q(θ) at price t(θ), which can be expressed as
t(θ)− c(q(θ)) = v(θ, q(θ))− c(q(θ))− u(θ) = s(θ, q(θ))− u(θ) ≥ 0 (19)
by using u(θ) = v(θ, q(θ))− t(θ) and s(θ, q(θ)) = v(θ, q(θ))− c(q(θ)).To derive the simple BC problem in (6), we proceed in analogy with the derivations of the simple IR
problem in the Supplementary Appendix. First, we write the BC constraint as
I(θ, q) ≥ t(θ) = v(θ, q(θ))− u(θ), (20)
since u(θ) = v(θ, q(θ))− t(θ). The BC problem can be expressed in Lagrangian-type form as
maxu,q∈Q
(∫ θ
θ
[v(θ, q(θ))− c(q(θ))− u(θ)] f(θ)dθ+
∫ θ
θ
{I(θ, q(θ))− [v(θ, q(θ))− u(θ)]}dΦ(θ)
)(21)
s.t. u′(θ) = vθ(θ, q(θ)), (22)
where Q is the set of weakly-increasing functions q(θ) and Φ(θ) is the cumulative Lagrange multiplier,defined analogously to γ(θ), on the budget constraint expressed as in (20). Next, note that∫ θ
θ
u(θ)f(θ)dθ =
∫ θ
θ
[u(θ) + u(θ)− u(θ)]f(θ)dθ = u(θ)
∫ θ
θ
dF (θ) +
∫ θ
θ
[u(θ)− u(θ)]dF (θ)
= u(θ) +
∫ θ
θ
(∫ θ
θ
u′(x)dx
)dF (θ).
Integrating by parts and using the local incentive compatibility condition u′(θ) = vθ(θ, q(θ)), we obtain∫ θ
θ
u(θ)f(θ)dθ = u(θ) +
∫ θ
θ
(∫ θ
θ
vθ(x, q(x))dx
)dF (θ) = u(θ) +
(∫ θ
θ
vθ(x, q(x))dx
)F (θ)
∣∣∣∣θθ
−∫ θ
θ
vθ(θ, q(θ))F (θ)dθ = u(θ) +
∫ θ
θ
vθ(θ, q(θ))dθ −∫ θ
θ
vθ(θ, q(θ))F (θ)dθ. (23)
Similarly, ∫ θ
θ
u(θ)dΦ(θ) = u(θ)[Φ(θ)− Φ(θ)] +
∫ θ
θ
(∫ θ
θ
vθ(x, q(x))dx
)dΦ(θ)
= u(θ)[Φ(θ)− Φ(θ)] +
(∫ θ
θ
vθ (x, q (x)) dx
)Φ(θ)
∣∣∣∣θθ
−∫ θ
θ
vθ (θ, q (θ)) Φ(θ)dθ
= u(θ)[Φ(θ)− Φ(θ)] + Φ(θ)
∫ θ
θ
vθ (θ, q (θ)) dθ −∫ θ
θ
vθ (θ, q (θ)) Φ(θ)dθ. (24)
46
Substituting (23) and (24) into (21), we obtain∫ θ
θ
[v(θ, q(θ))− c(q(θ))− u(θ)] f(θ)dθ +
∫ θ
θ
{I(θ, q(θ))− [v(θ, q(θ))− u(θ)]}dΦ(θ)
=
∫ θ
θ
[v(θ, q(θ))− c(q(θ))]f(θ)dθ +
∫ θ
θ
[I(θ, q(θ))− v(θ, q(θ))]dΦ(θ)− u (θ)−∫ θ
θ
vθ(θ, q (θ))dθ
+
∫ θ
θ
F (θ)vθ(θ, q (θ))dθ + u(θ)[Φ(θ)− Φ(θ)] + Φ(θ)
∫ θ
θ
vθ(θ, q (θ))dθ −∫ θ
θ
Φ (θ) vθ(θ, q (θ))dθ
which, by collecting terms, can be simplified to further obtain∫ θ
θ
[v(θ, q(θ))− c(q(θ))]f(θ)dθ +
∫ θ
θ
[F (θ)− Φ(θ) + Φ(θ)− 1
f(θ)
]vθ(θ, q(θ))f(θ)dθ
+
∫ θ
θ
φ(θ)[I(θ, q(θ))− v(θ, q(θ))]
f(θ)f(θ)dθ + u(θ)[Φ(θ)− Φ(θ)− 1].
By collecting terms one more time and dropping irrelevant constants, this expression reduces to the onein (6). The following result holds, which is the analogue of Result 4 in the Supplementary Appendix.
Result 1. Under potential separation, BC homogeneity, and full participation, the optimal allocation{u(θ), q(θ)} solves the simple BC problem if, and only if, there exists a cumulative multiplier functionΦ(θ) such that the first-order conditions (7) and the complementary slackness condition (8) are satisfied.
We now turn to prove Proposition 1. By Result 4 in the Supplementary Appendix, an implementableallocation {tIR(θ), qIR(θ), γIR(θ)}with associated utility profile u(θ) solves the IR problem if, and only if,there exists a cumulative multiplier function γ(θ) with the properties of a cumulative distribution functionsuch that the first-order conditions
vq(θ, q (θ))− c′(q(θ)) =γ(θ)− F (θ)
f(θ)vθq(θ, q(θ)) (25)
and the complementary slackness condition∫ θ
θ
[u(θ)− u(θ)] dγ(θ) = 0 (26)
hold. By Result 1 above, the allocation {tIR(θ), qIR(θ)} with associated utility profile uIR(θ) of the IRproblem solves the BC problem if, and only if, there exists a cumulative multiplier function Φ(θ) suchthat the first-order conditions
vq(θ, q (θ))− c′(q(θ)) =Φ(θ)− F (θ) + 1− Φ(θ)
f(θ)vθq(θ, q(θ)) +
φ(θ) [vq(θ, q(θ))− Iq(θ, q(θ))]f(θ)
(27)
and the complementary slackness condition∫ θ
θ
[I(θ, q(θ))− t(θ)] dΦ(θ) = 0 (28)
47
hold. Note that for Φ(θ) to be a legitimate cumulative multiplier function, it must be nonnegative andweakly increasing with θ. Clearly, Φ(θ) = a+γ(θ) for any constant a is a legitimate cumulative multiplier.Note also that with Φ(θ) = a+γ(θ), the multiplier dγ(θ) on the IR constraint is zero or strictly positive if,and only if, the multiplier dΦ(θ) on the BC constraint is zero or strictly positive. Given this observation,it is immediate that since the complementary slackness condition (26) holds for the IR problem, so mustthe complementary slackness condition (28) for the BC problem.
Then, we are left to show that at the multiplier function Φ(θ), the allocation {tIR(θ), qIR(θ)} satisfiesthe first-order conditions for the BC problem, namely (27). To this purpose, note first that
Φ(θ) + 1− Φ(θ) = a+ γ(θ) + 1− [a+ γ(θ)] = γ(θ), (29)
where the second equality holds since γ(θ) = 1; see the Supplementary Appendix for a proof that γ(θ) =1. Next, observe that, by assumption, Iq(θ, q(θ)) equals vq(θ, q(θ)) whenever the budget constraint binds.Thus, for each θ, either φ(θ) = 0 or, if not, Iq(θ, q(θ)) = vq(θ, q(θ)). Hence, the second term on the rightside of (27) equals zero for each θ. These two observations imply that the first-order conditions for the BCproblem in (27) are identical to those for the IR problem in (25). Hence, the solutions to the IR and BCproblems are the same. By the same argument as the one in the proof of Result 1 in the SupplementaryAppendix, it is also possible to show that Φ(θ) = 1.Proof of Proposition 2: Recall that I(θ, q) = Y − z(θ, q), where
z(θ, q) = −z1(θ)− z2ν(q), z′1(θ) = ψ(log(θ − z2)), and θ > z2 > 0. (30)
Let ψ(·) be a positive continuous function defined on the real line. To show that BC homogeneity issatisfied under these assumptions on z(θ, q), we proceed by constructing a menu {t(θ), q(θ)} such thatt(θ) = I(θ, q(θ)), Iq(θ, q(θ)) = θν ′(q(θ)), and q(θ) is weakly increasing. Since one can always chooset(θ) = I(θ, q(θ)), the restrictions that BC homogeneity imposes can be stated as requiring I(θ, q(θ)) =Y − z(θ, q), Iq(θ, q(θ)) = θν ′(q(θ)), and q(θ) be weakly increasing. For convenience, rather than estab-lishing that we can construct an increasing q(θ) function that satisfies I(θ, q(θ)) = Y − z(θ, q(θ)) andIq(θ, q(θ)) = θν ′(q(θ)) under (30), we show, equivalently, that we can construct an increasing functionθ(q) under (30) that satisfies {
Iq(θ(q), q) = θ(q)ν ′(q)I(θ(q), q) = Y − z(θ(q), q)
. (31)
Specifically, using (30), the derivative of the second expression in (31) with respect to q is
Iq(θ(q), q) = z′1(θ(q))θ′(q) + z2ν
′(q) = ψ(log[θ(q)− z2])θ′(q) + z2ν
′(q).
By equating the right side of this last expression to the right side of the first expression in (31), we obtain
ψ(log[θ(q)− z2])θ′(q) + z2ν
′(q) = θ(q)ν ′(q)⇔ ν ′(q) =ψ(log[θ(q)− z2])θ
′(q)
θ(q)− z2
. (32)
By integrating both sides of (32) from q(θ) to q ≤ q(θ) and using θ = θ(q(θ)), it follows that
ν(q)− ν(q(θ)) = Ψ(log[θ(q)− z2])−Ψ(log(θ − z2)), (33)
with Ψ(x) = Ψ(log(θ − z2)) +∫ x
log(θ−z2)ψ(x)dx increasing since ψ(·) is positive. Simple manipulations
of (33) yieldθ(q) = z2 + exp{(Ψ)−1(ν(q)− ν(q(θ)) + Ψ(log(θ − z2)))}.
48
Note that θ(q) is an increasing function of q, since (Ψ)−1 (·) and ν(·) are increasing functions. So, q(θ) isan increasing function of θ, as desired.
For an example, it is easy to show that if z(θ, q) = z0− z1(θ− z2)λ1 − z2ν(q) with z1, z2 > 0, θ > z2,and λ1 > 1, then by following the above steps, we obtain q(θ) with q′(θ) > 0 as
q(θ) = (ν)−1
(z1λ1
λ1 − 1[(θ − z2)λ1−1 − (θ − z2)λ1−1] + ν(q(θ))
).
The Two-Dimensional Case: Suppose that the parameter w differs across consumers so that the budgetschedule is now given by I(θ, q, w) = Y (w) − z(θ, q). The analysis of this case differs from that of thecase of constant w depending on whether the seller can discriminate across consumers based on w or,rather, only through a menu of prices at most contingent on q.Contractible Income Characteristic. Suppose that the seller can segment consumers across submarketsindexed by w and offer nonlinear prices in each submarket w so as to screen consumers based on θ. Forease of exposition, suppose that there are only two levels of w, say, wH and wL, with Y (wH) > Y (wL). Inany such submarket w, the seller’s problem is as stated in the BC problem with income Y (w) and budgetschedule I(θ, q, w) = Y (w) − z(θ, q). For the corresponding simple BC problem, the necessary andsufficient conditions for an optimal solution are given by Result 1: {t(θ, w), q(θ, w)} solves the simpleBC problem if, and only if, there exists a cumulative multiplier function Φ(θ, w) such that the first-orderconditions in (27) and the complementary slackness condition in (28) apply with I(θ, q, w) = Y (w) −z(θ, q). Our next result shows how this optimal solution varies across submarkets. For this, let
t(θ, wH) = t(θ, wL) + Y (wH)− Y (wL), q(θ, wH) = q(θ, wL), and Φ(θ, wH) = Φ(θ, wL). (34)
Result 2. If {t(θ, wL), q(θ, wL)} with associated cumulative multipliers {Φ(θ, wL)} solve the simple BCproblem in submarket wL, then {t(θ, wH), q(θ, wH)} with associated cumulative multipliers {Φ(θ, wH)}satisfying (34) solve the simple BC problem in submarket wH .
The result states that type (θ, wH) in the submarket with the higher income level chooses the samequantity as type (θ, wL) in the submarket with the lower income level, that is, q(θ, wH) = q(θ, wL).Moreover, the binding patterns of the multipliers in the two submarkets are identical in that the cumulativemultiplier binds for type (θ, wH) in submarketwH if, and only if, it binds for type (θ, wL) in submarketwL.The only difference is that type (θ, wH) pays Y (wH) − Y (wL) more for the same quantity purchased bytype (θ, wL) in submarket wL. The idea is straightforward. In the market with income Y (wL), a consumerof type θ chooses a pair (t(θ, wL), q(θ, wL)) leading to the consumption of z(θ, wL) = Y (wL)− t(θ, wL)units of the numeraire good. The consumption bundle (q(θ, wL), z(θ, wL)) must jointly provide enoughcalories so that the consumer meets the constraint z(θ, wL) ≥ z(θ, q(θ, wL)). Suppose that this constraintbinds for a set of types, that is,
z(θ, wL) = Y (wL)− t(θ, wL) = z(θ, q(θ, wL)). (35)
In submarketwH , at (t(θ, wL), q(θ, wL)) the budget constraint is slack for all types since Y (wH) > Y (wL).Clearly, in submarket wH , it is feasible for the seller to offer the same quantity as in submarket wL, thatis, q(θ, wH) = q(θ, wL), since q(θ, wL) is implementable in submarket wH too, and simply increase theprice by Y (wH)− Y (wL). In the proof of the result, we use the seller’s first-order conditions to show thatdoing so is optimal for the seller.Proof of Result 2. Let {t(θ, wL), q(θ, wL)} and the cumulative multipliers {Φ(θ, w)} solve the simple BCproblem in submarket wL. By Result 1, we know that these schedules satisfy the first-order conditions
49
(27) and the complementary slackness conditions (28) with t(θ), q(θ), Φ(θ), φ(θ), and I(θ, q) replaced byt(θ, wL), q(θ, wL), Φ(θ, wL), φ(θ, wL), and I(θ, q, wL). It is immediate that the allocations and multipliersgiven in (34) solve the corresponding first-order and complementary slackness conditions for submarketwH . To see why, note that since Iq(θ, q, w) = −zq(θ, q) is independent of w (conditional on q), thefirst-order conditions in the two submarkets are identical under (34). Consider next the complementaryslackness conditions. Since this condition holds in submarket wL, for any θ whose budget constraint forthe seller’s good binds, and so φ(θ, wL) is positive, we have
t(θ, wL) = I(θ, q(θ, wL), wL) ≡ Y (wL)− z(θ, q(θ, wL)). (36)
But then for this same θ in submarket wH , the multiplier φ(θ, wH) is also positive and
t(θ, wH) = t(θ, wL) + Y (wH)− Y (wL) = Y (wH)− z(θ, q(θ, wL)) = Y (wH)− z(θ, q(θ, wH)),
where the first and third equalities follow from (34), and the second equality from (36). Hence, the con-jectured solution satisfies the first-order conditions and complementary slackness condition for submarketwH . So, by Result 1, this conjectured solution solves the simple BC problem in submarket wH .Noncontractible Income Characteristic. Suppose now that the seller cannot segment consumers acrosssubmarkets. That is, the seller must offer the same price schedule to all consumers regardless of their w(and θ). This environment is equivalent to one in which the seller observes neither w nor θ. Assume thatw and θ are sufficiently positively dependent that w can be expressed as a nonlinear monotone function ofθ, w = ω(θ) with ω′(θ) > 0. Then, substituting w = ω(θ) into I(θ, q, w) = Y (w)− z(θ, q) gives
I(θ, q, ω(θ)) = Y (ω(θ))− z(θ, q). (37)
Under (37), Proposition 1 and Result 1 apply. To see that Proposition 2 also holds, let Y (ω(θ)) = Y +y(ω(θ)) without loss. Then, the analogous proposition holds with v(θ, q) = θν(q), z(θ, q) = −z1(θ) −z2ν(q), and y′(ω(θ))ω′(θ) + z′1(θ) = ψ(log(θ − z2)) for θ > z2 > 0.Proof of Proposition 3: Recall that T ′(q(θ)) = θν ′(q(θ)) > 0 by local incentive compatibility. LettingA(q) = −ν ′′(q)/ν ′(q), we have
T ′′(q) = θ′(q)ν ′(q) + θ(q)ν ′′(q) = θ(q)ν ′(q)
[θ′(q)
θ(q)+ν ′′(q)
ν ′(q)
]= T ′(q)
[1
θ(q)q′(θ)− A(q)
]. (38)
From the seller’s first-order condition (9), it follows that[θ − γ(θ)− F (θ)
f(θ)
]ν ′(q(θ))− c = 0,
which implies
q′(θ) = −∂∂θ
[θ − γ(θ)−F (θ)
f(θ)
]ν ′(q(θ))[
θ − γ(θ)−F (θ)f(θ)
]ν ′′(q(θ))
=
∂∂θ
[θ − γ(θ)−F (θ)
f(θ)
][θ − γ(θ)−F (θ)
f(θ)
]A(q(θ))
.
By (38), since T ′(q) > 0 and A(q) ≥ 0, we can express T ′′(q) ≤ 0 equivalently as
T ′(q)A(q(θ))
θ − γ(θ)−F (θ)f(θ)
θ ∂∂θ
[θ − γ(θ)−F (θ)
f(θ)
] − 1
≤ 0⇔ 1 ≤θ ∂∂θ
[θ − γ(θ)−F (θ)
f(θ)
]θ − γ(θ)−F (θ)
f(θ)
. (39)
50
In the highly-convex case, γ(θ) = γ. When γ ∈ [0, 1), the last inequality in (39) becomes
1 ≤θ ∂∂θ
[θ − γ−F (θ)
f(θ)
]θ − γ−F (θ)
f(θ)
=2θf 2(θ) + [γ − F (θ)]θf ′(θ)
θf 2(θ)− [γ − F (θ)]f(θ). (40)
This condition is always satisfied if (10) in the statement of the proposition holds. This argument alsoimplies that in the weakly-convex case, for types θ ∈ [θ, θ1] with γ(θ) = 0, the price schedule entailsquantity discounts provided (40) holds when γ = 0, that is,
2θf 2(θ)− θF (θ)f ′(θ)
θf 2(θ) + F (θ)f(θ)≥ 1,
which, since θf 2(θ) + F (θ)f(θ) > 0, is equivalent to θf 2(θ) ≥ F (θ)[f(θ) + θf ′(θ)].When γ(θ) = γ = 1, we have
q′(θ) =
∂∂θ
[θ − 1−F (θ)
f(θ)
][θ − 1−F (θ)
f(θ)
]A(q(θ))
≥ 1
θA(q(θ))=
ν ′(q(θ))
−θν ′′(q(θ))≥ 0, (41)
where the first inequality in the above follows from the assumption of potential separation and is strictif [1 − F (θ)]/f(θ) is strictly decreasing. Condition (41) implies 1/q′(θ) ≤ −θ(q)ν ′′(q)/ν ′(q), whichcombined with (38) yields
T ′′(q) =ν ′(q)
q′(θ)+ θ(q)ν ′′(q) ≤ ν ′(q)
[−θ(q)ν ′′(q)
ν ′(q)
]+ θ(q)ν ′′(q) = 0.
This argument also establishes that in the weakly-convex case, for types θ ∈ [θ2, θ] with γ(θ) = 1, theprice schedule entails quantity discounts.Quantity Premia in the Weakly-Convex Case: Since T ′(q(θ)) = θν ′(q(θ)) by local incentive com-patibility and A(q(θ)) = −ν ′′(q(θ))/ν ′(q(θ)) ≥ 0, from (38) it follows that T ′′(q) ≤ 0 if, and only if,A(q(θ))θq′(θ) ≥ 1. Since q(θ) = q(θ) for types in (θ1, θ2) and q′(θ) = u′′(θ)/ν ′(q(θ)), the conditionA(q(θ))θq′(θ) ≥ 1 can be also expressed as
θq′(θ)A(q(θ)) = −θq′(θ)ν ′′(q(θ))/ν ′(q(θ)) = θu′′(θ)[−ν ′′(q(θ))]/[ν ′(q(θ))]2 ≥ 1.
Hence, types in (θ1, θ2) for which the reverse inequality is satisfied, face quantity premia.Proof of Proposition 4. Let the allocations in the standard nonlinear pricing model be {us(θ), qs(θ)} andassume that u(θ) = u. A seller’s first-order conditions for the standard model and the augmented modelcan be written, respectively, as
1− c
T ′(qs)=
1− F (θ)
θf(θ)and 1− c
T ′(q)=γ(θ)− F (θ)
θf(θ). (42)
In the standard model, the participation (or budget) constraint binds only for the lowest type so thatus(θ) = u. We first examine the weakly-convex case, then the highly-convex case, and lastly the generalcase of the augmented model. Since the two models coincide for any type with γ(θ) = 1, we just focuson the case in which γ(θ) < 1.
Weakly-Convex Case. In this case, the cumulative multiplier γ(θ) equals zero until θ1, increases from
51
zero to one as θ increases from θ1 to θ2, and equals 1 between θ2 and θ. Since the allocations in the twomodels agree for θ ≥ θ2, we focus on types θ < θ2. We first show that for θ ∈ [θ, θ2), the augmentedmodel implies lower marginal prices, T ′(q(θ)) < T ′(qs(θ)), and higher consumption, q(θ) > qs(θ). Tothis purpose, note that since γ(θ) < 1 on [θ, θ2), it follows that
[1− F (θ)]/θf(θ) > [γ(θ)− F (θ)]/θf(θ),
which, by (42) and local incentive compatibility, implies that
T ′(q(θ)) = θν ′(q(θ)) < T ′(qs(θ)) = θν ′(qs(θ)). (43)
Thus, T ′(q(θ)) < T ′(qs(θ)). Moreover, since ν ′(·) is decreasing, (43) also implies that q(θ) > qs(θ). Thisresult, in turn, yields that
u(θ)=u(θ) +
∫ θ
θ
u′(θ)dθ =u(θ) +
∫ θ
θ
ν(q(θ))dθ>us(θ) +
∫ θ
θ
ν(qs(θ))dθ=us(θ) +
∫ θ
θ
u′s(θ)dθ=us(θ),
(44)where the second and third equalities in (44) follow from local incentive compatibility in both models, sothat u′(θ) = ν(q(θ)) and u′s(θ) = ν(qs(θ)), whereas the inequality in (44) follows because u(θ) > u(θ) ≥u, ν(·) is (strictly) increasing, and, as argued, q(θ) > qs(θ) on [θ, θ2).
Highly-Convex Case. In this case, γ(θ) = γ. Let, then, γ ∈ [0, 1) for θ ∈ [θ, θ). The arguments forT ′(q(θ)) < T ′(qs(θ)) and q(θ) > qs(θ) when θ ∈ [θ, θ) are nearly identical to those in the weakly-convexcase. Since the IR (or BC) constraint binds for the lowest type, in the highly-convex case u(θ) = u(θ),and the strict inequality in (44) follows because q(θ) > qs(θ) for θ ∈ [θ, θ).
General Case. For any θ with γ(θ) < 1, the same argument as in the weakly-convex case establishesT ′(q(θ)) < T ′(qs(θ)) and q(θ) > qs(θ). Since u(θ) ≥ u(θ) ≥ u, an argument analogous to the one in theweakly-convex case proves that u(θ) > us(θ).Proof of Proposition 5: We divide the proofs into two parts. Recall the assumption of full participationunder nonlinear and linear pricing.
Case a). We start by showing that if qm(θ) ≥ q(θ) and the price schedule exhibits quantity discounts inthat p′(q) ≤ 0, then the utility of a consumer of type θ is higher under linear that under nonlinear pricing,that is, um(θ) ≥ u(θ). By way of contradiction, assume that qm(θ) ≥ q(θ) and p′(q) ≤ 0 but
u(θ) = θν(q(θ))− T (q(θ)) > um(θ) = θν(qm(θ))− θν ′(qm(θ))qm(θ), (45)
where in (45) we have used the fact that under linear pricing, pm = θν ′(qm(θ)). Given that qm(θ) maxi-mizes the consumer’s utility under linear pricing, it follows
θν(q(θ))− T (q(θ)) > θν(qm(θ))− θν ′(qm(θ))qm(θ) ≥ θν(q(θ))− θν ′(qm(θ))q(θ), (46)
which impliesθν ′(qm(θ)) > T (q(θ))/q(θ). (47)
Note that the first inequality in (46) restates (45) while the second inequality follows from the fact that atthe linear price pm, any quantity demanded different from qm(θ) implies a lower utility for the consumer.The inequality in (47) follows since the left-most term in (46) is greater than the right-most term.
Next, by the assumption of quantity discounts, p′(q) = [T ′(q)− T (q)/q]/q ≤ 0 or, equivalently,
T ′(q(θ)) ≤ T (q(θ))/q(θ). (48)
52
This inequality, in turn, implies
θν ′(q(θ)) = T ′(q(θ)) ≤ T (q(θ))/q(θ) < θν ′(qm(θ)) (49)
where the equality in (49) follows from local incentive compatibility, the weak inequality from (48),and the strict inequality is simply (47). Clearly, (49) implies that θν ′(qm(θ)) > θν ′(q(θ)), which is acontradiction since qm(θ) ≥ q(θ) by assumption and ν ′(·) is decreasing. Hence, um(θ) ≥ u(θ).
Case b). We now show that if q(θ) > qm(θ), γ(θ) < 1, and the price schedule exhibits quantitydiscounts in that T ′′(q) ≤ 0, then the utility of a consumer of type θ is higher under linear that undernonlinear pricing. Consider one such type, θ, with q(θ) > qm(θ) and γ(θ) < 1. Suppose first that theweakly-convex case applies. In this case, the participation (or budget) constraint binds for θ ∈ [θ1, θ2] andis slack above θ2 with γ(θ) = 1 for θ ≥ θ2. Let then θ ∈ [θ, θ2]. By way of contradiction, suppose thatu(θ) > um(θ). We will show that if so, then we will contradict the assumption that um(θ2) ≥ u(θ2), thatis, that a consumer of type θ2 participates under linear pricing. To this purpose, rewrite u(θ) > um(θ) as
u(θ2)− [u(θ2)− u(θ)] > um(θ2)− [um(θ2)− um(θ)], (50)
which can be expressed as
u(θ2)−∫ θ2
θ
u′(x)dx > um(θ2)−∫ θ2
θ
u′m(x)dx. (51)
By using u(θ2) = u(θ2), since the IR (or BC) constraint binds at θ2, local incentive compatibility in thetwo models, that is, u′(θ) = ν(q(θ)) and u′m(θ) = ν(qm(θ)), and rearranging terms, condition (51) implies
u(θ2)− um(θ2) >
∫ θ2
θ
[ν(q(x))− ν(qm(x))] dx. (52)
We now argue that the right side of (52) is positive, which establishes the desired contradiction. To seethat the right side of (52) is positive, note that for all θ ∈ [θ, θ],
pm = θν ′(qm(θ)) = θν ′(qm(θ)) ≥ θν ′(q(θ)) = T ′(q(θ)) ≥ T ′(q(θ)) = θν ′(q(θ)), (53)
where the first two equalities follow from a consumer’s first-order condition under linear pricing, which,of course, hold for all θ, the first inequality follows from q(θ) > qm(θ) by the assumption of the case andν ′(·) decreasing, the third and fourth equalities follow from local incentive compatibility, and the secondinequality holds for θ ≥ θ since q(·) is increasing by incentive compatibility and T ′(·) is decreasing byassumption. Hence, (53) implies θν ′(qm(θ)) ≥ θν ′(q(θ)) for all θ ∈ [θ, θ], and so q(θ) ≥ qm(θ) for allθ ∈ [θ, θ], given that ν ′(·) is decreasing. Thus, the right side of (52) is positive since ν(·) is increasing.Then, u(θ2) > um(θ2) so type θ2 does not participate under linear pricing. Contradiction.
Consider next the highly-convex case. By assumption, the cumulative multiplier at θ satisfies γ(θ) < 1,so the participation (or budget) constraint binds at the highest type, that is, u(θ) = u(θ). From here, wecan repeat the steps of the contradiction argument for the weakly-convex case with u(θ) replacing u(θ2)and arrive at a similar conclusion, namely that u(θ) > um(θ) contradicts the assumption that a consumerof type θ participates under linear pricing. To this purpose, rewrite u(θ) > um(θ) as
u(θ)− [u(θ)− u(θ)] > um(θ)− [um(θ)− um(θ)]⇔ u(θ)−∫ θ
θ
u′(x)dx > um(θ)−∫ θ
θ
u′m(x)dx,
53
which, by using that u(θ) = u(θ), local incentive compatibility, and rearranging terms, yields
u(θ)− um(θ) >
∫ θ
θ
[u′(x)− u′m(x)] dx =
∫ θ
θ
[ν(q(x))− ν(qm(x))] dx. (54)
As before, (53) implies that q(θ) ≥ qm(θ) for all θ ∈ [θ, θ], which yields that the right side of (54) ispositive and so u(θ) > um(θ). Thus, type θ does not participate under linear pricing. Contradiction.
The general case is a simple extension of the arguments in the weakly-convex and highly-convex cases:since in this case the IR (or BC) constraint binds for at least an interval [θ′1, θ
′2], an argument analogous to
the above one applies.Proof of Proposition 6: Since s(θ, q(θ)) ≥ u(θ) for consumers with types θ ∈ [θ′, θ′′], the seller makesnonnegative profits from each such consumer type under nonlinear pricing and these types participate bythe argument in the proof of Proposition 1. To establish the desired claim, it is sufficient to show that thereexists a subinterval of types in [θ′, θ′′], say, [θ3, θ4], who do not participate under linear pricing. For this,suppose, by way of contradiction, that all consumer types in [θ′, θ′′] participate under linear pricing. Weprove that if so, then the seller makes negative profits under linear pricing. To prove this, let θ be a type in[θ′, θ′′] such that um(θ) = u(θ). Note that for any type θ in [θ, θ′′] who participates under linear pricing, itmust be um(θ) ≥ u(θ), which can be expanded as
um(θ) = um(θ) +
∫ θ
θ
u′m(x)dx = um(θ) +
∫ θ
θ
ν(qm(x))dx
≥ u(θ) = u(θ) +
∫ θ
θ
u′(x)dx = u(θ) +
∫ θ
θ
ν(q(x))dx, (55)
where the second equality in (55) uses the fact that u′m(θ) = ν(qm(θ)) by the consumer’s first-order condi-tion under linear pricing, θν ′(qm(x)) = pm, and the last equality uses the BC homogeneity assumption thatthe reservation quantity is incentive compatible so u′(θ) = ν(q(θ)). Since, by assumption, um(θ) = u(θ),(55) implies ∫ θ
θ
ν(qm(x))dx ≥∫ θ
θ
ν(q(x))dx. (56)
Since ν(·) is increasing, (56) implies that there exists a set of positive measure, say, [θ3, θ4], for whichqm(θ) ≥ q(θ). Since q(θ) > qFB(θ) for consumers with types in [θ′, θ′′] by assumption, and ν ′(·) is(strictly) decreasing, it follows that for consumers with θ ∈ [θ3, θ4],
pm = θν ′(qm(θ)) < θν ′(qFB(θ)) = c, (57)
where the first equality follows from a consumer’s first-order condition under linear pricing and the secondequality from the first-order condition for the first-best quantity for type θ. But (57) implies that pm < c,which contradicts optimality by the seller, as the seller can always raise the linear price and at least earnzero.Proof of Proposition 7: Recall the discussion of the equivalence between the IR and BC models in Section3.2. For simplicity of exposition only, we will base our proof on the simple version of this equivalencebetween the two models by defining
I(θ, q(θ)) = θν(q(θ))− u(θ). (58)
Consider a consumer of type θ receiving a transfer τ(θ) with τ ′(θ) ≤ 0. For example, the transfer schedule
54
could be τ(θ) = τ0 + τ1θ with τ1 ≤ 0 so that τ ′(θ) = τ1. Hence, after the transfer, the consumer’s incomeis I(θ, q(θ)) + τ(θ) and the analog of condition (58) for the equivalence of the two models under the newincome schedule is
I(θ, q(θ)) + τ(θ) = θν(q(θ))− u(θ, τ), (59)
where u(θ, τ) is the corresponding new reservation utility. Subtracting (59) from (58) gives
u(θ, τ) = u(θ)− τ(θ). (60)
To develop some intuition, note that a consumer of type θ spends t(θ) to purchase q(θ) and the rest ofher income to purchase z = Y − t(θ) ≥ z(θ, q(θ)) before the transfer is introduced. If this consumerreceives a transfer of τ(θ), then the seller can increase the price t(θ) since the consumer’s ability to payhas increased. When we translate this consumer’s budget constraint back to a participation constraintusing (59), we see that the transfer amounts to a decrease in the reservation utility of the consumer bythe amount of the transfer, which reflects the fact that the seller can now ask for a higher price whilestill satisfying the consumer’s participation constraint. Given this equivalence, in the proof we need onlyshow that replacing the original IR constraint u(θ) ≥ u(θ), where u(θ) is defined in (58), with the newconstraint
u(θ) ≥ u(θ, τ) (61)
where u(θ, τ) is given in (60), leads the amount purchased of the seller’s good to increase, marginal pricesto decrease, and the total price for each quantity to increase (or decrease) under conditions.
We now proceed to the formal argument. Let {tτ (θ), qτ (θ)} denote the equilibrium allocation withparticipation constraint (61) and {t(θ), q(θ)} denote the original allocation in the absence of transfers withparticipation constraint u(θ) ≥ u(θ). The proof is articulated in several steps. The first step establishesthat the new reservation quantity qτ (θ) is greater than the original one type by type in that
qτ (θ) ≥ q(θ) for all θ. (62)
That the reservation quantity increases after the transfer follows immediately from the homogeneity as-sumption that qτ (θ) and q(θ) satisfy in that
u′(θ, τ) = u′(θ)− τ ′(θ) = ν(qτ (θ)) and u′(θ) = ν(q(θ)),
from the fact that u′(θ, τ) ≥ u′(θ), since τ ′(θ) ≤ 0, and that ν(·) increases with q. (See the SupplementaryAppendix for details about the assumptions of the IR model.) We show next that for each type, this implies
qτ (θ) ≥ q(θ) and T ′τ (q) ≤ T ′(q). (63)
Part 1: Establishing (63). Consider first the weakly-convex case and suppose that the income transfergives rise to a new weakly-convex allocation. (This result requires the transfer to be sufficiently progres-sive so that q′τ (θ) = u′′(θ, τ)/ν ′(qτ (θ)) is not too large, as consistent with the weakly-convex case.) Recallthat at the original allocation {t(θ), q(θ)}, the multipliers {γ(θ)} are such that
γ(θ) =
0 for θ < θ1
γ(θ) for θ ∈ [θ1, θ2)1 for θ ≥ θ2
. (64)
55
The new allocation {tτ (θ), qτ (θ)} has associated multipliers {γτ (θ)} of the form
γτ (θ) =
0 for θ < θ1τ
γτ (θ) for θ ∈ [θ1τ , θ2τ )1 for θ ≥ θ2τ
. (65)
Recall also that the reservation multipliers γ(θ) and γτ (θ) are defined as the multipliers that support q(θ)and qτ (θ), respectively, in that q(θ) = l(γ(θ), θ) and qτ (θ) = l(γτ (θ), θ). Since l(·, θ) decreases with γand reservation quantities are larger after the transfer as in (62), the reservation multipliers associated withthe new allocation must be smaller in that γτ (θ) ≤ γ(θ). But then from the form of the multipliers in (64)and (65), it follows that θ1τ ≥ θ1 and θ2τ ≥ θ2. We prove these last two claims separately.
a) Claim 1: θ2τ ≥ θ2. Suppose not, that is, suppose that θ2τ < θ2. Note that θ2τ ≥ θ1, otherwise thereservation quantity would be smaller for some type after the transfer. By the form of the multipliers γτ (θ)and γ(θ), it follows that γτ (θ2τ ) = γτ (θ2τ ) = 1 and, since θ1 ≤ θ2τ < θ2, that γ(θ2τ ) = γ(θ2τ ) < 1. So,γτ (θ2τ ) > γ(θ2τ ), which contradicts the fact that γτ (θ) ≤ γ(θ).
b) Claim 2: θ1τ ≥ θ1. Suppose not, that is, suppose that θ1τ < θ1. By the form of the multipliersγτ (θ) and γ(θ), it follows that, since θ1τ < θ1 ≤ θ2τ , γτ (θ1) = γτ (θ1) > γτ (θ1τ ) = γτ (θ1τ ) = 0 andγ(θ1) = γ(θ1) = 0. So, γτ (θ1) > γ(θ1), which contradicts the fact that γτ (θ) ≤ γ(θ).
Using these facts, together with the form of the multipliers, gives that γτ (θ) ≤ γ(θ). Thus, sinceq(θ) = l(γ(θ), θ), qτ (θ) = l(γτ (θ), θ), and l(·, θ) decreases with γ, it follows that qτ (θ) ≥ q(θ). In turn,since ν ′(·) is decreasing, local incentive compatibility implies
θν ′(qτ (θ)) = T ′τ (qτ (θ)) ≤ T ′(q(θ)) = θν ′(q(θ)). (66)
Thus, we have established (63).Consider now the highly-convex case. Suppose that γ ∈ (0, 1) at the original allocation, so that the
participation constraint binds for the lowest and highest types. Then, u(θ) = u(θ) and u(θ) = u(θ).Denote by uτ (θ) the utility of a consumer of type θ after transfer is introduced. As argued, under thetransfer schedule τ(θ) with τ ′(θ) ≤ 0, the new reservation utility u(θ, τ) is steeper than the original one.So, the highly-convex case still applies. In addition,∫ θ
θ
u′(x, τ)dx = u(θ, τ)− u(θ, τ) = uτ (θ)− uτ (θ) =
∫ θ
θ
u′τ (x)dx =
∫ θ
θ
ν(qτ (x))dx
≥∫ θ
θ
u′(x)dx = u(θ)− u(θ) = u(θ)− u(θ) =
∫ θ
θ
u′(x)dx =
∫ θ
θ
ν(q(x))dx. (67)
The inequality in (67) holds because u′(θ, τ) ≥ u′(θ). The second equality in (67) holds since the par-ticipation constraint binds for the lowest and highest types after the transfer too, and the fourth equalityfollows from local incentive compatibility. The equalities in the second line of (67) hold for the samereason as the equalities in the first line. Equation (67) then implies that∫ θ
θ
ν(qτ (x))dx ≥∫ θ
θ
ν(q(x))dx, (68)
which in turn yields that l(γτ , θ) = qτ (θ) ≥ q(θ) = l(γ, θ) for a set of types with positive mass. Tounderstand this implication, note that qτ (θ) = l(γτ , θ) and q(θ) = l(γ, θ) follow by construction of anoptimal allocation with and without transfers, whereas qτ (θ) ≥ q(θ) for a set of types with positive mass
56
follows from (68) given that ν(·) is increasing. But since l(·, θ) decreases with γ and the multiplier isconstant for all interior types in the highly-convex case, it must be γτ ≤ γ for types in this set. Usingagain the fact that the multiplier is constant for all interior types in the highly-convex case, we concludethat γτ ≤ γ and so qτ (θ) ≥ q(θ) for all interior types. In turn, local incentive compatibility, togetherwith ν ′(·) decreasing, immediately implies (66). Suppose now that γ = 1 at the original allocation. If themultiplier changes after the transfer, then it must decrease and so the same argument applies. If, instead,the multiplier does not change, then qτ (θ) = q(θ) and T ′τ (q) = T ′(q). Suppose, finally, that γ = 0at the original allocation. As in the previous case, if the multiplier does not change after the transfer,then qτ (θ) = q(θ) and T ′τ (q) = T ′(q). If the multiplier changes in response to the transfer, then it mustincrease, which implies qτ (θ) ≤ q(θ) and T ′τ (q) ≥ T ′(q). However, since q(θ) > l(0, θ) and qτ (θ) ≥ q(θ),it follows that qτ (θ) > l(0, θ), and so γ is still equal to zero after the transfer. Hence, qτ (θ) = q(θ) andT ′τ (q) = T ′(q). This establishes (63).
Given thatG(q) = F (θ) and qτ (θ) ≥ q(θ), it follows that in both the highly-convex and weakly-convexcase, the distribution of quantities after the transfer is introduced first-order stochastically dominates theone before the transfer is introduced.
Part 2. Establishing Tτ (q) ≥ T (q). This result is immediate. Since, as argued, the transfer amountsto a reduction in consumers’ reservation utility in that u(θ, τ) ≤ u(θ), the allocation {t(θ), q(θ)} is stillimplementable. So, the profit of the seller cannot decrease. As shown under Part 1, the offered quantity(weakly) increases for each type and so the cost of producing each type’s quantity is higher after thetransfer. Since the seller’s profit is T (q(θ))− c(q(θ)), it follows that T (q(θ)) must increase.Integrand in the Estimator for Preference Parameter Well-Defined: Consider first the highly-convexcase of the augmented model and note that
limq→q(θHC)
g(q)[T ′(q)− c]T ′(q)[γ −G(q)]
=g(q(θHC))T ′′(q(θHC))]
−T ′(q)g(q(θHC))= −T
′′(q(θHC))]
T ′(q).
A similar argument applies to the weakly-convex case when γ = 0 (for q < q1) or γ = 1 (for q ≥ q2).Consider now the weakly-convex case when q ∈ [q1, q2). Then,
limq→q(θWC)
g(q)[T ′(q)− c]T ′(q)[γ(θ(q))−G(q)]
=g(q(θWC))T ′′(q(θWC))
T ′(q(θWC))[γ′(θWC)θ′(q(θWC))− g(q(θWC))].
Example 1 (Nonlinear vs. Linear Pricing): Suppose the base utility function, ν(q), is three-parameterHARA with ν(q) = (1 − d)[aq/(1 − d) + b]d/d, a > 0, aq/(1 − d) + b > 0, and 0 < d < 1. Denote byus(θ) the utility of a type θ consumer under the standard model. With a uniform type distribution on [θ, θ],u(θ) = 0, a = c = 1, b = 0, and d = 1/2, it follows us(θ) ≥ um(θ) if, and only if, (2θ−θ)2−(2θ−θ)2 ≥θ2. When θ = 1 and θ = 2, this expression reduces to 3θ2 − 8θ + 4 ≥ 0, a polynomial with rootsθ = 2/3 and θ = 2. Thus, um(θ) ≥ us(θ) for all consumer types. Consider now the highly-convex caseof the augmented model with γ = 1/2. In this case, u(θ) ≥ um(θ) if, and only if, 3θ2 − 6θ + 2 ≥ 0.So, u(θ) ≥ um(θ) for θ ≥ 1.58. Also, all such types demand quantities above first best. Thus, not onlyu(θ) ≥ us(θ), as implied by Proposition 4, but also nearly half of the consumers prefer nonlinear to linearpricing under the augmented model.
B Estimation DetailsTo estimate the distribution of price and quantities, a seller’s marginal cost, the multipliers on the partici-pation (or budget) constraint, the distribution of consumers’ marginal willingness to pay, and consumers’
57
marginal utility function, and to perform the counterfactual exercises described in the text, we proceedaccording to the following steps.
Step 1. We compute T ′(qi), T ′′(qi), and G(qi), i = 1, . . . , N and qi ∈ {q1, . . . , qN}, from data onprices (unit values) and quantity purchases in each village as explained in the text. We fit six differentspecifications for T (q): log(T (q)) = t0 + t1 log(q)+ t2(log(q))2, T (q) = t0 + t1q+ t2q
2, T (q) = exp{t0 +t1 log q+ t2(log q)2 + t3q}, T (q) = exp{t0 + t1 log(q)}, T (q) = t0 + t1 log(q), and T (q) = log(t0 + t1q).In each village, among those specifications that imply a positive total price, T (q), at the lowest quantity, apositive marginal price at the smallest and largest quantities in a village, and satisfy a necessary conditiondescribed next for the schedule θ(q) to be increasing under the standard model, we select the specificationof T (q) that leads to the highest (adjusted) R2. Note that the condition that guarantees θ(q) increases withq under the standard model can be formulated as follows. Recall that by local incentive compatibilityθ(q) = T ′(q)/ν ′(q) so that
∂θ(q)
∂q=T ′′(q)ν ′(q)− T ′(q)ν ′′(q)
[ν ′(q)]2≥ 0⇔ T ′′(q)
T ′(q)≥ ν ′′(q)
ν ′(q).
By integrating the left-side and right-side of the above expression with respect to q, we obtain∫ q
q
T ′′(x)
T ′(x)dx ≥
∫ q
q
ν ′′(x)
ν ′(x)dx⇔ log[T ′(x)]qq ≥ log[ν ′(x)]qq ⇔
T ′(q)
T ′(q)≥ ν ′(q)
ν ′(q).
Empirically, by Perrigne and Vuong (2010), this condition can be equivalently formulated as
T ′(qi)
T ′(q1)≥ ν ′(qi)
ν ′(q1)=T ′(qi)[1− G(qi)]
1−T ′(qN )
T ′(qi) exp{−T ′(qN)
∑i−1j=1 log[1− G(qj)]
[1
T ′(qj)− 1
T ′(qj+1)
]}T ′(q1)[1− G(q1)]
1−T ′(qN )
T ′(q1)
with i = 1, . . . , N − 1. Since G(q1) ≥ 0 and T ′(q1) ≥ T ′(qN), a necessary condition is
[1− G(qi)]1−T ′(qN )
T ′(qi) exp
{−T ′(qN)
i−1∑j=1
log[1− G(qj)]
[1
T ′(qj)− 1
T ′(qj+1)
]}≤ 1.
This is the requirement that restricts our sample of 38 villages with at least 100 households consumingrice to 31 villages, as explained in the text.
Step 2. In each village, we estimate c and γ(·), and determine the relevant case of the augmentedmodel, as discussed in the text, by estimating (11) by GMM in Stata and setting a confidence level of 5 per-cent for the corresponding test procedure. We perform the remaining estimation routines in FORTRAN90.Note that the estimator of c is normally distributed, so (asymptotic) standard errors are straightforward tocompute—ignoring the estimation of the powers in the fractional polynomial for the auxiliary functionx(q). Recall that in the regular sample we focus on, all villages conform to the highly-convex case, so themultiplier γ(·) is constant (for all interior quantities) and equal to γ. Hence, γ = G(qHC) = G((T ′)−1(c))
since, by definition, T ′(qHC) = c. Since γ is a function of G(q) and c, denoting by σ2c the asymptotic
variance of c, and by G(q)[1 − G(q)] the asymptotic variance of G(q), we can easily obtain the asymp-totic variance of γ. Recall that in computing the asymptotic distribution of the estimators of c and γ, weconsider T (q) and its derivatives as known.
Step 3. Since the empirical distribution function of quantity purchases is a step function with steps atq1 < · · · < qN , the integrals in θ(q) and ν ′(q) can be rewritten as finite sums of integrals. (See Perrigne
58
and Vuong (2010) for a similar approach.) On each of these intervals G(·) is constant. We then estimateν ′(q) as
θ(q) = exp
{∑j>2,j 6=i−1
g(qj)[T′(qj)− c]
T ′(qj)[γ(θ(qj))− G(qj)](qj − qj−1)
}and ν ′(q) as
ν ′(q) = T ′(q) exp
{−∑
j>2,j 6=i−1
g(qj)[T′(qj)− c]
T ′(qj)[γ(θ(qj))− G(qj)](qj − qj−1)
}. (69)
We estimate the density of types based on (16). We compute the standard errors of θ(q) and ν ′(q) throughthe delta method. Given the asymptotic standard error of c, the fact that γ′(c) = g(qHC)/T ′′(qHC), andthe normalization θ = 1, from
θ(q) = exp
{∫ q
q
g(x)[T ′(x)− c]T ′(x)[γ −G(x)]
dx
},
we obtain
∂θ(q)
∂c= θ(q)
−∫ q
q
g(x){γ −G(x) + [T ′(x)− c] g(qHC)
T ′′(qHC)
}T ′(x)[γ −G(x)]2
dx
,
where
limq→qHC
g(x){γ −G(x) + [T ′(x)− c] g(qHC)
T ′′(qHC)
}T ′(x)[γ −G(x)]2
= 0.
In practice, we compute ∂θ(q)/∂c as
∂θ(q)
∂c= θ(q)
−∑j>2,j 6=i−1
g(qj){γ −G(qj) + [T ′(qj)− c] g(qHC)
T ′′(qHC)
}T ′(qj)[γ −G(qj)]2
(qj − qj−1)
,
where, given the granularity of the data, for the purpose of these computations, we approximated G(q) asG(q) = exp{g0 + g1q}(1 + exp{g0 + g1q}) and estimated its parameters g0 and g1 jointly with c and γ toobtain the variance-covariance matrix of the estimators of c, g0, and g1, which is then used to determinethe standard error of the estimators of θ(q) and ν ′(q). Note that for i = 0, 1,
∂θ(q)
∂gi= θ(q) exp
∫ q
q
[T ′(x)− c]{∂g(x)∂gi
[γ −G(x)]− g(x)[∂γ∂gi− ∂G(x)
∂gi
]}T ′(x)[γ −G(x)]2
dx
,
where, using the fact that γ = G(qHC) = exp{g0 + g1qHC}/(1 + exp{g0 + g1qHC}),
∂γ
∂g0
=G(qHC)
1 + exp{g0 + g1qHC}and
∂γ
∂g1
=G(qHC)qHC
1 + exp{g0 + g1qHC},
∂G(q)
∂g0
=G(q)
1 + exp{g0 + g1q}and
∂G(q)
∂g1
=G(q)q
1 + exp{g0 + g1q}.
59
With g(q) = G(q)g1/(1 + exp{g0 + g1q}), we further obtain
∂g(q)
∂g0
=g1
(∂G(q)∂g0
+ exp{g0 + g1q}[∂G(q)∂g0−G(q)
])(1 + exp{g0 + g1q})2
,
∂g(q)
∂g1
=
[∂G(q)∂g1
g1 +G(q)]
(1 + exp{g0 + g1q})−G(q)g1q exp{g0 + g1q}
(1 + exp{g0 + g1q})2.
We then estimate (11), with x(q) specified as a fractional polynomial, β0 + β1qa1 + β2q
a2 + ... + βpqap ,
and G(q) as just described simultaneously by GMM{g(q)T ′(q)− g(q)
c+ [β0+β1qa1+β2qa2+...+βpq
ap ]
c= 0
G(q)− exp{g0+g1q}1+exp{g0+g1q} = 0
, (70)
with the powers of the fractional polynomial for x(q) treated as known. By the central limit theorem,
√N
c− cg0 − g0
g1 − g1
a∼ N(0,Σ).
So,√n(θ(q) − θ(q)) ∼ N(0, DXΣD′X) with DX = (∂θ(q)/∂c, ∂θ(q)/∂g0, ∂θ(q)/∂g1). Since ν ′(q) =
T ′(q)/θ(q), then ν ′(q) is (asymptotically) normally with mean zero and variance σ2θ [∂ν
′(q)/∂θ(q)]2 =
σ2θ [T′(q)/θ2(q)]2. Given θ(q) and f(θ), following Hall (1992), we compute the sample analog of the
variance of the kernel density estimator of f(θ) as
s2(θ) =1
(Nhθ)2
∑N
i=1Kθ
(θ − θihθ
)2
− [f(θ)]2
N,
and use it to produce asymptotic confidence (variability) bounds around the estimated density.Step 4. We calculate consumer surplus from quantity q under nonlinear pricing as CSnp(q) =
θ(q)ν(q) − T (q) and across quantities as CSnp =∫ qqCSnp(x)dG(x) =
[θ(x)ν(x) − T (x)]dG(x).
Note that θν(q) = θ[ν(q) + ν(q)− ν(q)] = θ[ν(q) +∫ qqν ′(x)dx]. Since all relevant variables are discrete,
we compute ν(q1) as q1ν ′(q1), and
ν(qi) = q1ν ′(q1) +∑i−1
j=1(qj+1 − qj)ν ′(qj+1) (71)
for i > 1. Specifically, ν(q2) = q1ν ′(q1) + (q2 − q1)ν ′(q2) = ν(q1) + (q2 − q1)ν ′(q2), ν(q3) = q1ν ′(q1) +(q2− q1)ν ′(q2) + (q3− q2)ν ′(q3) = ν(q2) + (q3− q2)ν ′(q3), and so on. Therefore, ν(qi) = ν(qi−1) + (qi−qi−1)ν ′(qi), i > 1. Accordingly, we compute consumer surplus as
CSnp =∑N
i=1[θ(qi)ν(qi)− T (qi)]rq(qi),
where rq(q1) = G(q1) and rq(qi+1) = G(qi+1)−G(qi), i = 1, . . . , N−1. Similarly, we compute producer
60
surplus as
PSnp =∑N
i=1[T (qi)− cqi] rq(qi).
Step 5. We describe here only how we perform the counterfactual exercise described in Section5.5 under the augmented model, as it is the most involved. When comparing consumer, producer, andsocial surplus under nonlinear and linear pricing, we compute a seller’s linear price and the correspondingindividual and aggregate demands as follows. From the consumer’s first-order condition θν ′(q) = pm, itfollows
q = q(pm, θ) = (ν ′)−1(pmθ
), (72)
where qm(θ) ≡ q(pm, θ) denotes the quantity demanded by a consumer of type θ under linear pricing.With Q(pm) =
∫ θθqm(x)f(x)dx, pm solves the problem
maxpm
[(pm − c)Q(pm)].
Hence, under linear pricing, (total) consumer and producer surplus are, respectively, given by
CSlp =
∫ θ
θ
CSlp(x)f(x)dx =
∫ θ
θ
[xν(qm(x))− pmqm(x)] f(x)dx,
PSlp = (pm − c)Q(pm) = (pm − c)∫ θ
θ
qm(x)f(x)dx.
Social surplus is simply the sum ofCSlp and PSlp. To solve for pm, we determine a grid p = (pm1, . . . , pmP )of 5,000 points, and for each θi we compute the schedule
q((pm1, . . . , pmP ), θi) = (q(pm1, θi), . . . , q(pmP , θi)).
To do so, given a grid q = (q1, . . . , qmax) of 5,000 equidistant points for candidate quantities, we determinethe quantity chosen by type θi for each possible price pmp, p = 1, . . . , P , by solving the system
θiν ′(q1) = pmp. . .
θiν ′(qmax) = pmp
for each pmp and selecting q(pmp, θi) as the grid quantity for which the difference |θiν ′(qg) − pmp| issmallest. Then, the quantity demanded by type θi at price pmp is
q(pmp, θi) =
{q(pmp, θi), if θiν(q(pmp, θi))− pmpq(pmp, θi) ≥ u(θi)0, otherwise
,
with u(θi) determined as detailed in Section 5.5. Aggregate demand for each price pmp, p = 1, . . . , P , is
Q(pmp) =∑N
i=1q(pmp, θi)rθ(θi),
where rθ(θ1) = G(q1) and rθ(θi+1) = G(qi+1)−G(qi) for i = 1, . . . , N − 1. We then solve for the price
61
p∗m such that (pm − c)Q(pm) is maximal. Finally, we calculate consumer surplus as
CSlp =∑N
i=1rq(q(p
∗m, θi)) max{θiν(q(p∗m, θi))− p∗mq(p∗m, θi), u(θi)},
where rq(q(p∗m, θi)) = rθ(θi), ν(q(p∗m, θ1)) = q(p∗m, θ1)ν ′(q(p∗m, θ1)), and
ν(q(p∗m, θi)) = q(p∗m, θ1)ν ′(q(p∗m, θ1)) +∑i−1
j=1[q(p∗m, θj+1)− q(p∗m, θj)]ν ′(q(p∗m, θj+1)),
for i > 1. Similarly, we compute producer surplus as PSlp = (p∗m − c)Q(p∗m).
B.1 Estimation Results: Regular SampleWe report in Figure 12 the schedule of marginal prices and the estimated marginal cost in each of the 11villages in our regular sample: we order the 11 villages according to the value of the multiplier on theparticipation (or budget) constraint, from lowest to highest, as reported in Figure 5 in the text. Note thatthe small discrepancy between the estimates of marginal cost reported in the plots in Figure 12 and thosereported in Figure 5 is due to rounding error. The reason is as follows. In each village, we estimate qHCas the quantity at which the difference between T ′(q) and the estimate c of marginal cost is smallest. Wethen use the estimate of marginal cost given by T ′(qHC) rather than c, and the corresponding estimate ofthe multiplier given by γ = G(qHC), when computing θi and ν ′(qi) so as to ensure that the integrand termin the expressions defining θi and ν ′(qi) is well-defined at points of singularity—these are the estimatesof c and γ reported in Figure 5. We also display the estimates of the type support, the probability densityfunction of types, and the base marginal utility function for each quantity in each village, together withconfidence bounds (for the estimates of the type support and the base marginal utility function) or point-wise asymptotic variability bounds (for the estimates of the density function) in Figures 13 to 15. Notethat for the density estimates, the bounds are centered on f(θ) and ignore the bias of the estimates.
B.2 Counterfactuals: Nonlinear vs. Linear Pricing Under Standard ModelAn intuitive rationale for the findings in Section 5.5 on the difference in consumer and social surplusunder nonlinear and linear pricing predicted by the standard model is as follows. As discussed, for givenconsumer types, the standard model implies lower consumption levels relative to our model. Hence,the standard model accounts for observed quantities and marginal prices by ascribing a higher marginalwillingness to pay, θ, to households purchasing any given quantity. But since θν ′(q) = T ′(q), for a givenT ′(q) schedule and higher implied θ’s, the standard model must also imply lower base marginal utilitiesthan our model. Indeed, by comparing Figure 11 to Figures 6 and 7, it is apparent that the standard modelleads to higher estimates of consumers’ marginal willingness to pay for nearly all quantities (left panel ofFigure 11) and somewhat smaller estimates of base marginal utility from any given quantity (right panelof Figure 11). These higher type estimates, in turn, imply a greater sensitivity of consumer and socialsurplus to changes in the pricing scheme, as the experiment on linear pricing has shown.
B.3 Estimation Results: Non-regular SampleIn Figures 16 to 19, we report estimates of the type support, the density function of types, the base marginalutility function, and the multipliers on the participation (or budget) constraint, together with confidencebounds (for the estimates of the type support, the base marginal utility function, and the multipliers) orpointwise asymptotic variability bounds (for the estimates of the density function), for each quantity ineach of the 13 villages in our non-regular sample, which do not conform either to the highly-convex or theweakly-convex case of the augmented model. In these villages, we estimate by GMM the type support,
62
Figure 11: Primitives Under Standard Model
05
1015
20E
stim
ated
Typ
e
0 1 2 3Quantity
Standard Model: Type Support
02
46
8E
stim
ated
Bas
e M
argi
nal U
tility
0 1 2 3Quantity
Standard Model: Base Marginal Utility
θ(q), base marginal utility, ν ′(q), and the multipliers, γ(θ(q)), from the system{log(T ′(q))− log (θ1q)+(1− d) log
(q
1−d
)= 0
γ(θ(q)) + 1q
[c
T ′(q)− 1]g(q)−G(q) = 0
, (73)
with γ(θ(q)) specified as γ(θ(q)) = exp{ϕq}/(1 + exp{ϕq}) except for villages 3, 4, 8, 9, 10, and 13,where we set ϕ = θ1. In these villages, the small number of points in the support of quantities purchasedrendered the convergence of the GMM routine problematic. We estimate the density function of types bythe same procedure we used for the regular sample, outlined above.
63
Figure 12: Marginal Price Schedule and Estimated Marginal Cost for Regular Sample
23
45
67
89
0 1 2 3Quantity
Village 1: Marginal Price Schedule and Marginal Cost
23
45
67
89
0 1 2 3Quantity
Village 2: Marginal Price Schedule and Marginal Cost
23
45
67
89
0 .5 1 1.5 2Quantity
Village 3: Marginal Price Schedule and Marginal Cost
23
45
67
89
0 1 2 3Quantity
Village 4: Marginal Price Schedule and Marginal Cost
23
45
67
89
0 .5 1 1.5 2Quantity
Village 5: Marginal Price Schedule and Marginal Cost
23
45
67
89
0 1 2 3Quantity
Village 6: Marginal Price Schedule and Marginal Cost
23
45
67
89
0 1 2 3Quantity
Village 7: Marginal Price Schedule and Marginal Cost
23
45
67
89
0 1 2 3Quantity
Village 8: Marginal Price Schedule and Marginal Cost
23
45
67
89
0 1 2 3Quantity
Village 9: Marginal Price Schedule and Marginal Cost
23
45
67
89
0 .5 1 1.5 2Quantity
Village 10: Marginal Price Schedule and Marginal Cost
23
45
67
89
0 .5 1 1.5 2Quantity
Village 11: Marginal Price Schedule and Marginal Cost
64
Figure 13: Confidence Bounds for Type Estimates for Regular Sample
-10
12
34
5T
ype
.2 .4 .6 .8 1Quantity
Type Estimates in Village 1
-10
12
34
5T
ype
0 .5 1 1.5Quantity
Type Estimates in Village 2
-50
510
15T
ype
0 .5 1 1.5 2Quantity
Type Estimates in Village 3
-10
12
34
5T
ype
0 .5 1 1.5 2Quantity
Type Estimates in Village 4
-10
12
34
5T
ype
0 .5 1 1.5 2Quantity
Type Estimates in Village 5
-10
12
34
5T
ype
0 .5 1 1.5 2Quantity
Type Estimates in Village 6
-10
12
34
5T
ype
0 .5 1 1.5 2Quantity
Type Estimates in Village 7
-10
12
34
5T
ype
0 .5 1 1.5 2 2.5Quantity
Type Estimates in Village 8-1
01
23
45
Typ
e
.2 .4 .6 .8 1Quantity
Type Estimates in Village 9
-10
12
34
5T
ype
0 .5 1 1.5 2Quantity
Type Estimates in Village 10
-10
12
34
5T
ype
.2 .4 .6 .8 1Quantity
Type Estimates in Village 11
65
Figure 14: Variability Bounds for Density Function Estimates for Regular Sample
05
1015
20D
ensi
ty F
unct
ion
1 1.02 1.04 1.06Type
Density Function in Village 1
-.5
0.5
11.
52
Den
sity
Fun
ctio
n
1 1.5 2 2.5Type
Density Function in Village 2
-.5
0.5
11.
52
Den
sity
Fun
ctio
n
1 2 3 4 5 6Type
Density Function in Village 3
-.5
0.5
11.
52
Den
sity
Fun
ctio
n
1 1.2 1.4 1.6 1.8Type
Density Function in Village 4
-.5
0.5
11.
52
Den
sity
Fun
ctio
n
1 1.5 2 2.5Type
Density Function in Village 5
-.5
0.5
11.
52
Den
sity
Fun
ctio
n
1 1.5 2 2.5 3Type
Density Function in Village 6
-.5
0.5
11.
52
Den
sity
Fun
ctio
n
1 1.2 1.4 1.6 1.8Type
Density Function in Village 7
-.5
0.5
11.
52
Den
sity
Fun
ctio
n
1 2 3 4 5Type
Density Function in Village 8-.
50
.51
1.5
2D
ensi
ty F
unct
ion
1 1.5 2 2.5Type
Density Function in Village 9
-.5
0.5
11.
52
Den
sity
Fun
ctio
n
1 1.2 1.4 1.6 1.8Type
Density Function in Village 10
-.5
0.5
11.
52
Den
sity
Fun
ctio
n
1 1.5 2 2.5Type
Density Function in Village 11
66
Figure 15: Confidence Bounds for Base Marginal Utility Estimates for Regular Sample
02
46
810
12B
ase
Mar
gina
l Util
ity
.2 .4 .6 .8 1Quantity
Base Marginal Utility Estimates in Village 1
02
46
810
12B
ase
Mar
gina
l Util
ity
0 .5 1 1.5Quantity
Base Marginal Utility Estimates in Village 2
02
46
810
12B
ase
Mar
gina
l Util
ity
0 .5 1 1.5 2Quantity
Base Marginal Utility Estimates in Village 3
02
46
810
12B
ase
Mar
gina
l Util
ity
0 .5 1 1.5 2Quantity
Base Marginal Utility Estimates in Village 4
02
46
810
12B
ase
Mar
gina
l Util
ity
0 .5 1 1.5 2Quantity
Base Marginal Utility Estimates in Village 5
02
46
810
12B
ase
Mar
gina
l Util
ity
0 .5 1 1.5 2Quantity
Base Marginal Utility Estimates in Village 6
02
46
810
12B
ase
Mar
gina
l Util
ity
0 .5 1 1.5 2Quantity
Base Marginal Utility Estimates in Village 7
02
46
810
12B
ase
Mar
gina
l Util
ity
0 .5 1 1.5 2 2.5Quantity
Base Marginal Utility Estimates in Village 80
24
68
1012
Bas
e M
argi
nal U
tility
.2 .4 .6 .8 1Quantity
Base Marginal Utility Estimates in Village 9
02
46
810
12B
ase
Mar
gina
l Util
ity
0 .5 1 1.5 2Quantity
Base Marginal Utility Estimates in Village 10
02
46
810
12B
ase
Mar
gina
l Util
ity
.2 .4 .6 .8 1Quantity
Base Marginal Utility Estimates in Village 11
67
Figure 16: Estimates and Confidence Bounds for Types for Non-regular Sample0
510
15E
stim
ated
Typ
e
0 1 2 3Quantity
Village 1
05
1015
Est
imat
ed T
ype
.5 1 1.5 2Quantity
Village 2
05
1015
2025
Est
imat
ed T
ype
0 1 2 3Quantity
Village 3
24
68
1012
Est
imat
ed T
ype
0 .5 1 1.5 2Quantity
Village 40
510
15E
stim
ated
Typ
e
0 1 2 3Quantity
Village 5
05
1015
Est
imat
ed T
ype
0 1 2 3Quantity
Village 6
05
1015
Est
imat
ed T
ype
0 1 2 3Quantity
Village 7
05
1015
Est
imat
ed T
ype
0 .5 1 1.5 2Quantity
Village 8
02
46
810
Est
imat
ed T
ype
0 .5 1 1.5 2Quantity
Village 9
05
1015
20E
stim
ated
Typ
e
0 1 2 3Quantity
Village 10
05
1015
20E
stim
ated
Typ
e
0 1 2 3Quantity
Village 11
05
1015
2025
Est
imat
ed T
ype
0 1 2 3Quantity
Village 12
05
1015
20E
stim
ated
Typ
e
0 .5 1 1.5 2Quantity
Village 13
68
Figure 17: Estimates and Confidence Bounds for Density Function Estimates for Non-regular Sample0
.05
.1.1
5D
ensi
ty F
unct
ion
0 5 10 15Type
Density Function in Village 1
-.05
0.0
5.1
.15
Den
sity
Fun
ctio
n
4 6 8 10 12 14Type
Density Function in Village 2
0.0
5.1
Den
sity
Fun
ctio
n
0 5 10 15 20Type
Density Function in Village 3
-.05
0.0
5.1
.15
Den
sity
Fun
ctio
n
2 4 6 8 10Type
Density Function in Village 40
.05
.1.1
5D
ensi
ty F
unct
ion
0 5 10 15Type
Density Function in Village 5
0.0
5.1
.15
Den
sity
Fun
ctio
n
0 5 10 15Type
Density Function in Village 6
0.0
5.1
.15
Den
sity
Fun
ctio
n
0 5 10 15Type
Density Function in Village 7
-.1
0.1
.2.3
Den
sity
Fun
ctio
n
2 4 6 8 10 12Type
Density Function in Village 8
-.05
0.0
5.1
.15
.2D
ensi
ty F
unct
ion
0 2 4 6 8 10Type
Density Function in Village 9
0.0
2.0
4.0
6.0
8.1
Den
sity
Fun
ctio
n
0 5 10 15 20Type
Density Function in Village 10
0.0
2.0
4.0
6.0
8.1
Den
sity
Fun
ctio
n
0 5 10 15 20Type
Density Function in Village 11
0.0
2.0
4.0
6.0
8D
ensi
ty F
unct
ion
0 5 10 15 20Type
Density Function in Village 12
0.0
5.1
.15
Den
sity
Fun
ctio
n
0 5 10 15Type
Density Function in Village 13
69
Figure 18: Estimates and Confidence Bounds for Base Marginal Utility for Non-regular Sample0
24
68
10E
stim
ate
Bas
e M
argi
nal U
tility
0 1 2 3Quantity
Village 1
.51
1.5
22.
53
Est
imat
e B
ase
Mar
gina
l Util
ity
.5 1 1.5 2Quantity
Village 2
05
1015
Est
imat
e B
ase
Mar
gina
l Util
ity
0 1 2 3Quantity
Village 3
05
1015
Est
imat
e B
ase
Mar
gina
l Util
ity
0 .5 1 1.5 2Quantity
Village 40
510
1520
Est
imat
e B
ase
Mar
gina
l Util
ity
0 1 2 3Quantity
Village 5
05
1015
Est
imat
e B
ase
Mar
gina
l Util
ity
0 1 2 3Quantity
Village 6
02
46
810
Est
imat
e B
ase
Mar
gina
l Util
ity
0 1 2 3Quantity
Village 7
02
46
8E
stim
ate
Bas
e M
argi
nal U
tility
0 .5 1 1.5 2Quantity
Village 8
05
1015
Est
imat
e B
ase
Mar
gina
l Util
ity
0 .5 1 1.5 2Quantity
Village 9
02
46
8E
stim
ate
Bas
e M
argi
nal U
tility
0 1 2 3Quantity
Village 10
02
46
8E
stim
ate
Bas
e M
argi
nal U
tility
0 1 2 3Quantity
Village 11
01
23
45
Est
imat
e B
ase
Mar
gina
l Util
ity
0 1 2 3Quantity
Village 12
.51
1.5
22.
5E
stim
ate
Bas
e M
argi
nal U
tility
0 .5 1 1.5 2Quantity
Village 13
70
Figure 19: Estimates and Confidence Bounds for Multipliers for Non-regular Sample.6
.7.8
.91
Est
imat
ed M
ultip
liers
0 1 2 3Quantity
Village 1
.6.8
11.
2E
stim
ated
Mul
tiplie
rs
.5 1 1.5 2Quantity
Village 2
.75
.8.8
5.9
.95
1E
stim
ated
Mul
tiplie
rs
0 1 2 3Quantity
Village 3
.75
.8.8
5.9
.95
1E
stim
ated
Mul
tiplie
rs
0 .5 1 1.5 2Quantity
Village 4.4
.6.8
11.
2E
stim
ated
Mul
tiplie
rs
0 1 2 3Quantity
Village 5
.4.6
.81
1.2
Est
imat
ed M
ultip
liers
0 1 2 3Quantity
Village 6
.75
.8.8
5.9
.95
1E
stim
ated
Mul
tiplie
rs
0 1 2 3Quantity
Village 7
.8.8
5.9
.95
1E
stim
ated
Mul
tiplie
rs
0 .5 1 1.5 2Quantity
Village 8
.75
.8.8
5.9
.95
1E
stim
ated
Mul
tiplie
rs
0 .5 1 1.5 2Quantity
Village 9
.8.8
5.9
.95
1E
stim
ated
Mul
tiplie
rs
0 1 2 3Quantity
Village 10
.85
.9.9
51
Est
imat
ed M
ultip
liers
0 1 2 3Quantity
Village 11
.6.7
.8.9
11.
1E
stim
ated
Mul
tiplie
rs
0 1 2 3Quantity
Village 12
.85
.9.9
51
Est
imat
ed M
ultip
liers
0 .5 1 1.5 2Quantity
Village 13
71