Nonlinear Pricing in Village Economies - Stanford University · PDF fileNonlinear Pricing in...

Nonlinear Pricing in Village Economies∗

Orazio Attanasio† Elena Pastorino‡

June 2015

Abstract

We propose a model of price discrimination to account for the nonlinearity of unit prices of basicfood items in developing countries. We allow consumers to differ in marginal willingness and absoluteability to pay for a good, incorporate consumers’ subsistence constraints, and model outside optionsfrom purchasing a good that depend on consumers’ preferences, like self-production or access to othermarkets. We obtain a simple characterization of optimal nonlinear pricing, and show that nonlinearpricing leads to higher levels of consumption, lower marginal prices, and higher consumer surplusthan implied by the standard nonlinear pricing model. The model is nonparametrically identified un-der common assumptions. We derive nonparametric and semiparametric estimators of the model’sprimitives that can be easily implemented using individual-level data commonly available for bene-ficiaries of conditional cash transfer programs in developing countries. The model well accounts forthe data, and at the estimated primitives, most consumers benefit from nonlinear pricing comparedto linear pricing. We also find that consumption distortions are less pronounced for individuals whopurchase small quantities, despite the steep decline in unit prices with quantity purchased, contrary towhat implied by the standard model. We show that when sellers have market power, which we detectin our data, policies like cash transfers that affect households’ ability to pay can effectively strengthenthe incentive for sellers to price discriminate across consumers and so give rise to asymmetric pricechanges for low and high quantities, which can exacerbate the consumption distortions typically as-sociated with nonlinear pricing. We find evidence of these patterns of price changes in our data inresponse to transfers. These results confirm the importance of our proposed extensions of the standardnonlinear pricing model in evaluating the distributional effects of nonlinear pricing.

Keywords: Nonlinear pricing; Budget Constraints; Cash Transfers; Structural Estimation

JEL Codes: D82, I38, J18, L11, O12, O13, O15

∗We thank Aureo de Paola, V.V. Chari, Tom Holmes, Bruno Jullien, Toru Kitagawa, Aprajit Mahajan, Costas Meghir,Holger Sieg, Kjetil Storesletten, and Joel Waldfogel for helpful comments. Eugenia Gonzalez Aguado, Sanghmitra Gautam,and Sergio Salgado Ibanez have provided excellent research assistance.†University College London and Institute for Fiscal Studies.‡University of Minnesota and Federal Reserve Bank of Minneapolis.

1 IntroductionQuantity discounts in the form of unit prices declining with quantity appear to be pervasive in developing

countries. McIntosh (2003), for instance, documents differences in the prices of drinking water paid by

poor households in the Philippines, whereas Fabricant et al. (1999) and Pannarunothai and Mills (1997)

document differences in the price of health care and services in Thailand and Sierra Leone. Attanasio and

Frayne (2006) show evidence that consumers purchasing basic staples in Colombian villages face price

schedules rather than linear prices: within a village, households buying larger quantities pay substantially

lower prices for homogeneous commodities. Similar patterns are observed in rural Mexico, as documented

here. This evidence is commonly interpreted as a symptom of the fact that “the poor pay more” than rich

households for the same goods. Since in these countries the poor are close to subsistence, nonlinear prices

are usually considered to have especially undesirable distributional consequences.

This view is consistent with the intuition from the standard model of nonlinear pricing (see Maskin and

Riley (1984)), which assumes that consumers only differ in their marginal willingness to pay for a good.

This model explains quantity discounts as arising from a seller’s incentive to screen consumers through

the offer of multiple price and quantity combinations. Its main insight is that the ability of a seller to

discriminate across consumers though nonlinear pricing not only implies that the consumption of (nearly)

all consumers is depressed relative to first best but, crucially, that consumption distortions are more severe

for consumers who purchase the smallest quantities, typically the poorest ones in developing countries.

The standard model assumes that consumers are unconstrained in their ability to pay for a good and

have access to similar alternatives to trading with a particular seller or in a particular market. This frame-

work then naturally explains the dispersion in unit prices for goods that absorb a small fraction of con-

sumers’ incomes, in settings in which consumers have available similar outside consumption opportuni-

ties.1 As such, the standard model abstracts from key features of food markets in developing countries.

In these countries, households typically face subsistence constraints on the consumption of basic sta-

ples, spend a large fraction of their income on food, and often have available alternative heterogeneous

consumption possibilities, through self-production or highly-subsidized government stores, that are un-

common in developed countries.

To rationalize the occurrence of quantity discounts in these settings, we propose an alternative model of

price discrimination that explicitly formalizes households’ subsistence constraints, and allows households

to differ in both their marginal willingness to pay and absolute ability to pay for a good. The model also

incorporates a rich set of alternatives to purchasing in a particular market that differ across consumers. We

show that in these settings, nonlinear pricing typically has distributional effects that run counter standard

intuition, as it leads to consumption below and above first best in a given population of consumers, yielding

levels of consumer surplus in general higher than those implied by the standard model.

1Formally, consumers are assumed to be able to pay more than their reservation prices for a good. See Che and Gale (2000).

1

The model we propose can be structurally identified from information in a given market on the distri-

bution of prices and quantities. Furthermore, it is possible to distinguish different versions of the model

that have very different welfare implications. When we bring the model to the data from a sample of

villages in rural Mexico, we find that the model fits extremely well the observed differences in prices and

quantities within and across villages, and that the standard model often used in the literature to explain

quantity discounts is strongly rejected.

Our analysis starts with the observation that when facing subsistence constraints, consumers can be

formally thought as facing an additional budget constraint just on the expenditure on a seller’s good. That

is, in the language of the literature on nonlinear pricing and auctions, consumers are budget-constrained

and their constraints depend on their preferences and incomes. We show that, even when consumers differ

in both their marginal willingness and absolute ability to pay, a model with budget-constrained consumers

maps into a class of nonlinear pricing models with so-called countervailing incentives, in which consumers

have heterogeneous reservation utilities (see Jullien (2000)). By formally exploiting this equivalence

across models, we then prove that in the richer environment we consider, a simple characterization of

nonlinear pricing can be obtained. In such an environment, nonlinear pricing has more desirable welfare

properties than those implied by the standard model, as it leads to higher levels of consumption, lower

marginal prices, and also higher consumer surplus. Intuitively, since a budget credibly conveys to a seller

that there is maximum price a consumer will pay, such a constraint effectively limits a seller’s ability to

extract consumer surplus. In these situations, nonlinear pricing might be preferable to linear pricing even

if sellers have market power.

The reason for why consumption is higher, and therefore marginal prices are lower, in our framework

than predicted by the standard model is that, unlike in the standard model, in our framework an optimal

menu entails overprovision as well as underprovision of quantity relative to first best. The intuition behind

this result can be explained simply. In the standard nonlinear pricing model, in order to discriminate

across consumers, a seller just needs to prevent consumers from effectively understating their preferences

and purchasing less than their taste for a good would imply. In our model, instead, a consumer may

have an incentive to overstate her preference. The reason for this reversal of the logic of the standard

model is that in our model a consumer who is ‘marginal’, that is, indifferent between purchasing and not

purchasing from the seller, may be one with an intermediate or even high preference for the good—in the

standard model, the only marginal consumer is the one with the lowest valuation for a good. This situation

occurs when consumers with intermediate or high valuations for a good have also better outside options

or face tighter subsistence constraints. If so, then a seller needs to offer such a consumer an attractive or

inexpensive enough price and quantity combination to induce her to buy. But then, the price and quantity

combination meant for this marginal consumer can also be attractive to lower valuation consumers: by

offering large enough quantities, even above first best, a seller can then discourage these lower valuation

consumers from purchasing quantities intended for higher valuation ones.

2

As for why consumer surplus may be higher than implied by the standard model, note that a budget

constraint on the expenditure on a seller’s good effectively constrains the price a seller can ask. For

similar outside options across models, the combination of higher quantities, lower marginal prices, and this

expenditure limit naturally gives rise to higher levels of utility for consumers than predicted by the standard

model. Despite the scope for sellers to extract consumer surplus through quantity-specific prices, nonlinear

pricing has also an obvious efficiency-enhancing dimension to it, especially salient when consumers are

differentially constrained in their access to a market. By allowing a seller to tailor prices and quantities

to a consumer’s marginal willingness to pay, nonlinear pricing enables a seller to trade at a profit with

consumers whose subsistence constraints and outside options would lead them to demand especially large

quantities under linear pricing. Such consumers would be excluded from the market under linear pricing,

and are therefore better off under nonlinear pricing.

By a similar argument, ignoring consumers’ budget constraints, and the endogenous price response

to changes in consumers’ ability to pay, can lead to an incorrect assessment of the impact of common

food subsidization policies, like conditional cash transfers, aimed at expanding households’ consumption

possibilities. Intuitively, by increasing consumers’ budgets, these policies not only stimulate consumption

but also provide an incentive for a seller to take advantage of the greater ability to pay of consumers. In

general, targeted cash transfers that are more generous for poorer households, typically the purchasers

of the smallest quantities, lead not just to a greater demand for a seller’s good but also to higher prices.

Depending on the distribution of outside options across consumers, transfers can then give rise to increases

in unit prices for low quantities but decreases in unit prices for high quantities, thus increasing the degree

of price discrimination and exacerbating some of the associated consumption distortions.

We prove that our model is nonparametrically identified under assumptions standard in the empirical

auction and nonlinear pricing literature (see Guerre, Perrigne, and Vuong (2000) and Perrigne and Vuong

(2010)) and can be non- or semiparametrically estimated using household-level expenditure data, typically

available for beneficiaries of conditional cash transfer programs in developing countries. Intuitively, our

identification and estimation strategy exploits the joint information about the primitives of both buyers

and sellers contained in the price schedule and in the distribution of quantity purchases in a village.

The model fits the data remarkably well. The standard model, in contrast, is robustly rejected in nearly

all villages. Estimates of the model primitives conform to all restrictions of the model not imposed in

estimation, like the monotonicity of hazards of the distribution of consumers’ unobserved willingness

to pay, and the inverse relationship between marginal utility and quantity consumed. We find that the

distribution of consumers’ tastes, and so marginal willingness to pay, is much more dispersed than the

distribution of observed quantities in each village, which implies a potentially strong incentive for sellers

to price discriminate across consumers. We also estimate a large degree of curvature in utility, which

suggests the potential for rich distributional implications of nonlinear pricing.

We estimate that sellers indeed have market power in all the villages in our sample, and exercise it

3

by distorting the quantities offered to most households. Interestingly, we find that nonlinear pricing leads

to both underconsumption for consumers of the smallest quantities and overconsumption for consumers

of the largest quantities but distortions are more pronounced for the purchasers of the largest quantities,

which empirically correspond to the relatively richer households. Although quantity distortions are of

smaller magnitude for consumers with lower marginal willingness to pay, price distortions are for them

more severe so that, on balance, purchasers of the smallest quantities would benefit more from a more

competitive market for rice.

When comparing welfare under observed nonlinear pricing to a counterfactual scenario in which

sellers charge linear prices, we find that for most consumers—except those purchasing the smallest

quantities—are better off under nonlinear pricing. For consumers who prefer nonlinear pricing, the greater

quantities and lower marginal prices that nonlinear pricing implies, relative to linear pricing, give rise not

just to higher levels of consumer surplus but also of social surplus. Consumer and social surplus for these

consumers are higher under nonlinear pricing partly due to the higher degree of market participation than

nonlinear pricing generates. Indeed, as consistent with our model, we find that consumers with both small

and high reservation utilities or budgets would be excluded from the market under linear pricing. We also

find that the standard nonlinear pricing model would systematically overestimate the gain from nonlin-

ear pricing: by implying, by construction, much lower reservations utilities than our model, the standard

model predicts much higher linear prices and thus lower consumer surplus under linear pricing.

We finally assess the effect of conditional cash transfers on prices in the villages in our data. Cash

transfers, as in the case of the Mexican Progresa program implemented in some of the villages in our

sample, have been increasingly popular in recent years, both in Latin America and in other developing

countries. A few studies have analyzed the impact of transfers on the price of commodities. Hoddinott

et al. (2000), for instance, study the impact of Progresa on household consumption and, in the process,

examine the price effect of Progresa concluding that: “There is no evidence that Progresa communities

paid higher food prices than similar control communities” (p. 33). Analogously, Angelucci and De Giorgi

(2009), when assessing the impact of Progresa on the consumption of non-eligible households, consider

the possibility that their results are mediated by changes in local prices but dismiss this possibility based

on their empirical analysis. The consensus, therefore, seems to be that Progresa did not have noticeable

effects on local prices. Analogous evidence has been documented by Cunha et al. (2014), who study

a large food assistance program for poor households in Mexico, the Programa de Apoyo Alimentario

(PAL), which was supposed to complement Progresa in that it was targeted to extremely marginalized and

isolated communities. That program was evaluated through a four-arm randomized control trial, in which,

in addition to a control group, the 200 villages in the evaluation sample were divided into 3 other groups,

one receiving an in-kind transfer and the other two receiving a monetary transfer. They find that relative

to the control group, in-kind transfers gave rise to a 4 percent fall in prices whereas cash transfers gave

rise to a positive but negligible price increase.

4

All of these studies focus on changes in average (unit) prices associated with the introduction of cash

or in-kind transfers but fail to account for the nonlinearity of unit prices when assessing the impact of

transfers on prices. Our model implies that income transfers to consumers would induce a seller to change

his price schedule altogether. In line with this prediction, we find evidence of a significant effect of

transfers on prices: in response to transfers, the schedule of unit prices becomes significantly steeper with

unit prices increasing at low quantities but decreasing at low quantities. Since the resulting average effect

of cash transfers on unit prices is much less pronounced, we also show that ignoring the variability of unit

prices with quantity leads to a much smaller estimate of the price effect of transfers, as consistent with

the empirical specifications and results in the literature. On the contrary, by increasing consumers’ ability

to pay, cash transfers also effectively increase a seller’s market power, the degree of price discrimination,

and can potentially lead to much lower consumer surplus gains than typically inferred.

2 Quantity Discounts: the Case of MexicoAs mentioned in the introduction, quantity discounts are common in several markets in developing coun-

tries. Attanasio and Frayne (2006), for instance, estimate the supply schedule for several basic food

staples, including rice, carrots and beans in Colombian villages and show substantial quantity discounts.

For instance, they find that the elasticity of the price of rice to the quantity bought is as large as −0.11

in their preferred specification. They estimate even larger discounts (in absolute value) for other specifi-

cations and for commodities such as beans or carrots. In what follows, we use a large dataset from rural

communities in Mexico to study similar patterns and test the model we propose to explain them.

The dataset we use, described in detail in the Supplementary Appendix, was collected to evaluate the

impact of the conditional cash program Progresa, which was started in 1997 under the Zedillo administra-

tion. The program consists of cash transfers to eligible families with children, conditional on behavior like

attendance to classes, taking young children to health centers, and sending school-aged children to school.

For participating households, on average, grants amount to 25% of their income, therefore constituting a

substantial fraction of it. Within the localities targeted by the first wave of the program expansion that

occurred between 1998 and 2000, about 70% of households qualified as eligible for the program.

The Mexican government decided to evaluate the program and its expansion using a randomized con-

trolled trial. In 1997, the program selected 506 localities in 7 states, each belonging to one of 191 larger

administrative units (municipalities), to be included in the evaluation sample. Of these localities, 320, ran-

domly chosen, were assigned to early treatment, which started in the middle of 1998, and the remaining

186 were assigned to late treatment, which started in December 1999. The households in these localities

were followed for several periods. In what follows, we use the wave collected in May 1999.

The evaluation data, which has been used extensively in recent years, are remarkable for several rea-

sons. First, it provides a census of 506 villages in that all households in the localities are surveyed. Second,

the data is very rich and exhaustive. For the purpose of our paper, we note that it contains information,

5

for each of 36 food commodities, both on the quantity consumed, on the amount purchased, and on the

outlay that such purchase involved. It is therefore possible to compute, for each purchase observed in the

village, unit values for many commodities. Third, the data contain information on a variety of locality-

level variables, including some prices collected in local stores. This information can be used to check

the reliability of the unit values one can construct from the household survey. Attanasio et al. (2009)

found that unit values approximate well local prices, and these, in turn, match reasonably well data from

national sources on prices. The locality dataset also contains some information on the market structure in

each village. Finally, the fact that the program was randomly implemented in a subset of the villages—at

least for the first waves, including the one we are using—is useful as it introduces random and substantial

variation in the amount of resources available to some households, which can be used to study some of

the implications of the models we analyze.

Having being targeted by Progresa, the villages included in the evaluation survey are small, remote,

and “marginalized,” according to an index used by the Mexican government to target social programs.

The average number of households in a locality is just over 50. These localities are not, however, the

most marginalized and poorest villages in Mexico, as to be eligible for Progresa their inhabitants had to

have access to some basic infrastructure such as health centres and schools. On average, about 78% of the

households of the villages in the evaluation survey are eligible for the program, which is designed for the

poorest sectors of Mexican society. Obviously, beneficiary households are “poorer” than non-beneficiary

ones, as can be verified in a variety of dimensions, from the ownership of durables to the fraction of total

consumption devoted to food. However, even non-eligible households in the villages we study are quite

poor. In the overall sample, including eligible and non-eligible households, food accounts for nearly 70%

of their total budget. This share is higher for eligible households but even among non-eligible households,

food constitutes a substantial part of their expenditure. Whilst most of the households in our sample are

poor, it should also be pointed out that there is a substantial amount of heterogeneity in our sample as

different households differ in the degree of poverty. There is also substantial variability across villages in

the level of poverty, reflected in the significant variation of eligibility rates across localities.

One of the main reasons for our use of the May 1999 wave of the survey is the richness of the con-

sumption and expenditure data it contains, in addition to a number of demographic and socioeconomic

variables. In the case of food and drink, the survey contains information on weekly expenditure and quan-

tity purchased for 36 goods, together with the quantity consumed and home produced, as mentioned. The

food items recorded include fruits and vegetables, grains and pulses, and meat and animal products. The

list is supposed to be exhaustive of the foods consumed by households.

Given that the survey contains information on quantities purchased and consumed for each recorded

item as well as total household expenditure on each item, it is possible to determine unit values for each

food item as measured by the ratio of expenditure to quantity consumed. From now on, we term unit

6

values as prices.2 Given its high degree of storability, relative homogeneity, and the large frequency

of households purchasing it—on average 86% of observations overall and 82% in our sample—in our

empirical analysis we focus on rice. The evidence on quantity discounts, however, can be obtained for a

variety of commodities.

For a sense of the presence of quantity discounts and their importance, we perform the exercise in

Attanasio and Frayne (2006) on our data for a variety of different commodities. In particular, we try to

identify the price schedule faced by consumers by regressing the log of price of a commodity on the log

of quantity. To take into account that quantity is endogenously determined and that the estimates of the

coefficient on such a variable could be affected by the presence of measurement error (and by the fact

that “prices” are obtained by dividing the total value by the quantity purchased, which then appears on the

right hand side of the equation), we use an instrumental variable approach as Attanasio and Frayne (2006),

where variables that determine demand but are not likely to affect supply, such as total household income

and demographic composition, are used to instrument quantity. In Table 1, we report results for rice,

beans, sugar, tomatoes and tortillas. The top panel of the Table contains results obtained on all available

data, whereas in the bottom one we focus on municipalities with at least 100 households. The number of

observations varies across commodities because of the different frequency of transactions for which we

have information. The elasticity of prices to quantities along the supply schedules we identify is largest

for rice, −0.2. It is also high for tomatoes (−0.15) and tortillas (−0.16); it is smaller (in absolute value)

for beans and sugar but statistically different from zero. We note that the estimates we obtain do not vary

when we restrict attention to a smaller set of municipalities.

The models we study below relate the shape of the price schedule to the distribution of quantities in

a given market. Therefore, to perform the empirical exercise we propose, we need to define a market.

Ideally, one would like to consider a relatively isolated market in which one or any number of sellers

face heterogeneous buyers and enjoy some degree of market power. It would then be natural to use the

“locality” as the unit of analysis. As our strategy will rely on estimating the distribution of quantities and

prices within each “village,” given the size of the localities and the number of transactions we observe,

using localities results in too few observed transactions. We therefore decided to use the “municipality” as

the unit of analysis. Each municipality is composed of several localities, not all of which are included in

the survey. The municipality is the administrative unit and, therefore, the assumption that it constitutes the

relevant market is not too far fetched. As mentioned above, the 506 localities belong to 191 municipalities.

In all but one of these municipalities, we observe purchases of rice. In the empirical exercise below, we

focus on the market for rice, as it is one of the commodities with the lowest number of zeros in the sample

we use. To minimize the impact of measurement error, we focus on rice purchases up to 3 kilos at a

median price not higher than 16 pesos. These restrictions imply the loss of very few observations. In

2For measurement issues involved in the construction of unit values, see Attanasio et al. (2009). The authors show that unitvalues are to a large extent validated by the available information on store-level prices in each village.

7

Table 1: Price ScheduleFull Sample

Rice Beans Sugar Tomatoes TortillasLn(quantity) −0.204 −0.093 −0.070 −0.157 −0.163

(0.026) (0.013) (0.016) (0.019) (0.028)Constant 1.945 2.334 1.806 1.721 1.535

(0.011) (0.009) (0.008) (0.008) (0.064)

Observations 13301 19499 20476 20223 5277R2 0.164 0.067 0.047 0.103 0.125

Restricted Sample: Villages with at Least 100 HouseholdsRice Beans Sugar Tomatoes Tortillas

Ln(quantity) −0.207 −0.095 −0.060 −0.143 −0.160(0.030) (0.015) (0.019) (0.022) (0.031)

Constant 1.947 2.33 1.803 1.731 1.539(0.013) (0.0106) (0.008) (0.009) (0.071)

Observations 10622 15414 16118 15957 4185R2 0.166 0.070 0.040 0.088 0.133

Note: instrumental variable estimates (instruments: family compositionvariables and age); clustered standard errors at village level in parentheses

this sample, households consume between 0.1 to 3 kilos of rice, with a mean of 0.80 kilos and a standard

deviation of 0.44 kilos, at a unit price ranging from 0.67 to 16 pesos per kilo, with a mean of 7.64 pesos

and a standard deviation of 2.26 pesos.

The leading model we present in the next section considers a seller in a fairly closed market facing

an heterogeneous population of consumers. As we discuss below, this model can account for virtually

any degree of competition. Having said that, it is interesting to consider the typical market structure in

the villages in our sample. Of the 497 villages in the data set for which we have this information, 4 have

a mercado publico (public market), 108 have a tienda Diconsa (government regulated store), 5 have an

almacen or botega de abasto (supply warehouse), 167 have a tiendas de abarrotes (grocery store), 9 have

a tianguis (an open air market or bazaar that is traditionally held on certain market days in a town or city

neighborhood), 4 have a mercado regional (regional market), 5 have a mercado ambulante (street market),

21 have a mercado sobre ruedas (a street market that is usually installed outdoors in one or more specific

days of the week), and 246 have a comercio casero (home trade). We therefore conclude that supply is

indeed highly concentrated on a handful of stores of similar type.

3 Models of Price DiscriminationAs just reviewed, prices per unit are nonlinear and imply quantity discounts in the villages in our data.

A simple model that generates these features is that of Maskin and Riley (1984) of price discrimination,

which assumes that a seller screens consumers by their marginal willingness to pay according to the

quantities they purchase. This model, however, can be too restrictive for the context we study because

8

it assumes that consumers have the same reservation utility. Therefore, we build on the model of Jullien

(2000) who assumes that consumers differ not only in their willingness to pay for a good but also in their

reservation utility in order to capture flexibly the value of consumption possibilities alternative to trading

with a particular seller.

Suitable interpretations of consumers’ reservation utilities can then accommodate different environ-

ments of interest in our context. A particularly relevant case arises when consumers face subsistence

constraints in consumption, which give rise to a budget constraint on the expenditure on a seller’s good.

Models with these types of budget constraints are known to be intractable (see Che and Gale (2000)): the

optimal pricing schedule in these settings is only known for special cases, when, for instance, utility is

linear in consumption (see Che and Gale (2000)) or the budget is identical across consumers so that a

seller has no incentive to price discriminate across all consumers (see Thomas (2002)). Key to our ap-

proach to characterizing nonlinear pricing in the presence of these budget constraints is that under simple

conditions, a model with heterogeneous reservation utilities is equivalent to a model with heterogeneous

budget constraints. Establishing this equivalence allows us to adapt Jullien’s results to such a setting.

The model we propose captures a variety of situations that are likely to be relevant for our application.

First, consumers in our data have access to a wide range of outside options: households in a village may

purchase a good from sellers in other villages, even sellers whose behavior is not determined by the market

as is the case for government-regulated Diconsa stores; they may have the ability to produce a good as

an alternative to purchasing it; or they may receive a good from relatives, friends, or the government as a

transfer. Second, while we analyze the problem of a single seller, since a consumer’s reservation utility

can be interpreted as the indirect utility obtained when purchasing from other sellers, the model accounts

for varying degrees of market power, ranging from monopoly to monopolistic competition to near perfect

competition. Finally, consumers may have preferences for other goods, some of which may be offered

by a same seller. We assume that a consumer’s utility is separable in the quantity of each such good and

that a seller prices his goods independently—see Stole (2006) for a discussion of situations in which this

component pricing is indeed optimal.3

As common in the nonlinear pricing literature, our framework implicitly excludes the possibility of

collusion among consumers, for instance, through resale. Interestingly, we do not observe resale among

consumers in our data. A natural question is why consumers do not form coalitions, buy bulk, and resell

the quantity purchased from the seller among themselves at linear prices. A possible answer is that our

context is that of small, isolated, and quite geographically dispersed communities in rural Mexico that

largely function as informal economies. Thus, it might be difficult for consumers to engage in the type

of contracts that would sustain resale. Conceptually, such a situation can be translated into a simple

assumption on imperfections in contracting between consumers similar to the assumption on imperfections

3Near competition obtains when every consumer’s reservation utility is arbitrarily close to first-best utility and marginalprices are close to marginal cost.

9

in contracting between sellers and consumers usually maintained in models of nonlinear pricing. 4

3.1 A Model With Heterogeneous Outside Options

We model a village as a market in which consumers (households) and a seller exchange a quantity q ∈[0,∞) of a good—rice in our case—for the monetary transfer t. Consumers’ preferences depend on a

characteristic, θ, continuously distributed across them with support [θ, θ], θ > 0, cumulative distribution

function F (θ), and probability density function f(θ), positive for θ ∈ (θ, θ). We assume that the seller

observes θ but that prices contingent on consumers’ characteristics are not enforceable. For convenience

only, we assume equivalently that the seller does not observe θ and rely on results from the literature on

mechanism design with private information in our derivations. We believe the first interpretation better

accords to our data. The seller, however, is allowed to post a nonlinear price schedule for all consumers

entailing possibly different unit prices for each offered quantity.

Upon trade, a consumer obtains utility v(θ, q) − t, with v(·, ·) twice continuously differentiable,

vq(θ, q) > 0, and vqq(θ, q) ≤ 0, whereas the seller obtains profit t − c(q), where c(·) is the cost of

producing q and is assumed to be twice continuously differentiable. We normalize the seller’s reservation

profit to zero. Let s(θ, q) = v(θ, q) − c(q) denote the social surplus from trade. We define the first-best

quantity, qFB(θ), as the one that maximizes social surplus when the consumer’s type is θ.

We assume, as standard, that vθq(θ, q) > 0 for q positive so that consumers can be ranked according

to their marginal utility from the good. We also maintain that sq(θ, q)/vθq(θ, q) decreases with q, which

ensures that a seller’s problem admits a unique solution and that first-order conditions are necessary and

sufficient to characterize its solution. This assumption plays the same role as the assumptions that s(θ, ·)is concave in q and vθ(θ, ·) is convex in q in the standard nonlinear pricing model. A consumer of type θ is

said to participate when the consumer purchases a single quantity q(θ) with probability one—the restric-

tion to deterministic contracts is without loss here. Since we focus on situations in which all consumers

purchase from the seller, q = 0 is interpreted as the limit when the contracted quantity becomes small.

Note that with exclusion, the optimal contract for types who participate would be the same as the one

characterized here.

By the revelation principle, a contract can be summarized by a menu {t(θ), q(θ)} such that the best

choice within the menu for a consumer of type θ is to choose the price t(θ) and the quantity q(θ), that is,

the menu is incentive compatible. Let u(θ) = v(θ, q(θ))− t(θ) denote the utility of a consumer of type θ

when purchasing from the seller. Define the reservation utility, u(θ), to be the consumer’s utility when the

consumer does not purchase from the seller. The seller’s optimal menu is chosen to maximize expected

4Formally, if enforcement or transaction costs, like commuting within or across villages, for each consumer were of theorder of the surplus from trade net of consumer’s and seller’s outside options or higher, then not even a coalition of all consumerscould achieve higher utility for any member than the utility each member obtains by trading with a nonlinear pricing seller.

10

profits subject to incentive compatibility and participation constraints, that is,

(IR problem) max{t(θ),q(θ)}

∫ θ

θ

[t(θ)− c(q(θ))]f(θ)dθ s.t.

(IC) v(θ, q(θ))− t(θ) ≥ v(θ, q(θ′))− t(θ′) for any θ, θ′

(IR) u(θ) ≥ u(θ) for any θ.

We refer to this model in which the seller’s constraints are IC and IR as the IR model, and define an

allocation {u(θ), q(θ)} to be implementable if it satisfies the IC and IR constraints. Clearly, the IC con-

straint is satisfied for a consumer of type θ if choosing q(θ) for the price of t(θ) maximizes the left-hand

side of the constraint. Taking first-order conditions, this requires vq(θ, q(θ))q′(θ) = t′(θ) or, equivalently,

u′(θ) = vθ(θ, q(θ)). Since vθq(θ, q) > 0, an allocation is incentive compatible, as usual, if, and only if, it

is locally incentive compatible in that u′(θ) = vθ(θ, q(θ)), the schedule q(θ) is weakly increasing (a.e.),

and the utility u(θ) is absolutely continuous.

Jullien (2000) shows that under three assumptions on the distribution of consumer types and utility,

homogeneity, potential separation, and full participation, there exists a unique optimal solution with full

participation to the seller’s problem, which is characterized by the first-order condition

vq(θ, q(θ))− c′(q(θ)) =γ(θ)− F (θ)

f(θ)vθq(θ, q(θ)) (1)

for each type, together with the complementary slackness condition associated with the IR constraints,

∫ θ

θ

[u(θ)− u(θ)]dγ(θ) = 0, (2)

with q(θ) weakly increasing.5 Note γ(θ) =∫ θθdγ(x), the cumulative multiplier on the IR constraints, has

the properties of a cumulative distribution function, that is, it is nonnegative, increasing, and γ(θ) = 1.

The integral in the definition of γ(θ) is interpreted as accommodating not just discrete and continuous

distributions but also mixed discrete-continuous ones. That is, this formulation covers the possibility

that the IR constraints bind at isolated points. For instance, in the standard nonlinear pricing model,

consumers’ reservation utility is assumed to be independent of θ, so the IR constraints simplify to u(θ) ≥ u

and bind only for the lowest type. Thus, γ(θ) = 1 for all types, and γ(θ) has a mass point at θ.6 By writing

the seller’s first-order condition in explicit form as q(θ) = l(γ, θ) for an arbitrary cumulative multiplier γ,

it follows that the solution to the seller’s problem can be expressed as q(θ) = l(γ(θ), θ) with associated

price schedule t(θ) = v(θ, q(θ))− u(θ). See the Supplementary Appendix for details.

5Homogeneity requires a menu inducing full participation with binding IR constraints for all types be incentive compatible.6It is understood that q(θ) is evaluated taking the left-limit at jump points.

11

We interpret the reservation utility, u(θ), as the value of purchasing from another seller or producing

the good at home. Note that by varying the level of the reservation utility, u(θ), the model can accommo-

date very different degrees of market power among sellers, ranging from no market power to any degree

of monopoly power. For instance, when the reservation utility equals the social surplus under first best for

all θ, that is, u(θ) = v(θ, qFB(θ))−c(qFB(θ)), the solution to the seller’s problem implies γ(θ) = F (θ)

for all types. Accordingly, consumers purchase from the seller first-best quantities at the first-best nonlin-

ear prices of c(qFB(θ)). As the reservation utility is lowered, a seller’s profit correspondingly increases,

allowing the model to capture any degree of monopoly power. The ability of this model to approximate ar-

bitrarily well perfect competition is an important advantage over the standard model for the measurement

exercises of later sections.

The characterization of the optimal menu depends crucially on the shape of the reservation utility.

As discussed in Jullien (2000), when the degree of convexity of u(θ) is high or low, only two types of

menus are optimal, referred to as the highly-convex and weakly-convex cases, which are characterized by

opposite patterns of consumption distortions relative to first best if sellers have market power. We provide

an example of these two cases in Figure 1 with power-law distributed types, v(θ, q) = θν(q), where

ν(q) = (1− d)[aq/(1− d) + b]d/d is a three-parameter HARA function, and constant marginal cost. We

let θ = 1, θ = 2.2, a = c = 1, b = 0, and d = 1/2. See the Supplementary Appendix for details.

Highly-Convex Case. This case arises when the participation constraint binds at isolated points. Since

q(θ) is continuous, γ(θ) can have mass points only at θ or θ, and so the IR constraint can bind at isolated

points only for these extreme types. When so, γ(θ) = γ for all interior types and the optimal quantity is

q(θ) = l(γ, θ). If γ = 0, then the participation constraint binds only for the highest type. As γ ≤ F (θ)

in this case, (almost) all types consume quantities above first best. If, instead, γ = 1, then the constraint

binds only for the lowest type, and the optimal menu coincides with that of the standard model. Since

γ ≥ F (θ) in this case, (almost) all types consume quantities below first best. The most interesting case

arises when 0 < γ < 1 so that the constraint binds for both the lowest and highest types. Since F (θ)

increases from 0 to 1, in this case F (θ) starts below γ, crosses γ at some type θHC , and then lies strictly

above it. By (1), consumer types below θHC consume quantities below first best whereas types above θHCconsume quantities above first best.

Weakly-Convex Case. This case arises when the participation constraint binds for an interval of types,

say, [θ1, θ2]. Let q(θ) denote quantity that implements the reservation utility profile in that u′(θ) =

vθ(θ, q(θ)) for each θ. Then, the optimal quantity is given by q(θ) = l(0, θ) for θ ≤ θ1, q(θ) = q(θ)

for θ1 ≤ θ < θ2, and q(θ) = l(1, θ) for θ ≥ θ2. The cumulative multiplier equals zero up to θ1, increases

up to one between θ1 and θ2, and equals one above θ2. Thus, there exists a type θWC in [θ1, θ2] at which

γ(θ) crosses F (θ): consumer types below θWC consume quantities above first best whereas types above

θWC consume quantities below first best.

12

Figure 1: Highly-Convex and Weakly-Convex Case0

.51

1.5

Util

ity a

nd R

eser

vatio

n U

tility

1 1.2 1.4 1.6 1.8 2 2.2Type

Utility Reservation Utility

Highly-Convex Case

qHC

0.2

.4.6

.81

Dis

trib

utio

n F

unct

ion

of T

ypes

and

Mul

tiplie

r

1 1.2 1.4 1.6 1.8 2 2.2Type

Type Distribution Function Multiplier on IR Constraint

Highly-Convex Case

q1 q2

0.5

11.

52

Util

ity a

nd R

eser

vatio

n U

tility

1 1.2 1.4 1.6 1.8 2 2.2Type

Utility Reservation Utility

Weakly-Convex Case

q1 q2qWC

0.2

.4.6

.81

Dis

trib

utio

n F

unct

ion

of T

ypes

and

Mul

tiplie

r

1 1.2 1.4 1.6 1.8 2 2.2Type

Type Distribution Function Multiplier on IR Constraint

Weakly-Convex Case

3.2 A Model with Budget Constraints

Suppose now that consumers face subsistence constraints. Formally, these constraints give rise to a lim-

ited liability constraint on consumers’ expenditure on the seller’s good that takes the form of a budget

constraint for the good. Here we show that under simple and intuitive conditions, this model is equivalent

to the one of the previous subsection.

Setup. Assume that consumers have quasi-linear preferences over the seller’s good, q, and the numeraire,

z, which represents all other goods. A consumer is characterized by a preference characteristic θ that

affects her valuation of q, as before, and by a parameter w that affects her overall budget or income, Y (w).

The consumer faces a subsistence constraint on the consumption of z of the form z ≥ z(θ, q). Then, the

consumer’s problem is

maxq,z{v(θ, q) + z} s.t. T (q) + z ≤ Y (w) and z ≥ z(θ, q). (3)

The constraint in (3) can be thought of as arising from a situation in which a certain number of calories

are necessary for survival, which can be achieved by consuming q units of the seller’s good and z units of

the numeraire. Such a calorie constraint can be expressed as Cq(θ, q) + Cz(θ)z ≥ C(θ), where Cq(θ, q)

13

and Cz(θ)z are, respectively, the calories produced by the consumption of q units of the seller’s good and

z units of the numeraire, and C(θ) is the subsistence level of calories for a consumer of type θ. We can

then rewrite the calorie constraint as z ≥ [C(θ)− Cq(θ, q)]/Cz(θ) ≡ z(θ, q) as in (3).7

Using the fact that at an optimum the consumer’s budget constraint holds as an equality and substi-

tuting z = Y (w) − T (q) into both the consumer’s objective function and the constraint z ≥ z(θ, q), the

problem in (3) can be restated as

maxq{v(θ, q) + Y (w)− T (q)} s.t. T (q) ≤ I(θ, q, w) ≡ Y (w)− z(θ, q), (4)

where Y (w) is an irrelevant constant in the objective function and I(θ, q, w) is the maximal amount that

the consumer can pay to purchase q units of the seller’s good and still meet her subsistence constraint.

Note that the constraint in (4) is a budget constraint for the seller’s good that arises from the consumer’s

subsistence constraint.

Suppose that when consumers do not purchase from the seller, they can achieve the exogenous utility

level u. Then, the seller’s optimal menu solves

(BC problem) max{t(θ),q(θ)}

∫ θ

θ

[t(θ)− c(q(θ))]f(θ)dθ s.t.

(IC) v(θ, q(θ))− t(θ) ≥ v(θ, q(θ′))− t(θ′) for any θ, θ′

(IR’) u(θ) ≥ u for any θ

(BC) t(θ) ≤ I(θ, q(θ), w) for any θ.

We refer to this model in which the seller’s constraints are IC, IR’, and BC as the BC model. We as-

sume that I(θ, q, w) is twice continuously differentiable and Iθ(·), Iq(·) ≥ 0. Although the model admits

heterogeneity in θ and w, in the text we consider the case of constant w and suppress the dependence of

I(θ, q, w) on w. We discuss the effect of this additional dimension of heterogeneity on an optimal menu

in Appendix A. In analogy to the homogeneity assumption in the IR model, we assume that there exists a

menu {t(θ), q(θ)} such that when faced with this menu, the budget constraint of each consumer holds as

equality and each consumer demands an incentive compatible quantity, q(θ):

(BC homogeneity) t(θ) = I(θ, q(θ)), t′(θ) = vq(θ, q(θ))q′(θ), and q(θ) is weakly increasing. (5)

This condition requires that there exists an incentive compatible way to induce a consumer to spend her

entire budget for the seller’s good. Importantly, under BC homogeneity, incentive compatibility can be

satisfied when the budget constraint binds. Like in the IR model, this condition then ensures that there7This formulation of the constraint generalizes the common one used, for instance, by Jensen and Miller (2008), Cqq +

Czz ≥ C, where Cq and Cz are the calories provided by one unit of q and one unit of z, and C is the subsistence calorie intake.

14

exists an implementable menu inducing all consumers to participate.

Equivalence Between Participation and Budget Constraints. The seller’s problem with constraints

IC, IR’, and BC has no known solution. Here we proceed to characterize the optimal menu indirectly by

establishing an equivalence between the BC problem and the IR problem. A natural approach, which leads

to a simple constructive argument, would be to define the income schedule of a consumer as I(θ, q(θ)) =

v(θ, q(θ)) − u(θ). Since, by definition, t(θ) = v(θ, q(θ)) − u(θ), it is immediate that in this case the BC

constraint is equivalent to the reservation utility constraint of the IR model.

Although such an approach is intuitive as it directly relates reservation utilities and budgets, it is unduly

restrictive in requiring the schedules of reservation utilities and budgets to agree for each type. For the

two problems to be equivalent, it is sufficient that the slope of the income schedule Iq(θ, q) coincides

with the marginal utility schedule, vq(θ, q), just for types whose IR constraints bind in the IR problem or,

equivalently, vq(θ, q) coincides with Iq(θ, q) for types whose BC constraints bind in the BC problem. We

establish this result by proving that under such a condition, the first-order and complementary slackness

conditions characterizing the solution to the IR problem, (1) and (2), coincide with those for the BC

problem. Formally, note that, as proved in Appendix A, the BC problem can be conveniently re-stated as

(simple BC problem) max{q(θ)}

∫ θ

θ

v(θ, q(θ))−c(q(θ)) +[F (θ)−Φ(θ)+Φ(θ)−1

f(θ)

]vθ(θ, q(θ))

+φ(θ)[I(θ,q(θ))−v(θ,q(θ))]f(θ)

f(θ)dθ, (6)

with q(θ) weakly increasing. In this problem, Φ(θ) =∫ θθφ(x)dx is the cumulative multiplier, defined

analogously to γ(θ), on the budget constraint expressed as I(θ, q, w) ≥ t(θ) = v(θ, q)− u(θ) and φ(θ) is

its derivative. For each type, the first-order conditions to this problem are

vq(θ, q(θ))−c′(q(θ))+

[F (θ)− Φ(θ) + Φ(θ)− 1

f(θ)

]vθq(θ, q(θ))+

φ(θ)[Iq(θ, q(θ))− vq(θ, q(θ))]f(θ)

=0, (7)

along with the complementary slackness condition

∫ θ

θ

{I(θ, q(θ))− [v(θ, q(θ))− u(θ)]}dΦ(θ) = 0. (8)

Under assumptions analogous to those of the IR model, Result 1 in Appendix A states that an imple-

mentable allocation is optimal if, and only if, there exists a cumulative multiplier function Φ(θ) such that

conditions (7) and (8) are satisfied. We provide a sufficient condition for the first-order and complementary

slackness conditions of the IR and BC problems to coincide in the next proposition.

Proposition 1 (Equivalence Between Problems). The solution to the BC problem coincides with that to

the IR problem if Iq(θ, q) equals vq(θ, q) for types whose participation constraints bind in the IR problem.

For intuition, note that in both problems a seller needs to induce a consumer to purchase in the first

15

place. In principle, a seller can induce a consumer to buy by offering a high enough quantity for a given

price or a low enough price for a given quantity. The IR constraint, however, implicitly places a restriction

on the maximum price a seller can charge to a consumer, since the requirement that u(θ) ≥ u(θ) is

equivalent to v(θ, q(θ)) − u(θ) ≥ t(θ), which effectively constrains a consumer’s expenditure on the

seller’s good. The desired result follows by combining this intuition with the construction of a multiplier

for the BC constraint such that the BC constraint binds in the BC problem if, and only if, the IR constraint

binds in the IR problem. The equivalence in Proposition 1 requires the slope of consumers’ budgets and

marginal utility to agree for some types but extends to any BC problem obtained under a monotone affine

transformation of the preferences of the IR problem.

Corollary 1 (Preferences for Equivalence). Consider an IR problem with preferences v(θ, q) + z and a

BC problem with budget I(θ, q). Let Iq(θ, q) = vq(θ, q) for types whose participation constraints bind in

the IR problem. Then, the solution to the new BC problem with preferences η0(θ)+η1(θ)[v(θ, q)+z], with

η0(θ) and η1(θ) monotonically increasing, and the solution to the IR problem also coincide.

Proposition 1 and its corollary are important for several reasons. From a theoretical point of view,

models with budget-constrained consumers are usually considered intractable. Our result provides a sim-

ple argument for how a model with budget constraints can be represented as a model with a type-dependent

IR constraint, and its solution characterized, even when the preferences of the two problems do not exactly

coincide. Importantly, this result allows us to consider a number of cases of great empirical and practical

relevance. For instance, we can explicitly allow for consumers’ substitution across different goods—as

long as utility is separable between the good under consideration and other goods—in the presence of

subsistence constraints affecting the consumption of each of them. We can also evaluate the effect of

policies, like cash transfers, that directly affect consumers’ ability to pay and, hence, budgets.

For intuition on the environments to which our equivalence result applies, assume v(θ, q) = θν(q),

a specification common in the literature. A key assumption is BC homogeneity, which requires: q(θ)

or, equivalently, its inverse θ(q) to be increasing; T (q) = t(θ(q)) to coincide with Y − z(θ(q), q); and

T ′(q) = θ(q)ν ′(q). Substituting θ(q) = T ′(q)/ν ′(q) in the condition T (q) = Y − z(θ(q), q), it is easy to

see that T (q) is determined by a non-linear first-order differential equation, T (q) = Y −z(T ′(q)/ν ′(q), q),

with boundary condition T (q) = Y − z(θ, q) for q = q(θ). Verifying that BC homogeneity holds is then

equivalent to verifying that this differential equation admits an increasing solution. We show next that this

is the case for a broad range of specifications for z(θ, q).

Proposition 2 (Subsistence Functions Compatible with Equivalence). Let utility be v(θ, q) = θν(q) and

the subsistence function be z(θ, q) = −z1(θ)− z2ν(q), where z′1(θ) = ψ(log(θ − z2)) with ψ(·) positive,

continuous, and integrable, and θ > z2 > 0. Then, a BC problem with I(θ, q) = Y − z(θ, q) satisfies the

BC homogeneity assumption and the condition of Proposition 1.

The conditions in Proposition 2 imply that BC homogeneity holds for a wide range of specifications

16

of the utility function, v(θ, q), and the subsistence function, z(θ, q)—see the proof of Proposition 2 for an

example. The assumption that z(θ, ·) decreases with q, which is equivalent to assuming that the calorie

intake from q, C(θ, ·), increases with q, is natural: the greater the amount consumed of the seller’s good,

the greater the calorie intake. The requirement that ψ(·) be positive or, equivalently, that z(·, q) be a de-

creasing function of θ, is to ensure that q(θ) is increasing. The intuition for why z(·, q) may be decreasing

with θ in practice is that if the same calorie intake can be reached through different combinations of food

items, a consumer who values more the seller’s good may require less of other goods to achieve it.8

3.3 Distributional Effects of Nonlinear Pricing

Given the equivalence just established between models with heterogeneous reservation utilities and models

with budget constraints, from now on we refer to the IR model as the augmented model and think of it as

covering both cases. We now examine some of the implications of this model for the characteristics of an

optimal menu of prices and quantities, for welfare under nonlinear and linear pricing, and for the impact of

policies, like cash transfers, that affect consumers’ ability to pay. Throughout this discussion, we maintain

for simplicity that v(θ, q) = θν(q) and c′(q) = c as in our leading specification in the empirical analysis.

Our results extend to the case of non-separable utility and increasing and convex cost functions.

Prices and Quantities. Since the optimal quantity schedule is increasing, we can define the inverse

function, θ(q), of q = q(θ), and obtain the observed price schedule, just function of quantity, as T (q) =

t(θ(q)). By expressing the local incentive compatibility condition, θν ′(q(θ))q′(θ) = t′(θ), as θν ′(q(θ)) =

T ′(q(θ)), we can then rewrite (1) simply as

T ′(q(θ))− cT ′(q(θ))

=γ(θ)− F (θ)

θf(θ), (9)

with T (q(θ)) given by θν(q(θ)) − u(θ) and u(θ) = u(θ′) +∫ θθ′ν(q(x))dx for some given θ′. The price

schedule exhibits quantity discounts if T ′′(q) ≤ 0 or the unit price p(q) = T (q)/q decreases with q.9

Proposition 3 (Quantity Discounts). If in the highly-convex case

minθ,ρ∈[0,1)

({2θf 2(θ) + [ρ− F (θ)]θf ′(θ)}/{θf 2(θ)− [ρ− F (θ)]f(θ)}

)≥ 1, (10)

and in the weakly-convex case θf 2(θ) ≥ F (θ)[f(θ) + θf ′(θ)] for θ in [θ, θ1], then T ′′(q) ≤ 0 for all types

in the highly-convex case and types in [θ, θ1] and [θ2, θ] in the weakly-convex case.

Note that condition (10) is always satisfied if the type distribution is, for instance, uniform. Observe

also that with constant or convex marginal cost, the price schedule cannot imply quantity discounts at all

8See Lancaster (1966) on the distinction between the caloric and taste attributes of goods and Jensen and Miller (2008) onthe relationship between these attributes and subsistence constraints.

9Note that the two characterizations of quantity discounts are equivalent when q(θ) = 0.

17

quantities in the weakly-convex case: the fact that γ(θ) = 0 and γ(θ) = 1 implies that T ′(q(θ)) = c′(q(θ))

and T ′(q(θ)) = c′(q(θ)) so T ′(q) cannot be decreasing at all quantities. In Appendix A we provide

conditions under which the price schedule entails quantity premia for any type between θ1 and θ2.

As discussed, the highly- and weakly-convex cases of the augmented model have very different impli-

cations for the type of consumption distortions that nonlinear pricing gives rise to. In the highly-convex

case, quantity discounts imply consumption levels below first best for low consumer types but above first

best for high consumer types. The reverse consumption pattern arises in the weakly-convex case. In gen-

eral, by comparing (9) to the first-order condition for the first-best allocation, T ′(q(θ)) = c, it is immediate

that when the difference γ(θ) − F (θ) is positive, the quantity provided by a seller to a consumer type is

below first best whereas when the difference γ(θ)− F (θ) is negative, the quantity provided is above first

best. As both instances of the augmented model give rise to unit prices declining with quantity, quantity

discounts by Proposition 3 are actually consistent with opposite patterns of consumption distortions.

We can provide an intuition for why an optimal menu with quantity discounts leads some consumers

to purchase less than first-best quantities (underprovision) and others more than first-best quantities (over-

provision) through a simple example with only two types of consumers: high, θ, and low, θ, types—the

logic naturally extends to general utility, cost, and continuous type distribution functions.10 Note that a

seller maximizes profits by inducing the two groups of consumers to pay different prices by purchasing

different quantities. Underprovision arises when consumers’ reservation utility does not increase fast with

θ so that as long as the low-type participates, the high-type participates too. Then, by setting a high

enough marginal price decreasing with quantity, a seller can induce a low-type consumer to purchase a

small quantity, at a price at which this consumer just reaches her reservation utility level, but a high-type

consumer to purchase a strictly higher quantity. Indeed, since higher types face a higher marginal benefit

from consuming the good, an understatement of preferences by a high consumer type, through the pur-

chase of a small quantity, is best discouraged by a seller by decreasing the quantity offered to lower types

relative to first best: by offering smaller and smaller quantities to lower types, a seller makes purchasing

these quantities unattractive to higher types. This argument for why a seller provides low-type consumers

with quantities below first best also explains underprovision in the standard model.

The opposite situation occurs when the reservation utility increases fast with θ and it is much higher

for a high-type than for a low-type consumer. In this case, a seller needs to offer an attractive enough

price and quantity combination to a high-type consumer to induce her to purchase. A seller can profitably

do so by offering a high enough quantity to a high-type consumer, even above first best, but at a price at

which the high type’s utility equals her reservation level. By also making the marginal price decrease with

quantity, and sufficiently high at low quantities, a seller can ensure that a low-type consumer purchases

10Formally, incentive constraints are upward binding for all consumer types whose closest marginal type—a type indifferentbetween purchasing and not—is above them. Incentive constraints are downward binding for all consumer types whose closestmarginal type is below them. In the standard model, the only marginal type is the lowest one, so quantity underprovision affectsall types except for the highest one, whose consumption is undistorted.

18

a strictly lower quantity and, thus, separate the two consumer groups. Intuitively, an overstatement of

preferences by low consumer types is best discouraged by a seller by increasing the quantity offered to

higher types relative to first best: as larger and larger quantities are less and less appealing to lower types,

offering higher types large enough quantities makes overstating preference more costly to low types.

Figure 2: Menus Under Augmented and Standard Models

01

23

4T

otal

Pric

e

0 1 2 3Quantity

Augmented Model Standard Model

24

68

1012

Pric

e P

er U

nit

0 1 2 3Quantity


01

23

Qua

ntity

1 1.2 1.4 1.6 1.8 2Marginal Willingness to Pay


Welfare Implications of Alternative Nonlinear Pricing Models. A clear ranking emerges between

the allocations implied by the augmented model and by the standard nonlinear pricing model in terms

of quantity provided, marginal prices, and consumer surplus. The reason is that when the augmented

model leads to allocations different from those implied by the standard model, these allocations entail

higher quantities and lower marginal prices, since for γ(θ) to be different from its value (of one) in

the standard model, it must be necessarily smaller than it. But if so, then it is immediate by (1) that

consumption distortions are smaller under the augmented model, since T ′(q) is closer to c. Then, for

similar (or higher) profiles of reservation utility in the two models, consumer surplus must also be higher.

Intuitively, if it is profitable for the seller to trade with a consumer, then the more attractive outside

consumption opportunities are, the better the seller’s offered price and quantity must be to induce the

consumer to purchase. See Figure 2 for an illustration of the optimal menu in the two models for a

parameterization similar to the one in Figure 1 and Appendix A for details.

Proposition 4 (Augmented vs. Standard Models). Let u be the reservation utility in the standard nonlinear

pricing model. If u(θ) ≥ u, then the augmented nonlinear pricing model implies higher consumption and

consumer surplus and lower marginal prices for each consumer than the standard model.

Nonlinear vs. Linear Pricing. A natural question is whether consumers are better off under nonlinear

pricing compared to linear pricing. In the standard linear pricing problem, a seller charges the unit price

pm for any quantity demanded. A consumer of type θ then chooses q to maximize θν(q) − pmq. By

the consumer’s first-order condition for the choice of q, θν ′(q) = pm, the demand function of a type θ

consumer is qm(θ) = (ν ′)−1 (pm/θ) and consumer surplus is um(θ) = θν(qm(θ)) − pmqm(θ). Given

aggregate demand Q(pm) =∫θqm(θ)f(θ)dθ, a seller’s problem just consists of choosing the linear price

19

pm that maximizes total profit, (pm − c)Q(pm). If the seller has market power, then by usual arguments

consumers are offered quantities below first best at marginal prices above marginal cost.

Whether consumers prefer nonlinear or linear pricing depends on the degree of market participation

that the two pricing schemes induce. When all consumers participate under both schemes and nonlinear

pricing entails quantity discounts, consumers are better off under linear pricing. Intuitively, linear pric-

ing is preferred when the quantity purchased by a consumer is higher under linear than under nonlinear

pricing. In this case, a seller’s ability to price discriminate simply exacerbates the quantity underprovision

typical of non-competitive linear pricing. Perhaps more surprising is that with full participation under

both pricing schemes, consumers prefer linear to nonlinear pricing even when quantities are smaller un-

der linear pricing. The reason is that a seller who can price discriminate across consumers asks for “too

high a price” for the greater quantity he is willing to provide under nonlinear pricing.

Proposition 5 (Nonlinear vs. Linear Pricing With Full Participation). Assume full participation under

nonlinear and linear pricing. Suppose the price schedule exhibits quantity discounts (T ′′(q) ≤ 0 and

p′(q) ≤ 0). If qm(θ) ≥ q(θ), or q(θ) > qm(θ) and γ(θ) < 1, then a consumer of type θ enjoys higher

consumer surplus under linear than under nonlinear pricing.

This proposition implies that for nonlinear pricing to be preferred to linear pricing in the presence

of quantity discounts, some consumers must be excluded from trade under the augmented model.11 In

particular, a consumer who is excluded under linear pricing but included under nonlinear pricing prefers

nonlinear pricing. The next result shows that under a mild sufficient condition, which implies there exist

consumers “at risk” of exclusion, consumers who participate under nonlinear pricing, s(θ, q(θ)) ≥ u(θ),

but have reservation quantities above first best, q(θ) > qFB(θ), would be excluded under linear pricing

and, hence, are better off under nonlinear pricing.12 See Example 1 in Appendix A for an illustration.

Proposition 6 (Nonlinear vs. Linear Pricing With Exclusion). Let ν ′′(·) < 0. Assume s(θ, q(θ)) ≥ u(θ)

and q(θ) > qFB(θ) for consumer types in the interval [θ′, θ′′]. If there exists θ ∈ [θ′, θ′′] with um(θ) = u(θ),

then an interval of consumer types in [θ, θ′′] are excluded from trade under linear pricing but included

under nonlinear pricing and so enjoy higher consumer surplus under nonlinear pricing.

The proposition implies that the availability of desirable enough outside consumption opportunities

provides incentives for a seller to match them when it is possible to include different consumers at different

prices. This result then highlights an efficiency dimension of nonlinear pricing: by price discriminating

across consumers, a seller may have an incentive to serve consumers who would demand unprofitably

large quantities under linear pricing.11In general, when there is some exclusion under either pricing scheme the proposition does not apply, and a consumer

included under both schemes can prefer nonlinear to linear pricing. The proposition also applies to the standard model ifqm(θ) ≥ q(θ) but not necessarily otherwise. When quantity distortions are small, it is easy to show that under the standardmodel, consumer types close to θ may prefer nonlinear pricing, even if they participate both under nonlinear and linear pricing.

12The claim requires the existence of a type whose utility under linear pricing equals her reservation utility. For instance,this occurs when such type’s reservation utility is her first-best utility. In this case, the consumer is either included under linearpricing, and so um(θ) = u(θ), or excluded under linear pricing, in which case again um(θ) = u(θ).

20

Income Transfers. The budget constraint version of our model provides a natural framework to explore

how cash transfers affect sellers’ incentives to price discriminate and, thus, prices and consumption when

individuals have limited ability to pay. Our model implies that income transfers increase the amount of

rice purchased by consumers but also endogenously give rise to higher prices as a seller adjusts his price

schedule in response to the greater ability to pay of consumers. Thus, sellers’ market power is a key force

shaping the impact of transfers on prices and quantities. Indeed, in a model with linear pricing and perfect

competition, transfers would affect prices only if the village economy was isolated and the supply of the

good of interest limited—see also Cunha et al. (2014) on this point. This situation, however, is unlikely

in the context we study where goods like rice are easily available from outside.

When consumers face budget constraints, changes in their income affect prices by creating an incentive

for a seller to take advantage of the greater ability to pay of consumers. To see this point, imagine that

consumers receive a cash transfer τ(θ) > 0. Suppose first that the transfer is independent of consumers’

characteristics (τ(θ) = τ ). Such a transfer then leads to a uniform increase in the price schedule, as

a seller anticipates that consumers can afford to pay more, but has no impact on purchased quantities

or marginal prices: the schedule of quantities offered before the transfer is still incentive compatible.

Suppose now that the transfer depends on consumers’ characteristic, θ, and so affects individual demand.

For instance, in the villages in our data, transfers are inversely proportional to income and, thus, larger for

households purchasing smaller quantities. A progressive transfer that increases consumers’ budgets for the

seller’s good (τ ′(θ) < 0) reduces the rate at which budgets increase with θ and so effectively increases the

marginal utility schedule of all consumers whose budget constraints for the seller’s good bind, u′(θ). By

incentive compatibility, the quantity purchased by each type then increases so the corresponding marginal

price—or, equivalently, the marginal price for a given percentile of quantities before and after the transfer

as F (θ) = G(q)—decreases. As before, given that consumers’ ability to pay has increased, the overall

price schedule increases.

Proposition 7 (Transfers Under Nonlinear Pricing). Consider an income transfer decreasing with θ. In

the highly-convex case for γ ∈ (0, 1] and in the weakly-convex case, the transfer leads to a lower marginal

price schedule and a first-order stochastic improvement in the distribution of quantity purchases. In both

cases, the income transfer induces a higher price schedule.

By Proposition 7, the price schedule becomes lower and flatter after transfers are introduced.13 Since

transfers cause both quantities and prices to increase, their effect on the price per unit of the good and,

thus, on the degree of price discrimination, is in principle ambiguous. Asymmetric changes in unit prices

at low and high quantities leading to a greater intensity of price discrimination occur, however, naturally.

For instance, when the utility function exhibits decreasing absolute risk aversion and the type distribution

is uniform, it is easy to show that in the highly-convex case—a common instance in our data—the price13When γ = 0, the same result applies if q(θ) ≥ l(0, θ). This latter condition ensures that the multiplier does not increase

after the transfer, in which case offered quantities may decrease.

21

per unit increases at low quantities but decreases at high quantities in response to an increase in income,

as the optimal price increase is smaller and smaller for higher types. See the Supplementary Appendix.

Hence, the schedule of unit prices becomes steeper. Recall that in the highly-convex case, quantity is

underprovided to low types but overprovided to high types. Then, such an increase in the degree of price

discrimination is associated with smaller consumption distortions at low quantities, due to the increase in

offered quantities after the transfer, but greater consumption distortions at higher quantities, as the higher

offered quantities accentuate the overprovision implied by nonlinear pricing. Thus, when sellers have

market power, relaxing consumption constraints does not obviously lead to greater consumer surplus—as

discussed, it may lead to lower consumer surplus if the transfer is uniform across consumers—given that

the transfer provides an opportunity for sellers to extract more resources from consumers.

A further implication of this argument is that when markets for food are imperfectly competitive, in-

kind transfers may be preferable to cash transfers, especially if they lead to an increase in the consumption

floor on other goods, z(θ, q), and so to a decrease in the budget available for a seller’s good. In this case,

by the reverse logic of Proposition 7, a transfer can lead to a decrease in prices with virtually no change

in purchased quantities.

4 Identification and EstimationIn this section, we first establish the identification of the model’s primitives building on Perrigne and

Vuong (2010). We then show how to obtain nonparametric and semi-parametric estimators.

4.1 Identification

In this section, we show that the model’s primitives, namely, consumers’ utility function, v(θ, q), the

cumulative distribution function of preference characteristics, F (θ), the associated probability density

function, f(θ), and a seller’s marginal cost, c′(q), can be identified in each village under standard assump-

tions, just based on data on households’ quantity purchases and expenditures that are commonly available

for developing countries. However, u(θ) in the IR model and, hence, I(θ, q) in the BC model can only be

identified for households whose corresponding constraint binds.14

In recovering the model’s primitives, we maintain that the sufficient condition s(θ, q(θ)) ≥ u(θ) for

full participation holds: it states that a seller makes non-negative profits from each consumer’s type at

the reservation quantity. This approach is justified by the fact that all households in each of our villages

consume rice. We also make use of the normalization θ = 1, since a scaling assumption is necessary for

identification. In the discussion that follows, we treat T (q) as known and, for simplicity, we denote the

cumulative multiplier γ(·) as the multiplier.

Since the first-order condition in (1) provides our estimating equation for a seller’s cost structure,

14Any economy with reservation utility u(θ) binding on the set Θ′ is observationally equivalent to an economy with thesame primitives but reservation utility u(θ) that agrees with u(θ) on Θ′ and is appropriately adjusted for the remaining types.

22

we can identify nonparametrically marginal cost at most at finitely many points without any other in-

formation in addition to prices and quantities consumed. Based on these data and the assumption of

known or parametric marginal cost, however, the remaining primitives are nonparametrically identified

for v(θ, q) = θν(q) if the support of θ is unknown, or for any v(θ, q) satisfying the assumed curva-

ture restrictions if the support of θ is known. The multiplicative specification of utility we consider,

v(θ, q) = θν(q), is ubiquitous in the theoretical and empirical literature on auctions and nonlinear pricing

both for its analytical tractability and for reasons of identification (see Guerre, Perrigne, and Vuong (2000)

and Perrigne and Vuong (2010)). We maintain that v(θ, q) = θν(q) throughout, and discuss relaxations of

this assumption as well as similarities between out setup and those common in the nonlinear pricing and

hedonic pricing literature in the Supplementary Appendix.

The assumption of a parametric cost function is common. The specific assumption of constant marginal

cost is consistent with anecdotal evidence on the cost of provision of basic staples in the villages in our

data. The marginal cost of rice for a seller is, basically, the wholesale price of rice and it is difficult to

imagine what would induce such a cost to vary with quantity.

Marginal Cost and Multipliers on Participation Constraint. The relationship between θ and q implied

by incentive compatibility is central to the identification of the model. To see why, denote by G(q) the

distribution of quantities across consumers in a village with density function g(q), and by q ≡ q(θ) and

q ≡ q(θ), respectively, the smallest and largest observed quantity. Since q = q(θ) and q′ ≥ 0, it is

immediate that G(q) = Pr(q ≤ q) = Pr(θ ≤ q−1(q) = θ) = F (θ). Hence, F (θ) is identified by G(q).

Moreover, f(θ) = g(q)q′(θ). Our model also implies a relationship between the distribution of quantities

and marginal prices that provides direct information about a seller’s marginal cost and the set of consumers

whose participation (or budget) constraints bind. Let’s re-write the seller’s first-order condition (1) as:

log [T ′(q)] = log [c′(q)]− log

[1 +

x(q)

g(q)

], (11)

where x(q) = [G(q)− γ(θ(q))]θ′(q)/θ(q). It is then apparent that a semiparametric nonlinear relationship

links T ′(q) to c′(q) and g(q). Based on (11), information on prices and quantities is sufficient to identify

whether the highly- or weakly-convex case of our model applies. In both cases, a seller’s marginal cost,

up to a finite number of parameters, and the multiplier on the relevant constraint can be easily recovered.

Intuitively, the number of times the function x(q) equals zero, and the percentiles of the distribution of

quantities at which this occurs, are directly informative about which offered quantity is undistorted, since

T ′(q) = c′(q) when x(q) = 0 by (11) or, equivalently,G(q) = γ(θ(q)) by (1) and (11). At these quantities,

c′(q) is then identified by T ′(q), and γ(θ(q)) by G(q). Recall that in the highly-convex case, the multiplier

γ(θ(q)) is constant, so G(q) = γ and then T ′(q) = c′(q) only at one interior quantity between q and q. In

the weakly-convex case, instead, T ′(q) = c′(q) at q, q, and a unique interior quantity.15 Hence, when the

15In this case single-crossing of T ′(q) and c′(q) is due to single-crossing of q(θ) and l(γ, θ) for each γ. Indeed, [γ(θ) −

23

function x(q) crosses the x-axis only at one quantity between q and q, we infer that the highly-convex case

applies. When the function x(q) crosses the x-axis exactly three times, at q, q, and one quantity between

q and q, we infer that the weakly-convex case applies.

In the highly-convex case, the quantity q(θHC) at which x(q) equals zero identifies the multiplier,

since γ = G(q(θHC)), and marginal cost when constant, since c′(q) = T ′(q(θHC)). When the resulting

γ ∈ (0, 1), then the relevant constraint binds only for the lowest and highest types. When γ ∈ (0, 1),

then the relevant constraint binds only for the highest type. In the special case in which γ equals one and

c = T ′(q), the standard nonlinear pricing model applies, and condition (1) reduces to

1

T ′(q)= λ0 + λ1

G(q)[1− T ′(q)/T ′(q)]1−G(q)

+ λ2[T ′(q)/T ′(q)− 1]

1−G(q), (12)

with λ0 = λ1 = λ2 = 1/c. Then, with constant marginal cost, testing if λ0, λ1, and λ2 are not significantly

different from each other provides a test of the standard model, which only requires knowledge of marginal

prices and the cumulative distribution function of quantities. When marginal cost is not constant and

unknown up to a finite number of parameters, a similar argument applies: both c′(q) and x(q) are identified

by (11), which can be interpreted as a standard additive semiparametric model for c′(q) and x(q) with T ′(q)

and g(q) known. As before, when x(q) equals zero once, G(q) evaluated at this quantity identifies γ.

In the weakly-convex case, T ′(q) evaluated at any of the three quantities at which x(q) equals zero

identifies cwhen marginal cost is constant. As for the multiplier, denote byA(q) the coefficient of absolute

risk aversion. Since θ′(q)/θ(q) = T ′′(q)/T ′(q) + A(q) by local incentive compatibility, (1) also implies

G(q) = γ(θ(q))− g(q)

[1− c′(q)

T ′(q)

] [T ′′(q)

T ′(q)+ A(q)

]. (13)

That is, once c′(q) is identified, γ(θ(q)) is identified up to A(q). Then, q1 is identified as the largest

quantity at which γ(θ(q)) is zero and q2 as the smallest quantity at which γ(θ(q)) is one: the relevant

constraint binds for all consumers purchasing quantities between q1 and q2 (included).

When neither the highly-convex nor the weakly-convex case apply, (11) and (13) identify marginal

cost and multipliers by the same argument as in the weakly-convex case. The relevant constraint binds for

all consumers purchasing quantities at which the derivative of γ(θ(q)) with respect to quantity is nonzero.

Proposition 8 (Identification of Marginal Cost and Multipliers). In a village, the number of zeros of

the function x(q) in (11) identify whether the highly-convex, the weakly-convex, or neither case applies.

Marginal cost (up to a finite number of parameters) and the schedule of multipliers are identified from

the cumulative distribution and probability density functions of quantities and from the marginal price

schedule by (11) in the highly-convex case, and by (11) and (13), up to the coefficient of absolute risk

aversion, in the remaining cases. With constant marginal cost, a necessary condition for the standard

F (θ)]/f(θ) is negative at θ, positive at θ, and increases from θ to θ by the potential separation assumption. See Jullien (2000).

24

model to apply is that the ratio of λ0, λ1, and λ2 in (12) does not significantly differ from 1.

Type Distribution and Preferences. Since F (θ) =G(q) and f(θ) = g(q)/θ′(q), the seller’s first-order

condition can be rewritten as θ′(q)/θ(q) = g(q)[T ′(q) − c]/{T ′(q)[γ(θ(q)) − G(q)]}. Integrating both

sides of this expression from q to q, it follows

log(θ(q)) = log(θ(q)) +

∫ q

q

∂ log(θ(x))

∂xdx = log(θ(q)) +

∫ q

q

g(x)[T ′(x)− c]T ′(x)[γ(θ(x))−G(x)]

dx. (14)

Note that the integrand term in (14) is positive since g(q) > 0, T ′(q) > 0, and, in addition, T ′(q) ≥ c

if, and only if, γ(θ(q)) ≥ G(q). This term is also well-defined both in the highly- and weakly-convex

case, despite the existence of quantities at which γ(θ(q)) = G(q) and so T ′(q) = c; see Appendix A. We

then obtain θ(q) as the exponential transformation of the right-most side of (14). Once marginal cost and

the schedule of multipliers are identified, (14) implies that θ(q) is a known function of objects that are

observed, like q, q, and T ′(q), or identified, like g(q) and G(q). Hence, θ(q) is also identified up to θ(q).

Proposition 9 (Identification of the Type Distribution). In a village, once marginal cost and the schedule

of multipliers are identified, the type support is identified from the probability density and cumulative dis-

tribution functions of quantities and from the marginal price schedule by (14) up to scale. The probability

density function of types, f(θ), is identified from the probability density function of quantities and the

derivative of the inverse of θ(q) by (14).

To see how u(θ), at points at which the participation (or budget) constraint binds, and ν(q) are identi-

fied, note first that once θ(q) is identified, the incentive compatibility condition ν ′(q) = T ′(q)/θ(q) implies

that the base marginal utility function, ν ′(q), is also identified from T ′(q). Once ν ′(q) is identified, we can

recover ν(q) as long as ν(q) or u(θ) is known at one point, say, q′ = q(θ′). Suppose, then, that the reserva-

tion utility u(θ) is known for at least one consumer type, θ′. We can then use ν(q′) = [u(θ′)+T (q(θ′))]/θ′

to identify ν(q′), and recover ν(q) as ν(q) = ν(q′)−∫ q′qν ′(x)dx for q ≤ q′ and ν(q) = ν(q′)+

∫ qq′ν ′(x)dx

for q ≥ q′. Lastly, once ν(q) is identified, we can also identify u(θ) as u(θ) = θ(q)ν(q)− T (q) for those

consumers whose participation (or budget) constraints are identified to bind.

Proposition 10 (Identification of Utility). In a village, once marginal cost, the schedule of multipliers,

and the support of preference characteristics are identified, the base marginal utility function, ν ′(q), is

identified from the marginal price schedule by the incentive compatibility condition ν ′(q) = T ′(q)/θ(q).

Then, up to the utility of one consumer type whose participation (or budget) constraint binds, the utility

function, θ(q)ν(q), is identified. The reservation utility function, u(θ), is identified for consumer types

whose participation (or budget) constraints bind.

Identification with Parametric Utility Curvature. Assuming the utility function takes the semipara-

metric form θν(q), with ν(q) specified as a member of a flexible parametric family, we can easily identify

25

γ(·) as well as all other model’s primitives based on our information on prices and quantities. Specifically,

suppose that ν(q) is a member of the HARA family with ν(q) = (1− d)[aq/(1− d) + b]d/d, a > 0, and

aq/(1− d) + b > 0. By incentive compatibility, T ′(q) = θ(q)ν ′(q), and so

log(T ′(q)) = log(θ(q)) + log

[a

(aq

1− d+b

)d−1]

= log(aθ(q))− (1− d) log

(aq

1− d+ b

).

In this case, a simple additive semiparametric relationship links T ′(q) to q, whose semiparametric compo-

nent identifies θ(q) up to scale. As before, f(θ) is identified from g(q) and θ(q). Thus, only c and γ(·) are

left to be identified. To this purpose, observe that a seller’s first-order condition, expressed as

γ(θ(q)) +θ′(q)g(q)

θ(q)T ′(q)c = G(q) + g(q)

θ′(q)

θ(q), (15)

leads to a system of as many linear equations in γ(θ(q)) and c as distinct observed quantities. With T ′(q)

known and g(q), G(q), θ(q), and θ′(q) identified—note that θ′(q) is identified from θ(q)—it follows that

c and γ(θ(q)) are identified if γ(θ(q)) takes the same value for at least two quantities. If not, then the

participation (or budget) constraint must bind at all quantities, and so there exists (at least) one consumer

type consuming qFB(θ). At this quantity, c equals T ′(q), and so is identified. At the remaining quantities,

γ(θ(q)) is then identified by (15). The identification of the remaining model primitives follows as before.

4.2 Estimation

We estimate the model separately in each village in two steps. After recovering the price schedule and the

distribution of quantities in each village, in the first step we estimate the seller’s marginal cost, the multi-

pliers associated with the participation (or budget) constraint, the support of the distribution of consumers’

marginal willingness to pay, and consumers’ utility. In the second step, we estimate the probability density

function of consumers’ marginal willingness to pay. Omitted details are collected in Appendix C.

Price Schedule. Our data contain direct information on unit prices and on the quantities purchased by

each consumer (household) in each village. Due to the existence of multiple observations on the unit

price of a given quantity in many villages, in each village we use the median unit value of each quantity

as the unit price for that quantity so as to minimize measurement error. Then, the price schedule in a

village, T (q), can be simply obtained by fitting the resulting unit prices, multiplied by the corresponding

quantity, on observed quantities. We do so by least squares allowing for different specifications in each

village.16 Note that when the fit of the estimated specification is good, the error that fitting may cause can

be considered minimal and, thus, ignored. Since the adjusted R2 for all estimated specifications is never

below 0.90 and often close to 1, we treat the price schedule as observed in inference—see Perrigne and

16We can treat quantity as exogenous since the information on the quantity purchased by each consumer provides directinformation on the price schedule of the seller and T (q) is a deterministic function of q under our model.

26

Vuong (2010) for the same assumption.

Distribution of Quantities. Denote by N the number of consumers purchasing rice in a village and

by qi the quantity purchased by consumer i. Then, G(q) can be estimated using a counting process as

G(q) = N−1∑N

i=1 1(qi ≤ q), where 1(·) is an indicator function and q ∈ [q, q]. We estimate g(q) via

a univariate kernel density estimator as g(q) = (Nhq)−1∑N

i=1Kq ((q − qi)/hq) for a suitable choice of

kernel function Kq(·) (Epanechnikov) and bandwidth hq.

Marginal Cost and Multipliers on the Participation Constraint. We test whether the standard non-

linear pricing model or the augmented model applies, and in this latter case identify the relevant case of

the augmented model, as follows. We specify x(q) as a flexible fractional polynomial. Fractional polyno-

mials allow for logarithms and non-integer powers, thus encompassing a wider range of shapes than those

obtained with regular polynomials. Given the granularity of our data, estimating x(q) nonparametrically

would be impractical. We then estimate (11) by generalized method of moments subject to the constraint

that c ranges between T ′(qmax) and T ′(qmin): since the model implies that the participation (or budget)

constraint binds for at least one consumer type, there must exist a quantity at which T ′(q) = c. Note that

this procedure allows us to obtain in one step estimates of c and x(q), and achieve standard convergence

rates in the next estimation step, while allowing for a flexible yet parsimonious specification of x(q).

We determine the relevant case of the augmented model by testing for the number of zeros of the

function x(q). Specifically, we conduct a multiple test of the hypotheses that x(q) equals zero at any of

theN points in the support of quantities in a village. We do so by computing the p-values of the hypotheses

that x(qi) = 0 at each distinct quantity i in the village. Since we test multiple linear constraints in each

village, we computed Hom-adjusted p-values to suitably bound the probability of falsely rejecting one of

the null hypotheses. If all p-values are above a given confidence level (five percent) except for one quantity,

then we reject the hypothesis that x(q) equals zero more than once. Correspondingly, we infer that the

highly-convex case of the augmented model applies to the village. If, instead, all p-values are above the set

confidence level except for three quantities, two of which are the lowest and highest quantities in a village,

then we infer that the weakly-convex case applies to the village. (Recall that in the weakly-convex case,

x(q) should be zero at q, q, and at an intermediate interior quantity.) Otherwise, we infer that the village

cannot be categorized as an instance of either the highly- or weakly-convex case.

When the highly-convex case applies, we estimate q(θHC) as the unique quantity q(θHC) at which

x(q) equals zero, and the constant multiplier γ as γ = G(q(θHC)). Note that if q(θHC) = q so that γ = 1,

then we cannot reject that the standard model applies to a village. When, instead, the weakly-convex case

applies, we specify γ(θ(q)) as a logistic function andA(q) as a flexible positive function of q, and estimate

them by generalized method of moments from by (13) as

G(q) = γ(q;ψγ)− g(q)

[1− c

T ′(q)

] [T ′′(q)

T ′(q)+ A(q;ψA)

],

27

where ψγ and ψA are vectors of parameters. We recover q1 as the largest quantity at which γ(q;ψγ) is

not significantly different from zero, and q2 as the smallest quantity at which γ(q;ψγ) is not significantly

different from one. Then, γ(q) = 0 below q1, γ(q) = γ(q;ψγ) between q1 and q2, and γ(q) = 1 above q2.

Support and Probability Density Function of Types. Since θi = θ(qi) by (14), given a kernel function

Kθ(·) (Epanechnikov) and bandwidth hθ, we estimate the density f(θ) nonparametrically as

f(θ) = (Nhθ)−1∑N

i=1Kθ((θ − θi)/hθ). (16)

Utility Function. We estimate marginal utility as ν ′(qi)=T ′(qi)/θ(qi) from the local incentive compat-

ibility condition θν ′(q)=T ′(q). Then, θiν(qi) can be obtained from θi = θ(qi) and ν ′(qi).

5 Estimation ResultsHere, we present evidence on the fit of the model to the data and estimates of the model’s primitives.

We then assess the consumption distortions associated with observed nonlinear pricing, and the degree

of sellers’ market power. We also compare the distributional properties of the observed allocations in

each village to the counterfactual ones that would emerge if sellers could not price discriminate across

consumers by charging different prices for different quantities. Finally, we consider the effect of the

Progresa cash transfers on prices and compare the predictions of our model to this evidence.

5.1 Village Categorization

As discussed in Section 2, our data cover 191 villages, which we take to coincide with the administrative

definition of a Mexican municipality. At least 100 households consume rice in 38 of these villages, 31 of

which satisfy a necessary condition for q(θ) to be increasing under the standard model. This condition is

a mild regularity requirement on the price schedule, which we impose in estimation, that just depends on

marginal prices and the distribution of quantity purchases in a village. See Appendix B for details.

Given the observed price schedule and distribution of quantity purchases, our first step estimation of

sellers’ marginal cost and multipliers on the participation (or budget) constraint converged on 24 of these

31 villages.17 We refer to these 24 villages as our estimation sample. Following the procedure outlined in

Section 4.2, we categorize villages as highly-convex, weakly-convex, or not classifiable as either instance

of our model. Based on this procedure, we found that 11 of these 24 villages conform to the highly-

convex case of our model, no village conforms to the weakly-convex case, and the remaining 13 villages

cannot be categorized as either instance. We refer to these 13 villages as the non-regular sample and to the

remaining 11 villages of the highly-convex case as the regular sample, which we focus on here. Estimates

of the model’s primitives on the 13 villages of the non-regular sample are presented in Appendix B.

17In practice, the numerous points of singularity of the right side of (11), together with the relative flatness of the scheduleof unit prices, created numerical indeterminacy that prevented the estimation of c and x(q) for those villages.

28

Figure 3: Unit Prices and Probability Density Function of Quantities5

7.5

1012

.515

Uni

t Pric

e

0 1 2 3Quantity

Regular Sample

01

23

Den

sity

Fun

ctio

n of

Qua

ntiti

es

0 1 2 3Quantity

Regular Sample

57.

510

12.5

15U

nit P

rice

0 1 2 3Quantity

Non-regular Sample

01

23

Den

sity

Fun

ctio

n of

Qua

ntiti

es

0 1 2 3Quantity

Non-regular Sample

In Figure 3, we plot the schedule of unit prices and the probability density function of observed quanti-

ties in each of the 24 villages in our estimation sample, separating the 11 villages of the regular sample (top

plots) from the 13 villages of the non-regular sample (bottom plots). Note that in these 13 villages, prices

per unit (bottom left plot) are less steep than in the 11 villages in the regular sample (top left plot), even

increasing with quantity. Also, the probability density function of quantities in these 13 villages (bottom

right plot) is more compressed than in the remaining 11 villages, as consumers are more evenly distributed

across quantities. For an intuition on these differences across samples, note that in the highly-convex case,

γ(θ) = γ and a seller’s first-order condition can be written as

c

T ′(q)+

γ −G(q)

g(q)θ(q)/θ′(q)= 1. (17)

Since γ − G(q) > 0 for θ < θHC , where θHC is the unique interior consumer type offered a first-best

quantity, and T ′(·) is decreasing, it is easy to see that for consumers with types θ < θHC , the denominators

in the two fractions in (17) should approximately move in opposite directions for (17) to hold. This fact

roughly implies that marginal prices, which track unit prices, and the density of quantities purchased

29

should be inversely related at low and intermediate quantities—especially if θ′(q) increases over these

quantities, as we estimate in several villages. This inverse pattern between marginal prices and the density

of quantities at low and intermediate quantities is evident in the regular sample (top panels of Figure

3). Instead, unit prices are unrelated or even positively related to the density of quantities at low and

intermediate quantities in the non-regular sample (bottom panels of Figure 3).

Note that in each of the 11 villages in the regular sample, the price per unit of rice declines with

quantity (top left panel) so unit prices are highest for the households who purchase the smallest quantities.

It is also apparent from the two top plots in Figure 3 that prices decrease more rapidly over the range of

quantities that most households purchase—up to 1.5 kilos per week. Thus, households in each village are

directly affected by the nonlinearity of the price of rice and nearly all of them face significant quantity

discounts: the unit price of the smallest quantity purchased, 0.2 kilos, is more than 15 pesos whereas the

unit price of the largest quantity, 3 kilos, ranges from 4.1 pesos to 5.7 pesos.

5.2 Model Fit and Evidence of Market Power

A central implication of the price discrimination models we have studied is that the shape of the price

schedule is determined by the cumulative distribution and probability density functions of consumers’

marginal willingness to pay (“type”), which is unobserved but is directly related to the observed distribu-

tion of quantities in each village. The first-order condition of a seller is key to these implications, since it

relates the price schedule in a village to the distribution of quantity purchases according to (11). Then, one

way to assess the fit of our model to the data is to determine the extent to which our estimates of c, g(q),

and the auxiliary function x(q) satisfy this relationship. Expressing (11) as g(q)/T ′(q) = [g(q) + x(q)]/c,

we plot in Figure 4 the estimated values of the left and right side of this expression with [g(q) + x(q)]/c

on the x-axis and g(q)/T (q) on the y-axis for each quantity in our estimation sample of 24 villages. Then,

the closer this relationship to the 45-degree line, the better the fit of the model to the data.

The two top plots in Figure 4 differ in that in the right panel, we trimmed the top and bottom 1% of

observations. The two bottom plots in Figure 4 display the same predicted relationship for the regular

sample of 11 villages conforming to the highly-convex case (left panel) and for the non-regular sample

of 13 villages that cannot be categorized as instances of the highly- or weakly-convex case (right panel).

Note that in all these samples, the model fits well the price and quantity data from each village.

This evidence on model fit as well as our village categorization are suggestive of the fact that our data

are consistent with the existence of market power among sellers. Indeed, if prices and quantities were

generated by a perfectly competitive market for rice, then marginal prices would equal marginal cost at

all quantities in each village. Since marginal cost is constant, then so would be average and marginal

prices, which is inconsistent with the left plots in Figure 3. Specifically, under perfect competition, a

seller’s first-order condition would reduce to T ′(q) = c, and so the estimated function x(q) would not

be significantly different from zero at all quantities—recall the argument in Section 4.2. This is not the

30

Figure 4: Model Fit Within and Across Villages0

.2.4

.6R

atio

of D

ensi

ty o

f Qua

ntiti

es to

Mar

gina

l Pric

es

0 .2 .4 .6Sum of Density of Quantities and Auxiliary Function Over Marginal Cost

Estimation Sample

0.1

.2.3

.4R

atio

of D

ensi

ty o

f Qua

ntiti

es to

Mar

gina

l Pric

es

0 .1 .2 .3 .4Sum of Density of Quantities and Auxiliary Function Over Marginal Cost

Estimation Sample (Trimmed)

0.2

.4.6

Rat

io o

f Den

sity

of Q

uant

ities

to M

argi

nal P

rices

0 .2 .4 .6Sum of Density of Quantities and Auxiliary Function Over Marginal Cost

Highly-Convex Case

0.1

.2.3

.4R

atio

of D

ensi

ty o

f Qua

ntiti

es to

Mar

gina

l Pric

es

0 .1 .2 .3 .4Sum of Density of Quantities and Auxiliary Function Over Marginal Cost

Villages Not Categorized

case in our sample: in the 11 villages of the regular sample conforming to the highly-convex case, x(q) is

not significantly different from zero only at one quantity, as consistent with the assumption of the highly-

convex case, whereas in the remaining 13 villages of the non-regular sample, x(q) is not significantly

different from zero only at about half of the quantities.18 We provide evidence on the degree of sellers’

market power and the type of price discrimination practiced in our villages in Section 5.4, where we also

assess the surplus loss associated with observed nonlinear pricing compared to first-best pricing.

5.3 Estimates

We present here estimates of sellers’ marginal cost, the multipliers on consumers’ participation (or budget)

constraints, the distribution of consumers’ marginal willingness to pay, and marginal utility.

Marginal Cost and Multipliers. Figure 5 reports the estimated marginal cost and multiplier on the

participation (or budget) constraint in the 11 villages of the regular sample, ordered by the value of the

multiplier. The figure also depicts pointwise asymptotic confidence bounds for the estimated values of

18By (11), testing for the number of times x(q) = 0 or, equivalently, T ′(q) = c, is equivalent to testing for the number ofquantities with the same marginal price since marginal cost is constant.

31

c and γ in each village. Note that both c and γ are fairly precisely estimated: confidence bounds are so

Figure 5: Estimated Marginal Cost and Multipliers2

34

56

Mar

gina

l Cos

t

0 5 10Village

.85

.9.9

51

Mul

tiplie

r on

Par

ticip

atio

n C

onst

rain

t

0 5 10Village

small that they are barely visible in most villages. Moreover, whereas sellers’ estimated marginal cost

noticeably varies across villages, from 2.823 to 5.914, the range of variation of the multiplier is much

smaller, from 0.876 to 1 with an average value of 0.960. However, the standard model (γ = 1) applies

only to two villages: in all other villages, we can reject that hypothesis that the standard model applies at

standard significance levels. Correspondingly, we infer that in nearly all villages, only consumers with the

lowest and highest marginal willingness to pay for rice have binding participation (or budget) constraints.

In the presence of quantity discounts, our model implies a negative relationship between marginal cost

and multiplier in the highly-convex case: since c = T ′(q(θHC)) and γ = G(q(θHC)) with T ′(·) decreasing

and G(·) increasing, it follows that the higher the marginal cost, the lower the quantity at which γ equals

G(·) and, thus, the lower the value of the multiplier. Hence, if the schedules of marginal prices in all

villages are similar enough, a negative relationship between marginal cost and multiplier should also

emerge across villages. This monotonicity is apparent in Figure 5 where it only fails between villages 10

and 11: in both villages the multiplier is equal to 1, and thus c = T ′(q), but the estimated marginal cost

is much higher in village 11 than in village 10. This large difference in estimated marginal cost between

the two villages is a direct consequence of the large difference in the schedule of marginal prices between

these two villages at all quantities including the largest one, q. See Figure 12 in Appendix B.

Recall that in the highly-convex case there is a single consumer type θHC consuming the first-best

quantity. Households with types below θHC consume quantities of rice below first best whereas households

with types above it consume quantities above first best. Our estimates of c and γ imply that while a large

fraction of households consume quantities below first best in all villages, the quantity q(θHC) above which

overconsumption occurs varies significantly across villages, starting at relatively low quantities in about

half of the villages. That most households are estimated to under-consume is due to the fact that the

estimated q(θHC) is in the fourth quartile of the cumulative distribution of quantities in nine villages and

32

in the third quartile of the distribution of quantities in the remaining two villages. However, this threshold

quantity differs markedly across villages: it falls between 1 and 1.5 kilos in two villages, between 1.5

to 2 kilos in three villages, between 2 to 2.5 kilos in five villages, and between 2.5 and 3 kilos in just

one village. For a sense of the magnitude of this threshold quantity, recall from Figure 3 that quantities

consumed range from 0.1 to 3 kilos across villages but only two households overall consume less than 0.2

kilo. Hence, underconsumption, a common concern among policy makers about the rural poor, is present

in our villages but is neither uniform across households or villages nor exclusive: a significant proportion

of households indeed consume quantities above first best.

Figure 6: Distribution of Types

12

34

56

Est

imat

ed T

ype

0 1 2 3Quantity

Type Support

05

1015

Est

imat

ed R

ever

se H

azar

d R

ate

1 2 3 4 5 6Type

Reverse Hazard Rate

02

46

Est

imat

ed H

azar

d R

ate

1 2 3 4 5 6Type

Hazard Rate

0.2

.4.6

.81

Est

imat

ed C

umul

ativ

e D

istr

ibut

ion

Fun

ctio

n

1 2 3 4 5 6Type

Cumulative Distribution Function

Distribution of Types. In Figure 6, we display the estimates of the type support (top left panel), θ(q), as

a function of the purchased quantity, and of the reverse hazard rate function (top right panel), F (θ)/f(θ),

in each village. Note that in each village, consumers’ estimated marginal willingness to pay for rice,

θ(q), increases with the quantity purchased, as consistent with the incentive compatibility condition of

our model, and the estimated reverse hazard function increases with θ, as consistent with the monotone

hazard rate condition implied by our separation assumption. As also required by our model, the estimated

33

hazard rate function, [1 − F (θ)]/f(θ), decreases everywhere with θ (bottom left panel). Note that none

of these restrictions have been imposed in estimation. As apparent from Figures 13 and 14 in Appendix

B, the function θ(q) and the density f(θ) are also fairly precisely estimated. We interpret this evidence as

validating our estimates of the distribution of types in each village.

Our estimates of θ(q) imply a greater dispersion in consumers’ marginal willingness to pay for rice

than at first evident from the relatively compressed distribution of observed quantities in Figure 3. Indeed,

the empirical probability density function of quantities in the top right panel of Figure 3 shows that most

consumers purchase small to intermediate quantities, between 0.25 kilos and 1.5 kilos of rice. Since the

cumulative distribution function of types coincides with the cumulative distribution function of quantities,

F (θ) = G(q), one might be tempted to conclude that as quantities are not much dispersed, so must

be tastes. That this inference is incorrect can be seen, for instance, from the top two panels of Figure

6, which show that the support of types is more than twice as wide as the support of quantities across

villages. This large degree of variation in consumers’ tastes implies a potentially strong incentive for

sellers to discriminate across consumers, as consumers markedly differ in their marginal willingness to

pay for rice. We examine the extent to which sellers exert market power through price discrimination in

Section 5.4 below, where we find evidence that sellers indeed behave noncompetitively.

Figure 7: Marginal Utility

02

46

8E

stim

ated

Bas

e M

argi

nal U

tility

0 1 2 3Quantity

Base Marginal Utility

24

68

10E

stim

ated

Mar

gina

l Util

ity

0 1 2 3Quantity

Marginal Utility

Marginal Utility Function. In Figure 7, we plot estimates of the base marginal utility function, ν ′(q),

and marginal utility function, θ(q)ν ′(q), at each quantity in each village. As it can be seen in Figure 15 in

Appendix B, the function ν ′(q) is fairly precisely estimated in each village. Note that both functions, ν ′(q)

and θ(q)ν ′(q), decrease with quantity in all villages, as consistent with the model, despite the fact that we

did not impose any such monotonicity constraint in estimation. Since θ(q) increases with q whereas ν ′(q)

decreases with q, however, estimated marginal utility decreases much less rapidly than estimated base

marginal utility. Hence, the distribution of marginal utilities is more compressed than the distribution of

base marginal utilities.

34

This relative compression of marginal utilities across consumers is still compatible with a potentially

large degree of market power for sellers, which they can exercise by price discriminating across con-

sumers. To see why, note that marginal utility is significantly downward sloping: rice is valued very

differently at the margin at different levels of consumption. Hence, in the villages in our regular sam-

ple, sellers not only have an incentive to price discriminate across consumers, as discussed, but also the

possibility to separate consumers by their intensity of preference through a menu of different quantities

differentially priced at the margin, depending on the level of a consumer’s demand.

Relative to the case of linear utility, the curvature we estimate further suggests the potential for rich

distributional implications of nonlinear pricing compared to alternative pricing schemes naturally of inter-

est, such as first-best or linear pricing. We explore these implications of our estimates in the next section,

where we link the scope of price discrimination, as captured by the distribution of consumers’ tastes and

marginal utility and by sellers’ marginal cost, to the type and intensity of price discrimination that we infer

sellers practice in our villages.

5.4 Distortions Associated with Price Discrimination

Our model allows for varying degrees of market power, which correspond to different degrees of efficiency

of the resulting price and quantity combinations available to consumers. Specifically, since a consumer’s

first-order condition is θν ′(q) = T ′(q), the closer marginal prices, T ′(q), are to marginal cost, c, in a

village, the closer the market for rice in the village is to an efficient one, in which marginal utility equals

marginal cost. In this case too, of course, the distribution of “gains from trade” between consumers and

producers depends on the ability of sellers to extract consumer surplus through nonlinear pricing.

A natural question is then whether the nonlinear pricing schedules we observe in each village reflect

the behavior of sellers who efficiently (first-degree) price discriminate across households, in the sense that

T ′(q) = c, or that of sellers who practice a distortionary type of standard second-degree price discrimina-

tion.19 In the first case, nonlinear pricing may have undesirable distributional implications but it leads to

efficient levels of consumption of rice. In particular, it does not restrict the access of any household, even

the poorest, to a basic staple like rice. Thus, although households’ ability to pay may still be of concern to

policy makers, the marginal prices, however high, that households face imply no distortion to individual

or social consumption of rice.

To assess the extent to which observed prices correspond more to first or second-degree (nonlinear)

price discrimination, we test whether the necessary condition T ′(q) = c for first-degree price discrimina-

tion holds. In Figure 12 in Appendix B, we show that in each village the marginal price schedule, {T ′(q)},is outside the 95% confidence interval around the estimated marginal cost. Based on this evidence, we

conclude that sellers have market power in the villages in our regular sample, and exercise it by distorting

the quantities offered to most households.

19This outcome occurs in our model when u(θ) = θν(qFB(θ))− cqFB(θ)− δ for some constant δ across consumers. In this

35

Figure 8: Nonlinear Pricing vs. First-Best Pricing-1

-.8

-.6

-.4

-.2

0N

onlin

ear

vs. F

irst-

Bes

t Pric

ing:

Soc

ial S

urpl

us

0 1 2 3Nonlinear Pricing Quantity

Social Surplus

01

23

Firs

t-B

est Q

uant

ity


Quantity Consumed

Since we reject the hypothesis that observed pricing is efficient, we now examine the severity of the

distortions associated with nonlinear pricing. In the left panel of Figure 8, for each quantity observed

consumed under nonlinear pricing, we graph the difference in social surplus under nonlinear pricing and

under first-best pricing. Specifically, we compute the percentage difference in the social surpluses arising

in each village under the observed (non-linear) price-quantity menu and that arising under the counter-

factual first-best price-quantity menu when T ′(q) = c, computed as SSnp(θ)/SSfb(θ) − 1, where the

subscript ‘np’ stands for nonlinear pricing and ‘fb’ for first-best pricing. The loss in social surplus implied

by observed nonlinear pricing ranges (except for one consumer type in one village) across quantities and

villages from 20% to 100% of the surplus associated with the first-best allocation. Importantly, this loss is

almost uniformly larger at lower quantities, and so largest for the lowest consumer types in each village.

In the right panel of Figure 8, we plot first best quantities (on the y-axis) against observed quantities

under nonlinear pricing. The dotted lines join the quantities in each village, while the solid line is the 45-

degree line. As implied by our estimates of marginal cost and the multipliers on consumers’ participation

(or budget) constraints, in most villages, households who purchase the smallest quantities consume less

than under first best whereas households who purchase the largest quantities consume more than under

first best. However, distortions seem more pronounced for households who consume relatively more than

for households who consume relatively less—they are highest for consumers with intermediate marginal

valuations of rice.

Figure 8 provides an interesting picture of the inefficiencies induced by nonlinear pricing by contrast-

ing the greater loss in consumer surplus (left panel) to the smaller distortions in consumption (right panel)

at smaller quantities relative to higher quantities under nonlinear pricing compared to first-best pricing.

For consumers with lower types, quantity distortions are less pronounced but price distortions are more

severe than for consumers with higher types, and more so that, on balance, lower consumer types would

case, the seller offers first-best quantities and charges a price at which every consumer reaches her reservation utility.

36

benefit more than higher ones from a more competitive market for rice.

5.5 Nonlinear vs. Linear Pricing

It has been argued that the ability of sellers to price discriminate through quantity discounts in developing

countries hurts poor consumers more: according to this argument, this type of nonlinear pricing may

limit the access to basic goods and services for consumers purchasing the smallest quantities, typically

the poorest households, since they face the highest unit prices—see Attanasio and Frayne (2006) for

references. Based on our theoretical framework and our estimates, we can examine which consumers are

hurt (or benefit) from the price discrimination we observe in our villages. We address this question by

contrasting consumer (and social surplus) under nonlinear and linear pricing. In particular, we ask what

level of consumer surplus would arise if the seller was constrained (by legislation, say) to practice linear

pricing. This exercise entails not only a comparison of the price and quantity combinations generated

by a nonlinear and a linear pricing scheme, but also of the size of the market served by sellers under

each scheme. As discussed in Section 3.3, the linear price a seller would choose when prevented from

discriminating may entail excluding some consumers, even if such a consumer participates in the market

under nonlinear pricing. We also discuss how this comparison differs across our augmented model and

the standard model. This exercise will then shed light on the nature and size of the bias that assuming,

counterfactually, that the standard model applies to all villages gives rise to.

Augmented Model. Given our estimates of consumers’ marginal willingness to pay, utility and sellers’

marginal cost, we compare consumer (and social surplus) in each village under the observed nonlinear

pricing allocation and under the counterfactual linear pricing one that would emerge if sellers were pre-

vented from price discriminating. One difficulty we need to face is the fact that the reservation utility

is only identified for consumer types whose participation (or budget) constraints bind. In the absence of

a global point estimate, we bound the reservation utility from below and from above and compute the

consumer and social surplus under these two extreme scenarios. In particular, we first bound the reserva-

tion utility from below by setting u(θ) = u(θ) for types smaller than θ and u(θ) = u(θ) for the highest

type. This schedule of reservation utilities is the lowest possible that is consistent with our model. We

then bound the reservation utility from above by setting u(θ) = u(θ) for all types.20 This schedule of

reservation utility corresponds to the highest possible one consistent with our model. We label the former

case as low reservation utility and the latter one as high reservation utility.

Given these bounds, we can compute the change in consumer surplus (and social surplus) between the

nonlinear and linear pricing solutions, type by type. In Figure 9, we plot the relative change in consumer

surplus, CS, under nonlinear pricing relative to the linear pricing counterfactual: CSnp(θ)/CSlp(θ) − 1

where the subscript ‘np’ stands for nonlinear pricing, and ‘lp’ for linear pricing. This relative loss is

20Note that when γ = 1, the standard model applies and so u(θ) = u(θ) for all types, since the participation or budgetconstraint only binds for the lowest type.

37

computed in each village and for each quantity (and therefore type) with CSnp(θ) = u(θ), CSlp(θ) =

θν(qm(θ)) − pmqm(θ). Recall that qm(θ) denotes the quantity demanded by a consumer of type θ under

linear pricing, and pm is the linear price in a village. Analogously, we can compute the relative change in

social surplus SSnp(θ)/SSlp(θ) − 1. See Appendix B for details. (Note that when γ = 1, the standard

model applies and so u(θ) = u(θ) for all types, since the participation or budget constraint only binds for

the lowest type.)

As the top left panel of Figure 9 shows, in most villages consumers who buy the smallest quantities

prefer linear to nonlinear pricing—note that the range of the y-axis is first negative then positive. Approxi-

mately three quarters of the remaining consumers, however, have positive values of CSnp(θ)/CSlp(θ)− 1

and, hence, are better off under nonlinear pricing. As apparent from the top right panel of Figure 9,

SSnp(θ)/SSlp(θ) − 1 is positive for nearly all consumer types in each village, even for those who are

worse off under nonlinear pricing. For these types, the higher producer surplus associated with nonlin-

ear pricing is large enough that social surplus is higher under nonlinear pricing despite consumer surplus

being lower.

Figure 9: Nonlinear vs. Linear Pricing Under Augmented Model

-1-.

50

.51

1.5

Non

linea

r vs

. Lin

ear

Pric

es: C

onsu

mer

Sur

plus


Low Reservation Utility

-10

12

3N

onlin

ear

vs. L

inea

r P

rices

: Soc

ial S

urpl

us


Low Reservation Utility

-1-.

50

.51

1.5

Non

linea

r vs

. Lin

ear

Pric

es: C

onsu

mer

Sur

plus


High Reservation Utility

-10

12

3N

onlin

ear

vs. L

inea

r P

rices

: Soc

ial S

urpl

us


High Reservation Utility

38

An interpretation for these findings is as follows. Under linear pricing, sellers provide smaller quan-

tities, thereby inducing consumers to purchase less than under nonlinear pricing, but also charge lower

prices for these quantities. As the top left panel of Figure 9 shows, the benefit of lower prices outweighs

the utility loss from lower consumption for the lowest consumer types, so for them consumer surplus is

higher under linear pricing.

The reduced ability of sellers to exert market power under linear pricing also implies a lower producer

surplus from nearly all types relative to nonlinear pricing. Overall, then, for these lowest consumer types,

consumer surplus is lower but social surplus is higher under nonlinear pricing—indeed, social surplus is

distinctively higher for some consumer types purchasing less than 0.5 kilos, as it can be seen from the

yellow and orange lines in the top right panel of Figure 9.

For the remaining consumer types who benefit from nonlinear pricing, the greater quantities and lower

marginal prices that nonlinear pricing implies, relative to linear pricing, give rise not just to higher levels of

social surplus but also of consumer surplus. As consistent with Propositions 5 and 6, consumer surplus and

social surplus are higher for these types under nonlinear pricing also due to the higher degree of market

participation than nonlinear pricing generates. As implied by Proposition 6, consumers with access to

large alternative levels of consumption (relative to their taste for rice) are unprofitable to serve under

linear pricing. Indeed, in 8 of the 11 villages in our regular sample, at least one consumer type is excluded

from trade under linear pricing: the highest type in six villages (villages 1, 5 to 8, and 10), the two lowest

types in one village (village 9), and the two lowest and the highest type in the remaining village (village

3). Social surplus does not change across nonlinear and linear pricing for those consumer types who do

not participate under linear pricing, and thus experience utility u(θ), but participate and obtain utility close

to u(θ) under nonlinear pricing. As apparent from Figure 9, in our villages these consumers are either the

smallest or the highest types.

As for the high reservation utility case, the bottom left panel of Figure 9 shows that nearly all con-

sumers prefer linear to nonlinear pricing, especially those who purchase relatively small quantities. Intu-

itively, when consumers’ reservation utility is high, sellers charge lower linear prices—indeed lower than

in the previous experiment. These lower prices benefit not just consumers with relatively low marginal

willingness to pay, who face steep unit prices under nonlinear pricing, but also higher consumer types:

the combination of their high reservation utilities and the reduced ability for sellers to extract consumer

surplus under linear pricing implies higher levels of utility for consumers with higher types too. Interest-

ingly, in nearly all villages in which exclusion occurs, consumers who do not participate in the market

are of middle to high type rather than the lowest and highest consumer types, as in the first version of

the experiment. Specifically, consumers excluded from trade under linear pricing are: the highest type in

three villages (villages 1, 7, and 10), the two lowest types in one village (village 9), the two highest types

in one village (village 6), the three highest types in two villages (villages 3 and 8), and the four highest

types in the remaining village (village 5).

39

Standard Model. An interesting exercise is to compare nonlinear and linear pricing allocations under

the counterfactual assumption that the standard model, where reservation utility is constant and does not

depend on consumer’s type, applies to all villages. To this purpose, we re-estimate the model’s primitives

in each village where our model applies under this assumption, and then perform an experiment analo-

gous to the one performed in the context of our model. That is, we compare consumer, producer, and

social surplus under the resulting nonlinear pricing allocation and under the linear pricing allocation that

would emerge if sellers were prevented from price discriminating, given the primitives estimated under the

standard model, which also assumes u(θ) = u(θ) for all consumers so that the participation (or budget)

constraint just binds for the lowest type. We report the results of this experiment in Figure 10. As before,

we plot the differences in consumer and social surplus, CSnp(θ)/CSlp(θ)− 1 and SSnp(θ)/SSlp(θ)− 1,

respectively, as functions of the quantity demanded by a given consumer type under nonlinear pricing,

q = q(θ).

Figure 10: Nonlinear vs. Linear Pricing Under Standard Model

-10

12

3N

onlin

ear

vs. L

inea

r P

rices

: Con

sum

er S

urpl

us


Standard Model0

24

68

Non

linea

r vs

. Lin

ear

Pric

es: S

ocia

l Sur

plus


Standard Model

By contrasting the left panel of Figure 10 with the top left panel of Figure 9 (the first version of our

experiment), it emerges that in this case too, consumer surplus is mostly lower for the lowest consumer

types when sellers can price discriminate and by an amount comparable to the one in the top left panel

of Figure 9. Intermediate and high consumer types mostly gain from nonlinear pricing. Their gain in

consumer surplus relative to linear pricing predicted by the standard model is, however, much larger than

predicted by our model: the range of consumer surplus increases in the left panel of Figure 10 is twice as

large as in the top left panel of Figure 9.

The pattern of differences in social surplus across nonlinear and linear pricing is also similar to the

one implied by our model but, like with consumer surplus, the gain in social surplus under nonlinear pric-

ing predicted by the standard model is larger. Specifically, comparing the right panel of Figure 10 to the

top right panel of Figure 9 shows how the greater efficiency of nonlinear pricing implied by the standard

model is distinctively more pronounced for lower types, those purchasing less than 0.5 kilos under non-

40

linear pricing, relative to our model. For high consumer types, the gain in social surplus associated with

nonlinear pricing is mostly comparable to the one predicted by our model in most villages.

Recall that the standard model implies greater consumption distortions for lower than for higher con-

sumer types since it requires [θν ′(q(θ)) − c]/ν ′(q(θ)) to equal [1 − F (θ)]/f(θ), which decreases with

θ by the monotone hazard rate condition—a maintained assumption in both the standard model and our

model. However, given the low reservation utility profile implied by the standard model, under this model

sellers have a greater ability to extract consumer surplus through linear pricing than under our model.

This argument explains the greater consumer surplus associated with nonlinear pricing relative to linear

pricing when the standard model is assumed to apply. Under the standard model, fewer consumers are ex-

cluded from trade under linear pricing and sellers charge higher linear prices than predicted by our model,

thereby depressing consumer surplus. Indeed, exclusion occurs only in villages 3, 5, and 9, and involves

the smallest two types in villages 3 and 9 but only the smallest type in village 5.

As mentioned in Section 3, the standard model can be interpreted as a special case of our model with

highly-convex reservation utility when γ = 1. It is interesting to note that, although all villages in our

regular sample turn out to correspond to the highly-convex case and our estimates of γ are relatively close

to 1, the exercise performed here shows that the standard model does not constitute a good approximation

to our data in that the welfare implications of the two models are very different. This is because even

small deviations of γ from 1 are symptomatic of very different behaviors on the part of sellers and con-

sumers. See also Appendix B for the bias to the estimates of consumers’ marginal willingness to pay and

marginal utility that would result from assuming that the standard model applies to a particular village

when, instead, it is rejected.

5.6 The Impact of Income Transfers

As discussed in Section 3.3, our model with budget constraints can be used to examine the consequences

of a targeted transfer, such as the one implemented through the Progresa program, on prices and quantities.

Proposition 7 implies that income transfers to consumers induce sellers to modify the shape of their pricing

schedules. This prediction of our model is particularly sharp for villages that can be classified “highly-

convex” or “weakly-convex” instances of our model. To examine whether this prediction is borne out in

the data, we analyze the 24 villages in our estimation sample that can be classified as highly-convex—as

explained, no village in our estimation sample conforms to the weakly-convex case. For these villages,

we then examine the extent to which the Progresa transfer has affected prices.

To this purpose, for each quantity and each such village, we compute the price implied for that quan-

tity/consumer type by the model’s estimates of T (q) in that particular village, as detailed in Section 4,

and use them to estimate a specification similar to the one reported in Table 1.21 Specifically, we regress

the log of the predicted unit price on the log of the quantity purchased in each village. The resulting

21Figure 4 implies that these theoretical predictions constitute a very good fit of actual data.

41

log-linear relationship could be considered as an approximation to the theoretical schedule, as implied

by our model’s estimate. We allow this schedule to be shifted by the presence of the Progresa transfer,

which is available only in some of the villages, by letting slope and intercept of this relationship to depend

on whether a given village is targeted by Progresa or not.22 We estimate this relationship by OLS and

present the results we obtain in Table 2, which also reports clustered standard errors at the village level in

parenthesis—once again, the commodity we are considering is rice.

A small literature has examined the effect of the Progresa program on prices of agricultural commodi-

ties. As mentioned in the introduction, Hodinott et al. (2000) and Angelucci and DeGiorgi (2009) found

no evidence that Progresa transfers induced a systematic increase in the (average) price of basic staples.

In column (1) of Table 2, we perform an exercise similar to theirs on the unit price predicted by our model

by regressing it on a constant and an indicator of the program. Consistent with this finding and similar

other findings in the literature, we observe that the parameter for the presence of Progresa is small in size

and implies an average increase in unit prices of about 1% in response to the program, not statistically

different from zero.

In column (2), we perform an exercise similar to that reported in Table 1 but using the prices predicted

by our estimates. Note that we explicitly account for the variability of unit prices with quantity with an

additional covariate, log-quantity, relative to the specification in column (1). We find that the elasticity of

unit prices to quantity is−0.215 and is, indeed, statistically different form zero. The size of this coefficient

is not too different from that reported in Table 1. In column (3), instead, we add the treatment dummy to

the specification in column (2). Once again, we find that the coefficient on this variable is not statistically

different from zero whereas the coefficient on log-quantity is not too dissimilar from that reported in the

previous specification.

Table 2: Impact of Cash Transfers on Prices

Rice Unit Values Restricted Sample:Villages with at Least 100 Households

1 2 3 4Treatment 0.011 − 0.020 0.005

(0.020 − (0.018) (0.017)Ln(quantity) − −0.215 −0.216 −0.193

(0.033) (0.034) (0.033)Ln(quantity)* − − − −0.039Treatment − − − (0.014)Constant 1.997 1.924 1.913 1.921

(0.023) (0.015) (0.017) (0.015)

Observations 4462 4462 4662 4462R2 0.0012 0.6179 0.6219 0.6267

Finally, in column (4), we add an interaction between the program dummy and quantity to the spec-

22 It should be remembered that, for evaluation purposes, the Progresa program was randomly allocated across villages.

42

ification in column (3). The coefficient on log-quantity is slightly lower (in absolute value) at −0.19

whereas the coefficient on the interaction between the program dummy and log-quantity is estimated at

−0.039 and is statistically different from zero. The results in column (4) imply that the price schedule

becomes steeper, and thus implies a higher quantity discount, in villages where the program is operating

than in villages where it is not. Not only this result is consistent with Proposition 7, but, to the best of our

knowledge, also constitutes one of the first pieces of evidence of an impact of a cash transfer on prices.

This “tilting” of the schedule of unit prices could explain the failure of many researchers to find an effect

of the program on average unit prices—Cunha et al. (2014) find a small but negligible increase in prices

in response to cash transfers but under the assumption that unit prices are constant with quantity.

Our model could also be used to interpret the fall in prices reported by Cunha et al. (2014) for the

villages in their sample that receive in-kind transfers. If the basket received by consumers includes the

commodity sold by the seller, such transfers could affect consumers’ subsistence constraints and, in par-

ticular, increase the consumption floor on other goods, thus reducing consumers’ budgets for the good

under consideration. In-kind transfers can then have an effect opposite to the one of cash transfers, in

particular, they may lead to a decrease in prices, as consistent with the findings of Cunha et al. (2014).23

These results, both theoretical and empirical, are important to assess the impact of cash transfers, a

commonly used policy tool in developing countries. In particular, we note that such cash transfers may

imply an upward shift in the price schedule and an increase in the intensity of price discrimination, which

we observe in our data. This price increase has obvious implications on the consumer surplus enjoyed by

households beneficiaries of the program but also on the surplus of non-eligible households. In particular,

the consumer surplus of eligible households is reduced relative to the surplus those households would

enjoy if prices did not change in response to transfers.

6 ConclusionWe have proposed a model of nonlinear pricing in which consumers differ in their tastes for goods, ability

to pay for them, outside options, and face subsistence constraints that give rise to budget constraints for a

seller’s good. In this case, the implications of nonlinear pricing for consumer, producer, and social surplus

are in general fundamentally different from those arising from standard models of nonlinear pricing, in

which outside options are identical across consumers and consumers are assumed unconstrained in their

purchasing decisions. In particular, in our environment quantity discounts for large volumes can be as-

sociated with overprovision of quantity at low volumes. We have proved that this more general model

is non- and semiparametrically identified under common assumptions. We have derived nonparametric

and semiparametric estimators of the model’s primitives that can be readily implemented using publicly

available data from conditional cash transfer programs, common in several developing countries. We have

23The authors extend their intuition on the price effect of cash transfers to the imperfectly competitive case of Cournotcompetition among sellers. However, in this case they assume directly conditions on aggregate demand that guarantee that thesame price effect as in the perfectly competitive case arises.

43

also showed that cash transfers, by affecting consumers’ ability to pay, provide sellers with a greater op-

portunity to extract consumer surplus. Thus, cash transfers in general lead to higher prices and, in most

villages in our sample, a greater intensity of price discrimination, as predicted by our model. Overall,

our estimation results suggest the importance of accounting for heterogeneity in consumers’ preferences,

consumption opportunities, and constraints in assessing the welfare implications of nonlinear pricing.

ReferencesANGELUCCI M., G. DE GIORGI (2009): “Indirect Effects of an Aid Program: How Do Cash Transfers

Affect Ineligibles’ Consumption?”, American Economic Review, 99(1), pages 486-508.

ATTANASIO, O., V. DI MARO, V. LECHENE, and D. PHILLIPS (2009): “The Welfare Consequences

of Increases in Food Prices in Rural Mexico and Colombia”, mimeo.

ATTANASIO, O., and C. FRAYNE (2006): “Do the Poor Pay More?”, mimeo.

BARON, D.P., and R.B. MYERSON (1982): “Regulating a Firm with Unknown Cost”, Econometrica,

50(4), 911-930.

CHE, Y.-K., and I. GALE (2000): “The Optimal Mechanism for Selling to a Budget-Constrained Buyer”,

Journal of Economic Theory, 92(2), 198-233.

CUNHA, J., G. DE GIORGI, and S. JAYACHANDRAN (2014): “The Price Effects of Cash Versus In-

Kind Transfers”, mimeo.

EKELAND, I., J.J. HECKMAN, and L. NESHEIM (2004): “Identification and Estimation of Hedonic

Models”, Journal of Political Economy, 112(S1), S60–S109.

GUERRE, E., I. PERRIGNE, and Q. VUONG (2000): “Optimal Nonparametric Estimation of First-Price

Auctions”, Econometrica, 68(3), 525-574.

HALL, P. (1992): “Effect of Bias Estimation on Coverage Accuracy of Bootstrap Confidence Intervals

for a Probability Density”, Annals of Statistics, 20(2), 675–694.

HECKMAN, J.J., , R.L. MATZKIN, and L. NESHEIM (2010): “Nonparametric Identification and Esti-

mation of Nonadditive Hedonic Models”, Econometrica, 78(5), 1569–1591.

HODDINOTT, J., E. SKOUFIAS and R. WASHBURN (2000): “The impact of Progresa on Consumption:

A Final Report”, IFPRI, mimeo.

JENSEN, R.T., and N.H. MILLER (2008): “Giffen Behavior and Subsistence Consumption”, American

Economic Review, 98(4), 1553–1577.

JULLIEN, B. (2000): “Participation Constraints in Adverse Selection Models”, Journal of Economic

Theory, 93(1), 1–47.

LANCASTER, K.J. (1966): “A New Approach to Consumer Theory,” Journal of Political Economy,

74(2), 132–157.

LESLIE, P. (2004): “Price Discrimination in Broadway Theatre”, Rand Journal of Economics, 35(3),

520–541.

44

MASKIN, E., and J. RILEY (1984): “Monopoly with Incomplete Information”, Rand Journal of Eco-

nomics, 15(2), 171–196.

MCINTOSH, A.C. (2003): “The poor pay more for their water. ”’ Habitat Debate, September 2003 Vol.

9 No. 3 , http://www.unhabitat.org/hd/hdv9n3/12.asp.

PERRIGNE, I., and Q. VUONG (2010): “Nonlinear Pricing in Yellow Pages”, mimeo.

ROBIN, J.-M. (1993): “Econometric Analysis of the Short-Run Fluctuations of Households’ Purchases”,

Review of Economic Studies, 60(4), 923-934.

STOLE, L.A. (2006): “Price Discrimination and Competition”, mimeo.

THOMAS, L. (2002): “Nonlinear Pricing with Budget Constraint”, Economics Letters, 75(2), 257–263.

A Omitted Proofs and DetailsCumulative Multiplier in the Highly-Convex Case: Consider the case in which v(θ, q) = θν(q) andc′(q) = c. Here we derive an expression that defines the multiplier in the highly-convex case in terms ofprimitives. Recall that in the highly-convex case, the cumulative multiplier γ(θ) is equal to a constant, γ,at all points θ ∈ [θ,θ). Here we solve for γ in the interesting case in which 0 < γ < 1 for [θ,θ). In theother two cases, the multiplier is trivial: γ(θ) = 1 for all θ ∈ [θ,θ] and γ(θ) = 0 for all θ ∈ [θ,θ). First,observe that (1) implies that

ν ′(q(θ)) = cf(θ)/[θf(θ) + F (θ)− γ], (18)

so q(θ) = (ν ′)−1(cf(θ)/[θf(θ) + F (θ) − γ]). Note that u(θ) = u(θ) and u(θ) = u(θ) when 0 < γ < 1.Hence, u(θ)− u(θ) = u(θ)− u(θ) so

u(θ)− u(θ) =

∫ θ

θ

u′(x)dx =

∫ θ

θ

ν(q(x))dx =

∫ θ

θ

ν

((ν ′)−1

(cf(x)

xf(x) + F (x)− γ

))dx,

where the first equality follows from u(θ)−u(θ) = u(θ)−u(θ) and the fundamental theorem of calculus,and the second equality by the local incentive compatibility condition u′(θ) = ν(q(θ)).Proof of Proposition 1: This result establishes that the first-order and complementary slackness condi-tions for the simple BC problem in (6) are necessary and sufficient to characterize an optimal menu. Theproof of this result requires that a version of the assumptions of potential separation, homogeneity, andfull participation in Jullien (2000) for the IR model hold for the BC model. Like in the IR model, thepotential separation assumption requires l(γ, θ) to be a weakly increasing function of θ for all γ ∈ [0, 1],for which sufficient conditions are

∂

∂θ

(sq(θ, q)

vθq(θ, q)

)≥ 0 and

d

dθ

(F (θ)

f(θ)

)≥ 0 ≥ d

dθ

(1− F (θ)

f(θ)

).

As explained in Jullien (2000), the first inequality in this condition implies that the conflict betweenrent extraction and efficiency is not too severe so that the marginal benefit of increasing the slope of theutility profile is weakly increasing with the type. When this occurs, the seller tends to desire convexquantity profiles, which implies that the monotonicity condition for q(θ) for incentive compatibility iseasier to satisfy. The second and third inequalities in this condition amount to a simple strengthening ofthe monotone hazard rate condition ubiquitous in the mechanism design literature: as the type increases,the relative weight of types above θ compared to below θ decreases, and the seller becomes progressively

45

more concerned about the informational rents left below θ. We have discussed the BC homogeneityassumption in the text. The full participation assumption simply ensures that the seller makes nonnegativeprofits when trading with each consumer type. Sufficient conditions for this assumption to hold are thathomogeneity is satisfied and that for each type θ, the seller makes weakly positive profits by supplying thereservation quantity q(θ) at price t(θ), which can be expressed as

t(θ)− c(q(θ)) = v(θ, q(θ))− c(q(θ))− u(θ) = s(θ, q(θ))− u(θ) ≥ 0 (19)

by using u(θ) = v(θ, q(θ))− t(θ) and s(θ, q(θ)) = v(θ, q(θ))− c(q(θ)).To derive the simple BC problem in (6), we proceed in analogy with the derivations of the simple IR

problem in the Supplementary Appendix. First, we write the BC constraint as

I(θ, q) ≥ t(θ) = v(θ, q(θ))− u(θ), (20)

since u(θ) = v(θ, q(θ))− t(θ). The BC problem can be expressed in Lagrangian-type form as

maxu,q∈Q

(∫ θ

θ

[v(θ, q(θ))− c(q(θ))− u(θ)] f(θ)dθ+

∫ θ

θ

{I(θ, q(θ))− [v(θ, q(θ))− u(θ)]}dΦ(θ)

)(21)

s.t. u′(θ) = vθ(θ, q(θ)), (22)

where Q is the set of weakly-increasing functions q(θ) and Φ(θ) is the cumulative Lagrange multiplier,defined analogously to γ(θ), on the budget constraint expressed as in (20). Next, note that∫ θ

θ

u(θ)f(θ)dθ =

∫ θ

θ

[u(θ) + u(θ)− u(θ)]f(θ)dθ = u(θ)

∫ θ

θ

dF (θ) +

∫ θ

θ

[u(θ)− u(θ)]dF (θ)

= u(θ) +

∫ θ

θ

(∫ θ

θ

u′(x)dx

)dF (θ).

Integrating by parts and using the local incentive compatibility condition u′(θ) = vθ(θ, q(θ)), we obtain∫ θ

θ

u(θ)f(θ)dθ = u(θ) +

∫ θ

θ

(∫ θ

θ

vθ(x, q(x))dx

)dF (θ) = u(θ) +

(∫ θ

θ

vθ(x, q(x))dx

)F (θ)

∣∣∣∣θθ

−∫ θ

θ

vθ(θ, q(θ))F (θ)dθ = u(θ) +

∫ θ

θ

vθ(θ, q(θ))dθ −∫ θ

θ

vθ(θ, q(θ))F (θ)dθ. (23)

Similarly, ∫ θ

θ

u(θ)dΦ(θ) = u(θ)[Φ(θ)− Φ(θ)] +

∫ θ

θ

(∫ θ

θ

vθ(x, q(x))dx

)dΦ(θ)

= u(θ)[Φ(θ)− Φ(θ)] +

(∫ θ

θ

vθ (x, q (x)) dx

)Φ(θ)

∣∣∣∣θθ

−∫ θ

θ

vθ (θ, q (θ)) Φ(θ)dθ

= u(θ)[Φ(θ)− Φ(θ)] + Φ(θ)

∫ θ

θ

vθ (θ, q (θ)) dθ −∫ θ

θ

vθ (θ, q (θ)) Φ(θ)dθ. (24)

46

Substituting (23) and (24) into (21), we obtain∫ θ

θ

[v(θ, q(θ))− c(q(θ))− u(θ)] f(θ)dθ +

∫ θ

θ

{I(θ, q(θ))− [v(θ, q(θ))− u(θ)]}dΦ(θ)

=

∫ θ

θ

[v(θ, q(θ))− c(q(θ))]f(θ)dθ +

∫ θ

θ

[I(θ, q(θ))− v(θ, q(θ))]dΦ(θ)− u (θ)−∫ θ

θ

vθ(θ, q (θ))dθ

+

∫ θ

θ

F (θ)vθ(θ, q (θ))dθ + u(θ)[Φ(θ)− Φ(θ)] + Φ(θ)

∫ θ

θ

vθ(θ, q (θ))dθ −∫ θ

θ

Φ (θ) vθ(θ, q (θ))dθ

which, by collecting terms, can be simplified to further obtain∫ θ

θ

[v(θ, q(θ))− c(q(θ))]f(θ)dθ +

∫ θ

θ

[F (θ)− Φ(θ) + Φ(θ)− 1

f(θ)

]vθ(θ, q(θ))f(θ)dθ

+

∫ θ

θ

φ(θ)[I(θ, q(θ))− v(θ, q(θ))]

f(θ)f(θ)dθ + u(θ)[Φ(θ)− Φ(θ)− 1].

By collecting terms one more time and dropping irrelevant constants, this expression reduces to the onein (6). The following result holds, which is the analogue of Result 4 in the Supplementary Appendix.

Result 1. Under potential separation, BC homogeneity, and full participation, the optimal allocation{u(θ), q(θ)} solves the simple BC problem if, and only if, there exists a cumulative multiplier functionΦ(θ) such that the first-order conditions (7) and the complementary slackness condition (8) are satisfied.

We now turn to prove Proposition 1. By Result 4 in the Supplementary Appendix, an implementableallocation {tIR(θ), qIR(θ), γIR(θ)}with associated utility profile u(θ) solves the IR problem if, and only if,there exists a cumulative multiplier function γ(θ) with the properties of a cumulative distribution functionsuch that the first-order conditions

vq(θ, q (θ))− c′(q(θ)) =γ(θ)− F (θ)

f(θ)vθq(θ, q(θ)) (25)

and the complementary slackness condition∫ θ

θ

[u(θ)− u(θ)] dγ(θ) = 0 (26)

hold. By Result 1 above, the allocation {tIR(θ), qIR(θ)} with associated utility profile uIR(θ) of the IRproblem solves the BC problem if, and only if, there exists a cumulative multiplier function Φ(θ) suchthat the first-order conditions

vq(θ, q (θ))− c′(q(θ)) =Φ(θ)− F (θ) + 1− Φ(θ)

f(θ)vθq(θ, q(θ)) +

φ(θ) [vq(θ, q(θ))− Iq(θ, q(θ))]f(θ)

(27)

and the complementary slackness condition∫ θ

θ

[I(θ, q(θ))− t(θ)] dΦ(θ) = 0 (28)

47

hold. Note that for Φ(θ) to be a legitimate cumulative multiplier function, it must be nonnegative andweakly increasing with θ. Clearly, Φ(θ) = a+γ(θ) for any constant a is a legitimate cumulative multiplier.Note also that with Φ(θ) = a+γ(θ), the multiplier dγ(θ) on the IR constraint is zero or strictly positive if,and only if, the multiplier dΦ(θ) on the BC constraint is zero or strictly positive. Given this observation,it is immediate that since the complementary slackness condition (26) holds for the IR problem, so mustthe complementary slackness condition (28) for the BC problem.

Then, we are left to show that at the multiplier function Φ(θ), the allocation {tIR(θ), qIR(θ)} satisfiesthe first-order conditions for the BC problem, namely (27). To this purpose, note first that

Φ(θ) + 1− Φ(θ) = a+ γ(θ) + 1− [a+ γ(θ)] = γ(θ), (29)

where the second equality holds since γ(θ) = 1; see the Supplementary Appendix for a proof that γ(θ) =1. Next, observe that, by assumption, Iq(θ, q(θ)) equals vq(θ, q(θ)) whenever the budget constraint binds.Thus, for each θ, either φ(θ) = 0 or, if not, Iq(θ, q(θ)) = vq(θ, q(θ)). Hence, the second term on the rightside of (27) equals zero for each θ. These two observations imply that the first-order conditions for the BCproblem in (27) are identical to those for the IR problem in (25). Hence, the solutions to the IR and BCproblems are the same. By the same argument as the one in the proof of Result 1 in the SupplementaryAppendix, it is also possible to show that Φ(θ) = 1.Proof of Proposition 2: Recall that I(θ, q) = Y − z(θ, q), where

z(θ, q) = −z1(θ)− z2ν(q), z′1(θ) = ψ(log(θ − z2)), and θ > z2 > 0. (30)

Let ψ(·) be a positive continuous function defined on the real line. To show that BC homogeneity issatisfied under these assumptions on z(θ, q), we proceed by constructing a menu {t(θ), q(θ)} such thatt(θ) = I(θ, q(θ)), Iq(θ, q(θ)) = θν ′(q(θ)), and q(θ) is weakly increasing. Since one can always chooset(θ) = I(θ, q(θ)), the restrictions that BC homogeneity imposes can be stated as requiring I(θ, q(θ)) =Y − z(θ, q), Iq(θ, q(θ)) = θν ′(q(θ)), and q(θ) be weakly increasing. For convenience, rather than estab-lishing that we can construct an increasing q(θ) function that satisfies I(θ, q(θ)) = Y − z(θ, q(θ)) andIq(θ, q(θ)) = θν ′(q(θ)) under (30), we show, equivalently, that we can construct an increasing functionθ(q) under (30) that satisfies {

Iq(θ(q), q) = θ(q)ν ′(q)I(θ(q), q) = Y − z(θ(q), q)

. (31)

Specifically, using (30), the derivative of the second expression in (31) with respect to q is

Iq(θ(q), q) = z′1(θ(q))θ′(q) + z2ν

′(q) = ψ(log[θ(q)− z2])θ′(q) + z2ν

′(q).

By equating the right side of this last expression to the right side of the first expression in (31), we obtain

ψ(log[θ(q)− z2])θ′(q) + z2ν

′(q) = θ(q)ν ′(q)⇔ ν ′(q) =ψ(log[θ(q)− z2])θ

′(q)

θ(q)− z2

. (32)

By integrating both sides of (32) from q(θ) to q ≤ q(θ) and using θ = θ(q(θ)), it follows that

ν(q)− ν(q(θ)) = Ψ(log[θ(q)− z2])−Ψ(log(θ − z2)), (33)

with Ψ(x) = Ψ(log(θ − z2)) +∫ x

log(θ−z2)ψ(x)dx increasing since ψ(·) is positive. Simple manipulations

of (33) yieldθ(q) = z2 + exp{(Ψ)−1(ν(q)− ν(q(θ)) + Ψ(log(θ − z2)))}.

48

Note that θ(q) is an increasing function of q, since (Ψ)−1 (·) and ν(·) are increasing functions. So, q(θ) isan increasing function of θ, as desired.

For an example, it is easy to show that if z(θ, q) = z0− z1(θ− z2)λ1 − z2ν(q) with z1, z2 > 0, θ > z2,and λ1 > 1, then by following the above steps, we obtain q(θ) with q′(θ) > 0 as

q(θ) = (ν)−1

(z1λ1

λ1 − 1[(θ − z2)λ1−1 − (θ − z2)λ1−1] + ν(q(θ))

).

The Two-Dimensional Case: Suppose that the parameter w differs across consumers so that the budgetschedule is now given by I(θ, q, w) = Y (w) − z(θ, q). The analysis of this case differs from that of thecase of constant w depending on whether the seller can discriminate across consumers based on w or,rather, only through a menu of prices at most contingent on q.Contractible Income Characteristic. Suppose that the seller can segment consumers across submarketsindexed by w and offer nonlinear prices in each submarket w so as to screen consumers based on θ. Forease of exposition, suppose that there are only two levels of w, say, wH and wL, with Y (wH) > Y (wL). Inany such submarket w, the seller’s problem is as stated in the BC problem with income Y (w) and budgetschedule I(θ, q, w) = Y (w) − z(θ, q). For the corresponding simple BC problem, the necessary andsufficient conditions for an optimal solution are given by Result 1: {t(θ, w), q(θ, w)} solves the simpleBC problem if, and only if, there exists a cumulative multiplier function Φ(θ, w) such that the first-orderconditions in (27) and the complementary slackness condition in (28) apply with I(θ, q, w) = Y (w) −z(θ, q). Our next result shows how this optimal solution varies across submarkets. For this, let

t(θ, wH) = t(θ, wL) + Y (wH)− Y (wL), q(θ, wH) = q(θ, wL), and Φ(θ, wH) = Φ(θ, wL). (34)

Result 2. If {t(θ, wL), q(θ, wL)} with associated cumulative multipliers {Φ(θ, wL)} solve the simple BCproblem in submarket wL, then {t(θ, wH), q(θ, wH)} with associated cumulative multipliers {Φ(θ, wH)}satisfying (34) solve the simple BC problem in submarket wH .

The result states that type (θ, wH) in the submarket with the higher income level chooses the samequantity as type (θ, wL) in the submarket with the lower income level, that is, q(θ, wH) = q(θ, wL).Moreover, the binding patterns of the multipliers in the two submarkets are identical in that the cumulativemultiplier binds for type (θ, wH) in submarketwH if, and only if, it binds for type (θ, wL) in submarketwL.The only difference is that type (θ, wH) pays Y (wH) − Y (wL) more for the same quantity purchased bytype (θ, wL) in submarket wL. The idea is straightforward. In the market with income Y (wL), a consumerof type θ chooses a pair (t(θ, wL), q(θ, wL)) leading to the consumption of z(θ, wL) = Y (wL)− t(θ, wL)units of the numeraire good. The consumption bundle (q(θ, wL), z(θ, wL)) must jointly provide enoughcalories so that the consumer meets the constraint z(θ, wL) ≥ z(θ, q(θ, wL)). Suppose that this constraintbinds for a set of types, that is,

z(θ, wL) = Y (wL)− t(θ, wL) = z(θ, q(θ, wL)). (35)

In submarketwH , at (t(θ, wL), q(θ, wL)) the budget constraint is slack for all types since Y (wH) > Y (wL).Clearly, in submarket wH , it is feasible for the seller to offer the same quantity as in submarket wL, thatis, q(θ, wH) = q(θ, wL), since q(θ, wL) is implementable in submarket wH too, and simply increase theprice by Y (wH)− Y (wL). In the proof of the result, we use the seller’s first-order conditions to show thatdoing so is optimal for the seller.Proof of Result 2. Let {t(θ, wL), q(θ, wL)} and the cumulative multipliers {Φ(θ, w)} solve the simple BCproblem in submarket wL. By Result 1, we know that these schedules satisfy the first-order conditions

49

(27) and the complementary slackness conditions (28) with t(θ), q(θ), Φ(θ), φ(θ), and I(θ, q) replaced byt(θ, wL), q(θ, wL), Φ(θ, wL), φ(θ, wL), and I(θ, q, wL). It is immediate that the allocations and multipliersgiven in (34) solve the corresponding first-order and complementary slackness conditions for submarketwH . To see why, note that since Iq(θ, q, w) = −zq(θ, q) is independent of w (conditional on q), thefirst-order conditions in the two submarkets are identical under (34). Consider next the complementaryslackness conditions. Since this condition holds in submarket wL, for any θ whose budget constraint forthe seller’s good binds, and so φ(θ, wL) is positive, we have

t(θ, wL) = I(θ, q(θ, wL), wL) ≡ Y (wL)− z(θ, q(θ, wL)). (36)

But then for this same θ in submarket wH , the multiplier φ(θ, wH) is also positive and

t(θ, wH) = t(θ, wL) + Y (wH)− Y (wL) = Y (wH)− z(θ, q(θ, wL)) = Y (wH)− z(θ, q(θ, wH)),

where the first and third equalities follow from (34), and the second equality from (36). Hence, the con-jectured solution satisfies the first-order conditions and complementary slackness condition for submarketwH . So, by Result 1, this conjectured solution solves the simple BC problem in submarket wH .Noncontractible Income Characteristic. Suppose now that the seller cannot segment consumers acrosssubmarkets. That is, the seller must offer the same price schedule to all consumers regardless of their w(and θ). This environment is equivalent to one in which the seller observes neither w nor θ. Assume thatw and θ are sufficiently positively dependent that w can be expressed as a nonlinear monotone function ofθ, w = ω(θ) with ω′(θ) > 0. Then, substituting w = ω(θ) into I(θ, q, w) = Y (w)− z(θ, q) gives

I(θ, q, ω(θ)) = Y (ω(θ))− z(θ, q). (37)

Under (37), Proposition 1 and Result 1 apply. To see that Proposition 2 also holds, let Y (ω(θ)) = Y +y(ω(θ)) without loss. Then, the analogous proposition holds with v(θ, q) = θν(q), z(θ, q) = −z1(θ) −z2ν(q), and y′(ω(θ))ω′(θ) + z′1(θ) = ψ(log(θ − z2)) for θ > z2 > 0.Proof of Proposition 3: Recall that T ′(q(θ)) = θν ′(q(θ)) > 0 by local incentive compatibility. LettingA(q) = −ν ′′(q)/ν ′(q), we have

T ′′(q) = θ′(q)ν ′(q) + θ(q)ν ′′(q) = θ(q)ν ′(q)

[θ′(q)

θ(q)+ν ′′(q)

ν ′(q)

]= T ′(q)

[1

θ(q)q′(θ)− A(q)

]. (38)

From the seller’s first-order condition (9), it follows that[θ − γ(θ)− F (θ)

f(θ)

]ν ′(q(θ))− c = 0,

which implies

q′(θ) = −∂∂θ

[θ − γ(θ)−F (θ)

f(θ)

]ν ′(q(θ))[

θ − γ(θ)−F (θ)f(θ)

]ν ′′(q(θ))

=

∂∂θ

[θ − γ(θ)−F (θ)

f(θ)

][θ − γ(θ)−F (θ)

f(θ)

]A(q(θ))

.

By (38), since T ′(q) > 0 and A(q) ≥ 0, we can express T ′′(q) ≤ 0 equivalently as

T ′(q)A(q(θ))

θ − γ(θ)−F (θ)f(θ)

θ ∂∂θ

[θ − γ(θ)−F (θ)

f(θ)

] − 1

≤ 0⇔ 1 ≤θ ∂∂θ

[θ − γ(θ)−F (θ)

f(θ)

]θ − γ(θ)−F (θ)

f(θ)

. (39)

50

In the highly-convex case, γ(θ) = γ. When γ ∈ [0, 1), the last inequality in (39) becomes

1 ≤θ ∂∂θ

[θ − γ−F (θ)

f(θ)

]θ − γ−F (θ)

f(θ)

=2θf 2(θ) + [γ − F (θ)]θf ′(θ)

θf 2(θ)− [γ − F (θ)]f(θ). (40)

This condition is always satisfied if (10) in the statement of the proposition holds. This argument alsoimplies that in the weakly-convex case, for types θ ∈ [θ, θ1] with γ(θ) = 0, the price schedule entailsquantity discounts provided (40) holds when γ = 0, that is,

2θf 2(θ)− θF (θ)f ′(θ)

θf 2(θ) + F (θ)f(θ)≥ 1,

which, since θf 2(θ) + F (θ)f(θ) > 0, is equivalent to θf 2(θ) ≥ F (θ)[f(θ) + θf ′(θ)].When γ(θ) = γ = 1, we have

q′(θ) =

∂∂θ

[θ − 1−F (θ)

f(θ)

][θ − 1−F (θ)

f(θ)

]A(q(θ))

≥ 1

θA(q(θ))=

ν ′(q(θ))

−θν ′′(q(θ))≥ 0, (41)

where the first inequality in the above follows from the assumption of potential separation and is strictif [1 − F (θ)]/f(θ) is strictly decreasing. Condition (41) implies 1/q′(θ) ≤ −θ(q)ν ′′(q)/ν ′(q), whichcombined with (38) yields

T ′′(q) =ν ′(q)

q′(θ)+ θ(q)ν ′′(q) ≤ ν ′(q)

[−θ(q)ν ′′(q)

ν ′(q)

]+ θ(q)ν ′′(q) = 0.

This argument also establishes that in the weakly-convex case, for types θ ∈ [θ2, θ] with γ(θ) = 1, theprice schedule entails quantity discounts.Quantity Premia in the Weakly-Convex Case: Since T ′(q(θ)) = θν ′(q(θ)) by local incentive com-patibility and A(q(θ)) = −ν ′′(q(θ))/ν ′(q(θ)) ≥ 0, from (38) it follows that T ′′(q) ≤ 0 if, and only if,A(q(θ))θq′(θ) ≥ 1. Since q(θ) = q(θ) for types in (θ1, θ2) and q′(θ) = u′′(θ)/ν ′(q(θ)), the conditionA(q(θ))θq′(θ) ≥ 1 can be also expressed as

θq′(θ)A(q(θ)) = −θq′(θ)ν ′′(q(θ))/ν ′(q(θ)) = θu′′(θ)[−ν ′′(q(θ))]/[ν ′(q(θ))]2 ≥ 1.

Hence, types in (θ1, θ2) for which the reverse inequality is satisfied, face quantity premia.Proof of Proposition 4. Let the allocations in the standard nonlinear pricing model be {us(θ), qs(θ)} andassume that u(θ) = u. A seller’s first-order conditions for the standard model and the augmented modelcan be written, respectively, as

1− c

T ′(qs)=

1− F (θ)

θf(θ)and 1− c

T ′(q)=γ(θ)− F (θ)

θf(θ). (42)

In the standard model, the participation (or budget) constraint binds only for the lowest type so thatus(θ) = u. We first examine the weakly-convex case, then the highly-convex case, and lastly the generalcase of the augmented model. Since the two models coincide for any type with γ(θ) = 1, we just focuson the case in which γ(θ) < 1.

Weakly-Convex Case. In this case, the cumulative multiplier γ(θ) equals zero until θ1, increases from

51

zero to one as θ increases from θ1 to θ2, and equals 1 between θ2 and θ. Since the allocations in the twomodels agree for θ ≥ θ2, we focus on types θ < θ2. We first show that for θ ∈ [θ, θ2), the augmentedmodel implies lower marginal prices, T ′(q(θ)) < T ′(qs(θ)), and higher consumption, q(θ) > qs(θ). Tothis purpose, note that since γ(θ) < 1 on [θ, θ2), it follows that

[1− F (θ)]/θf(θ) > [γ(θ)− F (θ)]/θf(θ),

which, by (42) and local incentive compatibility, implies that

T ′(q(θ)) = θν ′(q(θ)) < T ′(qs(θ)) = θν ′(qs(θ)). (43)

Thus, T ′(q(θ)) < T ′(qs(θ)). Moreover, since ν ′(·) is decreasing, (43) also implies that q(θ) > qs(θ). Thisresult, in turn, yields that

u(θ)=u(θ) +

∫ θ

θ

u′(θ)dθ =u(θ) +

∫ θ

θ

ν(q(θ))dθ>us(θ) +

∫ θ

θ

ν(qs(θ))dθ=us(θ) +

∫ θ

θ

u′s(θ)dθ=us(θ),

(44)where the second and third equalities in (44) follow from local incentive compatibility in both models, sothat u′(θ) = ν(q(θ)) and u′s(θ) = ν(qs(θ)), whereas the inequality in (44) follows because u(θ) > u(θ) ≥u, ν(·) is (strictly) increasing, and, as argued, q(θ) > qs(θ) on [θ, θ2).

Highly-Convex Case. In this case, γ(θ) = γ. Let, then, γ ∈ [0, 1) for θ ∈ [θ, θ). The arguments forT ′(q(θ)) < T ′(qs(θ)) and q(θ) > qs(θ) when θ ∈ [θ, θ) are nearly identical to those in the weakly-convexcase. Since the IR (or BC) constraint binds for the lowest type, in the highly-convex case u(θ) = u(θ),and the strict inequality in (44) follows because q(θ) > qs(θ) for θ ∈ [θ, θ).

General Case. For any θ with γ(θ) < 1, the same argument as in the weakly-convex case establishesT ′(q(θ)) < T ′(qs(θ)) and q(θ) > qs(θ). Since u(θ) ≥ u(θ) ≥ u, an argument analogous to the one in theweakly-convex case proves that u(θ) > us(θ).Proof of Proposition 5: We divide the proofs into two parts. Recall the assumption of full participationunder nonlinear and linear pricing.

Case a). We start by showing that if qm(θ) ≥ q(θ) and the price schedule exhibits quantity discounts inthat p′(q) ≤ 0, then the utility of a consumer of type θ is higher under linear that under nonlinear pricing,that is, um(θ) ≥ u(θ). By way of contradiction, assume that qm(θ) ≥ q(θ) and p′(q) ≤ 0 but

u(θ) = θν(q(θ))− T (q(θ)) > um(θ) = θν(qm(θ))− θν ′(qm(θ))qm(θ), (45)

where in (45) we have used the fact that under linear pricing, pm = θν ′(qm(θ)). Given that qm(θ) maxi-mizes the consumer’s utility under linear pricing, it follows

θν(q(θ))− T (q(θ)) > θν(qm(θ))− θν ′(qm(θ))qm(θ) ≥ θν(q(θ))− θν ′(qm(θ))q(θ), (46)

which impliesθν ′(qm(θ)) > T (q(θ))/q(θ). (47)

Note that the first inequality in (46) restates (45) while the second inequality follows from the fact that atthe linear price pm, any quantity demanded different from qm(θ) implies a lower utility for the consumer.The inequality in (47) follows since the left-most term in (46) is greater than the right-most term.

Next, by the assumption of quantity discounts, p′(q) = [T ′(q)− T (q)/q]/q ≤ 0 or, equivalently,

T ′(q(θ)) ≤ T (q(θ))/q(θ). (48)

52

This inequality, in turn, implies

θν ′(q(θ)) = T ′(q(θ)) ≤ T (q(θ))/q(θ) < θν ′(qm(θ)) (49)

where the equality in (49) follows from local incentive compatibility, the weak inequality from (48),and the strict inequality is simply (47). Clearly, (49) implies that θν ′(qm(θ)) > θν ′(q(θ)), which is acontradiction since qm(θ) ≥ q(θ) by assumption and ν ′(·) is decreasing. Hence, um(θ) ≥ u(θ).

Case b). We now show that if q(θ) > qm(θ), γ(θ) < 1, and the price schedule exhibits quantitydiscounts in that T ′′(q) ≤ 0, then the utility of a consumer of type θ is higher under linear that undernonlinear pricing. Consider one such type, θ, with q(θ) > qm(θ) and γ(θ) < 1. Suppose first that theweakly-convex case applies. In this case, the participation (or budget) constraint binds for θ ∈ [θ1, θ2] andis slack above θ2 with γ(θ) = 1 for θ ≥ θ2. Let then θ ∈ [θ, θ2]. By way of contradiction, suppose thatu(θ) > um(θ). We will show that if so, then we will contradict the assumption that um(θ2) ≥ u(θ2), thatis, that a consumer of type θ2 participates under linear pricing. To this purpose, rewrite u(θ) > um(θ) as

u(θ2)− [u(θ2)− u(θ)] > um(θ2)− [um(θ2)− um(θ)], (50)

which can be expressed as

u(θ2)−∫ θ2

θ

u′(x)dx > um(θ2)−∫ θ2

θ

u′m(x)dx. (51)

By using u(θ2) = u(θ2), since the IR (or BC) constraint binds at θ2, local incentive compatibility in thetwo models, that is, u′(θ) = ν(q(θ)) and u′m(θ) = ν(qm(θ)), and rearranging terms, condition (51) implies

u(θ2)− um(θ2) >

∫ θ2

θ

[ν(q(x))− ν(qm(x))] dx. (52)

We now argue that the right side of (52) is positive, which establishes the desired contradiction. To seethat the right side of (52) is positive, note that for all θ ∈ [θ, θ],

pm = θν ′(qm(θ)) = θν ′(qm(θ)) ≥ θν ′(q(θ)) = T ′(q(θ)) ≥ T ′(q(θ)) = θν ′(q(θ)), (53)

where the first two equalities follow from a consumer’s first-order condition under linear pricing, which,of course, hold for all θ, the first inequality follows from q(θ) > qm(θ) by the assumption of the case andν ′(·) decreasing, the third and fourth equalities follow from local incentive compatibility, and the secondinequality holds for θ ≥ θ since q(·) is increasing by incentive compatibility and T ′(·) is decreasing byassumption. Hence, (53) implies θν ′(qm(θ)) ≥ θν ′(q(θ)) for all θ ∈ [θ, θ], and so q(θ) ≥ qm(θ) for allθ ∈ [θ, θ], given that ν ′(·) is decreasing. Thus, the right side of (52) is positive since ν(·) is increasing.Then, u(θ2) > um(θ2) so type θ2 does not participate under linear pricing. Contradiction.

Consider next the highly-convex case. By assumption, the cumulative multiplier at θ satisfies γ(θ) < 1,so the participation (or budget) constraint binds at the highest type, that is, u(θ) = u(θ). From here, wecan repeat the steps of the contradiction argument for the weakly-convex case with u(θ) replacing u(θ2)and arrive at a similar conclusion, namely that u(θ) > um(θ) contradicts the assumption that a consumerof type θ participates under linear pricing. To this purpose, rewrite u(θ) > um(θ) as

u(θ)− [u(θ)− u(θ)] > um(θ)− [um(θ)− um(θ)]⇔ u(θ)−∫ θ

θ

u′(x)dx > um(θ)−∫ θ

θ

u′m(x)dx,

53

which, by using that u(θ) = u(θ), local incentive compatibility, and rearranging terms, yields

u(θ)− um(θ) >

∫ θ

θ

[u′(x)− u′m(x)] dx =

∫ θ

θ

[ν(q(x))− ν(qm(x))] dx. (54)

As before, (53) implies that q(θ) ≥ qm(θ) for all θ ∈ [θ, θ], which yields that the right side of (54) ispositive and so u(θ) > um(θ). Thus, type θ does not participate under linear pricing. Contradiction.

The general case is a simple extension of the arguments in the weakly-convex and highly-convex cases:since in this case the IR (or BC) constraint binds for at least an interval [θ′1, θ

′2], an argument analogous to

the above one applies.Proof of Proposition 6: Since s(θ, q(θ)) ≥ u(θ) for consumers with types θ ∈ [θ′, θ′′], the seller makesnonnegative profits from each such consumer type under nonlinear pricing and these types participate bythe argument in the proof of Proposition 1. To establish the desired claim, it is sufficient to show that thereexists a subinterval of types in [θ′, θ′′], say, [θ3, θ4], who do not participate under linear pricing. For this,suppose, by way of contradiction, that all consumer types in [θ′, θ′′] participate under linear pricing. Weprove that if so, then the seller makes negative profits under linear pricing. To prove this, let θ be a type in[θ′, θ′′] such that um(θ) = u(θ). Note that for any type θ in [θ, θ′′] who participates under linear pricing, itmust be um(θ) ≥ u(θ), which can be expanded as

um(θ) = um(θ) +

∫ θ

θ

u′m(x)dx = um(θ) +

∫ θ

θ

ν(qm(x))dx

≥ u(θ) = u(θ) +

∫ θ

θ

u′(x)dx = u(θ) +

∫ θ

θ

ν(q(x))dx, (55)

where the second equality in (55) uses the fact that u′m(θ) = ν(qm(θ)) by the consumer’s first-order condi-tion under linear pricing, θν ′(qm(x)) = pm, and the last equality uses the BC homogeneity assumption thatthe reservation quantity is incentive compatible so u′(θ) = ν(q(θ)). Since, by assumption, um(θ) = u(θ),(55) implies ∫ θ

θ

ν(qm(x))dx ≥∫ θ

θ

ν(q(x))dx. (56)

Since ν(·) is increasing, (56) implies that there exists a set of positive measure, say, [θ3, θ4], for whichqm(θ) ≥ q(θ). Since q(θ) > qFB(θ) for consumers with types in [θ′, θ′′] by assumption, and ν ′(·) is(strictly) decreasing, it follows that for consumers with θ ∈ [θ3, θ4],

pm = θν ′(qm(θ)) < θν ′(qFB(θ)) = c, (57)

where the first equality follows from a consumer’s first-order condition under linear pricing and the secondequality from the first-order condition for the first-best quantity for type θ. But (57) implies that pm < c,which contradicts optimality by the seller, as the seller can always raise the linear price and at least earnzero.Proof of Proposition 7: Recall the discussion of the equivalence between the IR and BC models in Section3.2. For simplicity of exposition only, we will base our proof on the simple version of this equivalencebetween the two models by defining

I(θ, q(θ)) = θν(q(θ))− u(θ). (58)

Consider a consumer of type θ receiving a transfer τ(θ) with τ ′(θ) ≤ 0. For example, the transfer schedule

54

could be τ(θ) = τ0 + τ1θ with τ1 ≤ 0 so that τ ′(θ) = τ1. Hence, after the transfer, the consumer’s incomeis I(θ, q(θ)) + τ(θ) and the analog of condition (58) for the equivalence of the two models under the newincome schedule is

I(θ, q(θ)) + τ(θ) = θν(q(θ))− u(θ, τ), (59)

where u(θ, τ) is the corresponding new reservation utility. Subtracting (59) from (58) gives

u(θ, τ) = u(θ)− τ(θ). (60)

To develop some intuition, note that a consumer of type θ spends t(θ) to purchase q(θ) and the rest ofher income to purchase z = Y − t(θ) ≥ z(θ, q(θ)) before the transfer is introduced. If this consumerreceives a transfer of τ(θ), then the seller can increase the price t(θ) since the consumer’s ability to payhas increased. When we translate this consumer’s budget constraint back to a participation constraintusing (59), we see that the transfer amounts to a decrease in the reservation utility of the consumer bythe amount of the transfer, which reflects the fact that the seller can now ask for a higher price whilestill satisfying the consumer’s participation constraint. Given this equivalence, in the proof we need onlyshow that replacing the original IR constraint u(θ) ≥ u(θ), where u(θ) is defined in (58), with the newconstraint

u(θ) ≥ u(θ, τ) (61)

where u(θ, τ) is given in (60), leads the amount purchased of the seller’s good to increase, marginal pricesto decrease, and the total price for each quantity to increase (or decrease) under conditions.

We now proceed to the formal argument. Let {tτ (θ), qτ (θ)} denote the equilibrium allocation withparticipation constraint (61) and {t(θ), q(θ)} denote the original allocation in the absence of transfers withparticipation constraint u(θ) ≥ u(θ). The proof is articulated in several steps. The first step establishesthat the new reservation quantity qτ (θ) is greater than the original one type by type in that

qτ (θ) ≥ q(θ) for all θ. (62)

That the reservation quantity increases after the transfer follows immediately from the homogeneity as-sumption that qτ (θ) and q(θ) satisfy in that

u′(θ, τ) = u′(θ)− τ ′(θ) = ν(qτ (θ)) and u′(θ) = ν(q(θ)),

from the fact that u′(θ, τ) ≥ u′(θ), since τ ′(θ) ≤ 0, and that ν(·) increases with q. (See the SupplementaryAppendix for details about the assumptions of the IR model.) We show next that for each type, this implies

qτ (θ) ≥ q(θ) and T ′τ (q) ≤ T ′(q). (63)

Part 1: Establishing (63). Consider first the weakly-convex case and suppose that the income transfergives rise to a new weakly-convex allocation. (This result requires the transfer to be sufficiently progres-sive so that q′τ (θ) = u′′(θ, τ)/ν ′(qτ (θ)) is not too large, as consistent with the weakly-convex case.) Recallthat at the original allocation {t(θ), q(θ)}, the multipliers {γ(θ)} are such that

γ(θ) =

0 for θ < θ1

γ(θ) for θ ∈ [θ1, θ2)1 for θ ≥ θ2

. (64)

55

The new allocation {tτ (θ), qτ (θ)} has associated multipliers {γτ (θ)} of the form

γτ (θ) =

0 for θ < θ1τ

γτ (θ) for θ ∈ [θ1τ , θ2τ )1 for θ ≥ θ2τ

. (65)

Recall also that the reservation multipliers γ(θ) and γτ (θ) are defined as the multipliers that support q(θ)and qτ (θ), respectively, in that q(θ) = l(γ(θ), θ) and qτ (θ) = l(γτ (θ), θ). Since l(·, θ) decreases with γand reservation quantities are larger after the transfer as in (62), the reservation multipliers associated withthe new allocation must be smaller in that γτ (θ) ≤ γ(θ). But then from the form of the multipliers in (64)and (65), it follows that θ1τ ≥ θ1 and θ2τ ≥ θ2. We prove these last two claims separately.

a) Claim 1: θ2τ ≥ θ2. Suppose not, that is, suppose that θ2τ < θ2. Note that θ2τ ≥ θ1, otherwise thereservation quantity would be smaller for some type after the transfer. By the form of the multipliers γτ (θ)and γ(θ), it follows that γτ (θ2τ ) = γτ (θ2τ ) = 1 and, since θ1 ≤ θ2τ < θ2, that γ(θ2τ ) = γ(θ2τ ) < 1. So,γτ (θ2τ ) > γ(θ2τ ), which contradicts the fact that γτ (θ) ≤ γ(θ).

b) Claim 2: θ1τ ≥ θ1. Suppose not, that is, suppose that θ1τ < θ1. By the form of the multipliersγτ (θ) and γ(θ), it follows that, since θ1τ < θ1 ≤ θ2τ , γτ (θ1) = γτ (θ1) > γτ (θ1τ ) = γτ (θ1τ ) = 0 andγ(θ1) = γ(θ1) = 0. So, γτ (θ1) > γ(θ1), which contradicts the fact that γτ (θ) ≤ γ(θ).

Using these facts, together with the form of the multipliers, gives that γτ (θ) ≤ γ(θ). Thus, sinceq(θ) = l(γ(θ), θ), qτ (θ) = l(γτ (θ), θ), and l(·, θ) decreases with γ, it follows that qτ (θ) ≥ q(θ). In turn,since ν ′(·) is decreasing, local incentive compatibility implies

θν ′(qτ (θ)) = T ′τ (qτ (θ)) ≤ T ′(q(θ)) = θν ′(q(θ)). (66)

Thus, we have established (63).Consider now the highly-convex case. Suppose that γ ∈ (0, 1) at the original allocation, so that the

participation constraint binds for the lowest and highest types. Then, u(θ) = u(θ) and u(θ) = u(θ).Denote by uτ (θ) the utility of a consumer of type θ after transfer is introduced. As argued, under thetransfer schedule τ(θ) with τ ′(θ) ≤ 0, the new reservation utility u(θ, τ) is steeper than the original one.So, the highly-convex case still applies. In addition,∫ θ

θ

u′(x, τ)dx = u(θ, τ)− u(θ, τ) = uτ (θ)− uτ (θ) =

∫ θ

θ

u′τ (x)dx =

∫ θ

θ

ν(qτ (x))dx

≥∫ θ

θ

u′(x)dx = u(θ)− u(θ) = u(θ)− u(θ) =

∫ θ

θ

u′(x)dx =

∫ θ

θ

ν(q(x))dx. (67)

The inequality in (67) holds because u′(θ, τ) ≥ u′(θ). The second equality in (67) holds since the par-ticipation constraint binds for the lowest and highest types after the transfer too, and the fourth equalityfollows from local incentive compatibility. The equalities in the second line of (67) hold for the samereason as the equalities in the first line. Equation (67) then implies that∫ θ

θ

ν(qτ (x))dx ≥∫ θ

θ

ν(q(x))dx, (68)

which in turn yields that l(γτ , θ) = qτ (θ) ≥ q(θ) = l(γ, θ) for a set of types with positive mass. Tounderstand this implication, note that qτ (θ) = l(γτ , θ) and q(θ) = l(γ, θ) follow by construction of anoptimal allocation with and without transfers, whereas qτ (θ) ≥ q(θ) for a set of types with positive mass

56

follows from (68) given that ν(·) is increasing. But since l(·, θ) decreases with γ and the multiplier isconstant for all interior types in the highly-convex case, it must be γτ ≤ γ for types in this set. Usingagain the fact that the multiplier is constant for all interior types in the highly-convex case, we concludethat γτ ≤ γ and so qτ (θ) ≥ q(θ) for all interior types. In turn, local incentive compatibility, togetherwith ν ′(·) decreasing, immediately implies (66). Suppose now that γ = 1 at the original allocation. If themultiplier changes after the transfer, then it must decrease and so the same argument applies. If, instead,the multiplier does not change, then qτ (θ) = q(θ) and T ′τ (q) = T ′(q). Suppose, finally, that γ = 0at the original allocation. As in the previous case, if the multiplier does not change after the transfer,then qτ (θ) = q(θ) and T ′τ (q) = T ′(q). If the multiplier changes in response to the transfer, then it mustincrease, which implies qτ (θ) ≤ q(θ) and T ′τ (q) ≥ T ′(q). However, since q(θ) > l(0, θ) and qτ (θ) ≥ q(θ),it follows that qτ (θ) > l(0, θ), and so γ is still equal to zero after the transfer. Hence, qτ (θ) = q(θ) andT ′τ (q) = T ′(q). This establishes (63).

Given thatG(q) = F (θ) and qτ (θ) ≥ q(θ), it follows that in both the highly-convex and weakly-convexcase, the distribution of quantities after the transfer is introduced first-order stochastically dominates theone before the transfer is introduced.

Part 2. Establishing Tτ (q) ≥ T (q). This result is immediate. Since, as argued, the transfer amountsto a reduction in consumers’ reservation utility in that u(θ, τ) ≤ u(θ), the allocation {t(θ), q(θ)} is stillimplementable. So, the profit of the seller cannot decrease. As shown under Part 1, the offered quantity(weakly) increases for each type and so the cost of producing each type’s quantity is higher after thetransfer. Since the seller’s profit is T (q(θ))− c(q(θ)), it follows that T (q(θ)) must increase.Integrand in the Estimator for Preference Parameter Well-Defined: Consider first the highly-convexcase of the augmented model and note that

limq→q(θHC)

g(q)[T ′(q)− c]T ′(q)[γ −G(q)]

=g(q(θHC))T ′′(q(θHC))]

−T ′(q)g(q(θHC))= −T

′′(q(θHC))]

T ′(q).

A similar argument applies to the weakly-convex case when γ = 0 (for q < q1) or γ = 1 (for q ≥ q2).Consider now the weakly-convex case when q ∈ [q1, q2). Then,

limq→q(θWC)

g(q)[T ′(q)− c]T ′(q)[γ(θ(q))−G(q)]

=g(q(θWC))T ′′(q(θWC))

T ′(q(θWC))[γ′(θWC)θ′(q(θWC))− g(q(θWC))].

Example 1 (Nonlinear vs. Linear Pricing): Suppose the base utility function, ν(q), is three-parameterHARA with ν(q) = (1 − d)[aq/(1 − d) + b]d/d, a > 0, aq/(1 − d) + b > 0, and 0 < d < 1. Denote byus(θ) the utility of a type θ consumer under the standard model. With a uniform type distribution on [θ, θ],u(θ) = 0, a = c = 1, b = 0, and d = 1/2, it follows us(θ) ≥ um(θ) if, and only if, (2θ−θ)2−(2θ−θ)2 ≥θ2. When θ = 1 and θ = 2, this expression reduces to 3θ2 − 8θ + 4 ≥ 0, a polynomial with rootsθ = 2/3 and θ = 2. Thus, um(θ) ≥ us(θ) for all consumer types. Consider now the highly-convex caseof the augmented model with γ = 1/2. In this case, u(θ) ≥ um(θ) if, and only if, 3θ2 − 6θ + 2 ≥ 0.So, u(θ) ≥ um(θ) for θ ≥ 1.58. Also, all such types demand quantities above first best. Thus, not onlyu(θ) ≥ us(θ), as implied by Proposition 4, but also nearly half of the consumers prefer nonlinear to linearpricing under the augmented model.

B Estimation DetailsTo estimate the distribution of price and quantities, a seller’s marginal cost, the multipliers on the partici-pation (or budget) constraint, the distribution of consumers’ marginal willingness to pay, and consumers’

57

marginal utility function, and to perform the counterfactual exercises described in the text, we proceedaccording to the following steps.

Step 1. We compute T ′(qi), T ′′(qi), and G(qi), i = 1, . . . , N and qi ∈ {q1, . . . , qN}, from data onprices (unit values) and quantity purchases in each village as explained in the text. We fit six differentspecifications for T (q): log(T (q)) = t0 + t1 log(q)+ t2(log(q))2, T (q) = t0 + t1q+ t2q

2, T (q) = exp{t0 +t1 log q+ t2(log q)2 + t3q}, T (q) = exp{t0 + t1 log(q)}, T (q) = t0 + t1 log(q), and T (q) = log(t0 + t1q).In each village, among those specifications that imply a positive total price, T (q), at the lowest quantity, apositive marginal price at the smallest and largest quantities in a village, and satisfy a necessary conditiondescribed next for the schedule θ(q) to be increasing under the standard model, we select the specificationof T (q) that leads to the highest (adjusted) R2. Note that the condition that guarantees θ(q) increases withq under the standard model can be formulated as follows. Recall that by local incentive compatibilityθ(q) = T ′(q)/ν ′(q) so that

∂θ(q)

∂q=T ′′(q)ν ′(q)− T ′(q)ν ′′(q)

[ν ′(q)]2≥ 0⇔ T ′′(q)

T ′(q)≥ ν ′′(q)

ν ′(q).

By integrating the left-side and right-side of the above expression with respect to q, we obtain∫ q

q

T ′′(x)

T ′(x)dx ≥

∫ q

q

ν ′′(x)

ν ′(x)dx⇔ log[T ′(x)]qq ≥ log[ν ′(x)]qq ⇔

T ′(q)

T ′(q)≥ ν ′(q)

ν ′(q).

Empirically, by Perrigne and Vuong (2010), this condition can be equivalently formulated as

T ′(qi)

T ′(q1)≥ ν ′(qi)

ν ′(q1)=T ′(qi)[1− G(qi)]

1−T ′(qN )

T ′(qi) exp{−T ′(qN)

∑i−1j=1 log[1− G(qj)]

[1

T ′(qj)− 1

T ′(qj+1)

]}T ′(q1)[1− G(q1)]

1−T ′(qN )

T ′(q1)

with i = 1, . . . , N − 1. Since G(q1) ≥ 0 and T ′(q1) ≥ T ′(qN), a necessary condition is

[1− G(qi)]1−T ′(qN )

T ′(qi) exp

{−T ′(qN)

i−1∑j=1

log[1− G(qj)]

[1

T ′(qj)− 1

T ′(qj+1)

]}≤ 1.

This is the requirement that restricts our sample of 38 villages with at least 100 households consumingrice to 31 villages, as explained in the text.

Step 2. In each village, we estimate c and γ(·), and determine the relevant case of the augmentedmodel, as discussed in the text, by estimating (11) by GMM in Stata and setting a confidence level of 5 per-cent for the corresponding test procedure. We perform the remaining estimation routines in FORTRAN90.Note that the estimator of c is normally distributed, so (asymptotic) standard errors are straightforward tocompute—ignoring the estimation of the powers in the fractional polynomial for the auxiliary functionx(q). Recall that in the regular sample we focus on, all villages conform to the highly-convex case, so themultiplier γ(·) is constant (for all interior quantities) and equal to γ. Hence, γ = G(qHC) = G((T ′)−1(c))

since, by definition, T ′(qHC) = c. Since γ is a function of G(q) and c, denoting by σ2c the asymptotic

variance of c, and by G(q)[1 − G(q)] the asymptotic variance of G(q), we can easily obtain the asymp-totic variance of γ. Recall that in computing the asymptotic distribution of the estimators of c and γ, weconsider T (q) and its derivatives as known.

Step 3. Since the empirical distribution function of quantity purchases is a step function with steps atq1 < · · · < qN , the integrals in θ(q) and ν ′(q) can be rewritten as finite sums of integrals. (See Perrigne

58

and Vuong (2010) for a similar approach.) On each of these intervals G(·) is constant. We then estimateν ′(q) as

θ(q) = exp

{∑j>2,j 6=i−1

g(qj)[T′(qj)− c]

T ′(qj)[γ(θ(qj))− G(qj)](qj − qj−1)

}and ν ′(q) as

ν ′(q) = T ′(q) exp

{−∑

j>2,j 6=i−1

g(qj)[T′(qj)− c]

T ′(qj)[γ(θ(qj))− G(qj)](qj − qj−1)

}. (69)

We estimate the density of types based on (16). We compute the standard errors of θ(q) and ν ′(q) throughthe delta method. Given the asymptotic standard error of c, the fact that γ′(c) = g(qHC)/T ′′(qHC), andthe normalization θ = 1, from

θ(q) = exp

{∫ q

q

g(x)[T ′(x)− c]T ′(x)[γ −G(x)]

dx

},

we obtain

∂θ(q)

∂c= θ(q)

−∫ q

q

g(x){γ −G(x) + [T ′(x)− c] g(qHC)

T ′′(qHC)

}T ′(x)[γ −G(x)]2

dx

,

where

limq→qHC

g(x){γ −G(x) + [T ′(x)− c] g(qHC)

T ′′(qHC)

}T ′(x)[γ −G(x)]2

= 0.

In practice, we compute ∂θ(q)/∂c as

∂θ(q)

∂c= θ(q)

−∑j>2,j 6=i−1

g(qj){γ −G(qj) + [T ′(qj)− c] g(qHC)

T ′′(qHC)

}T ′(qj)[γ −G(qj)]2

(qj − qj−1)

,

where, given the granularity of the data, for the purpose of these computations, we approximated G(q) asG(q) = exp{g0 + g1q}(1 + exp{g0 + g1q}) and estimated its parameters g0 and g1 jointly with c and γ toobtain the variance-covariance matrix of the estimators of c, g0, and g1, which is then used to determinethe standard error of the estimators of θ(q) and ν ′(q). Note that for i = 0, 1,

∂θ(q)

∂gi= θ(q) exp

∫ q

q

[T ′(x)− c]{∂g(x)∂gi

[γ −G(x)]− g(x)[∂γ∂gi− ∂G(x)

∂gi

]}T ′(x)[γ −G(x)]2

dx

,

where, using the fact that γ = G(qHC) = exp{g0 + g1qHC}/(1 + exp{g0 + g1qHC}),

∂γ

∂g0

=G(qHC)

1 + exp{g0 + g1qHC}and

∂γ

∂g1

=G(qHC)qHC

1 + exp{g0 + g1qHC},

∂G(q)

∂g0

=G(q)

1 + exp{g0 + g1q}and

∂G(q)

∂g1

=G(q)q

1 + exp{g0 + g1q}.

59

With g(q) = G(q)g1/(1 + exp{g0 + g1q}), we further obtain

∂g(q)

∂g0

=g1

(∂G(q)∂g0

+ exp{g0 + g1q}[∂G(q)∂g0−G(q)

])(1 + exp{g0 + g1q})2

,

∂g(q)

∂g1

=

[∂G(q)∂g1

g1 +G(q)]

(1 + exp{g0 + g1q})−G(q)g1q exp{g0 + g1q}

(1 + exp{g0 + g1q})2.

We then estimate (11), with x(q) specified as a fractional polynomial, β0 + β1qa1 + β2q

a2 + ... + βpqap ,

and G(q) as just described simultaneously by GMM{g(q)T ′(q)− g(q)

c+ [β0+β1qa1+β2qa2+...+βpq

ap ]

c= 0

G(q)− exp{g0+g1q}1+exp{g0+g1q} = 0

, (70)

with the powers of the fractional polynomial for x(q) treated as known. By the central limit theorem,

√N

c− cg0 − g0

g1 − g1

a∼ N(0,Σ).

So,√n(θ(q) − θ(q)) ∼ N(0, DXΣD′X) with DX = (∂θ(q)/∂c, ∂θ(q)/∂g0, ∂θ(q)/∂g1). Since ν ′(q) =

T ′(q)/θ(q), then ν ′(q) is (asymptotically) normally with mean zero and variance σ2θ [∂ν

′(q)/∂θ(q)]2 =

σ2θ [T′(q)/θ2(q)]2. Given θ(q) and f(θ), following Hall (1992), we compute the sample analog of the

variance of the kernel density estimator of f(θ) as

s2(θ) =1

(Nhθ)2

∑N

i=1Kθ

(θ − θihθ

)2

− [f(θ)]2

N,

and use it to produce asymptotic confidence (variability) bounds around the estimated density.Step 4. We calculate consumer surplus from quantity q under nonlinear pricing as CSnp(q) =

θ(q)ν(q) − T (q) and across quantities as CSnp =∫ qqCSnp(x)dG(x) =

∫ qq

[θ(x)ν(x) − T (x)]dG(x).

Note that θν(q) = θ[ν(q) + ν(q)− ν(q)] = θ[ν(q) +∫ qqν ′(x)dx]. Since all relevant variables are discrete,

we compute ν(q1) as q1ν ′(q1), and

ν(qi) = q1ν ′(q1) +∑i−1

j=1(qj+1 − qj)ν ′(qj+1) (71)

for i > 1. Specifically, ν(q2) = q1ν ′(q1) + (q2 − q1)ν ′(q2) = ν(q1) + (q2 − q1)ν ′(q2), ν(q3) = q1ν ′(q1) +(q2− q1)ν ′(q2) + (q3− q2)ν ′(q3) = ν(q2) + (q3− q2)ν ′(q3), and so on. Therefore, ν(qi) = ν(qi−1) + (qi−qi−1)ν ′(qi), i > 1. Accordingly, we compute consumer surplus as

CSnp =∑N

i=1[θ(qi)ν(qi)− T (qi)]rq(qi),

where rq(q1) = G(q1) and rq(qi+1) = G(qi+1)−G(qi), i = 1, . . . , N−1. Similarly, we compute producer

60

surplus as

PSnp =∑N

i=1[T (qi)− cqi] rq(qi).

Step 5. We describe here only how we perform the counterfactual exercise described in Section5.5 under the augmented model, as it is the most involved. When comparing consumer, producer, andsocial surplus under nonlinear and linear pricing, we compute a seller’s linear price and the correspondingindividual and aggregate demands as follows. From the consumer’s first-order condition θν ′(q) = pm, itfollows

q = q(pm, θ) = (ν ′)−1(pmθ

), (72)

where qm(θ) ≡ q(pm, θ) denotes the quantity demanded by a consumer of type θ under linear pricing.With Q(pm) =

∫ θθqm(x)f(x)dx, pm solves the problem

maxpm

[(pm − c)Q(pm)].

Hence, under linear pricing, (total) consumer and producer surplus are, respectively, given by

CSlp =

∫ θ

θ

CSlp(x)f(x)dx =

∫ θ

θ

[xν(qm(x))− pmqm(x)] f(x)dx,

PSlp = (pm − c)Q(pm) = (pm − c)∫ θ

θ

qm(x)f(x)dx.

Social surplus is simply the sum ofCSlp and PSlp. To solve for pm, we determine a grid p = (pm1, . . . , pmP )of 5,000 points, and for each θi we compute the schedule

q((pm1, . . . , pmP ), θi) = (q(pm1, θi), . . . , q(pmP , θi)).

To do so, given a grid q = (q1, . . . , qmax) of 5,000 equidistant points for candidate quantities, we determinethe quantity chosen by type θi for each possible price pmp, p = 1, . . . , P , by solving the system

θiν ′(q1) = pmp. . .

θiν ′(qmax) = pmp

for each pmp and selecting q(pmp, θi) as the grid quantity for which the difference |θiν ′(qg) − pmp| issmallest. Then, the quantity demanded by type θi at price pmp is

q(pmp, θi) =

{q(pmp, θi), if θiν(q(pmp, θi))− pmpq(pmp, θi) ≥ u(θi)0, otherwise

,

with u(θi) determined as detailed in Section 5.5. Aggregate demand for each price pmp, p = 1, . . . , P , is

Q(pmp) =∑N

i=1q(pmp, θi)rθ(θi),

where rθ(θ1) = G(q1) and rθ(θi+1) = G(qi+1)−G(qi) for i = 1, . . . , N − 1. We then solve for the price

61

p∗m such that (pm − c)Q(pm) is maximal. Finally, we calculate consumer surplus as

CSlp =∑N

i=1rq(q(p

∗m, θi)) max{θiν(q(p∗m, θi))− p∗mq(p∗m, θi), u(θi)},

where rq(q(p∗m, θi)) = rθ(θi), ν(q(p∗m, θ1)) = q(p∗m, θ1)ν ′(q(p∗m, θ1)), and

ν(q(p∗m, θi)) = q(p∗m, θ1)ν ′(q(p∗m, θ1)) +∑i−1

j=1[q(p∗m, θj+1)− q(p∗m, θj)]ν ′(q(p∗m, θj+1)),

for i > 1. Similarly, we compute producer surplus as PSlp = (p∗m − c)Q(p∗m).

B.1 Estimation Results: Regular SampleWe report in Figure 12 the schedule of marginal prices and the estimated marginal cost in each of the 11villages in our regular sample: we order the 11 villages according to the value of the multiplier on theparticipation (or budget) constraint, from lowest to highest, as reported in Figure 5 in the text. Note thatthe small discrepancy between the estimates of marginal cost reported in the plots in Figure 12 and thosereported in Figure 5 is due to rounding error. The reason is as follows. In each village, we estimate qHCas the quantity at which the difference between T ′(q) and the estimate c of marginal cost is smallest. Wethen use the estimate of marginal cost given by T ′(qHC) rather than c, and the corresponding estimate ofthe multiplier given by γ = G(qHC), when computing θi and ν ′(qi) so as to ensure that the integrand termin the expressions defining θi and ν ′(qi) is well-defined at points of singularity—these are the estimatesof c and γ reported in Figure 5. We also display the estimates of the type support, the probability densityfunction of types, and the base marginal utility function for each quantity in each village, together withconfidence bounds (for the estimates of the type support and the base marginal utility function) or point-wise asymptotic variability bounds (for the estimates of the density function) in Figures 13 to 15. Notethat for the density estimates, the bounds are centered on f(θ) and ignore the bias of the estimates.

B.2 Counterfactuals: Nonlinear vs. Linear Pricing Under Standard ModelAn intuitive rationale for the findings in Section 5.5 on the difference in consumer and social surplusunder nonlinear and linear pricing predicted by the standard model is as follows. As discussed, for givenconsumer types, the standard model implies lower consumption levels relative to our model. Hence,the standard model accounts for observed quantities and marginal prices by ascribing a higher marginalwillingness to pay, θ, to households purchasing any given quantity. But since θν ′(q) = T ′(q), for a givenT ′(q) schedule and higher implied θ’s, the standard model must also imply lower base marginal utilitiesthan our model. Indeed, by comparing Figure 11 to Figures 6 and 7, it is apparent that the standard modelleads to higher estimates of consumers’ marginal willingness to pay for nearly all quantities (left panel ofFigure 11) and somewhat smaller estimates of base marginal utility from any given quantity (right panelof Figure 11). These higher type estimates, in turn, imply a greater sensitivity of consumer and socialsurplus to changes in the pricing scheme, as the experiment on linear pricing has shown.

B.3 Estimation Results: Non-regular SampleIn Figures 16 to 19, we report estimates of the type support, the density function of types, the base marginalutility function, and the multipliers on the participation (or budget) constraint, together with confidencebounds (for the estimates of the type support, the base marginal utility function, and the multipliers) orpointwise asymptotic variability bounds (for the estimates of the density function), for each quantity ineach of the 13 villages in our non-regular sample, which do not conform either to the highly-convex or theweakly-convex case of the augmented model. In these villages, we estimate by GMM the type support,

62

Figure 11: Primitives Under Standard Model

05

1015

20E

stim

ated

Typ

e

0 1 2 3Quantity

Standard Model: Type Support

02

46

8E

stim

ated

Bas

e M

argi

nal U

tility

0 1 2 3Quantity

Standard Model: Base Marginal Utility

θ(q), base marginal utility, ν ′(q), and the multipliers, γ(θ(q)), from the system{log(T ′(q))− log (θ1q)+(1− d) log

(q

1−d

)= 0

γ(θ(q)) + 1q

[c

T ′(q)− 1]g(q)−G(q) = 0

, (73)

with γ(θ(q)) specified as γ(θ(q)) = exp{ϕq}/(1 + exp{ϕq}) except for villages 3, 4, 8, 9, 10, and 13,where we set ϕ = θ1. In these villages, the small number of points in the support of quantities purchasedrendered the convergence of the GMM routine problematic. We estimate the density function of types bythe same procedure we used for the regular sample, outlined above.

63

Figure 12: Marginal Price Schedule and Estimated Marginal Cost for Regular Sample

23

45

67

89

0 1 2 3Quantity

Village 1: Marginal Price Schedule and Marginal Cost

23

45

67

89

0 1 2 3Quantity


23

45

67

89

0 .5 1 1.5 2Quantity


23

45

67

89

0 1 2 3Quantity


23

45

67

89



23

45

67

89

0 1 2 3Quantity


23

45

67

89

0 1 2 3Quantity


23

45

67

89

0 1 2 3Quantity


23

45

67

89

0 1 2 3Quantity


23

45

67

89



23

45

67

89



64

Figure 13: Confidence Bounds for Type Estimates for Regular Sample

-10

12

34

5T

ype

.2 .4 .6 .8 1Quantity

Type Estimates in Village 1

-10

12

34

5T

ype

0 .5 1 1.5Quantity


-50

510

15T

ype



-10

12

34

5T

ype



-10

12

34

5T

ype



-10

12

34

5T

ype



-10

12

34

5T

ype



-10

12

34

5T

ype

0 .5 1 1.5 2 2.5Quantity

Type Estimates in Village 8-1

01

23

45

Typ

e

.2 .4 .6 .8 1Quantity


-10

12

34

5T

ype



-10

12

34

5T

ype

.2 .4 .6 .8 1Quantity


65

Figure 14: Variability Bounds for Density Function Estimates for Regular Sample

05

1015

20D

ensi

ty F

unct

ion

1 1.02 1.04 1.06Type

Density Function in Village 1

-.5

0.5

11.

52

Den

sity

Fun

ctio

n

1 1.5 2 2.5Type


-.5

0.5

11.

52

Den

sity

Fun

ctio

n

1 2 3 4 5 6Type


-.5

0.5

11.

52

Den

sity

Fun

ctio

n

1 1.2 1.4 1.6 1.8Type


-.5

0.5

11.

52

Den

sity

Fun

ctio

n

1 1.5 2 2.5Type


-.5

0.5

11.

52

Den

sity

Fun

ctio

n

1 1.5 2 2.5 3Type


-.5

0.5

11.

52

Den

sity

Fun

ctio

n

1 1.2 1.4 1.6 1.8Type


-.5

0.5

11.

52

Den

sity

Fun

ctio

n

1 2 3 4 5Type

Density Function in Village 8-.

50

.51

1.5

2D

ensi

ty F

unct

ion

1 1.5 2 2.5Type


-.5

0.5

11.

52

Den

sity

Fun

ctio

n

1 1.2 1.4 1.6 1.8Type


-.5

0.5

11.

52

Den

sity

Fun

ctio

n

1 1.5 2 2.5Type


66

Figure 15: Confidence Bounds for Base Marginal Utility Estimates for Regular Sample

02

46

810

12B

ase

Mar

gina

l Util

ity

.2 .4 .6 .8 1Quantity

Base Marginal Utility Estimates in Village 1

02

46

810

12B

ase

Mar

gina

l Util

ity

0 .5 1 1.5Quantity


02

46

810

12B

ase

Mar

gina

l Util

ity



02

46

810

12B

ase

Mar

gina

l Util

ity



02

46

810

12B

ase

Mar

gina

l Util

ity



02

46

810

12B

ase

Mar

gina

l Util

ity



02

46

810

12B

ase

Mar

gina

l Util

ity



02

46

810

12B

ase

Mar

gina

l Util

ity

0 .5 1 1.5 2 2.5Quantity


24

68

1012

Bas

e M

argi

nal U

tility

.2 .4 .6 .8 1Quantity


02

46

810

12B

ase

Mar

gina

l Util

ity



02

46

810

12B

ase

Mar

gina

l Util

ity

.2 .4 .6 .8 1Quantity


67

Figure 16: Estimates and Confidence Bounds for Types for Non-regular Sample0

510

15E

stim

ated

Typ

e

0 1 2 3Quantity

Village 1

05

1015

Est

imat

ed T

ype

.5 1 1.5 2Quantity

Village 2

05

1015

2025

Est

imat

ed T

ype

0 1 2 3Quantity

Village 3

24

68

1012

Est

imat

ed T

ype


Village 40

510

15E

stim

ated

Typ

e

0 1 2 3Quantity

Village 5

05

1015

Est

imat

ed T

ype

0 1 2 3Quantity

Village 6

05

1015

Est

imat

ed T

ype

0 1 2 3Quantity

Village 7

05

1015

Est

imat

ed T

ype


Village 8

02

46

810

Est

imat

ed T

ype


Village 9

05

1015

20E

stim

ated

Typ

e

0 1 2 3Quantity

Village 10

05

1015

20E

stim

ated

Typ

e

0 1 2 3Quantity

Village 11

05

1015

2025

Est

imat

ed T

ype

0 1 2 3Quantity

Village 12

05

1015

20E

stim

ated

Typ

e


Village 13

68

Figure 17: Estimates and Confidence Bounds for Density Function Estimates for Non-regular Sample0

.05

.1.1

5D

ensi

ty F

unct

ion

0 5 10 15Type


-.05

0.0

5.1

.15

Den

sity

Fun

ctio

n

4 6 8 10 12 14Type


0.0

5.1

Den

sity

Fun

ctio

n

0 5 10 15 20Type


-.05

0.0

5.1

.15

Den

sity

Fun

ctio

n

2 4 6 8 10Type


.05

.1.1

5D

ensi

ty F

unct

ion

0 5 10 15Type


0.0

5.1

.15

Den

sity

Fun

ctio

n

0 5 10 15Type


0.0

5.1

.15

Den

sity

Fun

ctio

n

0 5 10 15Type


-.1

0.1

.2.3

Den

sity

Fun

ctio

n

2 4 6 8 10 12Type


-.05

0.0

5.1

.15

.2D

ensi

ty F

unct

ion

0 2 4 6 8 10Type


0.0

2.0

4.0

6.0

8.1

Den

sity

Fun

ctio

n

0 5 10 15 20Type


0.0

2.0

4.0

6.0

8.1

Den

sity

Fun

ctio

n

0 5 10 15 20Type


0.0

2.0

4.0

6.0

8D

ensi

ty F

unct

ion

0 5 10 15 20Type


0.0

5.1

.15

Den

sity

Fun

ctio

n

0 5 10 15Type


69

Figure 18: Estimates and Confidence Bounds for Base Marginal Utility for Non-regular Sample0

24

68

10E

stim

ate

Bas

e M

argi

nal U

tility

0 1 2 3Quantity

Village 1

.51

1.5

22.

53

Est

imat

e B

ase

Mar

gina

l Util

ity

.5 1 1.5 2Quantity

Village 2

05

1015

Est

imat

e B

ase

Mar

gina

l Util

ity

0 1 2 3Quantity

Village 3

05

1015

Est

imat

e B

ase

Mar

gina

l Util

ity


Village 40

510

1520

Est

imat

e B

ase

Mar

gina

l Util

ity

0 1 2 3Quantity

Village 5

05

1015

Est

imat

e B

ase

Mar

gina

l Util

ity

0 1 2 3Quantity

Village 6

02

46

810

Est

imat

e B

ase

Mar

gina

l Util

ity

0 1 2 3Quantity

Village 7

02

46

8E

stim

ate

Bas

e M

argi

nal U

tility


Village 8

05

1015

Est

imat

e B

ase

Mar

gina

l Util

ity


Village 9

02

46

8E

stim

ate

Bas

e M

argi

nal U

tility

0 1 2 3Quantity

Village 10

02

46

8E

stim

ate

Bas

e M

argi

nal U

tility

0 1 2 3Quantity

Village 11

01

23

45

Est

imat

e B

ase

Mar

gina

l Util

ity

0 1 2 3Quantity

Village 12

.51

1.5

22.

5E

stim

ate

Bas

e M

argi

nal U

tility


Village 13

70

Figure 19: Estimates and Confidence Bounds for Multipliers for Non-regular Sample.6

.7.8

.91

Est

imat

ed M

ultip

liers

0 1 2 3Quantity

Village 1

.6.8

11.

2E

stim

ated

Mul

tiplie

rs

.5 1 1.5 2Quantity

Village 2

.75

.8.8

5.9

.95

1E

stim

ated

Mul

tiplie

rs

0 1 2 3Quantity

Village 3

.75

.8.8

5.9

.95

1E

stim

ated

Mul

tiplie

rs


Village 4.4

.6.8

11.

2E

stim

ated

Mul

tiplie

rs

0 1 2 3Quantity

Village 5

.4.6

.81

1.2

Est

imat

ed M

ultip

liers

0 1 2 3Quantity

Village 6

.75

.8.8

5.9

.95

1E

stim

ated

Mul

tiplie

rs

0 1 2 3Quantity

Village 7

.8.8

5.9

.95

1E

stim

ated

Mul

tiplie

rs


Village 8

.75

.8.8

5.9

.95

1E

stim

ated

Mul

tiplie

rs


Village 9

.8.8

5.9

.95

1E

stim

ated

Mul

tiplie

rs

0 1 2 3Quantity

Village 10

.85

.9.9

51

Est

imat

ed M

ultip

liers

0 1 2 3Quantity

Village 11

.6.7

.8.9

11.

1E

stim

ated

Mul

tiplie

rs

0 1 2 3Quantity

Village 12

.85

.9.9

51

Est

imat

ed M

ultip

liers


Village 13

71

Nonlinear Pricing in Village Economies - Stanford University · PDF fileNonlinear Pricing in...

Documents

Transcript of Nonlinear Pricing in Village Economies - Stanford University · PDF fileNonlinear Pricing in...