Prediction and Determination of Household Permanent...

32
Prediction and Determination of Household Permanent Income Ramses Abul Naga DEEP Univiversite de Lausanne Robin Burgess Department of Economics London School of Economics October 2001 Abstract This paper develops a framework for looking at the determination and pre- diction of permanent income and applies it to Chinese household data. By pooling information from both determinants and correlates of permanent in- come we are able to derive predictors of permanent income which statistically outperform observed income and consumption. Simultaneous estimation of the model enables us to compare how well dierent observable welfare indicators identify household permanent income. Our framework also provides insights into the processes generating permanent income and allows us to draw in- ferences concerning the power of dierent forms of public policy to inuence permanent income. Keywords: permanent income, welfare measurement, prediction, living stan- dards, China JEL classication : I30, C31, 012 We are grateful to seminar participants at the University of Geneva, London School of Eco- nomics, University of Lausanne, University of Namur, Stanford University, University College Lon- don and to Timothy Besley, Richard Blundell, MarcelFafchamps, Arthur Goldberger, Alberto Holly, John Muellbauer and Christian Schluter for useful comments and suggestions. We are grateful to the Chinese State Statistical Bureau for the data used in this paper Corresponding author. Department of Economics, London School of Economics, Houghton Street, London WC2A 2AE, UK, [email protected] 1

Transcript of Prediction and Determination of Household Permanent...

Page 1: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

Prediction and Determination of HouseholdPermanent Income∗

Ramses Abul NagaDEEP

Univiversite de Lausanne

Robin Burgess†

Department of EconomicsLondon School of Economics

October 2001

Abstract

This paper develops a framework for looking at the determination and pre-diction of permanent income and applies it to Chinese household data. Bypooling information from both determinants and correlates of permanent in-come we are able to derive predictors of permanent income which statisticallyoutperform observed income and consumption. Simultaneous estimation of themodel enables us to compare how well different observable welfare indicatorsidentify household permanent income. Our framework also provides insightsinto the processes generating permanent income and allows us to draw in-ferences concerning the power of different forms of public policy to influencepermanent income.

Keywords: permanent income, welfare measurement, prediction, living stan-dards, China

JEL classification: I30, C31, 012

∗We are grateful to seminar participants at the University of Geneva, London School of Eco-nomics, University of Lausanne, University of Namur, Stanford University, University College Lon-don and to Timothy Besley, Richard Blundell, Marcel Fafchamps, Arthur Goldberger, Alberto Holly,John Muellbauer and Christian Schluter for useful comments and suggestions. We are grateful tothe Chinese State Statistical Bureau for the data used in this paper

†Corresponding author. Department of Economics, London School of Economics, HoughtonStreet, London WC2A 2AE, UK, [email protected]

1

Page 2: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

1 Introduction

Current income and consumption are imperfect proxies of permanent income whichis the variable of real interest in a wide variety of policy settings. This is bothbecause of measurement error but also because they contain a high transitory com-ponent. Recognising these problems, a literature has emerged which seeks to identifythe most satisfactory indicator (often defined as the least noisy correlate) of perma-nent income (Deaton, 1992; Anand and Harris 1993; Chauduri and Ravallion, 1994;Blundell and Preston, 1998). This paper extends this literature by modelling perma-nent income as an unobservable by using information relevant to its determinationto obtain predictors of the latent variable. The key contributions of the paper aretwofold. First, we are able to derive predictors which pool information both fromdeterminants and correlates of permanent income. These are shown to statisticallyoutperform definitions which are based on one set of variables and not the other.Second, the analytical framework we develop enables us to gain deeper insights intothe processes generating permanent income. This in turn allows us to draw inferencesconcerning the power of different forms of public policy to influence long-run livingstandards.The usefulness of our theoretical framework for the analysis of living standards is

tested through an application to household data from two provinces in rural Chinawhich are at different stages of economic development. As data on both currentincome and consumption are collected we are able contrast how well each performsin predicting permanent income. Estimation of the permanent income equation alsoprovides insights into which factors have been effective in influencing long-run livingstandards. This has implications for public policy as it illustrates how influencingvariables other than income and consumption can be instrumental in raising long-run living standards. These inferences cannot be made from standard reduced formestimates. And this is relevant in rural China where welfare policy operates pre-dominately through non-income channels (e.g. land reform). Finally, we are ableto demonstrate how constructed predictors of permanent income outperform staticindicators in terms of predicting household permanent income.The outline for the remainder of the paper is as follows. Section 2 discusses

the theory behind estimation and prediction in a model where permanent incomeis treated as latent variable. Section 3 discusses the application of our theoreticalframework to rural China and distills the key lessons we gain in terms of the analysisof living standards. Section 4 concludes.

2 Theory

In their work on income inequality in the United States, Friedman and Kuznets (1945)suggested that observed income at a particular point in time could be decomposed intothe sum of an individuals long run income together with a transitory component. In

2

Page 3: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

his subsequent work on the consumption function, Friedman (1957) proposed a similardecomposition for current consumption: observed consumption was hypothesised tobe a function of the households permanent income and a transitory component.Letting Y denote the household’s unobserved permanent income, X1 and X2 de-

note their current income and consumption expenditure respectively and definingu1 and u2 as the corresponding transitory components, we can write the followingsystem:

X1 = Y + u1 (1)

X2 = β2Y + u2 (2)

we then augment the above system with an equation explaining Y :

Y = γ0Z + ² (3)

where Z is k × 1 column vector and ² is a disturbance term.These Z variables should represent longer term characteristics of households which

have bearing on the determination of permanent income and are likely to be measuredwith less noise than income or expenditure. Following household production theory(Muellbauer, 1983; Singh, Squire and Strauss, 1986) we employ a specification:

Y = γ1Dh + γ2Eh + γ3Ah + γ4Ch + ² (4)

where permanent income (Y ) is seen to depend on household composition (Dh), edu-cational and occupational status (Eh), stock of physical assets (Ah) and community orenvironmental characteristics such as access to amenities (Ch). Permanent income isthus viewed as a function of the human and non-human capital of the household con-ditioned by its composition which controls for position in the life-cycle (see Deaton,1992).Equation (3) is commonly referred to as the causes equation, and thus the system

consisting of (1), (2) and (3) is known as the multiple indicators-multiple causes(MIMIC) model (Zellner, 1970; Joreskog and Goldberger, 1975).Our paper extends previous work on this model by deriving alternative predictors

for the unobserved variable Y by pooling information both on X and Z variables.This new framework, which emphasises both the correlates and determinants of per-manent income, is used to extend the literature on the analysis of living standardsin three directions. (i) Decomposition of variance of static indicators into permanentand transitory components allows us to select the least noisy indicator of permanentincome. (ii) Examination of the causal equation (3) provides insights into the deter-minants of permanent income and the design of public policy to support long runliving standards. (iii) The predictors we derive, which make more intensive use ofwelfare relevant information in household data, can be shown to outperform standardalternatives in terms of predicting permanent income and identifying chronically poorhouseholds.

3

Page 4: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

The key assumption in our model is that the observed correlation between incomeand expenditure is induced by their joint dependence on the household’s permanentincome (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979; Muellbauer, 1983). Sucha formulation is consistent with the permanent income hypothesis but also covers abroader class of stochastic models where consumption exhibits the martingale prop-erty (see Deaton, 1992: 22-29). For the purpose of the present paper, it suffices to seecurrent income and expenditure being modelled as linear functions of a common setof explanatory variables Z, a common error term ², as well as specific disturbancesui for each of the X variables.The reduced form of the model can be obtained by substituting for Y (equation

(3)) into (1) and (2):

X1 = γ0Z + ²+ u1 (5)

X2 = β2γ0Z + β2²+ u2

i.e. X = βγ0Z + β²+ u

where β0 = [1 β2] and X0 = [X1 X2].

1 Let νj denote the disturbance term for thereduced form equation (5):

νj = βj²+ uj (6)

Also let Π = βγ0 denote the matrix of reduced form parameters.The causal structure of the model can be depicted in terms of a path diagram

(see Figure 1). In path diagrams observed variables are squared, latent variables arecircled, and disturbance terms are unsigned. The path diagram indicates that theerror terms, ² and u, are all assumed to be uncorrelated and the X’s are influenced bythe Z’s and ² (through their dependence on Y ). Throughout the paper we maintainthe following assumptions:

E(u2i ) = ωii, E(uiuj) = 0 (7)

cov(Zε) = [0], cov(Y u) = [0], cov(²u) = [0]

where [0] denotes a matrix of zeros of appropriate order.The two main features of the model are that (i) the vector of reduced form coef-

ficients is of the multiplicative form:

Πi = βiγ0 (8)

so that the matrix Π = βγ0 is of rank 1, and (ii) the covariance matrix of ν has afactor analytic structure (see Joreskog and Goldberger, 1975):

Σν = E

Ã"ν1ν2

# hν1 ν2

i!=

"σ²² + ω11 β2σ²²β2σ²² β22σ²² + ω22

#(9)

1An alternative normalisation device is to set variance of the causes equation equal to unity (i.e.σ²² = 1) and β1 to be unconstrained (see Joreskog and Goldberger, 1975).

4

Page 5: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

viz. it is the sum of a matrix of unit rank and a diagonal matrix of full rank:

Σν = σ²²ββ0 + Ω (10)

where Ω is the diagonal covariance matrix of the u’s whose elements are defined inequation (7).Let δ denote the set of structural parameters and Σ(δ) be the covariance matrix

implied by our statistical model. A general procedure for estimating models withlatent variables consists of choosing estimates of δ in order to minimize a distancefunction F [Σ(δ), S] between the model covariance matrix and its sample counterpartS. For the applications of Section 3, a generalised least sqaures procedure is usedin order to minimise the discrepancy between the covariance matrix of the data andthat implied by the model (Browne, 1977).Since permanent income is assumed unobservable in our framework, one of our

essential tasks is to draw inferences about this latent variable using the data onliving standards available to us. As the model contains two sets of variables, X, thecorrelates, and Z, the causes, we can essentially follow three alternative routes toprediction: (i) the X predictor (Y ∗X) — a predictor of Y using the correlates X, (ii)the Z predictor (Y ∗Z ) — a predictor of Y based on the causes equation (3), (iii) the Wpredictor (Y ∗W ) — a predictor of Y which combines the available information in bothX and Z variables.The Z predictor readily obtains from the “causes” equation (3):

Y ∗Z = E(Y ) = γ0Z

From this it follows that the Z predictor can be obtained from the regression functionof Y on the causal variables (Z). In the context of cross-section data where error termsare assumed uncorrelated across family units, the regression function constitutes theminimum square error predictor of Y using Z.The best linear predictor of Y using X, obtained subject to an unbiasedness

constraint (see the Appendix), can be expressed as:

Y ∗X = (β0Σ−1X β)−1β0Σ−1X X (11)

where ΣX is the covariance matrix of X given in expression (A1) in the Appendix.For the W predictor (the predictor combining X and Z information simultaneously)we obtain:

Y ∗W = σ²²β0Σ−1ν X + (1− σ²²β

0Σ−1ν β)γ0Z (12)

By substituting (5) into (11), we have that:

Y ∗X = (β0Σ−1X β)−1β0Σ−1X (βγ

0Z + β²+ u)

i.e.Y ∗X = γ0Z + η

5

Page 6: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

where η is white noise, viz E(η) = 0. Likewise for the W predictor :

Y ∗W = σ²²β0Σ−1ν (βγ

0Z + β²+ u) + (1− σ²²β0Σ−1ν β)γ0Z

which can be expressed as:Y ∗W = γ0Z + φ

where also E(φ) = 0. For both the X and W predictors we can then write thedecomposition:

Y ∗i = Y∗Z + white noise, i = X,W (13)

It is in this sense that the predictors are unbiased. It also follows that the X and Wpredictors will have higher variances than the Z predictor.Going back to equation (12) we can express Y ∗W as a weighted average of Y ∗X and

Y ∗Z :Y ∗W = θY ∗X + (1− θ)Y ∗Z where θ = σ²²β

0Σ−1ν β (14)

It is shown in the Appendix that Y ∗W dominates either of Y ∗X and Y ∗Z in terms ofmean squared error, and that 0 < θ < 1. The intuition for this result follows fromthe fact that θ is increasing in σ²² (equation (A7)). This implies that in order tominimise prediction mean-square error it is necessary to pool the information aboutY contained in Y ∗X and Y

∗Z and this is precisely what the Y ∗W achieves. As σ²² goes to

zero, (that is, as Z accounts for more of the variation in Y ) θ goes to zero, so that Y ∗Wrelies on Z variables. Conversely, as σ²² rises, θ increases and more weight is beingplaced on X variables in Y ∗W . Note finally that while the X and Z predictors can becompared in variance terms (see (13)), they cannot be ranked in terms of predictionmean squared error (see equations (A9) and (A10) of the Appendix.Given that the variance of the Z predictor underestimates the true variance of per-

manent income, this raises the possibility that the Z predictor is worse at predictingthe permanent incomes of individuals at the extremes, so that it may be of limiteduse in identifying the households with chronically low standards of living. Since thevariance of permanent income lies somewhere between the variance of current incomeand that of the Z predictor2 there may be grounds for believing that the Z predic-tor may be more precise at predicting middle ranges of the distribution of Y , butthat magnitudes of prediction errors may be larger at the tails. It is perhaps therethat using the X variables adds most information about the permanent incomes ofhouseholds. It is in this sense focussing our attention on unbiasedness and efficiencycriteria exclusively may not be appropriate.There is yet another way in which the superiority of the W predictor over the Z

and X predictors may be advocated. In the case of living standard analysis, wheretypically long run income is not observed, we believe that a predictor of permanentincome should ideally be chosen in a way as to exhaust all the sample informationon the underlying latent variable. In technical terms, a statistic which contains as

2This follows from reading the reduced form equations (5).

6

Page 7: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

much information about permanent income as the data itself, is said to be sufficient(Bartholomew, 1984). While sufficient statistics do not exist for all inferential prob-lems, the concept is useful for identifying approaches to the prediction of permanentincome which do not meet this particular criterion. Thus, for instance, a researcherwho possesses data on a households consumption expenditure and annual earnings,and chooses to predict permanent income using consumption expenditure only (onthe grounds of lower noise) will not be using a sufficient statistic for the unobservedvariable. Unless, earnings contain no other information about permanent income thanthe information already available in consumption, the approach of predicting perma-nent income using its least noisy indicator cannot be recommended on sufficiencygrounds. Furthermore, for data from low income countries, where measurement er-rors may be high, the need to exploit all the available information on long run incomeis all the more necessary.Since both X and Z contain information on Y , a statistic which does not use

both sets of variables cannot exhaust all the available sample information on Y .Predictors based on X without Z, or on Z excluding X, do not exhaust all thesample information on Y , and thus will fail to meet the sufficiency requirement. Onthe basis of this sufficiency result, it is clearly preferable to employ Y ∗W as a predictorof permanent income as opposed to Y ∗X or Y

∗Z . In essence, the Z predictor throws away

the welfare relevant information contained in the correlates of permanent income (i.e.current income and consumption) which on a priori grounds would be thought tobe highly relevant to the prediction of permanent income. These considerations leadus to select the W predictor as our preferred indicator of permanent income. Thispredictor effectively balances the contribution of X1 (current income), X2 (currentconsumption) and Z information to prediction according to the degree to which theyare correlated with unobserved permanent income (Y ).This does not imply the X predictor is useless for practical work. First of all,

the researcher or policy maker may not have the necessary data on the Z variablesto make a prediction about the family’s long run income. Observing current incomeand expenditure may also be at times less costly. In our empirical application sectionwe show that the weights on income and consumption in the X predictor tend to fallwithin the 0.45− 0.55. As a first hand approximation one could therefore compute apredictor of permanent income as an average of current income and consumption.There is also the fact that the X predictor is based solely on correlates of perma-

nent income. It is known from previous studies on living standards that income andconsumption do not identify the same households as being in poverty (e.g. Glewweand van der Gaag, 1990; Chaudhuri and Ravallion, 1994; Blundell and Preston, 1998).On such basis the X predictor can offer a sensible compromise between relying onincome data on the one hand, versus consumption exclusively, in identifying house-holds with low standards of living. As shown in our application section, by predictingpermanent income using Z variables we do not necessarily identify a larger set of poorindividuals than that which is common to income and expenditure definitions. This

7

Page 8: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

is not an undesirable feature of Y ∗Z (and Y∗W ), but rather a statement of the fact that

poverty of productive resources or opportunities (Z space) need not be equivalent topoverty measured as a function of outcomes (i.e. X variables).

3 Empirical Application

In this section we estimate our theoretical model using rural household data fromthe worlds most populous country, China. The question of what determines livingstandards and which factors China’s explain achievements in constraining poverty andsupporting welfare in the countryside is one that has attracted intense scrutiny (seee.g. World Bank, 1992; Putterman, 1993; Burgess, 1997, 2001). However, it is onlyrecently that representative household expenditure has been released by the StateStatistical Bureau (SSB). As a result we know little about living standards in ruralChina. This would appear to be a serious gap in our knowledge given the numbersinvolved — roughly eight hundred million people were living in rural China in 1990,the year for which we have data. In this paper we exploit the theoretical frameworkdeveloped in Section 2 to gain insights into the measurement and determination ofliving standards in rural China. We focus on two provinces, Sichuan and Jiangsu,which differ markedly as regards economic growth and the development of markets,thus allowing us to contrast the factors affecting living standards. Taken togetherour empirical results serve to demonstrate the usefullness of the theoretical frameworkdeveloped in Section 2 as a practical tool for the analysis of living standards. Ourempirical application is divided into four subsections. In the first, we discuss dataand specification of the model. In the second, we contrast different indicators interms of their ability to proxy household permanent income. In the third we examinethe causes equation to gain insights into which factors play a prominent part in thedetermination of household permanent income in China. In the fourth, we examinethe performance of the different predictors of permanent income in the light of variousstatistical criteria.

3.1 Data and Model Specification

The data used in this paper are drawn from two provincial sub-samples of the RuralHousehold Sample Survey conducted by the State Statistical Bureau (SSB) of thePeople’s Republic of China (see Burgess andWang, 1995). Given scarcity of householddata on rural China during the transition period they are of considerable interest.We focus on data from two provinces which represent the two faces of modern ruralChina. Sichuan is poorer, less diversified and more agricultural, and is located inthe backward interior of China while Jiangsu is richer, has a high degree of ruralindustrialisation and is located on the fast growing coastal rim adjacent to Taiwan(see Table 1). If we rank the rural sectors of Chinese provinces according to percapita expenditure (PCE) in 1990, Jiangsu is located near the top of the distribution

8

Page 9: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

whilst Sichuan is located in the lower half of the distribution. Using our theoreticalframework we contrast we derive and contrast alternative proxies of permanent incomeand examine the processes generating permanent income in the these two provinces.As is evident from Table 1, sampling is multistage in design. One third of the

counties in a province are sampled, ten villages are drawn from each county and tenhouseholds from each village. What drew us to the data set is the fact that there areseparate, highly detailed series on both income and consumption which are collectedthroughout the course of the entire year on a daily basis.3 This feature is unusualin household data sets and allows us to contrast the performance of income andconsumption in identifying household permanent income. Treatment of householdproduction and consumption in the data is highly detailed and care has been takento ensure that home produced consumption is correctly imputed. The fact that dailydata entry in household log books is closely monitored by a resident village enumeratorand subjected to a rigorous system of cross-checks by SSB officials at different levelsis also likely to add to the robustness of the results (see Burgess and Wang, 1995).The Chinese data set contains a number of variables on the different asset stocks

within the household (e.g. landholding, grain stocks, savings, durable stocks, housingstock, and productive assets) as well as information on the occupational, educationaland demographic characteristics of household members (see Table 2). This allows usto implement our theory and model permanent income as a function of the human andnon-human capital of the household conditioned by its composition which controlsfor position in the life-cycle (see Deaton, 1992). This information also allows us toexamine the impact of a range of Z variables on permanent income. In this way weare able to build up a detailed picture of the processes generating permanent incomein the two provinces.We have chosen the Z variables to capture longer-run or pre-determined char-

acteristics of the household which might be driving permanent income. Several in-stitutional features specific to rural China make us more comfortable with viewingthese Z variables as determinants of Y . Strict family planning controls limit demo-graphic choices of households. The allocation of cultivable land is determined byvillage collectives and there is limited scope for households to influence the allocationthey receive.4 Primary and secondary education is compulsory and free. Locationalchoices are restricted by a household registration system which in 1990 severely lim-ited mobility.Table 2 sets out the basic specification of the model used to analyse living stan-

dards in Sichuan and Jiangsu. Following the structure set out in the theory sec-

3This makes the a priori theoretical case for preferring consumption over income weaker — incomeand consumption are pitted against each other on more equal terms than is the case when surveyperiods are shorter (see Deaton, 1997).

4Land allocation thus represents an important example of a lump-sum transfer. These transfersare lump-sum in the sense that they do not depend on actions of households or individuals (seeBurgess, 2001).

9

Page 10: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

tion we have two X indicators, per capita income (PCI) and per capita expendi-ture (PCE). We examine three alternative definitions of consumption (LNPCE1,LNPCE2, LNPCE3) arrived at by making different types of adjustment for thenoisy elements in total expenditure, namely durables and housing. Permanent incomeis modelled as a function of eleven Z variables which capture household composition,educational and occupational status, stock of physical assets, and community charac-teristics such as access to amenities (see (4)). The model, where possible, is specifiedin log terms to allow for the presence of non-linearities in the reduced form estimates.

3.2 Comparison of Welfare Indicators

Researchers, policy makers and administrators routinely face the problem of selectingan observable indicator of welfare from cross-sectional data. These indicators areexpected to convey information about the welfare of households well beyond thesurvey period. The two leading practical candidates in this respect are per capita (orequivalised) income and consumption (see Deaton, 1997; Blundell and Preston, 1998).The underlying problem is that these observed welfare measures (X) are imperfectproxies of unobservable long-term welfare (Y ) which is the object of interest if we aretrying to alleviate chronic as opposed to transient poverty.In some cases the choice between income and expenditure is dictated by the fact

that information is only collected on one variable.5 In the bulk of other cases thechoice is usually made by resorting to a priori arguments. In the developing countrysetting, consumption based measures are typically preferred for a variety of reasons.Based on the permanent income hypothesis it is argued that these represent smootherindicators of permanent income than current income in particular when data is col-lected over short periods (Deaton, 1997). For households which are unrestricted intheir opportunities to buffer their income variability, their short-term consumptionlevels will reveal their permanent income at that date. Incentives to underreportconsumption may also be less.These advantages are hypothetical and the choice between consumption and in-

come needs to be made with reference to the data at hand. It is therefore largelyan empirical issue as to which measure should be preferrred in a given setting. Thisdilemma has spawned a growing empirical literature on the choice of static indicatorsof permanent income in household data.6 The framework we have developed in thispaper allows us to rigorously and robustly compare income and consumption in termsof their ability to identify the permanent income of households. Moreover we are ableto examine how different definitions of consumption fare vis a vis income. We arealso able to examine whether determinants of permanent income (Z — e.g. landhold-ing) can outperform income and consumption in terms of identifying the permanent

5For example the National Sample Surveys in India only collect information on consumption.6See Glewwe, 1990; Glewwe and van der Gaag, 1990; Anand and Harris, 1993; Chaudhuri and

Ravallion; 1994 and Deaton, 1992, 1997.

10

Page 11: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

income of households. Not having to rely on a priori arguments in making thesecomparison represents a significant empirical advance.Calculation of the variances of the permanent and transitory components of ob-

served income and expenditure serves as a basis for choosing between them. Placing(3) into (1) and (2) and calculating variances we have that:

β2i var(Y ) + var(ui)

var(Xi)= 1 (15)

We can identify the variances of the permanent and transitory components of a givenindicator as:

var(ui)

var(Xi)= variance share of transitory component (16)

1− var(ui)

var(Xi)= variance share of permanent component

Given the specification of the model, these shares provide us with a precise andobservable measure of indicator noisiness.7

These decompositions also provide a basis for choosing between different defini-tions of the same welfare indicator. One can examine, for example, how correcting fornoisy elements of consumption (e.g. durables, housing) affect the performance of theconsumption indicator (PCE). These critical decisions are often made on the basisof somewhat arbitrary assumptions typically without reference to the data set beinganalysed. Finally, it is also possible to compare the relative correlations of Xs andZs with permanent income thus allowing us to examine the hypothesis that observedZs may be more appropriate (i.e. less noisy) indicators of long-run living standardsthan observed Xs. Given that some Zs may be easier to observe than Xs, this raisesthe possibility that Zs may be better measures on which to base the targeting ofresources and other redistributive schemes.Estimation results for our model are presented in Table 3. These suggest that

the choice between income and consumption is sensitive to which definition of con-sumption is used. Decomposition of the variance of these static welfare indicatorsinto transitory and permanent components brings out the comparisons most clearly(Table 4). Uncorrected measures like LNPCE1 which includes current spending onall goods and services (including noisy items such as housing and durables) are out-performed by LNPCI in both provinces. For intermediate measures like LNPCE2where only the flow of value from durables is imputed, income and consumption per-form almost equally well as each other, consumption slightly outperforming income inSichuan and vice versa in Jiangsu. Corrected measures such as LNPCE3, where theflows of value from both housing and durable stocks are imputed in place of (noisy)current expenditures, clearly outperform LNPCI.

7This type of decomposition analysis parallels that for earnings mobility (see Lillard and Willis,1978).

11

Page 12: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

The overall impression from Table 4 is that replacement of noisy elements ofcurrent consumption with imputed value flows leads to a lowering of the transitoryshare in overall variance and a corresponding rise in its performance vis a vis income.Thus, assuming that the causal equation (3) is well specified, corrected consumptionwill be preferred over income as a proxy of household permanent income. The results,however, cast doubt on the widespread belief that current consumption is alwaysa better proxy of permanent income than current income — it turns out that thiscomparison is highly sensitive to how noisy elements of consumption are treated.Our results are in line with Chaudhuri and Ravallion (1994) who do not find that thepreference for consumption-based measures is supported in Indian longitudinal data.8

Arguments based either on consumption smoothing or on the superior accuracy ofconsumption data do not seem to be well supported in either study. Instead theypoint to need the need to use some impartial framework, such as that developed inthis paper, as a basis for comparing across income and consumption on a case by casebasis.The performance of Xs versus Zs as indicators of permanent income can also

be examined by calculating correlation coefficients between X and Z variables andunobservable Y . The appropriate formula for the calculation of correlation coefficients(ζ) are:

ζXY =β(ρ2 + σ2²²)q

var(Xi)(ρ2 + σ2²²)where ρ2 = γ0ZZ 0γ (17)

ζZY =(ZZ 0γ)iq

var(Zi)(ρ2 + σ2²²)

where (ZZ 0γ)i is the ith entry of the vector ZZ 0γ. These results follow from equa-tions A1-A5 in the Appendix. Coefficients for the uncorrected consumption model(LNPCE1) are shown in Table 5.This comparison confirms that either income or consumption, on the whole, per-

form better than the various Z variables in identifying permanent income. Thus, inthis data, if a single welfare indicator needs to be chosen these results indicate thatit would be better to choose a money metric utility proxy (X measure) as opposedto a Z measure. This does not, however, imply that a X based approach to povertyidentification is preferable to a Z approach in all cases.

3.3 Determinants of Permanent Income

In our framework the Z factors determine permanent income (Y ) which in turn de-termines the welfare indicators (X) observed at a particular point in time (see Figure

8Comparing current income and consumption measures to six year means in the ICRISAT panel,Chaudhuri and Ravallion (1994) find that current income is a better indicator of chronic povertythan current consumption when poverty is defined with respect to mean income.

12

Page 13: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

1). By examining coefficients in the causes equation (4) we can therefore gain signif-icant insights into the determination of permanent income in Sichuan and Jiangsu.Permanent income is seen to depend on household composition, educational and occu-pational status, stocks of physical and monetary assets (i.e. housing, land, durables,productive assets and savings) and community characteristics such as access to ameni-ties (see Muellbauer, 1983; Glewwe, 1991; Deaton, 1992). The causal equation shouldbe viewed as a reduced form estimate of various structural relationships (e.g. assetreturns, agricultural production function, earnings function).9 Coefficients on the Zvariables tell us which factors may be important in determining permanent income(Y ) and its correlates (X). The Z variables we have chosen represent longer-run orpre-determined characteristics of the household. Coefficients in the causal equationshould therefore be interpreted as explaining variation in household welfare condi-tional on the stocks of human and physical capital that have been accumulated inthe past.Coefficients on the Z variables in the causal equation (4) are presented in Ta-

ble 3. Several clear patterns emerge. As would be expected, both household size(LNN) and the dependency ratio (CHILDP ) have significant negative effects onhousehold permanent income (Y ). The latter child effect appears to be stronger inSichuan which may reflect lower off-farm employment opportunities in this province.Increasing population on a fixed land resource may lead to a more negative impacton welfare than where excess labour can be absorbed into other activities. The edu-cational level of the household head (EDUHD) has a significant effect on Y whichis more pronounced in Jiangsu which may be due to off-farm employment oppor-tunities raising the returns to education. The proportion of the household labourforce in primary activities (PRIMP ) is negatively and significantly related to Y inboth provinces.10 This suggests that rural diversification into off-farm activities canhave a large positive impact on permanent income. Given small farm sizes expandingemployment into off-farm activities is the key means of improving welfare in ruralChina. Since the onset of rural reforms in 1978 it is the fact that structural change ofthis type has gone so much further in Jiangsu than Sichuan which explains why, onaverage, income and consumption levels are so much higher in this province. Housingcharacteristics, both per capita floor area (AREAPC) or the electrification dummy(ELEC) are both positively associated with Y . Housing, in particular, appears torepresent an important stock of wealth in Jiangsu. Stronger coefficients on the ELECvariable (which can also be viewed as a community characteristic) in Jiangsu mightalso reflect complementarity with off-farm production activities. This is an impor-tant finding as it suggests that expanding provision of rural infrastructure may be animportant means of encouraging diversification and raising long run living standards.

9Policies to support living standards through Z interventions would have to be based on a morerefined understanding of these structural relationships (i.e. the manner in which a household accu-mulates human and physical capital).10Primary activities here relates to agriculture, forestry and fisheries.

13

Page 14: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

As might be expected cultivable landholding per capita (CULTPC) has a significantand positive influence on permanent income confirming its central role in determiningthe permanent income of households. Unlike in other low income countries there isuniversal access to land in rural China. Village councils allocate land to householdson the basis of demographic composition.This implies that though the distributionof (per capita) land is egalitarian within villages there is significant variation acrossvillages (see Burgess, 2001). The results here thus point to the the centrality of accessto land to supporting living standards in rural China. Indeed rural poverty in Chinais typically viewed as being closely correlated with habitation on poor quality land inarid parts of the country. This is a central finding BWTV andWATCH which proxyfor the basic and luxury durable stocks held by the household are both significantlyand positively related to permanent income. This suggests that durables representan important stock of wealth in rural China. Per capita savings (DEPOSIT ) alsohave a significant role in the determination of permanent income.11 Interestinglythis effect is stronger in poorer Sichuan which might be reflective of farmers in thisprovince placing their surpluses in financial institutions whereas a larger number ofhouseholds in Jiangsu may be investing their surpluses directly in off-farm activitieswhich are more risky but have higher mean expected returns.12 A greater role forsavings in rural Sichuan may also reflect that they play a greater role as a consump-tion smoothing device in this province as households cannot rely as much on off-farmearning streams. Per capita holdings of productive assets (PASSETV ) also has apositive and significant impact on household welfare. Given that these assets arealmost entirely agricultural in nature it is not surprising that this effect is strongerin Sichuan.Not too much should be read into these causal equations as they represent the

reduced form estimates of a host of structural equations which are not directly ob-served. They do, however, offer broad directions in terms of identifying which factorsare important for the determination of rural living standards in China. Not all thesefactors can be thought of as policy dependent but a number of policy suggestionsdo follow from the analysis. The promotion of diversification, for example, throughderegulation or the provision of basic infrastructure (e.g. electricity, telecoms, roads)can help to raise long-run living standards in both provinces. Expansion of educationprogrammes will have a more significant impact in Jiangsu than Sichuan, thoughthis effect is likely to be contingent on the level of diversification. Providing close touniversal access to land is shown to be an important element in the support of rural

11In 1990 nearly all household savings would be held with government banks (i.e. there was amonopoly), therefore this measure should be relatively good measure of the level of monetary savingskept in formal financial institutions. It can also be an inverse measure of liquidity constraints.12Taken as a whole these findings are consistent with the analysis of demand patterns in rural

China which shows that particularly in Jiangsu, richer rural households invest their surpluses pri-marily in building up housing and durable stocks or in pooling resources with other households toinvest in off-farm rural enterprises (as opposed to placing these surpluses in low or negative interestbank accounts).

14

Page 15: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

living standards in both provinces. This confirms the central role that access to landplays in the determination of permanent income in rural China. Given small farmsizes, making access to land less egalitarian, for example, by altering the allocationmechanism or by introducing land markets, may have negative consequences for thelong term welfare of rural households.13 The liberalisation of financial markets whichallows rural households to benefit from rapid private sector growth by increasing thereturns on savings might also help to increase welfare. Finally, the results suggestthat the subsidisation of agricultural productive assets will have greater impact inSichuan than Jiangsu. It is thus possible through this crude analysis to pick up somecommon factors of importance in the two provinces as well as some key differences interms of what might be advisable in policy terms.

3.4 Prediction of Permanent Income

Prediction results using the formulae from Section 2 are shown in Table 6.14 Weemploy the normalisation β = 1 so that the predicted levels of permanent incomeare measured in the units of per capita income (yuan). Using this specification,different determinants (Z) and correlates (X) can thus be collapsed into a singlemoney metric predictor (Y ) which is comparable in magnitude to observed incomeand consumption.In Table 6, as a means of avoiding clutter, we restrict our attention to the model

using uncorrected consumption expenditure (LNPCE1).15 In order to produce un-biased estimates of Y which are comparable in levels to PCE and PCI, both the ZandW prediction equations contain constants. The regression constant γ0 is obtainedunder the condition that the regression line intersects the sample mean. In the caseof the Z predictor this constant (γ0) is given by:

γ0Z = X1 − γ0Z (18)

where bars denote sample means. In the case of the W predictor the size of thisconstant is scaled down to take into account the fact that the Xs are now playing arole in prediction:

γ0W =γ0Z

(1− σ²²β0Σ−1ν β)

(19)

Figures 2 and 3 which plot the non-parametric densities of the different predictorsalong with that of the observed income provide confirmation of the key results derivedin Section 2. First, all three predictors are shown to be unbiased and are centred

13In particular, where off-farm labour markets remain relatively undeveloped.14For detail on the derivation of the different predictors and on criteria for comparing across them

see Section 2 and the Appendix.15Models using different definitions of consumption, where noisy housing and durable expenditures

have been corrected for, produce very similar results except that the relative weighting of incomeversus consumption changes.

15

Page 16: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

around the mean of LNPCI. Second, it is clear that in both provinces all threepredictors have a lower variance than either observed consumption or income. Third,the Z predictor is shown to have the lowest variance. Though there is an increase invariance in moving from the Z to the W predictor there is a compensating gain interms of the information set being covered which results in a reduction in predictedmean square error. Because the W predictor utilises X information, it is capable ofexplaining some of the residual variance that is left unaccounted for by sole use ofZ information and this may have value in terms of identifying the long-run incomestatus of households. The proof that the W predictor dominates both the X and Wpredictors in terms of prediction mean squared error is shown in the Appendix.This gain in information (sufficiency), can be demonstrated by first ranking house-

holds according to the different predictors and then creating two way quintile fre-quency tables which give some idea of the degree of agreement between the differentrankings (Table 7). The weight of households on the diagonal expressed as a fractionof total sample size provides us with a measure of the degree of agreement betweentwo predictors in terms of the identification of the welfare status of households. Thedegree of disagreement in welfare rankings between the X and Z predictors is large inboth provinces. Only 37−38% of household are classified to belong to the same quin-tile according to these criteria. Table 7 demonstrates that there is greater agreementon welfare rankings between the W predictor and either the X or the Z predictorthan there is between the X and the Z predictors. The W predictor, by spanningboth outcome (X) and opportunity (Z) space, thus forces agreement on the relativewelfare rankings and should be preferred if we believe that both these spaces containinformation which is relevant to the prediction of household permanent income (Y ).A similar analysis can be carried out to compare static welfare indicators (PCI, PCE)

and the various predictors (Table 8). As can be seen from Table 8 the degree ofagreement between observed income and consumption is fairly low; less than 50% ofhouseholds are identified as being in the same quintile. The use of the X predictorsubstantially increases the degree of agreement on the welfare status of households.In cases where there is substantial disagreement according to income and consump-tion criteria, the X predictor thus has real value in steering a middle route betweenusing either current income or consumption as is standard practice. Interestingly, inTable 8 we see that using the Z predictor leads to a reduction in agreement as regardswelfare identification. This result is consistent with Table 7 and suggests that theranking of households is substantially different when they are ranked in outcome (X)space as opposed to opportunity (Z) space. The subset identified as “poor” accordingto these two sets of criteria is likely to be substantially different irrespective of wherethe poverty line is drawn. Implementation of the W predictor leads to an improve-ment in agreement on the welfare ranking of households. Where there is uncertaintyregarding the true welfare ranking, theW predictor effectively balances the contribu-tions of these types of information in prediction according to their correlation withthe unobserved variable of interest (Y ). In this way the W predictor makes fullest

16

Page 17: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

use of welfare relevant information in a given data set.

4 Conclusions

The framework we have developed allows us to make greater use of information con-tained in household surveys to identify the permanent income of households. Thecentral conclusion of the paper must be that it is possible to construct welfare mea-sures which are more informationally efficient and which perform better than ob-served income or consumption in terms of proxying for household permanent income.Moreover using our theoretical framework we are able to compare the performanceof different predictors and to demonstrate that the predictor which combines infor-mation on correlates and determinants of permanent income outperforms predictorsbased either on correlates or determinants. We not only provide the theory as to whythis is the case but also demonstrate the validity of these findings using 1990 ruralhousehold data from China. These findings have important implications for povertyand inequality analysis — if new informationally more efficient welfare measures canbe constructed then these should be used in preference to current income and con-sumption in situations where permanent income is the variable of interest for policymakers.The framework we develop also allows us to compare the performance of various

commonly used welfare indicators in terms of identifying permanent income. Theresults of this analysis in the case of rural China are not straightforward and suggestthat consumption cannot be assumed to be a better indicator or permanent incomein a low income setting. In rural China preference for consumption based welfaremeasures rests on careful corrections being made for the noisy durable and housingelements in expenditure. This analysis makes clear that selection of welfare indicatorsneeds to be made based on empirical analysis as opposed to on a priori reasoning.Where the costs of continuous household surveys are prohibitive, by making more

intensive use of welfare relevant data in a given household survey, predicted welfaremeasures of the types suggested here can play a useful role in providing a betterapproximation of permanent income than is obtainable using observed income orconsumption. This in turn can result in savings in welfare programmes if use of thisnew welfare indicator (e.g. W predictor) results in the needy being more correctlyidentified. If cross-sectional data is going to be used in this fashion then carefulthought needs to be put into survey design, in particular as regards ensuring that acomplete and standardised set of Xs and Zs is gathered.Our framework has also proven to be useful for looking at the determination of liv-

ing standards. In the case of rural China what this analysis has shown is that variousZ factors have had a great deal of purchase in terms of improving the long-run statusof households. Factors such as access to cultivable land and agricultural productiveassets, access to basic infrastructure (e.g. electricity), economic diversification andfinancial savings are all shown to play an important role in the determination of liv-

17

Page 18: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

ing standards. Interesting differences emerge in the causal equation between the twoprovinces. The analytical framework we have developed thus allows us to examinea richer set of policy influences on living standards than is standard. This is rele-vant as institutional features which characterize developing countries limit the scopefor redistribution through monetary transfers due to administrative and logistic con-straints and problems associated with monitoring and identifying needy individuals(see Burgess and Stern, 1993).Given these constraints, our results for rural China point to the possibility that

ex ante redistribution of opportunity, via such mechanisms as guaranteeing accessto land and education may be more effective means of supporting long-run livingstandards than ex post redistribution of income. The framework we have developedthus allows us to contrast what may be termed as the Z as opposed to X basedapproach to poverty and to identify which Z channels of influence will be effective inraising long-run living standards.Our empirical application is of particular interest as welfare support in East Asia

as a whole has tended to be based on Z based measures such as land redistribution,universal primary schooling, rural diversification and widening of access to credit.And there is evidence that high growth in the region was built on these Z basedmeasures (see Rodrik, 1995). At the core of the analysis is a suggestion that fromthe perspective of living standard support and growth promotion it may be moreeffective to redistribute opportunity as opposed to resources per se. In future workwe plan to delve deeper into complex set of relationships that determine the long-run status of the household living standards. More exact analysis of the Z-basedapproach to living standard protection and promotion in rural China is likely toproduce interesting lessons for other developing countries.

References

[1] ABUL NAGA, R.H. (1994). “Identifying the Poor: A Multiple Indicator Ap-proach”, Distributional Analysis Research Programme Discussion Paper No. 9,LSE.

[2] ABUL NAGA, R.H. (1996). “Prediction and Sufficiency in the Model of FactorAnalysis”, discussion paper, DEEP, University of Lausanne.

[3] ANAND, S. and HARRIS, C. (1990). “Food and Standard of Living: An Analysisbased on Sri Lankan Data” in Dreze, J. and Sen, A. (eds) The Political Economyof Hunger, Vol. 1 (Oxford, Oxford University Press).

[4] ATTFIELD, C.L.F. (1980). “Testing the Assumptions of the Permanent IncomeModel”, Journal of the American Statictical Association, 75, 32-36.

18

Page 19: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

[5] BARTHOLOMEW, D. (1984). “The Foundations of Factor Analysis”,Biometrika.

[6] BHALLA, S. (1979). “Measurement Errors and the Permanent Income Hypoth-esis: Evidence from Rural India”, American Economic Review, 69, 295-307.

[7] BROWNE, M.W. (1977). “Generalised Least Squares Estimators in the Analysisof Covariance Structures” in Aigner D. and Goldberger, A. (eds) Latent Variablesin Socioeconomic Models (Amsterdam: North Holland)

[8] BURGESS, R. (1997). “Welfare Analysis in Rural China”, D.Phil, University ofOxford.

[9] BURGESS, R. (2001). “Land Distribution and Welfare in Rural China” (mimeo,Deparment of Economics, London School of Economics).

[10] BURGESS, R. and STERN, N.H. (1993). “Taxation and Development”, Journalof Economic Literature, 31, 762-830.

[11] BURGESS, R. and WANG, P. (1995). “Chinese Rural Household ExpenditureAnalysis”, EF 13, Programme of Research into Economic Transformation andPublic Finance, STICERD, London School of Economics, Houghton Street, Lon-don WC2A 2AE.

[12] BLUNDELL, R. and PRESTON, I. (1998). “Consumption Inequality and IncomeUncertainty”, Quarterly Journal of Economics, 113, 603-640.

[13] CHAUDHURI, S. and RAVALLION, M. (1994). “How Well do Static IndicatorsIdentify the Chronically Poor”, Journal of Public Economics, 53, 367-394.

[14] DEATON, A. (1992). Understanding Consumption (Oxford: Clarendon Press).

[15] DEATON, A. (1997). The Analysis of Household Surveys: MicroeconometricAnalysis for Development Policy (Oxford: Oxford University Press).

[16] DEATON, A. and MUELLBAUER, J. (1980). Economics and Consumer Be-haviour (Cambridge: Cambridge University Press).

[17] FREIDMAN, M. (1957). A Theory of the Consumption Function (Princeton:Princeton Univ Press).

[18] FREIDMAN, M. and KUZNETS, S. (1945). Income from Independent Profes-sional Practice (New York: National Bureau of Economic Research)

[19] GLEWWE, P. (1990). “Efficient Allocation of Transfers to the Poor”, LivingStandards Measurement Working Paper No. 70, World Bank, Washington D.C.

19

Page 20: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

[20] GLEWWE, P. (1991): “Investigating the Determinants of Household Welfare inCote d’Ivoire”, Journal of Development Economics, 35, 307-337.

[21] GLEWWE, P. and VAN DER GAAG, J. (1990). “Identifying the Poor in De-veloping Countries: Do Different Definitions Matter?”, World Development, 18,803-814.

[22] GOLDBERGER, A. (1962). “Best Linear Unbiased Prediction in the GeneralisedLinear Model”, Journal of the American Statistical Association, 57, 369-375.

[23] GOLDBERGER, A. (1972). “Maximum Likelihood Estimation of RegressionsContaining Unobservable Independent Variables”, International Economic Re-view, 13, 1-15.

[24] JORESKOG, K. and GOLDBERGER, A. (1975). “Estimation of A Model withMultiple Indicators and Multiple Causes of a Single Latent Variable”, Journalof the American Statistical Association, 70, 631-639.

[25] LILLARD, L. and WILLIS, R. (1978). “Dynamic Aspects of Earning Mobility”,Econometrica, 46.

[26] MUELLBAUER, J. (1983). “The Measurement of Long-Run Living Standardswith a Permanent Income Model” (mimeo, Nuffield College, Oxford).

[27] MUSGROVE, P. (1979). “Permanent Income and Consumption in Urban SouthAmerica”, American Economic Review, 69, 598-609.

[28] PUTTERMAN, L. (1993). Continuity and Change in China’s Rural Development(New York, Oxford University Press).

[29] RAO, C.R. and TOUTENBERG, J. (1995). Linear Models, Least Squares andAlternatives (New York: Springer)

[30] RODRIK, D. (1995), “Getting Interventions Right: How South Korea and Tai-wan Grew Rich”, Economic Policy, 20.

[31] SINGH, I., SQUIRE, L. and STRAUSS, J. (1986). Agricultural Household Mod-els: Extensions, Applications and Policy (Baltimore: Johns Hopkins UniversityPress).

[32] VAN DE WALLE, D. and NEAD, K. (1995). Public Spending and the Poor:Theory and Evidence (Oxford, Oxford University Press).

[33] WORLD BANK, (1992). “China: Strategies for Reducing Poverty in the 1990s”,World Bank Country Study, World Bank, Washington D.C.

20

Page 21: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

[34] ZELLNER, A. (1962). “An Efficient Method of Estimating Seemingly UnrelatedRegressions and Tests for Aggregation Bias”, Journal of the American StaticticalAssociation, 70, 631-639.

[35] ZELLNER, A. (1970). “Estimation of Regression Relationships containing Un-observable Variables”, International Economic Review, 11, 441-454.

21

Page 22: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

5 Appendix

In this appendix we derive the best linear predictors Y ∗X and Y ∗W subject to unbi-asedness constraints. We also show that Y ∗W has the lowest MSE. Throughout thesection we make use of the following results.

ΣX = ββ0(ρ2 + σ²²) + Ωwhere ρ2 = γ0ZZ 0γ (A1)

E(XZ 0) = βγ0ZZ 0 (A2)

E(ZY 0) = ZZ 0γ (A3)

E(XY ) = β(ρ2 + σ²²) (A4)

var(Y ) = ρ2 + σ²² (A5)

5.1 The X predictor

The best linear predictor of Y using X variables obtained subject to a sample un-biasedness constraint is a function a0X with coefficients vector a chosen to minimisethe Lagrangean:16

L = E[Y − a0X]2 − λE[a0X − Y ]The constraint can be written as:

E(a0βY − Y ) = 0i.e. a0β − 1 = 0

Minimisation of L is thus equivalent to minimisation of L0:

L0 = E[Y Y 0 − 2a0XY + a0XX 0a]− 2λ(a0β − 1)Taking first derivatives with respect to a, we get:

E[−XY +XX 0a]− λβ = 0

i.e. a = Σ−1X [λβ +E(XY )]

Using (A4) we obtain:a = (λ+ ρ2 + σ²²)Σ

−1X β (*)

Pre-multiplying (*) by β0 noting that β0a = 1, we have:

1 = (λ+ ρ2 + σ²²)β0Σ−1X β

i.e. λ = (β0Σ−1X β)− ρ2 − σ²²)

and substituting for λ in (*) we get :

a = (β0Σ−1X β)−1Σ−1X β

i.e. Y ∗X = (β0Σ−1X β)−1β0Σ−1X X

16See Goldberger (1962) and Abul Naga (1996).

22

Page 23: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

5.2 The W predictor

We wish to choose a vector b such that:

E[b0W − Y ]2

is minimum subject to the unbiasedness condition for random variables:

E[b0W − Y ] = 0 (UC)

Write the W predictor as:Y ∗W = b01X + b

02Z

We have:Y ∗W = b01(βγ

0Z + β²+ u) + b02Z

thus the unbiasedness condition is equivalent to:

E[(b01βγ0Z + b01β²+ b

01u) + (b

02Z)− (γ0Z + ²)] = 0 (UC’)

where we have made use of the causal equation Y = γ0Z+ε. (UC’) is thus equivalentto:

b01βγ0 + b02 − γ0 = 0 (UC”)

On the basis of (UC”) we have replaced b02 by γ0 − b01βγ0. Our problem can now bestated as one of choosing a vector b1 to minimise the MSE criterion:

E[b01X + (γ0 − b01βγ0)Z − Y ]2

Taking first derivatives, first order conditions imply:

E[XX 0b1 +XZ 0γ − 2βγ0ZX 0b1 −XY − βγ0ZZ 0γ + ββ0b1γ0ZZ 0γ + Y Z 0γβ] = 0

Using (A1)-(A4) above, we obtain:

b1 = Σ−1ν βσ²²

Substituting for b1 in (UC”) we obtain the following expression for b2:

b2 = (1− β0Σ−1ν βσ²²)γ

and therefore:Y ∗W = (σ²²β

0Σ−1ν )X + (1− σ²²β0Σ−1ν β)γ0Z

23

Page 24: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

5.3 Comparison of Mean Square Errors of the Three Predic-tors

Let θ be a scalar such that

σ²²β0Σ−1ν = θ(β0Σ−1X β)−1β0Σ−1X (A6)

i.e. θ is the scalar which maps the coefficients of Y ∗X on those of X variables in Y ∗W .Post multiplying (A6) by β, we obtain

σ²²β0Σ−1ν β = θ

It follows that Y ∗W can be written as in (26).Using theorem A.18 of Rao and Toutenburg (1995: 291) we can invert Σν to

obtain:

Σ−1ν = Ω−1 − σ²²Ω−1ββ0Ω−1

1 + σ²²β0Ω−1β

from which it follows that:

θ = σ²²β0Σ−1ν β =

σ²²β0Ω−1β

1 + σ²²β0Ω−1β

(A7)

i.e. 0 < θ < 1. The W predictor is the best combination of X and Z variables whichminimises MSE subject to the unbiasedness condition (UC”). In particular, it willdominate any other convex combination of Y ∗X and Y

∗Z in the MSE sense.

It is instructive to consider the two limiting cases of Y ∗W , that is when θ → 0 andwhen θ → 1. From (A7) it can be seen that as σ²² →∞, θ → 1, and that as σ²² → 0,θ → 0. That is, the more Z accounts for the variation in Y , the higher will be theweight on Z variables in Y ∗W .Computations of MSEs give us the following expressions:

MSE(Y ∗W ) = (1− θ)2[σ²²(σ²²β0Ω−1β + 1)] (A8)

MSE(Y ∗X) = (β0Ω−1β)−1 (A9)

MSE(Y ∗Z ) = σ²² (A10)

These relations are used to establish that MSE(Y ∗X) > MSE(Y ∗W ), and thatMSE(Y ∗Z ) > MSE(Y

∗W ).

24

Page 25: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

Table 1: Sample Characteristics, Rural Sectors, 1990

Sichuan

Jiangsu

Rural PCE (yuan)

569

953

Rural PCI (yuan)

504

883

Rural industry/ rural output (%)

26.9

60.4

Location

Central inland

East coastal

Climate

Subtropical

Subtropical

Main food crop

Rice

Rice

Household size

4.35

4.15

Sample size counties [villages]

<household> (persons)

54 [538]

<5380> (23416)

34 [336]

<3364> (13920)

Source: SSB Rural Household Surveys. China Statistical Yearbook (1991)

Page 26: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

Table 2: Model Specification Variable

Variable name

Definition

Y

Unobservable logged per capita permanent income

X1

LNPCI

Logged per capita disposable income

X2

LNPCE1

Logged per capita expenditure including current durable and housing expenditures

LNPCE2

Logged per capita expenditure including imputed rent from durable stock and current housing expenditures

LNPCE3

Logged per capita expenditure including imputed rents from durable and housing stock

Z1

LNN

Logged household size

Z2

CHILDP

Proportion of children in household size

Z3

EDUHD

Educational status of household head

Z4

PRIMP

Proportion of household labour force in primary occupation (agriculture, fisheries, forestry)

Z5

AREAPC

Per capita housing floor area

Z6

ELEC

Dummy for whether house electrified

Z7

CULTPC

Per capita cultivable land

Z8

BWTV

Per capita black and white TVs

Z9

WATCH

Per capita watches

Z10

DEPOSIT

Per capita value of savings

Z11

PASSETV

Per capita value of productive assets

Notes: The source of all the variables are the Rural Household Sample Surveys for Sichuan and Jiangsu for 1990. All per capita terms are arrived at by dividing through by household size. In the corrected measures of consumption, housing is imputed at 6% of house value, whilst durables are imputed at 12% of the current value of the household durable stock.

Page 27: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

Table 3: Model Estimation Results

SICHUAN 1990

JIANGSU 1990

Var Coef

LNPCE1

LNPCE2

LNPCE3

LNPCE1

LNPCE2

LNPCE3

LNPCI

β1

1.000

1.000

1.000

1.000

1.000

1.000

LNPCE

β2

0.966

0.985

1.001

0.956

0.979

1.040

LNN

γ1

-0.100 (7.486)

-0.105 (8.046)

-0.133

(11.425)

-0.109 (5.351)

-0.110 (5.582)

-0.119 (7.579)

CHILDP

γ2

-0.240 (8.834)

-0.236 (8.896)

-0.279

(11.825)

-0.224 (5.884)

-0.207 (5.626)

-0.196 (6.687)

EDUHD

γ3

0.002

(2.770)

0.002

(2.853)

0.002

(3.108)

0.008

(6.500)

0.008

(6.618)

0.008

(8.127)

PRIMP

γ4

-0.231 (9.897)

-0.226 (9.925)

-0.241

(11.906)

-0.247 (8.663)

-0.244 (8.848)

-0.181 (8.227)

AREAPC

γ5

0.006

(13.390)

0.006

(13.830)

0.006

(16.160)

0.009

(17.992)

0.009

(18.585)

0.011

(25.276)

ELEC

γ6

0.070

(5.089)

0.070

(5.260)

0.052

(4.395)

0.157

(7.163)

0.157

(7.405)

0.147

(8.710)

CULTPC

γ7

0.086

(11.420)

0.081

(11.108)

0.071

(10.858)

0.129

(12.576)

0.118

(11.866)

0.076

(9.636)

BWTV

γ8

0.640

(17.863)

0.618

(17.640)

0.605

(19.242)

0.363

(7.065)

0.368

(7.409)

0.373

(9.382)

WATCH

γ9

0.294

(17.447)

0.304

(18.430)

0.306

(20.644)

0.436

(15.071)

0.449

(15.997)

0.398

(17.415)

DEPOSIT

γ10

0.040

(14.877)

0.040

(15.176)

0.029

(12.529)

0.012

(7.539)

0.011

(7.496)

0.009

(7.288)

PASSETV

γ11

0.017

(9.192)

0.017

(9.303)

0.019

(11.574)

0.007

(3.829)

0.006

(3.782)

0.008

(5.840)

ω11

0.039

0.043

0.051

0.075

0.084

0.107

ω22

0.039

0.034

0.011

0.113

0.093

0.027

AGFI1

0.977

0.977

0.957

0.979

0.973

0.949

Notes: Absolute t statistics in parenthesis. Different columns refer to estimation results using different definitions of consumption. For variable definitions see Table 2. AGFI refers to the adjusted goodness of fit index.

1AGFI is the adjusted goodness of fit of index.

Page 28: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

Table 4: Decomposition of Variance of Static Welfare Indicators

SICHUAN

JIANGSU Welfare indicator

Transitory share

Permanent share

Transitory share

Permanent share

LNPCI

LNPCE1

0.236 0.249

0.764 0.751

0.277 0.370

0.723 0.630

LNPCI

LNPCE2

0.261 0.224

0.739 0.776

0.297 0.327

0.703 0.673

LNPCI

LNPCE3

0.315 0.090

0.685 0.910

0.377 0.124

0.623 0.876

Notes: Different rows refer to estimation results using different definitions of consumption. For variable definitions see Table 2.

Table 5: Correlation Coefficients between X and Z Variables and Permanent Income Variable

Coefficient

SICHUAN 1990

JIANGSU 1990

LNPCI

ζXY

0.874

0.858

LNPCE1

ζXY

0.865

0.793

LNN

ζZY

-0.241

-0.275

CHILDP

ζZY

-0.167

-0.237

EDUHD

ζZY

0.045

0.142

PRIMP

ζZY

-0.031

-0.181

AREAPC

ζZY

0.375

0.543

ELEC

ζZY

0.157

0.246

CULTPC

ζZY

0.201

0.121

BWTV

ζZY

0.399

0.349

WATCH

ζZY

0.441

0.531

DEPOSIT

ζZY

0.307

0.248

PASSETV

ζZY

0.254

0.180

Notes: For variable definitions see Table 2.

Page 29: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

Table 6: Model Prediction Results

SICHUAN 1990

JIANGSU 1990

X-pred

Z-pred

W-pred

X-pred

Z-pred

W-Pred

LNPCI

0.517

0.251

0.623

0.245

LNPCE

0.500

0.243

0.394

0.155

LNN

-0.100

-0.051

-0.109

-0.066

CHILDP

-0.240

-0.123

-0.224

-0.135

EDUHD

0.002

0.001

0.008

0.005

PRIMP

-0.231

-0.119

-0.247

-0.150

AREAPC

0.006

0.003

0.009

0.005

ELEC

0.070

0.036

0.157

0.095

CULTPC

0.086

0.044

0.129

0.078

BWTV

0.640

0.329

0.363

0.220

WATCH

0.294

0.151

0.436

0.264

DEPOSIT

0.040

0.021

0.012

0.007

PASSETV

0.017

0.009

0.007

0.004

Table 7: Diagonal Weights - Quintile Crosstabs for Predictors

Crosstab variables

SICHUAN

JIANGSU

Yx/Yz

37.47%

38.05%

Yx/Yw

67.34%

59.66%

Yz/Yw

51.30%

59.93%

Table 8: Diagonal Weights - Quintile Crosstabs for Static Indicators and Predictors

Crosstab variables

SICHUAN

JIANGSU

PCI/PCE1

47.06%

43.9%

PCI/Yx

68.46%

70.4%

PCE1/Yx

66.47%

58.3%

PCI/Yz

35.17%

36.7%

PCE1/Yz

37.16%

36.4%

PCI/Yw

57.64%

52.8%

PCE1/Yw

58.44%

50.4%

Page 30: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

YZ

XX

12

β 1β

γ

2

ε

uu

12

Figu

re 1

: Pat

h D

iagr

am fo

r the

MIM

IC M

odel

Page 31: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

Figu

re 2

: Den

sity

Plo

ts fo

r Si

chua

n

Page 32: Prediction and Determination of Household Permanent Incomeecon.lse.ac.uk/staff/rburgess/wp/mimic2.pdf · 2002. 2. 18. · income (see Attfield, 1980; Bhalla, 1979; Musgrove, 1979;

Figu

re 3

: Den

sity

Plo

ts fo

r Ji

angs

u