Normative comparisons of distributions of one attribute.

Normative comparisons of distributions of one attribute

Basic framework We assume here that social states that are visible

by the evaluator (economist, philosopher, policy maker) are described by a lists of numbers (one such number for every individual)

The number is supposed to represent the quantity of some attribute that is cardinally meaningful

Best example: income, Other examples (not as good): life expectancy,

schooling Notice the distinction between what is visible by

the analysis and what may be normatively important

Formally : We compare objects such as: x = (x1,…,xm) a distribution of the attribute between

individuals 1,…m and y = (y1,…,yn) a distribution of the attribute between

individuals 1,...,n x and y differ by: 1) the number and identity of the individuals 2) The size of the cake that is distributed (efficiency) 3) The sharing of the cake given its size (equity, equality) 3a) The amount of the cake held by those who gets the

least (tighly connected to 3) We want to compare income distributions in a way which

captures each of these features

Differing number of individuals

The handling of this difference is easy It is done by the so-called Dalton principle of replication (x1,…,xm) (x1,…,x1,x2,…,x2,…, xm …,xm)



k times



k times k times



k times k times k times




“Nothing important is changed from an ethical point of view if a given distribution is replicated k times”





This principle enables one to compare distributions with differing numbers of individuals





This principle enables one to compare distributions with differing numbers of individuals

Indeed suppose I want to compare distribution x containingm individuals with distribution y containing n individuals


(x1,…,xm) (x1,…,x1,x2,…,x2,…,xm,…,xm)

n times n times n times


(x1,…,xm) (x1,…,x1,x2,…,x2,…,xm,…,xm)

(y1,…,yn) (y1,…,y1,y2,…,y2,…,yn …,yn)



(x1,…,xm) (x1,…,x1,x2,…,x2,…,xm,…,xm)

(y1,…,yn) (y1,…,y1,y2,…,y2,…,yn …,yn)


m times m times m times


(x1,…,xm) (x1,…,x1,x2,…,x2,…,xm,…,xm)

(y1,…,yn) (y1,…,y1,y2,…,y2,…,yn …,yn)



nm individuals


(x1,…,xm) (x1,…,x1,x2,…,x2,…,xm,…,xm)

(y1,…,yn) (y1,…,y1,y2,…,y2,…,yn …,yn)



nm individuals

nm individuals


(x1,…,xm) (x1,…,x1,x2,…,x2,…,xm,…,xm)

(y1,…,yn) (y1,…,y1,y2,…,y2,…,yn …,yn)



nm individuals

nm individuals

If I can compare together these two replicates of x and y then, by transitivity, I can compare x and y


(x1,…,xm) (x1,…,x1,x2,…,x2,…,xm,…,xm)

(y1,…,yn) (y1,…,y1,y2,…,y2,…,yn …,yn)



nm individuals

nm individuals

If I can compare together these two replicates of x and y then, by transitivity, I can compare x and y

Hence, we shall assume from now on that distributions have the same number (n) of individuals

Important principle: anonymity

The name (identity) of the individuals does not matter Underlying this principle is the assumption that individuals do

not differ in other dimensions than income Formally, x Ax for every n n permutation matrix Recall that an n square matrix A is a permutation matrix if

it satisfies Aij {0,1} for every i, j =1,…,n and

Ai1+….+Ain =A1j+…+Anj = 1 for every i, j

For every x, x(.) denotes its ordered permutation

x(.) = Ax for some permutation matrix such that x(i) x(i+1) for all i =1,…,n -1

Efficiency

Let (x) = (ixi)/n (x) (the mean of x) is a natural (and widely used) measure

of the “size of the cake”. What principles should a measure of “efficiency” satisfy ? 1: Weak (Pareto) efficiency x y x is weakly better than y 2) Strong (Pareto) Efficiency x > y x is strictly better than y 1a) Anonymous (Suppes) weak efficiency x(.) y(.) x is

weakly better than y 2a) Anonymous (Suppes) Strong efficiency x(.) > y(.) x is

strictly better than y

Pareto and Suppes efficiency ?

xx22

xx11

4545oo

yy2

yy1

yy2 yy11

(y(1),y(2)) Pareto better

(y1,y2)

Pareto &Suppes better

Efficiency

verifies both Pareto and Suppes efficiency The notions of efficiency captured by the Pareto and

Suppes principles are strong in the sense that they are not required to hold “ceteris paribus” with respect to inequality

For example (1,1000) (1,100) for both principles Yet, in some sense (to be made precise soon),

(1,100) is “more equal” than (1,1000) There are other (weaker) notions of efficiency that

restrict the scope of the principle to situations that are “identical” in terms of equality

Efficiency

1) For all real numbers t > 1, tx(.) x (increasing all incomes in a common proportion t does not affect inequality if inequality is taken to be a “relative” concept).

2) For all strictly positive real number a, x(.) + a x (adding a euros to everyone does not affect inequality if inequality is an “absolute” concept)

Both principles are obviously satisfied by (tx(.)) = t (x) > (x) for every number t > 1 (x(.) + a) = a + (x) > (x) for every strictly positive

number a

Equality

What do we mean by “equalizing” ? Basic idea: Reducing pair-wise discrepancies in

incomes without “throwing cake away”. Pigou-Dalton notion of transfer x(.) has been obtained from z(.) by a bilateral Pigou-

Dalton transfer if there are i and j {1,…,n} and a strictly positive real number such that x(h) = z(h) for all h ≠ i, j and x(i) = z(i) + z(j) - = x(j)

Illustration:

z(i) z(j)

x(i) x(j)

Equality

x(.) has been obtained from z(.) by a finite sequence of Pigou-Dalton transfers (x(.) PD z(.)) if there exists a sequence of distributions {zt

(.)}, with t =0,...T such that z0

(.)= x(.), zT(.) = z(.) and zt

(.) has been obtained from zt+1

(.) by a bilateral Pigou-Dalton transfer for t = 0,…,T-1

Example: x(.) = (2,4,8,11,16)

z(.) = (1,5,7,10,18)

Equality



(.)= x(.), zT(.) = z(.) and zt



Example: x(.) = (2,4,8,11,16)

z3(.) = (1,5,7,10,18)

Equality



(.)= x(.), zT(.) = z(.) and zt



Example: x(.) = (2,4,8,11,16)

z2(.) = (2,4,7,10,18)

Equality



(.)= x(.), zT(.) = z(.) and zt



Example: x(.) = (2,4,8,11,16)

z1(.) = (2,4,8,10,17)

Equality



(.)= x(.), zT(.) = z(.) and zt



Example: x(.) = (2,4,8,11,16)

z0(.) = (2,4,8,11,16)

Equality



(.)= x(.), zT(.) = z(.) and zt



Example: x(.) = (2,4,8,11,16)

z0(.) = (2,4,8,11,16)If x(.) PD z(.)), then distribution x is unambiguously more equal than the distribution z because it resultsfrom a finite sequence of clearly “equalizing” elementary operations

Pigou-Dalton transfer and T-transforms

An n n matrix A is a T-transform if there exist i and j and a number [0,1] such that:

Aii= Ajj = 1- Aij= Aji = Ahh= 1 for all h ≠ i, j

Ahk= 0 for all h, k ≠ i, j, h ≠ k Let us visualize what a T-transform is

An example of a T-transform

1 0 …. 0 …. 0 …. 0

0 1 … 0 …. 0 …. 0

. … 1 0 … 0

0 … 0 1- … …. 0

… … … 0 1

0 … 0 0 1- … 0

. … … 0 … 0 …

0 0 … … … … … 1

i

j

i j

A Pigou-Dalton transfer is a T-transform

Indeed, if x and z are two distributions of income for which x(.) = Az(.) for some T-transform

Then: x(.) =

(z(1),…,z(i-1),(1-)z(i)+z(j),z(i+1),…,z(j-1),z(i)+(1-)z(j),z(j+1),…,z(n))

If x(.) is obtained from z(.) by a Pigou-Dalton transfer, then x(.) =

(z(1),…,z(i-1),z(i)+,z(i+1),…,z(j-1),z(j)-,z(j+1),…,z(n)) for some [0, [z(j)- z(i)]/2]

Which is a T-transform if = /[z(j)- z(i)]

Bistochastic matrices An n n matrix A is bistochastic if it satisfies: Aij [0,1] for all i, j.

jAij= iAij = 1 for all i and j A permutation matrix is a bistochastic matrix So is a T-Transform If x = Ay for some bistochastic matrix A, then x can be said to

be “more equal” than y. Indeed, what the premultiplication of a vector of numbers by a bistochastic matrix does is that it reduces the discrepancies between numbers without changing the mean

Indeed if x = Ay for some bistochastic matrix A, then

)(

)(

)( 11 11 yn

y

n

yA

n

xx

n

ii

n

ii

n

jij

n

ii

Some results Result 1: x(.) PD y(.) if and only x = Ay for some

bistochastic matrix A Result 2: x = Ay for some bistochastic matrix A if and

only if x(.) = T1T2…TKy(.) for some finite sequence of T-transforms.

Lorenz dominance and Pigou-Dalton There is a nice test to check if x(.) PD y(.)

For any distribution x and number k =1,…,n define Lk(x) by:

k

iik xxL

1)()(

For any two distributions of income x and y with the sameMeans, define the Lorenz binary relation L by:

x L y Lk(x) Lk(y) for all k

L is a quasi-ordering (reflexive and transitive)

Theorem (Hardy-Littlewood-Polya): Let x and y be two distributions such that (x) = (y). Then x PD y x L y

Example

has (2,6,6,15,16) been obtained from (1,5,7,14,18) by a finite sequence of Pigou-Dalton transfers ?

Let us do the Lorenz test and draw the Lorenz curves

This curve shows the points (k,Lk(x)) for any distribution x

Lorenz Curves

position

Cumulatedincome

1 2126

13 Lorenz curve for (1,5,7,14,18)

45

3

8

14

2729

4 5

Lorenz curve for (2,6,6,15,16)

Hence (2,6,6,15,16) has beenobtained from (1,5,7,14,18) bya finite sequence of Pigou-Daltontransfers (find the sequence!!)

Principle of diminishing transfers A bilateral Pigou-Dalton transfer is a clearly

equalizing operation But it remains silent about the ranking of some

distributions Consider for example (2,4,6,10) and (1,5,7,9) To go from (1,5,7,9) to (2,4,6,10) we have made a

progressive transfer of 1 from 2 to 1 and a regressive transfer of the same amount from 3 to 4

Can it be said that (2,4,6,12) is more equal than (1,5,7,11) ?

Yes if one adheres to the Foster & Shorrocks (1987) principle of diminishing transfer

The key transformation involved in this principle is the composite transfer

Composite transfer ?

x(.) has been obtained from z(.) by a composite transfer if there are h, i, j and k {1,…,n} and a strictly positive real number such that

1) x(g) = z(g) for all g ≠ h, i, j and k

2) x(h) = z(h) + z(i) - = x(i)

3) z(j) - = x(j)

4) x(k) = z(k) + 5) z(i) – z(h) = x(k) – x(j) > 0

6) z(h) < x(j)

A Composite transfer

Is a combination of a progressive transfer in the lower tail of the distribution with a regressive transfer of the same magnitude in the upper tail of the distribution.

Does not affect the variance of the distribution of income

A Composite transfer does not affect the variance

Suppose x has been obtained from z by a composite transfer (x and z are assumed to have the same mean )

Let us calculate the variance V(x):

]][][][

][][[1

)(

2)(

2)(

2)(

2)(

2

,,,)(

kji

hkjihg

g

zzz

zzn

xV




22)()(

2)(

22)()(

2)(

22)()(

2)(

22)()(

2)(

2

,,,)(

222

222

222

222][[1

)(

kkk

jjj

iii

hhhkjihg

g

zzz

zzz

zzz

zzzzn

xV




2)(

2)(

2)(

2)(

2)(

22

22

22

22][[1

)(

k

j

i

hg

g

z

z

z

zzn

xV




)2/(2

]2/[2

]2/[2

]2/[2][[1

)(

)(

)(

)(

)(2

)(

k

j

i

hg

g

z

z

z

zzn

xV




]]2/[2/[2

]]2/[2/[2

][[1

)(

)()(

)()(

2)(

jk

hi

gg

zz

zz

zn

xV




]][[2

][2

][[1

)(

)()(

)()(

2)(

jk

hi

gg

zz

zz

zn

xV



Let us calculate the variance V(x):2

)( ][[1

)( g

gzn

xV

Because z(i) – z(h) = x(k) – x(j)

A larger class of variancepreserving composite transfers

In a composite transfer, the amount transferred (from the rich to the poor at the bottom, and from the poor to the rich at the top) is the same

We may want to consider a broader class of transfers that avoid this constraint but that keep the requirement that the variance be unaffected.

A larger class of variancepreserving composite transfers

x(.) has been obtained from z(.) by a variance preserving composite transfer if there are h, i, j and k {1,…,n} and strictly positive real numbers and such that

1) x(g) = z(g) for all g ≠ h, i, j and k

2) x(h) = z(h) + z(i) - = x(i) x(j) = z(j) -

3) x(k) = z(k) +

5) z(i) – z(h) > 0 and x(k) – x(j) > 0 6) V(x) = V(z)

A stronger notion of « equalization »

x(.) has been obtained from z(.) by a finite sequence of Pigou-Dalton transfers and/or variance preserving composite transfers (x(.) COMP z(.)) if there exists a sequence of distributions {zt


(.)= x(.), zT(.) = z(.) and zt


(.) by either a bilateral Pigou-Dalton transfer or a composite variance-preserving transfer for t = 0,…,T-1

Example: x(.) = (2,4,6,12,16)

z(.) = (1,5,7,9,18)




(.)= x(.), zT(.) = z(.) and zt



Example: x(.) = (2,4,6,12,16)

z2(.) = (1,5,7,9,18)




(.)= x(.), zT(.) = z(.) and zt



Example: x(.) = (2,4,6,12,16)

z1(.) = (2,4,6,10,18)

Composite variancepreserving




(.)= x(.), zT(.) = z(.) and zt



Example: x(.) = (2,4,6,12,16)

z0(.) = (2,4,6,12,16)

Pigou Dalton

Empirical test of this stronger notion of equalization

There is a nice test to check if x(.) COMP y(.)

Theorem (Foster & Shorrocks (1987): Let x and y be two distributions such that (x) = (y). Then x COMP y there exists a k {1,…,n} such that Lj(x) Lj(y) for all j k (with 1 inequality strict) and Ll(x) Ll(y) for all l > k and V(x) V(y)

In words, if (x) = (y), x COMP y if and only if the variance of x is weakly smaller than the variance of y and the Lorenz curve of x starts above that of y and crosses that of y at most once.

Measuring inequality Now that we have defined what it means for a

distribution to be “more equal” than another, we may want to measure the inequality of a distribution x by a single number I(x)

I(x) I(y) means “inequality is no smaller in x than in y”

We shall assume for the sake of this course that any income distribution lies in n

+ I may verify I(x)= I(x) for any strictly positive real

number and every x n+(relative)

I may verify I(x+)= I(x) for any strictly positive real number and every x n

+(absolute) We will restrict attention to relative indexes

Properties of inequality indices Schur-convexity: I(x) I(A.x) for every nn bistochastic

matrix A and every x in n+

Strict Schur-convexity: I(x) > I(A.x) for every nn bistochastic matrix A that is not a permutation matrix

Schur-convexity garantees (Hardy-Littlewood-Polya) that I will be (weakly or strictly) sensitive to Pigou-Dalton transfers

Hence, for any I Schur Convex, x(.) PD y(.) I(x) I(y) (or I(x) < I(y) if Schur-convexity is strict)

It also entails the property of symmetry Symmetry: I(x) = I(x(.)) for every distribution x n

+ We could also require I to be sensitive to variance-

preserving composite transfers. x(.) COMP y(.) I(x) I(y) (Variance-Preserving Composite

Transfers (VPCT) sensitivity)

Examples of inequality indices: Interquartile ratio

Inter quartile ratio IIq(x) : For any fraction q [0,1] IIq(x) is defined by

nqjj

qnii

Iq

x

x

xI

)1()(

)(

1)(

This index measures (negatively) the ratio of the fraction of totalincome held by the qth poorest fraction of the population over thetotal income held by the qth richest fraction of the population

This index is weakly (but not strictly)Schur-convex. It does not satisfy VPCT sensitity

Examples of inequality indices: Coefficient of variation

This index ranks income distributions with the samemean as per their variance; the absolute versionthis index is the standard deviation

This index is strictly Schur-convex but it violates VPCT sensitity

Coefficient of variation : ICV(x)

)(/)()( xxVxI CV

Examples of inequality indices: Coefficient of variation of the logarithm

Coefficient of variation of the logarithm ICVLOG(x) :

Comes from applied work where the distribution of incomeis often assumed to be log normal (the distribution of the logarithmof income is normal)

This index is not Schur-convex and violates thereforethe Pigou-Dalton principle

i

ii

iCV

n

xx

nxxI 2log )

ln(ln

)(

1)(

Examples of inequality indices: Theil

Theil index IT(x) :

i

iiT

x

x

x

x

nxI

)(ln

)(

1)(

This relative index is very closely related to the entropy index of diversity discussed earlier

It is strictly Schur convex and verifies VPCT sensitivity

Examples of inequality indices: Atkinson-Kolm

Atkinson-Kolm index (for a parameter > 0) IAK(x) :

otherwisex

x

forx

x

nxI

ni

i

i

iAK

/1

1

11

])(

[1

)1(]))(

(1

[1)(

This relative index is strictly Schur convex and verifies VPCT sensitivity

Examples of inequality indices: Generalized entropy family


Generalized entropy family of indices (for a parameter c) IGEc(x) :



i

i

i

ii

i

ciGEc

cforx

x

n

cforx

x

x

x

n

cforx

x

ccnxI

)0()(

ln1

)1()(

ln)(

1

)1,0(]1))(

()1(

11)(



i

i

i

ii

i

ciGEc

cforx

x

n

cforx

x

x

x

n

cforx

x

ccnxI

)0()(

ln1

)1()(

ln)(

1

)1,0(]1))(

()1(

11)(

For c =2, we have IGEc(x) = ICV(x)2/2

This family of relative indices contains several indices seen so far



i

i

i

ii

i

ciGEc

cforx

x

n

cforx

x

x

x

n

cforx

x

ccnxI

)0()(

ln1

)1()(

ln)(

1

)1,0(]1))(

()1(

11)(

For c =1, we have IGEc(x) = IT(x)




i

i

i

ii

i

ciGEc

cforx

x

n

cforx

x

x

x

n

cforx

x

ccnxI

)0()(

ln1

)1()(

ln)(

1

)1,0(]1))(

()1(

11)(

For 0 < c <1, we have IGEc(x) = [1/c(c-1)][1-[IAK(1-c)]c-1)




i

i

i

ii

i

ciGEc

cforx

x

n

cforx

x

x

x

n

cforx

x

ccnxI

)0()(

ln1

)1()(

ln)(

1

)1,0(]1))(

()1(

11)(

All members of this family are strictly Schur-convex



i

i

i

ii

i

ciGEc

cforx

x

n

cforx

x

x

x

n

cforx

x

ccnxI

)0()(

ln1

)1()(

ln)(

1

)1,0(]1))(

()1(

11)(

Only those members for which c < 2 satisfy VPCT sensitivity

Examples of inequality indices: Gini

Gini coefficient IG(x) :

i ih

hG xxi

xnxI ]))([

)(

21)( )(2

Example of inequality indices: Gini

Relative position

Cumulatedfraction ofincome

1/n

1

n/n = 1)(

)1(

xn

x

2/n

)()2()1(

xn

xx

Line of perfect equality

area =1/2

)(2 2

)1(

xn

x

]2

[)(

1 )2()1(2

xx

xn

]2

)([

)(

1)(2

nxx

xn i ihh

]2

1])([

)(2

1)(2

i ih

hxxixn


Hence, the Gini coefficient is twice the value of thearea between the perfect equality line and the (relative)Lorenz curve

But this coefficient can be written (and interpreted)differently



jiji

ii

i ihi

G

xxxn

xinxn

xxixn

xI

,2

)(2

)(2

)(

1

))1)(2()(

11

]))([)(

21)(

Gini coefficientis (one minus) the weighted average of the relative income of people, with the weight being inversely related to the ranking of people in the income distribution



jiji

ii

i ihi

G

xxxn

xinxn

xxixn

xI

,2

)(2

)(2

)(

1

))1)(2()(

11

]))([)(

21)(

But there isyet another interpretation of the Gini coefficient



jiji

ii

i ihi

G

xxxn

xinxn

xxixn

xI

,2

)(2

)(2

)(

1

))1)(2()(

11

]))([)(

21)(

It is the sum of all income differences between people (relative to total income)

Examples of inequality indices: The single parameter Gini family

Single parameter family IG(x) :

i

iG xinin

xnxI )(2

])()1[()(

11)(

1 is a parameter that reflects the inequality aversion of the index = 2 gives the standard Gini coefficient = 1 gives the constant index = (n-1)/n

Which (relative) inequality index should we use ?

Let us try to answer this question axiomatically Let us propose desirable properties that an

inequality index should satisfy and see whether we can identify precisely the class of inequality indices that satisfy these properties

Desirable properties of an inequality index We always assume Schur-convexity (and therefore symmetry) Continuity: For every number n > 0 of people, and every distributions of

income x and y between n people such that I(x) < I(y), there exists a strictly positive real numbers such that I(x) < I(y) for all income distribution x n

+ such that | xi - xi | <

Relative invariance: For every number n > 0 of people, and every distributions of income x and y between n people I(x) I(y) I(x) I(y) for all strictly positive real number (trivially satisfied by any relative index)

Absolute invariance: For every number n > 0 of people, and distributions of income x and y between n people I(x) I(y) I(x+) I(y+) for all strictly positive real number (trivially satisfied by any absolute index)

Desirable properties of an inequality index Shorrocks Group Decomposition: For every set N = {1,…,n} of n individuals,

for every partition of N into subsets A and B of N such that A B = and A B = N, I(x) = I(xA) + I(xB) + I((xA),(xB)) (where, for every C N with #C = c, xC is the vector in c

+ defined by xic = xi for all i C)

Convenient if the index is to be applied to subgroups of a given population Another important notion: the Equally Distributed Equivalent Income Take an inequality index I:n n and consider any distribution of income x

between n individuals EDEII(x) is the number such that I(EDEII(x),…,EDEII(x)) = I(x) EDEII(x) is the amount of income which, if given to everyone, gives an income

distribution that is just as good as x for the inequality index I EDEII(x) may not exist (it will not if I is a relative index) Recursivity: For any n, and all x n

+ I(x) = I(x(1),EDEII(x(2),…,x(n)),…,EDEII(x(2),…,x(n)))

Increasingness: For all positive real numbers a and b such that a > b I(a,…,a) > I(b,…,b) (violated by relative indices; efficiency notion)

Result 1: Theorem (Shorrocks (1984)): A relative and Schur

convex index I:n n is continuous and satisfies the Shorrocks Group Decomposition axiom if and only if, for every x n n, I(x) = IGEc(x) for some real number c 0

Hence generalized entropy indices are the only continuous, Schur convex and relative indices that satisfy the Shorrocks group decomposition axiom

If one wants VPCT sensitivity as well, then one needs to restrict attention to the smaller class of such indices for which c < 2

Result 2: Theorem (Donaldson and Weymark (1980, Bossert

1990) A Schur convex index I:n n is continuous and satisfies recursivity, increasingness, Dalton replication invariance as well as relative and absolute invariance if and only if there exists a real number 0 such that I(x)= IG(x) for every x n n

Poverty Poverty is an issue that is the source of intense feelings. A person is considered poor if the income of this person is

considered inferior to a certain treshold t: the Poverty line Defining and measuring poverty amounts therefore to: Identifying the poor (and therefore fixing the poverty line) Measuring poverty, given the identification of the poor Let us consider each of these in turn.

Identifiying the poor It is admitedly difficult to fix a poverty line that separate poor and non-poor It is always difficult to « draw the line ». Two approaches exist: absolute and relative Absolute approach: the poverty line is an amount of income that is

necessary (given prices) to someone to achieve a minimal nutritional, clothing, housing, educational, etc. objective

Absolute approach: the poverty line is independent from the characteristic of the income distribution to which it applies

Example: $1/day for the Word Bank millenium objective (based on the cheapest way in India to get 2400 calories per day).

Relative approach: the poverty line depends upon the distribution to which it is applied (for instance one half the median income)

Measuring poverty

The most widely used class of poverty measures (given a poverty line) is the Foster, Greek, Thorbeck FGT (1987) class.

This class is parameterized by a positive real number a. Given such a number a and a poverty line t the FGT measure of poverty in a distribution of income x, denoted Pa(t,x) is defined by:

n

i

ai

a xtxtP1

])0,(max[),(

This class contains two widely used measured of povertythat correspond, respectively, to a = 0 and a = 1

Headcount poverty

a = 0 corresponds to headcount poverty (counting the number (or the fraction) of poor in the population)

Indeed, for a = 0, the formula for the FGT class of index writes :

txi

xtxtP

i

n

ii

:#

])0,(max[),(1

00

This measure of poverty is widely used

under the convention that 00 = 0

It is not very sensitive to the intensity of poverty

Poverty gap

a = 1 corresponds to poverty gap the poverty gap of an income distribution for a

given poverty line is the minimal amount of money that is needed to eliminate totally poverty

Indeed, for a = 1, the formula for the FGT class of index writes :

n

iixtxtP

1

1 ]0,max[),(

This measure of poverty is also quite used

It is more sensitive to the intensity of poverty one sometimes use the square of the poverty gap (by fixing a = 2)

Contrasting these poverty measures

For a given poverty line, these poverty measures give very different evaluations of poverty

For instance suppose the poverty line is 6 and consider the following three distributions

x = (5,5,5,7), y = (3,3,7,7) and z = (1,7,7,7) x has more poor than y which has more poor than z Poverty gap is 6 in y, 5 in z and only 3 in x Square of poverty gap is 25 in z, 18 in y and 3 in x.

Dominance rankings of distributions of one attribute

We would like now to put together the various notions that we have seen (efficiency, equality, poverty) so as to rank distributions of income in a way that commands wide support.

We would like to connect our rankings of income distributions to firm ethical theories (welfarist or not)

There are several beautiful dominance theorems that enable us to do this

1st order (efficiency) dominance

The following four statements (that apply to two income distibutoins x and y) are equivalent

1) #{i: xi < t} #{i: yi < t} for all poverty lines t. 2) x Suppes dominates y 3) W(x) W(y) for all increasing and symmetric functions

W: n+ (ethical robustness for non-welfarist ethics)

4) F(u(x1),…,u(xn)) F(u(y1),…,u(yn)) for all Pareto inclusive and symmetric welfarist functionals F and all individual utility functions u: + that are increasing in income (ethical robustness for welfarist ethics)

1st order (efficiency) dominance First order dominance of a distribution over another is

particularly robust ethically Indeed, the class of ethical judgments (welfarist or not) who

aggree with the Suppes dominance ranking is very large. It is also nice to know that a distribution Suppes dominate

another if and only if the number of poor is lower in the dominating distribution than in the dominated one for all poverty lines

Any objective of poverty reduction for a poverty line, with poverty measured by the headcount, will therefore be pleased to see a Suppes improvement

Suppes dominance is often useful in practice to rank income distributions

Consider the ordered vectors of income of the OECD countries discussed in the beginning of the course

0

10000

20000

30000

40000

50000

60000

70000

80000

disposable income

1 2 3 4 5 6 7 8 9 10

individual rank

India

Spain

Italy

UK

Australia

US

France

Some empirical Suppes dominations

Suppes dominance chart

Switzerland US

UK AustraliaCanada

AustriaFrance Germany

Sweden Italy

Spain

Portugal

India

Inequality dominance Consider now two distributions x and y with the same

mean The following 5 statements are equivalent: 1) x PD y (one can go from y to x by a finite sequence of

Pigou Dalton transfers) 2) P1(t,x) P1(t,y) for all poverty lines t (Poverty gap

dominance) 3) Lk(x) Lk(y) for all k = 1,…,n (Lorenz dominance) 4) W(x) W(y) for all Schur-concave functions W:

n+ (ethical robustness for non-welfarist ethics)

5) F(u(x1),…,u(xn)) F(u(y1),…,u(yn)) for all Pareto inclusive and Schur concave welfarist functionals F and all individual utility functions u: + concave in income (ethical robustness for welfarist ethics)

Inequality dominance

Concerns only distributions with the same mean income

Notice that in the ethical statements 4 and 5, the property of concavity of individual utility function (welfarism) or of Schur concavity of the social evaluation function has replaced the property of increasingness.

What if we combine inequality and efficiency dominance ?

We can define robust rankings of distributions if we define the following (efficiency) elementary operation

Income increment

x(.) has been obtained from z(.) by an income increment if there is an individual i and a strictly positive real number such that x(h) = z(h) for all h ≠ i and x(i) = z(i) +

x(.) has been obtained from z(.) by an income increment if an individual has received some amount of money “from the sky” (mana ?)

x(.) has been obtained from z(.) by a finite sequence of Pigou-Dalton transfers and/or increment (x(.) IPD z(.)) if there exists a sequence of distributions {zt

(.)}, with t =0,...T such that z0(.)= x(.),

zT(.) = z(.) and zt

(.) has been obtained from zt+1(.) by a either

bilateral Pigou-Dalton transfer or an increment for t = 0,…,T-1

An example: Canada and Australia

The distribution of income in Australia in 1998 could have been obtained from that of Canada in the same year by a finite sequence of increments and Pigou-Dalton (at least if the two income distribution are aggregated by deciles)

Let us see this

Australia and Canada

0

10000

20000

30000

40000

50000

60000

1 2 3 4 5 6 7 8 9 10

Australia

Canada


0

2000

4000

6000

8000

10000

12000

1 2 3

Australia

Canada


0

2000

4000

6000

8000

10000

12000

1 2 3

Australia

Canada

A Pigou Daltontransferbetween 3 and 1


0

2000

4000

6000

8000

10000

12000

1 2 3

Australia

Canada


0

10000

20000

30000

40000

50000

60000

1 2 3 4 5 6 7 8 9 10

Australia

Canada


0

10000

20000

30000

40000

50000

60000

1 2 3 4 5 6 7 8 9 10

Australia

Canada

a set of incomeincrements

Second-order dominance The following 5 statements are equivalent: 1) x IPD y (one can go from y to x by a finite sequence of

Pigou Dalton transfers and/or increments) 2) P1(t,x) P1(t,y) for all poverty lines t (Poverty gap

dominance) 3) Lk(x) Lk(y) for all k = 1,…,n (Generalized Lorenz

dominance) 4) W(x) W(y) for all Schur-concave and increasing

functions W: n+ (ethical robustness for non-welfarist

ethics) 5) F(u(x1),…,u(xn)) F(u(y1),…,u(yn)) for all Pareto inclusive

and Schur concave welfarist functionals F and all individual utility functions u: + increasing and concave in income (ethical robustness for welfarist ethics)

Second-order dominance Except for statement 2), the equivalence looks very

much like that established for two distributions with the same mean

Mild difference for statements 4 and 5 (where the requirement that W and u (respectively) be increasing is added to that they be concave

Hence the class of ethical functions over which unanimity is looked for is smaller dominance)

We talk about Generalized Lorenz domination because the Lorenz curves that are appealed to in statement 3) do not have the same mean (and therefore do not “end” at the same point) as is the case with usual (non-generalized) Lorenz curves (with the ending point being often normalized at 1)

Here is the second order ranking of our countries

Generalized Lorenz dominance chart

Switzerland US

UKAustralia

Canada

Austria

France Germany

Sweden

Italy

Spain

Portugal

India

Comments on 2nd order dominance

The ranking is driven to a large extent from 1st order dominance

According to some, this is symptomatic of an excessive weight given to efficiency as opposed to equality

Indeed, the notion of efficiency which underlies 1st order dominance is very strong (it is not defined ceteris paribus with respect to equality)

What happens if we weaken the notion of efficiency only to proportional increases in income ?

Proportional increase in income

x(.) has been obtained from z(.) by a proportional increase in incomes of t if there is real number t greater than 1 such that x(h) = tz(h) for all h

x(.) has been obtained from z(.) by an income increment if all individuals have seen their income increased in proportion t

x(.) has been obtained from z(.) by a finite sequence of Pigou-Dalton transfers and/or proportional increase in income (x(.) PIPD z(.)) if there exists a sequence of distributions {zt


(.)= x(.), zT(.) = z(.) and zt


(.) by a either bilateral Pigou-Dalton transfer or a proportional increase in all incomes for t = 0,…,T-1

A result by Shorrocks The following 3 statements are equivalent: 1) x PIPD y (one can go from y to x by a finite

sequence of Pigou Dalton transfers and/or increments)

2) lk(x) lk(y) for all k = 1,…,n where, for every distribution z, lk(z) = Lk(z)/n(z) and (x) (y) (relative Lorenz and mean dominance)

3) W(x) W(y) for all Schur-concave functions W: n

+ satisfying W(tx) > W(ty) for all real numbers t > 1 ethical robustness for non-welfarist ethics)

Comments on this result No statement on poverty or on welfarist dominance Implementable criterion: requires both mean

dominance and relative Lorenz dominance Relative Lorenz criterion: compare the share of total

income held by the k poorest individuals (no matter what k is)

This criterion is very incomplete Here are the relative Lorenz curves for some of our

countries

0,00

0,10

0,20

0,30

0,40

0,50

0,60

0,70

0,80

0,90

1,00

1 2 3 4 5 6 7 8 9 10

Australia

France

Germany

Italy

Spain

Sweden

Uk

Us

India

Canada

Relative Lorenz curves

Mean income country ranking country Mean income

US 28 265

Switzerland 26 693

Australia 21 998

Canada 19 962

UK 19 812

Austria 19 194

France 18 158

Germany 17 661

Sweden 15 168

Italy 13 522

Spain 13 221

Portugal 11 501

India 1 885

Efficiency-equality dominance chart

SwedenUS

India

AustraliaCanada

Austria

Germany France

Switz.

Italy

Spain Portugal

non comparableto any other country

TO BE COMPLETED

Normative comparisons of distributions of one attribute.

Documents

Transcript of Normative comparisons of distributions of one attribute.