Quantile Regression: A Gentle Introductionroger/courses/RMetrics/L1.pdfQuantile Regression: A Gentle...

67
Quantile Regression: A Gentle Introduction Roger Koenker University of Illinois, Urbana-Champaign 5th RMetrics Workshop, Meielisalp: 28 June 2011 Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 1 / 58

Transcript of Quantile Regression: A Gentle Introductionroger/courses/RMetrics/L1.pdfQuantile Regression: A Gentle...

Quantile Regression: A Gentle Introduction

Roger Koenker

University of Illinois, Urbana-Champaign

5th RMetrics Workshop, Meielisalp: 28 June 2011

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 1 / 58

Overview of the Course

Some Basics: What, Why and How?

Inference and Quantile Treatment Effects

Nonparametric Quantile Regression

Quantile Autoregression

Risk Assessment and Choquet Portfolios

Course outline, lecture slides, an R FAQ, and even some proposed exercisescan all be found at:

http://www.econ.uiuc.edu/~roger/courses/RMetrics.

A somewhat more extensive set of lecture slides can be found at:

http://www.econ.uiuc.edu/~roger/courses/LSE.

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 2 / 58

Boxplots of CEO Pay

firm market value in billions

annu

al c

ompe

nsat

ion

in m

illio

ns

●●

0.1 1 10 100

0.1

110

100

** * * * *

** *

*

++ + + + +

++

+

+

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 3 / 58

Motivation

What the regression curve does is give a grand summary for theaverages of the distributions corresponding to the set of of x’s.We could go further and compute several different regressioncurves corresponding to the various percentage points of thedistributions and thus get a more complete picture of the set.

Ordinarily this is not done, and so regression often gives a ratherincomplete picture. Just as the mean gives an incomplete pictureof a single distribution, so the regression curve gives acorrespondingly incomplete picture for a set of distributions.

Mosteller and Tukey (1977)

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 4 / 58

Motivation

What the regression curve does is give a grand summary for theaverages of the distributions corresponding to the set of of x’s.We could go further and compute several different regressioncurves corresponding to the various percentage points of thedistributions and thus get a more complete picture of the set.Ordinarily this is not done, and so regression often gives a ratherincomplete picture.

Just as the mean gives an incomplete pictureof a single distribution, so the regression curve gives acorrespondingly incomplete picture for a set of distributions.

Mosteller and Tukey (1977)

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 4 / 58

Motivation

What the regression curve does is give a grand summary for theaverages of the distributions corresponding to the set of of x’s.We could go further and compute several different regressioncurves corresponding to the various percentage points of thedistributions and thus get a more complete picture of the set.Ordinarily this is not done, and so regression often gives a ratherincomplete picture. Just as the mean gives an incomplete pictureof a single distribution, so the regression curve gives acorrespondingly incomplete picture for a set of distributions.

Mosteller and Tukey (1977)

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 4 / 58

Motivation

Francis Galton in a famous passage defending the “charms of statistics”against its many detractors, chided his statistical colleagues

[who] limited their inquiries to Averages, and do not seem torevel in more comprehensive views. Their souls seem as dull tothe charm of variety as that of a native of one of our flat Englishcounties, whose retrospect of Switzerland was that, if themountains could be thrown into its lakes, two nuisances wouldbe got rid of at once. Natural Inheritance, 1889

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 5 / 58

Univariate QuantilesGiven a real-valued random variable, X, with distribution function F, wedefine the τth quantile of X as

QX(τ) = F−1X (τ) = inf{x | F(x) > τ}.

This definition follows the usual convention that F is CADLAG, and Q isCAGLAD as illustrated in the following pair of pictures.

0.0 1.0 2.0 3.0

0.0

0.2

0.4

0.6

0.8

1.0

x

F(x

)

0.0 0.2 0.4 0.6 0.8 1.0

0.0

1.0

2.0

3.0

τ

Q(τ

)

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 6 / 58

Univariate QuantilesGiven a real-valued random variable, X, with distribution function F, wewill define the τth quantile of X as

QX(τ) = F−1X (τ) = inf{x | F(x) > τ}.

This definition follows the usual convention that F is CADLAG, and Q isCAGLAD as illustrated in the following pair of pictures.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

x

F(x

)

● ●

F's are CADLAG

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

τ

Q(τ

)●

Q's are CAGLAD

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 7 / 58

Univariate Quantiles

Viewed from the perspective of densities, the τth quantile splits the areaunder the density into two parts: one with area τ below the τth quantileand the other with area 1 − τ above it:

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.4

0.8

x

f(x)

τ 1 − τ

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 8 / 58

Two Bits Worth of Convex Analysis

A convex function ρ and its subgradient ψ:

ττ − 1

ρτ(u)τ

τ − 1

ψτ(u)

The subgradient of a convex function f(u) at a point u consists of all thepossible “tangents.” Sums of convex functions are convex.

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 9 / 58

Population Quantiles as Optimizers

Quantiles solve a simple optimization problem:

α(τ) = argmin E ρτ(Y − α)

Proof: Let ψτ(u) = ρ′τ(u), so differentiating wrt to α:

0 =

∫∞−∞ψτ(y− α)dF(y)

= (τ− 1)

∫α−∞ dF(y) + τ

∫∞αdF(y)

= (τ− 1)F(α) + τ(1 − F(α))

implying τ = F(α) and thus α = F−1(τ).

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 10 / 58

Sample Quantiles as Optimizers

For sample quantiles replace F by F, the empirical distribution function.The objective function becomes a polyhedral convex function whosederivative is monotone decreasing, in effect the gradient simply countsobservations above and below and weights the sums by τ and 1 − τ.

−2 0 2 4

2030

4050

6070

80

x

R(x

)

●●

●●●●

●●

●●●●●●●●●●

●●

●●

−2 0 2 4

−15

−10

−5

05

1015

x

R'(x

)

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 11 / 58

Conditional Quantiles: The Least Squares Meta-Model

The unconditional mean solves

µ = argminmE(Y −m)2

The conditional mean µ(x) = E(Y|X = x) solves

µ(x) = argminmEY|X=x(Y −m(X))2.

Similarly, the unconditional τth quantile solves

ατ = argminaEρτ(Y − a)

and the conditional τth quantile solves

ατ(x) = argminaEY|X=xρτ(Y − a(X))

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 12 / 58

Conditional Quantiles: The Least Squares Meta-Model

The unconditional mean solves

µ = argminmE(Y −m)2

The conditional mean µ(x) = E(Y|X = x) solves

µ(x) = argminmEY|X=x(Y −m(X))2.

Similarly, the unconditional τth quantile solves

ατ = argminaEρτ(Y − a)

and the conditional τth quantile solves

ατ(x) = argminaEY|X=xρτ(Y − a(X))

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 12 / 58

Conditional Quantiles: The Least Squares Meta-Model

The unconditional mean solves

µ = argminmE(Y −m)2

The conditional mean µ(x) = E(Y|X = x) solves

µ(x) = argminmEY|X=x(Y −m(X))2.

Similarly, the unconditional τth quantile solves

ατ = argminaEρτ(Y − a)

and the conditional τth quantile solves

ατ(x) = argminaEY|X=xρτ(Y − a(X))

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 12 / 58

Conditional Quantiles: The Least Squares Meta-Model

The unconditional mean solves

µ = argminmE(Y −m)2

The conditional mean µ(x) = E(Y|X = x) solves

µ(x) = argminmEY|X=x(Y −m(X))2.

Similarly, the unconditional τth quantile solves

ατ = argminaEρτ(Y − a)

and the conditional τth quantile solves

ατ(x) = argminaEY|X=xρτ(Y − a(X))

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 12 / 58

Computation of Linear Regression Quantiles

Primal Formulation as a linear program, split the residual vector intopositive and negative parts and sum with appropriate weights:

min{τ1>u+ (1 − τ)1>v|y = Xb+ u− v, (b,u, v) ∈ |Rp × |R2n+ }

Dual Formulation as a Linear Program

max{y ′d|X>d = (1 − τ)X>1,d ∈ [0, 1]n}

Solutions are characterized by an exact fit to p observations.Let h ∈ H index p-element subsets of {1, 2, ...,n} then primal solutionstake the form:

β = β(h) = X(h)−1y(h)

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 13 / 58

Least Squares from the Quantile Regression PerspectiveExact fits to p observations:

β = β(h) = X(h)−1y(h)

OLS is a weighted average of these β(h)’s:

βOLS = (X>X)−1X>y =∑h∈H

w(h)β(h),

w(h) = |X(h)|2/∑h∈H

|X(h)|2

The determinants |X(h)| are the (signed) volumes of the parallelipipedsformed by the columns of the the matrices X(h). In the simplest bivariatecase, we have,

|X(h)|2 =

∣∣∣∣ 1 xi1 xj

∣∣∣∣2 = (xj − xi)2

so pairs of observations that are far apart are given more weight.

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 14 / 58

Least Squares from the Quantile Regression PerspectiveExact fits to p observations:

β = β(h) = X(h)−1y(h)

OLS is a weighted average of these β(h)’s:

βOLS = (X>X)−1X>y =∑h∈H

w(h)β(h),

w(h) = |X(h)|2/∑h∈H

|X(h)|2

The determinants |X(h)| are the (signed) volumes of the parallelipipedsformed by the columns of the the matrices X(h). In the simplest bivariatecase, we have,

|X(h)|2 =

∣∣∣∣ 1 xi1 xj

∣∣∣∣2 = (xj − xi)2

so pairs of observations that are far apart are given more weight.

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 14 / 58

Quantile Regression: The Movie

Bivariate linear model with iid Student t errors

Conditional quantile functions are parallel in blue

100 observations indicated in blue

Fitted quantile regression lines in red.

Intervals for τ ∈ (0, 1) for which the solution is optimal.

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 15 / 58

Quantile Regression in the iid Error Model

0 2 4 6 8 10

−2

02

46

810

12

x

y

●●

●●

● ●

●●

● ●

●●●

● ●

[ 0.069 , 0.096 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 16 / 58

Quantile Regression in the iid Error Model

0 2 4 6 8 10

−2

02

46

810

12

x

y

●●

●●

● ●

●●

● ●

●●●

● ●

[ 0.138 , 0.151 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 17 / 58

Quantile Regression in the iid Error Model

0 2 4 6 8 10

−2

02

46

810

12

x

y

●●

●●

● ●

●●

● ●

●●●

● ●

[ 0.231 , 0.24 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 18 / 58

Quantile Regression in the iid Error Model

0 2 4 6 8 10

−2

02

46

810

12

x

y

●●

●●

● ●

●●

● ●

●●●

● ●

[ 0.3 , 0.3 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 19 / 58

Quantile Regression in the iid Error Model

0 2 4 6 8 10

−2

02

46

810

12

x

y

●●

●●

● ●

●●

● ●

●●●

● ●

[ 0.374 , 0.391 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 20 / 58

Quantile Regression in the iid Error Model

0 2 4 6 8 10

−2

02

46

810

12

x

y

●●

●●

● ●

●●

● ●

●●●

● ●

[ 0.462 , 0.476 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 21 / 58

Quantile Regression in the iid Error Model

0 2 4 6 8 10

−2

02

46

810

12

x

y

●●

●●

● ●

●●

● ●

●●●

● ●

[ 0.549 , 0.551 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 22 / 58

Quantile Regression in the iid Error Model

0 2 4 6 8 10

−2

02

46

810

12

x

y

●●

●●

● ●

●●

● ●

●●●

● ●

[ 0.619 , 0.636 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 23 / 58

Quantile Regression in the iid Error Model

0 2 4 6 8 10

−2

02

46

810

12

x

y

●●

●●

● ●

●●

● ●

●●●

● ●

[ 0.704 , 0.729 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 24 / 58

Quantile Regression in the iid Error Model

0 2 4 6 8 10

−2

02

46

810

12

x

y

●●

●●

● ●

●●

● ●

●●●

● ●

[ 0.768 , 0.798 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 25 / 58

Quantile Regression in the iid Error Model

0 2 4 6 8 10

−2

02

46

810

12

x

y

●●

●●

● ●

●●

● ●

●●●

● ●

[ 0.919 , 0.944 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 26 / 58

Virtual Quantile Regression II

Bivariate quadratic model with Heteroscedastic χ2 errors

Conditional quantile functions drawn in blue

100 observations indicated in blue

Fitted quadratic quantile regression lines in red

Intervals of optimality for τ ∈ (0, 1).

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 27 / 58

Quantile Regression in the Heteroscedastic Error Model

0 2 4 6 8 10

020

4060

8010

0

x

y

●●

● ●

●●

●●

●●

●● ●

[ 0.048 , 0.062 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 28 / 58

Quantile Regression in the Heteroscedastic Error Model

0 2 4 6 8 10

020

4060

8010

0

x

y

●●

● ●

●●

●●

●●

●● ●

[ 0.179 , 0.204 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 29 / 58

Quantile Regression in the Heteroscedastic Error Model

0 2 4 6 8 10

020

4060

8010

0

x

y

●●

● ●

●●

●●

●●

●● ●

[ 0.261 , 0.261 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 30 / 58

Quantile Regression in the Heteroscedastic Error Model

0 2 4 6 8 10

020

4060

8010

0

x

y

●●

● ●

●●

●●

●●

●● ●

[ 0.304 , 0.319 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 31 / 58

Quantile Regression in the Heteroscedastic Error Model

0 2 4 6 8 10

020

4060

8010

0

x

y

●●

● ●

●●

●●

●●

●● ●

[ 0.414 , 0.417 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 32 / 58

Quantile Regression in the Heteroscedastic Error Model

0 2 4 6 8 10

020

4060

8010

0

x

y

●●

● ●

●●

●●

●●

●● ●

[ 0.499 , 0.507 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 33 / 58

Quantile Regression in the Heteroscedastic Error Model

0 2 4 6 8 10

020

4060

8010

0

x

y

●●

● ●

●●

●●

●●

●● ●

[ 0.581 , 0.582 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 34 / 58

Quantile Regression in the Heteroscedastic Error Model

0 2 4 6 8 10

020

4060

8010

0

x

y

●●

● ●

●●

●●

●●

●● ●

[ 0.633 , 0.635 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 35 / 58

Quantile Regression in the Heteroscedastic Error Model

0 2 4 6 8 10

020

4060

8010

0

x

y

●●

● ●

●●

●●

●●

●● ●

[ 0.685 , 0.685 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 36 / 58

Quantile Regression in the Heteroscedastic Error Model

0 2 4 6 8 10

020

4060

8010

0

x

y

●●

● ●

●●

●●

●●

●● ●

[ 0.73 , 0.733 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 37 / 58

Quantile Regression in the Heteroscedastic Error Model

0 2 4 6 8 10

020

4060

8010

0

x

y

●●

● ●

●●

●●

●●

●● ●

[ 0.916 , 0.925 ]

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 38 / 58

Conditional Means vs. Medians

●●

● ●

0 2 4 6 8 10

010

2030

40

x

y

meanmedian

Minimizing absolute errors for median regression can yield something quitedifferent from the least squares fit for mean regression.

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 39 / 58

Equivariance of Regression Quantiles

Scale Equivariance: For any a > 0, β(τ;ay,X) = aβ(τ;y,X) andβ(τ; −ay,X) = aβ(1 − τ;y,X)

Regression Shift: For any γ ∈ |Rp β(τ;y+ Xγ,X) = β(τ;y,X) + γ

Reparameterization of Design: For any |A| 6= 0,β(τ;y,AX) = A−1β(τ;yX)

Robustness: For any diagonal matrix D with nonnegative elements.β(τ;y,X) = β(τ,y+Du,X)

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 40 / 58

Equivariance of Regression Quantiles

Scale Equivariance: For any a > 0, β(τ;ay,X) = aβ(τ;y,X) andβ(τ; −ay,X) = aβ(1 − τ;y,X)

Regression Shift: For any γ ∈ |Rp β(τ;y+ Xγ,X) = β(τ;y,X) + γ

Reparameterization of Design: For any |A| 6= 0,β(τ;y,AX) = A−1β(τ;yX)

Robustness: For any diagonal matrix D with nonnegative elements.β(τ;y,X) = β(τ,y+Du,X)

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 40 / 58

Equivariance of Regression Quantiles

Scale Equivariance: For any a > 0, β(τ;ay,X) = aβ(τ;y,X) andβ(τ; −ay,X) = aβ(1 − τ;y,X)

Regression Shift: For any γ ∈ |Rp β(τ;y+ Xγ,X) = β(τ;y,X) + γ

Reparameterization of Design: For any |A| 6= 0,β(τ;y,AX) = A−1β(τ;yX)

Robustness: For any diagonal matrix D with nonnegative elements.β(τ;y,X) = β(τ,y+Du,X)

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 40 / 58

Equivariance of Regression Quantiles

Scale Equivariance: For any a > 0, β(τ;ay,X) = aβ(τ;y,X) andβ(τ; −ay,X) = aβ(1 − τ;y,X)

Regression Shift: For any γ ∈ |Rp β(τ;y+ Xγ,X) = β(τ;y,X) + γ

Reparameterization of Design: For any |A| 6= 0,β(τ;y,AX) = A−1β(τ;yX)

Robustness: For any diagonal matrix D with nonnegative elements.β(τ;y,X) = β(τ,y+Du,X)

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 40 / 58

Equivariance to Monotone Transformations

For any monotone function h, conditional quantile functions QY(τ|x) areequivariant in the sense that

Qh(Y)|X(τ|x) = h(QY|X(τ|x))

In contrast to conditional mean functions for which, generally,

E(h(Y)|X) 6= h(EY|X)

Examples:h(y) = min{0,y}, Powell’s (1985) censored regression estimator.h(y) = sgn{y} Rosenblatt’s (1957) perceptron, Manski’s (1975) maximumscore estimator. estimator.

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 41 / 58

Beyond Average Treatment Effects

Lehmann (1974) proposed the following general model of treatmentresponse:

“Suppose the treatment adds the amount ∆(x) when theresponse of the untreated subject would be x. Then thedistribution G of the treatment responses is that of the randomvariable X+ ∆(X) where X is distributed according to F.”

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 42 / 58

Lehmann QTE as a QQ-Plot

Doksum (1974) defines ∆(x) as the “horizontal distance” between F andG at x, i.e.

F(x) = G(x+ ∆(x)).

Then ∆(x) is uniquely defined as

∆(x) = G−1(F(x)) − x.

This is the essence of the conventional QQ-plot. Changing variables soτ = F(x) we have the quantile treatment effect (QTE):

δ(τ) = ∆(F−1(τ)) = G−1(τ) − F−1(τ).

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 43 / 58

Lehmann-Doksum QTE

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

x

PG

F

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 44 / 58

Lehmann-Doksum QTE

−6 −4 −2 0 2 4 6

0.0

0.2

0.4

0.6

0.8

1.0

x

P

Location Shift

−6 −4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

x

dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

0.99

01.

000

1.01

0

u

QT

E−6 −4 −2 0 2 4 6

0.0

0.2

0.4

0.6

0.8

1.0

x

P

Scale Shift

−6 −4 −2 0 2 4 60.

00.

10.

20.

30.

4

x

dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

−2

−1

01

2

u

QT

E

−6 −4 −2 0 2 4 6

0.0

0.2

0.4

0.6

0.8

1.0

x

P

Location and Scale Shift

−6 −4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

x

dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

−1

01

23

u

QT

E

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 45 / 58

An Asymmetric Example

0 2 4 6 8 10 12

0.0

0.2

0.4

0.6

0.8

1.0

x

P

0 2 4 6 8 10 12

0.0

0.1

0.2

0.3

0.4

0.5

0.6

x

dens

ity0.0 0.2 0.4 0.6 0.8 1.0

−10

−5

05

10

u

QT

E

Treatment shifts the distribution from right skewed to left skewed makingthe QTE U-shaped.

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 46 / 58

QTE via Quantile Regression

The Lehmann QTE is naturally estimable by

δ(τ) = G−1n (τ) − F−1

m (τ)

where Gn and Fm denote the empirical distribution functions of thetreatment and control observations, Consider the quantile regression model

QYi(τ|Di) = α(τ) + δ(τ)Di

where Di denotes the treatment indicator, and Yi = h(Ti), e.g.Yi = log Ti, which can be estimated by solving,

minn∑i=1

ρτ(yi − α− δDi)

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 47 / 58

Engel’s Food Expenditure Data

1000 2000 3000 4000 5000

500

1000

1500

2000

Household Income

Foo

d E

xpen

ditu

re

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

● ●

●●

●●

●● ●

Engel Curves for Food: This figure plots data taken from Engel’s (1857) study of the de-

pendence of households’ food expenditure on household income. Seven estimated quantile

regression lines for τ ∈ {.05, .1, .25, .5, .75, .9, .95} are superimposed on the scatterplot.

The median τ = .5 fit is indicated by the blue solid line; the least squares estimate of the

conditional mean function is indicated by the red dashed line.Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 48 / 58

Engel’s Food Expenditure Data

400 500 600 700 800 900 1000

300

400

500

600

700

800

Household Income

Foo

d E

xpen

ditu

re

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

● ●●

Engel Curves for Food: This figure plots data taken from Engel’s (1857) study of the de-

pendence of households’ food expenditure on household income. Seven estimated quantile

regression lines for τ ∈ {.05, .1, .25, .5, .75, .9, .95} are superimposed on the scatterplot.

The median τ = .5 fit is indicated by the blue solid line; the least squares estimate of the

conditional mean function is indicated by the red dashed line.Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 49 / 58

A Model of Infant Birthweight

Reference: Abrevaya (2001), Koenker and Hallock (2001)

Data: June, 1997, Detailed Natality Data of the US. Live, singletonbirths, with mothers recorded as either black or white, between 18-45,and residing in the U.S. Sample size: 198,377.

Response: Infant Birthweight (in grams)

Covariates:I Mother’s EducationI Mother’s Prenatal CareI Mother’s SmokingI Mother’s AgeI Mother’s Weight Gain

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 50 / 58

Quantile Regression Birthweight Model I

0.0 0.2 0.4 0.6 0.8 1.0

2500

3000

3500

4000

Intercept

••

••

••

••

••

••

0.0 0.2 0.4 0.6 0.8 1.0

4060

8010

012

014

0

Boy

••

••

••

••

• • • •

0.0 0.2 0.4 0.6 0.8 1.0

4060

8010

0

Married

••

• • ••

•• • •

••

• •

0.0 0.2 0.4 0.6 0.8 1.0

-350

-300

-250

-200

-150

Black

• • •• • • • • • •

• •• •

0.0 0.2 0.4 0.6 0.8 1.0

3040

5060

Mother’s Age

•• •

•• •

•• •

•• •

0.0 0.2 0.4 0.6 0.8 1.0

-1.0

-0.8

-0.6

-0.4

Mother’s Age^2

•• •

• • • ••

• • •• •

0.0 0.2 0.4 0.6 0.8 1.0

010

2030

High School

••

••

••

••

••

• ••

0.0 0.2 0.4 0.6 0.8 1.0

1020

3040

50

Some College

••

••

• ••

• ••

• • •

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 51 / 58

Quantile Regression Birthweight Model II

0.0 0.2 0.4 0.6 0.8 1.0

-20

020

4060

8010

0

College

•• • • •

••

•• •

•• • •

••

0.0 0.2 0.4 0.6 0.8 1.0

-500

-400

-300

-200

-100

0

No Prenatal

•• •

• •• •

•• • • • •

••

0.0 0.2 0.4 0.6 0.8 1.0

020

4060

Prenatal Second

••

••

• • •• •

••

••

• •

0.0 0.2 0.4 0.6 0.8 1.0

-50

050

100

150

Prenatal Third

• • •• •

• • •• • •

•• •

0.0 0.2 0.4 0.6 0.8 1.0

-200

-180

-160

-140

Smoker

• •• •

• ••

0.0 0.2 0.4 0.6 0.8 1.0

-6-5

-4-3

-2

Cigarette’s/Day

•• •

• • ••

••

• •

0.0 0.2 0.4 0.6 0.8 1.0

010

2030

40

Mother’s Weight Gain

••

•• •

• • •• • •

• • • •

0.0 0.2 0.4 0.6 0.8 1.0

-0.3

-0.2

-0.1

0.0

0.1

Mother’s Weight Gain^2

••

•• •

• • •• •

•• •

• •

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 52 / 58

Marginal Effect of Mother’s Age

●●

●●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0.0 0.2 0.4 0.6 0.8

0.01

0.03

0.05

Quantile

Wei

ght G

ain

Effe

ct

●●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0.0 0.2 0.4 0.6 0.8

0.01

0.02

0.03

0.04

Quantile

Wei

ght G

ain

Effe

ct

●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0.0 0.2 0.4 0.6 0.8

0.01

00.

015

0.02

0

Quantile

Wei

ght G

ain

Effe

ct

●●

● ● ● ● ●●

● ● ●● ● ●

● ●●

●●

● ●

0.0 0.2 0.4 0.6 0.8

0.00

70.

009

0.01

1

Quantile

Wei

ght G

ain

Effe

ct

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 53 / 58

Marginal Effect of Mother’s Weight Gain

20 25 30 35 40 45

0.4

0.5

0.6

0.7

Mother's Age

Wei

ght G

ain

Effe

ct

20 25 30 35 40 45

0.45

0.55

0.65

Mother's Age

Wei

ght G

ain

Effe

ct

20 25 30 35 40 45

0.44

0.48

0.52

0.56

Mother's Age

Wei

ght G

ain

Effe

ct

20 25 30 35 40 45

0.45

0.50

0.55

0.60

Mother's Age

Wei

ght G

ain

Effe

ct

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 54 / 58

Daily Temperature in Melbourne: AR(1) Scatterplot

10 15 20 25 30 35 40

1020

3040

yesterday's max temperature

toda

y's

max

tem

pera

ture

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 55 / 58

Daily Temperature in Melbourne: Nonlinear QAR(1) Fit

10 15 20 25 30 35 40

1020

3040

yesterday's max temperature

toda

y's

max

tem

pera

ture

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 56 / 58

Conditional Densities of Melbourne Daily Temperature

10 12 14 16 18

0.05

0.15

today's max temperature

dens

ity

Yesterday's Temp 11

12 16 20 24

0.00

0.05

0.10

0.15

today's max temperature

dens

ity

Yesterday's Temp 16

15 20 25 30

0.02

0.06

0.10

today's max temperature

dens

ity

Yesterday's Temp 21

15 20 25 30 35

0.01

0.03

0.05

0.07

today's max temperature

dens

ity

Yesterday's Temp 25

20 25 30 35

0.01

0.03

0.05

0.07

today's max temperature

dens

ity

Yesterday's Temp 30

20 25 30 35 40

0.01

0.03

0.05

0.07

today's max temperature

dens

ity

Yesterday's Temp 35

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 57 / 58

Review

Least squares methods of estimating conditional mean functions

were developed for, and

promote the view that,

Response = Signal + iid (Gaussian Measurement Error

In fact the world is rarely this simple. Quantile regression is intended toexpand the regression window allowing us to see a wider vista.

Roger Koenker (UIUC) Introduction Meielisalp: 28.6.2011 58 / 58