Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data...

72
Prologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department of Statistics, University of California, Santa Cruz Joint work with Yifei Yan (Amazon, New York) 12 th International Conference on Bayesian Nonparametrics Oxford, U.K., June 24-28, 2019 1 / 28

Transcript of Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data...

Page 1: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Bayesian Quantile Mixture Regression

Athanasios Kottas

Department of Statistics, University of California, Santa Cruz

Joint work with Yifei Yan (Amazon, New York)

12th International Conference on Bayesian NonparametricsOxford, U.K., June 24-28, 2019

1 / 28

Page 2: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Quantile regression

• Quantile regression quantifies the relationship between a set ofquantiles of the response distribution and covariates, thus providing amore complete explanation of the response distribution.

• Practically important alternative to traditional mean regression models,with a growing literature in terms of methods and applications.

• Single-quantile regression formulation: yi = h(xi) + εi, i = 1, ..., n• Response observations yi, with covariate vectors xi.

• εi i.i.d. (given parameters) from an error distribution with density fp(ε) andp-th quantile equal to 0, i.e.,

∫ 0−∞ fp(ε)dε = p.

• h(x) is the quantile regression function, e.g., h(x) = x′β, for a linearquantile regression model.

2 / 28

Page 3: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Quantile regression

• Quantile regression quantifies the relationship between a set ofquantiles of the response distribution and covariates, thus providing amore complete explanation of the response distribution.

• Practically important alternative to traditional mean regression models,with a growing literature in terms of methods and applications.

• Single-quantile regression formulation: yi = h(xi) + εi, i = 1, ..., n• Response observations yi, with covariate vectors xi.

• εi i.i.d. (given parameters) from an error distribution with density fp(ε) andp-th quantile equal to 0, i.e.,

∫ 0−∞ fp(ε)dε = p.

• h(x) is the quantile regression function, e.g., h(x) = x′β, for a linearquantile regression model.

2 / 28

Page 4: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Estimation and modeling methods

• Classical nonparametric estimation: point estimates for quantile re-gression coefficients β through optimization of the check loss function,minβ

∑ni=1 ρp(yi − x′iβ), where ρp(u) = up− u1(u<0) (Koenker, 2005).

• Least absolute deviations criterion, minβ

∑ni=1 |yi − x′iβ|, for p = 0.5.

• Bayesian modeling and inference:• Nonparametric priors for the error distribution (Kottas & Gelfand, 2001;

Hanson & Johnson, 2002; Kottas & Krnjajic, 2009; Reich et al., 2010)

• Joint estimation of multiple quantile regression curves (Tokdar & Kadane,2012; Reich & Smith, 2013; Yang & Tokdar, 2017)

• Nonparametric inference through density regression (Taddy & Kottas, 2010)

• Parametric modeling based on the asymmetric Laplace (AL) distribution(Yu & Moyeed, 2001; Tsionas, 2003; Kozumi & Kobayashi, 2011)

3 / 28

Page 5: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Estimation and modeling methods

• Classical nonparametric estimation: point estimates for quantile re-gression coefficients β through optimization of the check loss function,minβ

∑ni=1 ρp(yi − x′iβ), where ρp(u) = up− u1(u<0) (Koenker, 2005).

• Least absolute deviations criterion, minβ

∑ni=1 |yi − x′iβ|, for p = 0.5.

• Bayesian modeling and inference:• Nonparametric priors for the error distribution (Kottas & Gelfand, 2001;

Hanson & Johnson, 2002; Kottas & Krnjajic, 2009; Reich et al., 2010)

• Joint estimation of multiple quantile regression curves (Tokdar & Kadane,2012; Reich & Smith, 2013; Yang & Tokdar, 2017)

• Nonparametric inference through density regression (Taddy & Kottas, 2010)

• Parametric modeling based on the asymmetric Laplace (AL) distribution(Yu & Moyeed, 2001; Tsionas, 2003; Kozumi & Kobayashi, 2011)

3 / 28

Page 6: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Objectives

• Quantile mixture regression: modeling for the response distributionthrough weighted mixtures of quantile regression components.

• Common regression function (common β in linear regression) for allmixture components.

• Synthesize information from multiple parts of the response distribution forestimation and selection of covariate effects.

• Different from simultaneous quantile regression, the goal is to obtain acombined estimate of the predictive effect of each covariate.

• Mixtures of K distributions, each parameterized in terms of the pk-thquantile, for k = 1, ...,K.

• Fixed pk: mixtures of generalized AL distributions (more flexible skewnessand tail behavior than the AL, when pk is fixed).

• Random pk: mixtures of AL distributions.

4 / 28

Page 7: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Objectives

• Quantile mixture regression: modeling for the response distributionthrough weighted mixtures of quantile regression components.

• Common regression function (common β in linear regression) for allmixture components.

• Synthesize information from multiple parts of the response distribution forestimation and selection of covariate effects.

• Different from simultaneous quantile regression, the goal is to obtain acombined estimate of the predictive effect of each covariate.

• Mixtures of K distributions, each parameterized in terms of the pk-thquantile, for k = 1, ...,K.

• Fixed pk: mixtures of generalized AL distributions (more flexible skewnessand tail behavior than the AL, when pk is fixed).

• Random pk: mixtures of AL distributions.

4 / 28

Page 8: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Objectives

• Quantile mixture regression: modeling for the response distributionthrough weighted mixtures of quantile regression components.

• Common regression function (common β in linear regression) for allmixture components.

• Synthesize information from multiple parts of the response distribution forestimation and selection of covariate effects.

• Different from simultaneous quantile regression, the goal is to obtain acombined estimate of the predictive effect of each covariate.

• Mixtures of K distributions, each parameterized in terms of the pk-thquantile, for k = 1, ...,K.

• Fixed pk: mixtures of generalized AL distributions (more flexible skewnessand tail behavior than the AL, when pk is fixed).

• Random pk: mixtures of AL distributions.

4 / 28

Page 9: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Objectives

• Quantile mixture regression: modeling for the response distributionthrough weighted mixtures of quantile regression components.

• Common regression function (common β in linear regression) for allmixture components.

• Synthesize information from multiple parts of the response distribution forestimation and selection of covariate effects.

• Different from simultaneous quantile regression, the goal is to obtain acombined estimate of the predictive effect of each covariate.

• Mixtures of K distributions, each parameterized in terms of the pk-thquantile, for k = 1, ...,K.

• Fixed pk: mixtures of generalized AL distributions (more flexible skewnessand tail behavior than the AL, when pk is fixed).

• Random pk: mixtures of AL distributions.

4 / 28

Page 10: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Modeling framework

• Quantile mixture regression model:

f (y | x) =K∑

k=1

ωk fpk(y | µpk + x′β, θ)

• Continuous response y, covariate vector x.

• The pk-th quantile for density fpk is given by µpk + x′β(or, more generally, by µpk + h(x)).

• 0 < p1 < ... < pK < 1 are (possibly random) probabilities associatedwith quantiles µpk + x′β.

• Restriction µp1 < ... < µpK yields the ordering constraint for the quantiles→ facilitates identifiability→ connects mixture weights with quantiles.

• Additional scale/skewness/tail parameters θ (may be component specific).

5 / 28

Page 11: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Modeling framework

• Quantile mixture regression model:

f (y | x) =K∑

k=1

ωk fpk(y | µpk + x′β, θ)

• Continuous response y, covariate vector x.

• The pk-th quantile for density fpk is given by µpk + x′β(or, more generally, by µpk + h(x)).

• 0 < p1 < ... < pK < 1 are (possibly random) probabilities associatedwith quantiles µpk + x′β.

• Restriction µp1 < ... < µpK yields the ordering constraint for the quantiles→ facilitates identifiability→ connects mixture weights with quantiles.

• Additional scale/skewness/tail parameters θ (may be component specific).

5 / 28

Page 12: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Modeling framework

• Quantile mixture regression model:

f (y | x) =K∑

k=1

ωk fpk(y | µpk + x′β, θ)

• Continuous response y, covariate vector x.

• The pk-th quantile for density fpk is given by µpk + x′β(or, more generally, by µpk + h(x)).

• 0 < p1 < ... < pK < 1 are (possibly random) probabilities associatedwith quantiles µpk + x′β.

• Restriction µp1 < ... < µpK yields the ordering constraint for the quantiles→ facilitates identifiability→ connects mixture weights with quantiles.

• Additional scale/skewness/tail parameters θ (may be component specific).

5 / 28

Page 13: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Modeling framework

• Quantile mixture regression model:

f (y | x) =K∑

k=1

ωk fpk(y | µpk + x′β, θ)

• Continuous response y, covariate vector x.

• The pk-th quantile for density fpk is given by µpk + x′β(or, more generally, by µpk + h(x)).

• 0 < p1 < ... < pK < 1 are (possibly random) probabilities associatedwith quantiles µpk + x′β.

• Restriction µp1 < ... < µpK yields the ordering constraint for the quantiles→ facilitates identifiability→ connects mixture weights with quantiles.

• Additional scale/skewness/tail parameters θ (may be component specific).

5 / 28

Page 14: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Modeling framework

• Quantile mixture regression model:

f (y | x) =K∑

k=1

ωk fpk(y | µpk + x′β, θ)

• Continuous response y, covariate vector x.

• The pk-th quantile for density fpk is given by µpk + x′β(or, more generally, by µpk + h(x)).

• 0 < p1 < ... < pK < 1 are (possibly random) probabilities associatedwith quantiles µpk + x′β.

• Restriction µp1 < ... < µpK yields the ordering constraint for the quantiles→ facilitates identifiability→ connects mixture weights with quantiles.

• Additional scale/skewness/tail parameters θ (may be component specific).

5 / 28

Page 15: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Connection with quantile regression methods

• Different from simultaneous quantile regression which jointly modelsdifferent covariate effects for different quantiles (βpk

instead of β).

• Similar in spirit with composite quantile regression from the classicalliterature (Zou & Yuan, 2008): for a collection of K specified quantilelevels, 0 < p1 < . . . < pK < 1, the CQR estimator:

(b1, . . . , bK , β) = arg minb1,...,bk,β

K∑k=1

{n∑

i=1

ρpk(yi − bk − x′iβ)

}

• bk ≡ bpk , for k = 1, ...,K: intercepts for the specified quantile levels.

• Variable selection method with properties that include consistency inselection and asymptotic normality, as n→∞.

• But, only an optimization algorithm ... does not involve/correspond to aprobabilistic model.

6 / 28

Page 16: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Connection with quantile regression methods

• Different from simultaneous quantile regression which jointly modelsdifferent covariate effects for different quantiles (βpk

instead of β).

• Similar in spirit with composite quantile regression from the classicalliterature (Zou & Yuan, 2008): for a collection of K specified quantilelevels, 0 < p1 < . . . < pK < 1, the CQR estimator:

(b1, . . . , bK , β) = arg minb1,...,bk,β

K∑k=1

{n∑

i=1

ρpk(yi − bk − x′iβ)

}

• bk ≡ bpk , for k = 1, ...,K: intercepts for the specified quantile levels.

• Variable selection method with properties that include consistency inselection and asymptotic normality, as n→∞.

• But, only an optimization algorithm ... does not involve/correspond to aprobabilistic model.

6 / 28

Page 17: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Connection with quantile regression methods

• Different from simultaneous quantile regression which jointly modelsdifferent covariate effects for different quantiles (βpk

instead of β).

• Similar in spirit with composite quantile regression from the classicalliterature (Zou & Yuan, 2008): for a collection of K specified quantilelevels, 0 < p1 < . . . < pK < 1, the CQR estimator:

(b1, . . . , bK , β) = arg minb1,...,bk,β

K∑k=1

{n∑

i=1

ρpk(yi − bk − x′iβ)

}

• bk ≡ bpk , for k = 1, ...,K: intercepts for the specified quantile levels.

• Variable selection method with properties that include consistency inselection and asymptotic normality, as n→∞.

• But, only an optimization algorithm ... does not involve/correspond to aprobabilistic model.

6 / 28

Page 18: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Connection with quantile regression methods

• Different from simultaneous quantile regression which jointly modelsdifferent covariate effects for different quantiles (βpk

instead of β).

• Similar in spirit with composite quantile regression from the classicalliterature (Zou & Yuan, 2008): for a collection of K specified quantilelevels, 0 < p1 < . . . < pK < 1, the CQR estimator:

(b1, . . . , bK , β) = arg minb1,...,bk,β

K∑k=1

{n∑

i=1

ρpk(yi − bk − x′iβ)

}

• bk ≡ bpk , for k = 1, ...,K: intercepts for the specified quantile levels.

• Variable selection method with properties that include consistency inselection and asymptotic normality, as n→∞.

• But, only an optimization algorithm ... does not involve/correspond to aprobabilistic model.

6 / 28

Page 19: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Two modeling scenarios

• We envision two modeling scenarios (with fixed K throughout):• fixed pk: for settings where one expects specific parts of the response

distribution (e.g., the right tail) to be informed by the covariates;

• the fixed pk model can also be used when such information is not available,based on a set of equally spaced pk spanning the unit interval;

• random pk: specify only the total number of quantiles, and let the datainform the full configuration of quantile components (both pk and µpk ).

• Mixture components need to be flexible distributions parameterized interms of quantiles.• For the random pk model, the AL distribution is sufficiently flexible.

• But not so when the quantile level is fixed→ generalized AL distribution.

7 / 28

Page 20: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Two modeling scenarios

• We envision two modeling scenarios (with fixed K throughout):• fixed pk: for settings where one expects specific parts of the response

distribution (e.g., the right tail) to be informed by the covariates;

• the fixed pk model can also be used when such information is not available,based on a set of equally spaced pk spanning the unit interval;

• random pk: specify only the total number of quantiles, and let the datainform the full configuration of quantile components (both pk and µpk ).

• Mixture components need to be flexible distributions parameterized interms of quantiles.• For the random pk model, the AL distribution is sufficiently flexible.

• But not so when the quantile level is fixed→ generalized AL distribution.

7 / 28

Page 21: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Two modeling scenarios

• We envision two modeling scenarios (with fixed K throughout):• fixed pk: for settings where one expects specific parts of the response

distribution (e.g., the right tail) to be informed by the covariates;

• the fixed pk model can also be used when such information is not available,based on a set of equally spaced pk spanning the unit interval;

• random pk: specify only the total number of quantiles, and let the datainform the full configuration of quantile components (both pk and µpk ).

• Mixture components need to be flexible distributions parameterized interms of quantiles.• For the random pk model, the AL distribution is sufficiently flexible.

• But not so when the quantile level is fixed→ generalized AL distribution.

7 / 28

Page 22: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Two modeling scenarios

• We envision two modeling scenarios (with fixed K throughout):• fixed pk: for settings where one expects specific parts of the response

distribution (e.g., the right tail) to be informed by the covariates;

• the fixed pk model can also be used when such information is not available,based on a set of equally spaced pk spanning the unit interval;

• random pk: specify only the total number of quantiles, and let the datainform the full configuration of quantile components (both pk and µpk ).

• Mixture components need to be flexible distributions parameterized interms of quantiles.• For the random pk model, the AL distribution is sufficiently flexible.

• But not so when the quantile level is fixed→ generalized AL distribution.

7 / 28

Page 23: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Two modeling scenarios

• We envision two modeling scenarios (with fixed K throughout):• fixed pk: for settings where one expects specific parts of the response

distribution (e.g., the right tail) to be informed by the covariates;

• the fixed pk model can also be used when such information is not available,based on a set of equally spaced pk spanning the unit interval;

• random pk: specify only the total number of quantiles, and let the datainform the full configuration of quantile components (both pk and µpk ).

• Mixture components need to be flexible distributions parameterized interms of quantiles.• For the random pk model, the AL distribution is sufficiently flexible.

• But not so when the quantile level is fixed→ generalized AL distribution.

7 / 28

Page 24: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Asymmetric Laplace distribution

• Asymmetric Laplace (AL) density

f ALp (y | µ, σ) = p(1− p)

σexp

{− 1σρp (y− µ)

}, y ∈ R

where ρp(u) = up− u1(u<0), σ > 0 is a scale parameter, p ∈ (0, 1), andµ ∈ R corresponds to the p-th percentile,

∫ µ−∞ f AL

p (y | µ, σ)dy = p.

• For µ = x′β, maximizing the likelihood w.r.t. β under an AL responsedistribution corresponds to minimizing for β the check loss function.

• This property, along with an effective mixture representation, render theAL a popular choice in quantile regression modeling.

• But, the fixed-p AL distribution is very restrictive:• the skewness of the error distribution is fully determined by fixing p;

• the mode of the error density is at 0 for any p.

8 / 28

Page 25: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Asymmetric Laplace distribution

• Asymmetric Laplace (AL) density

f ALp (y | µ, σ) = p(1− p)

σexp

{− 1σρp (y− µ)

}, y ∈ R

where ρp(u) = up− u1(u<0), σ > 0 is a scale parameter, p ∈ (0, 1), andµ ∈ R corresponds to the p-th percentile,

∫ µ−∞ f AL

p (y | µ, σ)dy = p.

• For µ = x′β, maximizing the likelihood w.r.t. β under an AL responsedistribution corresponds to minimizing for β the check loss function.

• This property, along with an effective mixture representation, render theAL a popular choice in quantile regression modeling.

• But, the fixed-p AL distribution is very restrictive:• the skewness of the error distribution is fully determined by fixing p;

• the mode of the error density is at 0 for any p.

8 / 28

Page 26: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Asymmetric Laplace distribution

• Asymmetric Laplace (AL) density

f ALp (y | µ, σ) = p(1− p)

σexp

{− 1σρp (y− µ)

}, y ∈ R

where ρp(u) = up− u1(u<0), σ > 0 is a scale parameter, p ∈ (0, 1), andµ ∈ R corresponds to the p-th percentile,

∫ µ−∞ f AL

p (y | µ, σ)dy = p.

• For µ = x′β, maximizing the likelihood w.r.t. β under an AL responsedistribution corresponds to minimizing for β the check loss function.

• This property, along with an effective mixture representation, render theAL a popular choice in quantile regression modeling.

• But, the fixed-p AL distribution is very restrictive:• the skewness of the error distribution is fully determined by fixing p;

• the mode of the error density is at 0 for any p.

8 / 28

Page 27: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Asymmetric Laplace distribution

• Asymmetric Laplace (AL) density

f ALp (y | µ, σ) = p(1− p)

σexp

{− 1σρp (y− µ)

}, y ∈ R

where ρp(u) = up− u1(u<0), σ > 0 is a scale parameter, p ∈ (0, 1), andµ ∈ R corresponds to the p-th percentile,

∫ µ−∞ f AL

p (y | µ, σ)dy = p.

• For µ = x′β, maximizing the likelihood w.r.t. β under an AL responsedistribution corresponds to minimizing for β the check loss function.

• This property, along with an effective mixture representation, render theAL a popular choice in quantile regression modeling.

• But, the fixed-p AL distribution is very restrictive:• the skewness of the error distribution is fully determined by fixing p;

• the mode of the error density is at 0 for any p.

8 / 28

Page 28: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Generalized AL distribution

• Construction motivated by the AL mixture representation:

f ALp (y | µ, σ) =

∫R+

N(y | µ+ σA(p)z, σ2B(p)z)Exp(z | 1) dz

where A(p) = (1− 2p)/{p(1− p)} and B(p) = 2/{p(1− p)}.

• Replace the normal kernel with a skew normal density:

2ωφ

(y− ξω

(α(y− ξ)

ω

)=

∫R+

N(y | ξ + ταs, τ 2)N+(s | 0, 1) ds

where α ∈ R is the skewness parameter, τ = ω(1 + α2)−1/2, andN+(0, 1) denotes the standard normal distribution truncated over R+.

• Generalized asymmetric Laplace (GAL) density:∫∫R+×R+

N(y | µ+σαs+σA(p)z, σ2B(p)z)Exp(z | 1)N+(s | 0, 1) dz ds

9 / 28

Page 29: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Generalized AL distribution

• Construction motivated by the AL mixture representation:

f ALp (y | µ, σ) =

∫R+

N(y | µ+ σA(p)z, σ2B(p)z)Exp(z | 1) dz

where A(p) = (1− 2p)/{p(1− p)} and B(p) = 2/{p(1− p)}.

• Replace the normal kernel with a skew normal density:

2ωφ

(y− ξω

(α(y− ξ)

ω

)=

∫R+

N(y | ξ + ταs, τ 2)N+(s | 0, 1) ds

where α ∈ R is the skewness parameter, τ = ω(1 + α2)−1/2, andN+(0, 1) denotes the standard normal distribution truncated over R+.

• Generalized asymmetric Laplace (GAL) density:∫∫R+×R+

N(y | µ+σαs+σA(p)z, σ2B(p)z)Exp(z | 1)N+(s | 0, 1) dz ds

9 / 28

Page 30: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Generalized AL distribution

• Construction motivated by the AL mixture representation:

f ALp (y | µ, σ) =

∫R+

N(y | µ+ σA(p)z, σ2B(p)z)Exp(z | 1) dz

where A(p) = (1− 2p)/{p(1− p)} and B(p) = 2/{p(1− p)}.

• Replace the normal kernel with a skew normal density:

2ωφ

(y− ξω

(α(y− ξ)

ω

)=

∫R+

N(y | ξ + ταs, τ 2)N+(s | 0, 1) ds

where α ∈ R is the skewness parameter, τ = ω(1 + α2)−1/2, andN+(0, 1) denotes the standard normal distribution truncated over R+.

• Generalized asymmetric Laplace (GAL) density:∫∫R+×R+

N(y | µ+σαs+σA(p)z, σ2B(p)z)Exp(z | 1)N+(s | 0, 1) dz ds

9 / 28

Page 31: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Generalized AL distribution

• GAL density f (y | p, µ, σ, α) =∫∫R+×R+

N(y | µ+σαs+σA(p)z, σ2B(p)z)Exp(z | 1)N+(s | 0, 1) dz ds

• When α = 0, the GAL density reduces to the AL density.• Integrating over z → generalized inverse-Gaussian density → integrating

over s→ (complex) closed-form expression for f (y | p, µ, σ, α).• Hierarchical mixture representation of the GAL density suffices for study

of model properties and for model fitting.

• p0-th quantile, for p0 ∈ (0, 1)? Setting γ = {I(α > 0)− p}|α|,∫ µ

−∞f (y | p, µ, σ, γ) dy = p g(γ) with g(γ) = 2Φ(−|γ|) exp(γ2/2).

• g(γ) is increasing in R−, and decreasing in R+;• for any γ, unique solution for p such that

∫ µ

−∞ f (y | p, µ, σ, γ) dy = p0.

10 / 28

Page 32: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Generalized AL distribution

• GAL density f (y | p, µ, σ, α) =∫∫R+×R+

N(y | µ+σαs+σA(p)z, σ2B(p)z)Exp(z | 1)N+(s | 0, 1) dz ds

• When α = 0, the GAL density reduces to the AL density.• Integrating over z → generalized inverse-Gaussian density → integrating

over s→ (complex) closed-form expression for f (y | p, µ, σ, α).• Hierarchical mixture representation of the GAL density suffices for study

of model properties and for model fitting.

• p0-th quantile, for p0 ∈ (0, 1)? Setting γ = {I(α > 0)− p}|α|,∫ µ

−∞f (y | p, µ, σ, γ) dy = p g(γ) with g(γ) = 2Φ(−|γ|) exp(γ2/2).

• g(γ) is increasing in R−, and decreasing in R+;• for any γ, unique solution for p such that

∫ µ

−∞ f (y | p, µ, σ, γ) dy = p0.

10 / 28

Page 33: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Generalized AL distribution

• Reparameterize (p, α) to (p0, γ) −→ for any fixed p0 ∈ (0, 1), the p0-thquantile of f (y | p0, µ, σ, γ) ≡ fp0(y | µ, σ, γ) is µ.

• Quantile-fixed (fixed p0) GAL density fp0(y | µ, σ, γ) =∫∫R+×R+

N(y | µ+ σC|γ|s + σAz, σ2Bz)Exp(z | 1)N+(s | 0, 1) dz ds

• A, B and C are all functions of γ (and p0).

• Y has density fp0 (µ, σ, γ) if-f (Y − µ)/σ has density fp0 (0, 1, γ)→ µ is alocation parameter (the p0-th quantile) and σ is a scale parameter.

• γ ∈ (L,U), where L is the negative root of g(γ) = 1 − p0 and U is thepositive root of g(γ) = p0.

• γ is a shape parameter that allows for more flexible densities than the AL.

11 / 28

Page 34: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Generalized AL distribution

• Reparameterize (p, α) to (p0, γ) −→ for any fixed p0 ∈ (0, 1), the p0-thquantile of f (y | p0, µ, σ, γ) ≡ fp0(y | µ, σ, γ) is µ.

• Quantile-fixed (fixed p0) GAL density fp0(y | µ, σ, γ) =∫∫R+×R+

N(y | µ+ σC|γ|s + σAz, σ2Bz)Exp(z | 1)N+(s | 0, 1) dz ds

• A, B and C are all functions of γ (and p0).

• Y has density fp0 (µ, σ, γ) if-f (Y − µ)/σ has density fp0 (0, 1, γ)→ µ is alocation parameter (the p0-th quantile) and σ is a scale parameter.

• γ ∈ (L,U), where L is the negative root of g(γ) = 1 − p0 and U is thepositive root of g(γ) = p0.

• γ is a shape parameter that allows for more flexible densities than the AL.

11 / 28

Page 35: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Generalized AL distribution

• Reparameterize (p, α) to (p0, γ) −→ for any fixed p0 ∈ (0, 1), the p0-thquantile of f (y | p0, µ, σ, γ) ≡ fp0(y | µ, σ, γ) is µ.

• Quantile-fixed (fixed p0) GAL density fp0(y | µ, σ, γ) =∫∫R+×R+

N(y | µ+ σC|γ|s + σAz, σ2Bz)Exp(z | 1)N+(s | 0, 1) dz ds

• A, B and C are all functions of γ (and p0).

• Y has density fp0 (µ, σ, γ) if-f (Y − µ)/σ has density fp0 (0, 1, γ)→ µ is alocation parameter (the p0-th quantile) and σ is a scale parameter.

• γ ∈ (L,U), where L is the negative root of g(γ) = 1 − p0 and U is thepositive root of g(γ) = p0.

• γ is a shape parameter that allows for more flexible densities than the AL.

11 / 28

Page 36: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Generalized AL distribution

• Reparameterize (p, α) to (p0, γ) −→ for any fixed p0 ∈ (0, 1), the p0-thquantile of f (y | p0, µ, σ, γ) ≡ fp0(y | µ, σ, γ) is µ.

• Quantile-fixed (fixed p0) GAL density fp0(y | µ, σ, γ) =∫∫R+×R+

N(y | µ+ σC|γ|s + σAz, σ2Bz)Exp(z | 1)N+(s | 0, 1) dz ds

• A, B and C are all functions of γ (and p0).

• Y has density fp0 (µ, σ, γ) if-f (Y − µ)/σ has density fp0 (0, 1, γ)→ µ is alocation parameter (the p0-th quantile) and σ is a scale parameter.

• γ ∈ (L,U), where L is the negative root of g(γ) = 1 − p0 and U is thepositive root of g(γ) = p0.

• γ is a shape parameter that allows for more flexible densities than the AL.

11 / 28

Page 37: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Generalized AL distribution

• Reparameterize (p, α) to (p0, γ) −→ for any fixed p0 ∈ (0, 1), the p0-thquantile of f (y | p0, µ, σ, γ) ≡ fp0(y | µ, σ, γ) is µ.

• Quantile-fixed (fixed p0) GAL density fp0(y | µ, σ, γ) =∫∫R+×R+

N(y | µ+ σC|γ|s + σAz, σ2Bz)Exp(z | 1)N+(s | 0, 1) dz ds

• A, B and C are all functions of γ (and p0).

• Y has density fp0 (µ, σ, γ) if-f (Y − µ)/σ has density fp0 (0, 1, γ)→ µ is alocation parameter (the p0-th quantile) and σ is a scale parameter.

• γ ∈ (L,U), where L is the negative root of g(γ) = 1 − p0 and U is thepositive root of g(γ) = p0.

• γ is a shape parameter that allows for more flexible densities than the AL.

11 / 28

Page 38: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

−20 −10 0 10 20

0.0

00

.02

0.0

40

.06

0.0

8

p0 = 0.05

ε

De

nsity

γ = 0γ = −0.04γ = 4γ = 8

−10 −5 0 5 10

0.0

00

.05

0.1

00

.15

0.2

00

.25

p0 = 0.5

ε

De

nsity

γ = 0γ = −0.8γ = −0.4γ = 0.6

−10 −5 0 5 10

0.0

00

.05

0.1

00

.15

0.2

0

p0 = 0.75

ε

De

nsity

γ = 0γ = −1.5γ = −0.5γ = 0.2

Figure: Quantile level fixed GAL(µ = 0, σ = 1, γ) densities, with p0 = 0.05 (γ ∈(−0.07, 15.90)), p0 = 0.5 (γ ∈ (−1.09, 1.09)), and p0 = 0.75 (γ ∈ (−2.90, 0.39)).

12 / 28

Page 39: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Quantile regression with regularization

• The extension of the AL to the GAL distribution from the check lossfunction perspective.

• Marginalize over the zi −→ π(β, γ, σ, s1, ..., sn | data)−→ posterior fullconditional for β:

π(β | γ, σ, s1, ..., sn, data) ∝ π(β) exp

{− 1σ

n∑i=1

ρp(yi − xTi β − σH(γ)si)

}

• H(γ) = γg(γ)/{g(γ)− |p0 − I(γ < 0)|}

• p = I(γ < 0) + {[p0 − I(γ < 0)]/g(γ)}

• p0 the probability associated with the quantile modeled through xTi β.

• For γ = 0 (AL errors) −→ check loss function with p = p0.

13 / 28

Page 40: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Quantile regression with regularization

• The extension of the AL to the GAL distribution from the check lossfunction perspective.

• Marginalize over the zi −→ π(β, γ, σ, s1, ..., sn | data)−→ posterior fullconditional for β:

π(β | γ, σ, s1, ..., sn, data) ∝ π(β) exp

{− 1σ

n∑i=1

ρp(yi − xTi β − σH(γ)si)

}

• H(γ) = γg(γ)/{g(γ)− |p0 − I(γ < 0)|}

• p = I(γ < 0) + {[p0 − I(γ < 0)]/g(γ)}

• p0 the probability associated with the quantile modeled through xTi β.

• For γ = 0 (AL errors) −→ check loss function with p = p0.

13 / 28

Page 41: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Quantile regression with regularization

• The extension of the AL to the GAL distribution from the check lossfunction perspective.

• Marginalize over the zi −→ π(β, γ, σ, s1, ..., sn | data)−→ posterior fullconditional for β:

π(β | γ, σ, s1, ..., sn, data) ∝ π(β) exp

{− 1σ

n∑i=1

ρp(yi − xTi β − σH(γ)si)

}

• H(γ) = γg(γ)/{g(γ)− |p0 − I(γ < 0)|}

• p = I(γ < 0) + {[p0 − I(γ < 0)]/g(γ)}

• p0 the probability associated with the quantile modeled through xTi β.

• For γ = 0 (AL errors) −→ check loss function with p = p0.

13 / 28

Page 42: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Quantile regression with regularization

• Adjusted loss function∑n

i=1 ρp(yi − xTi β − σH(γ)si)

• Positive-valued latent variables si can be viewed as response-specific weightsthat are adjusted by real-valued coefficient H(γ), which is fully specifiedthrough the shape parameter γ.

• Real-valued, response-specific terms σH(γ)si reflect on the estimation ofβ the effect of outlying observations relative to the AL distribution.

• Different versions of regularized quantile regression under differentpriors for β, working with AL errors (Li et al., 2010).• Lasso regularized quantile regression −→ hierarchical Laplace prior,π(β | σ, λ) =

∏j 0.5λσ−1 exp(−λσ−1|βj|)

• A broader framework for exploring regularization by adjusting the lossfunction (through the response distribution) in addition to the penaltyterm (through the prior for the regression coefficients).

14 / 28

Page 43: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Quantile regression with regularization

• Adjusted loss function∑n

i=1 ρp(yi − xTi β − σH(γ)si)

• Positive-valued latent variables si can be viewed as response-specific weightsthat are adjusted by real-valued coefficient H(γ), which is fully specifiedthrough the shape parameter γ.

• Real-valued, response-specific terms σH(γ)si reflect on the estimation ofβ the effect of outlying observations relative to the AL distribution.

• Different versions of regularized quantile regression under differentpriors for β, working with AL errors (Li et al., 2010).• Lasso regularized quantile regression −→ hierarchical Laplace prior,π(β | σ, λ) =

∏j 0.5λσ−1 exp(−λσ−1|βj|)

• A broader framework for exploring regularization by adjusting the lossfunction (through the response distribution) in addition to the penaltyterm (through the prior for the regression coefficients).

14 / 28

Page 44: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Quantile regression with regularization

• Adjusted loss function∑n

i=1 ρp(yi − xTi β − σH(γ)si)

• Positive-valued latent variables si can be viewed as response-specific weightsthat are adjusted by real-valued coefficient H(γ), which is fully specifiedthrough the shape parameter γ.

• Real-valued, response-specific terms σH(γ)si reflect on the estimation ofβ the effect of outlying observations relative to the AL distribution.

• Different versions of regularized quantile regression under differentpriors for β, working with AL errors (Li et al., 2010).• Lasso regularized quantile regression −→ hierarchical Laplace prior,π(β | σ, λ) =

∏j 0.5λσ−1 exp(−λσ−1|βj|)

• A broader framework for exploring regularization by adjusting the lossfunction (through the response distribution) in addition to the penaltyterm (through the prior for the regression coefficients).

14 / 28

Page 45: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

GAL mixture model (fixed pk)

• Use GAL densities for the mixture components:

f (y | x) =K∑

k=1

ωk fpk(y | µpk + x′β, σ, γpk)

• Specify the values for 0 < p1 < ... < pK < 1.• Conditional truncated normal priors for µp1 < ... < µpK .• Rescaled Beta priors for the γpk (default uniform).• Hierarchical Laplace prior for β, π(β | σ, λ) =

∏dj=1

λ2σ exp(−λ

σ|βj|).

• Mixture weights defined through increments of a c.d.f. G (on (0, pK)):

ω1 = G(p1), ωk = G(pk)− G(pk−1), k = 2, ...,K

where G is assigned a Dirichlet process prior.

• MCMC: one set of latent variables for the mixture, two sets for the GALdensities→M-H steps for the γpk ; Gibbs sampling for all other parameters.

15 / 28

Page 46: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

GAL mixture model (fixed pk)

• Use GAL densities for the mixture components:

f (y | x) =K∑

k=1

ωk fpk(y | µpk + x′β, σ, γpk)

• Specify the values for 0 < p1 < ... < pK < 1.• Conditional truncated normal priors for µp1 < ... < µpK .• Rescaled Beta priors for the γpk (default uniform).• Hierarchical Laplace prior for β, π(β | σ, λ) =

∏dj=1

λ2σ exp(−λ

σ|βj|).

• Mixture weights defined through increments of a c.d.f. G (on (0, pK)):

ω1 = G(p1), ωk = G(pk)− G(pk−1), k = 2, ...,K

where G is assigned a Dirichlet process prior.

• MCMC: one set of latent variables for the mixture, two sets for the GALdensities→M-H steps for the γpk ; Gibbs sampling for all other parameters.

15 / 28

Page 47: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

GAL mixture model (fixed pk)

• Use GAL densities for the mixture components:

f (y | x) =K∑

k=1

ωk fpk(y | µpk + x′β, σ, γpk)

• Specify the values for 0 < p1 < ... < pK < 1.• Conditional truncated normal priors for µp1 < ... < µpK .• Rescaled Beta priors for the γpk (default uniform).• Hierarchical Laplace prior for β, π(β | σ, λ) =

∏dj=1

λ2σ exp(−λ

σ|βj|).

• Mixture weights defined through increments of a c.d.f. G (on (0, pK)):

ω1 = G(p1), ωk = G(pk)− G(pk−1), k = 2, ...,K

where G is assigned a Dirichlet process prior.

• MCMC: one set of latent variables for the mixture, two sets for the GALdensities→M-H steps for the γpk ; Gibbs sampling for all other parameters.

15 / 28

Page 48: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

GAL mixture model (fixed pk)

• Use GAL densities for the mixture components:

f (y | x) =K∑

k=1

ωk fpk(y | µpk + x′β, σ, γpk)

• Specify the values for 0 < p1 < ... < pK < 1.• Conditional truncated normal priors for µp1 < ... < µpK .• Rescaled Beta priors for the γpk (default uniform).• Hierarchical Laplace prior for β, π(β | σ, λ) =

∏dj=1

λ2σ exp(−λ

σ|βj|).

• Mixture weights defined through increments of a c.d.f. G (on (0, pK)):

ω1 = G(p1), ωk = G(pk)− G(pk−1), k = 2, ...,K

where G is assigned a Dirichlet process prior.

• MCMC: one set of latent variables for the mixture, two sets for the GALdensities→M-H steps for the γpk ; Gibbs sampling for all other parameters.

15 / 28

Page 49: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

GAL mixture model (fixed pk)

• Use GAL densities for the mixture components:

f (y | x) =K∑

k=1

ωk fpk(y | µpk + x′β, σ, γpk)

• Specify the values for 0 < p1 < ... < pK < 1.• Conditional truncated normal priors for µp1 < ... < µpK .• Rescaled Beta priors for the γpk (default uniform).• Hierarchical Laplace prior for β, π(β | σ, λ) =

∏dj=1

λ2σ exp(−λ

σ|βj|).

• Mixture weights defined through increments of a c.d.f. G (on (0, pK)):

ω1 = G(p1), ωk = G(pk)− G(pk−1), k = 2, ...,K

where G is assigned a Dirichlet process prior.

• MCMC: one set of latent variables for the mixture, two sets for the GALdensities→M-H steps for the γpk ; Gibbs sampling for all other parameters.

15 / 28

Page 50: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

GAL mixture model (fixed pk)

• Use GAL densities for the mixture components:

f (y | x) =K∑

k=1

ωk fpk(y | µpk + x′β, σ, γpk)

• Specify the values for 0 < p1 < ... < pK < 1.• Conditional truncated normal priors for µp1 < ... < µpK .• Rescaled Beta priors for the γpk (default uniform).• Hierarchical Laplace prior for β, π(β | σ, λ) =

∏dj=1

λ2σ exp(−λ

σ|βj|).

• Mixture weights defined through increments of a c.d.f. G (on (0, pK)):

ω1 = G(p1), ωk = G(pk)− G(pk−1), k = 2, ...,K

where G is assigned a Dirichlet process prior.

• MCMC: one set of latent variables for the mixture, two sets for the GALdensities→M-H steps for the γpk ; Gibbs sampling for all other parameters.

15 / 28

Page 51: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

GAL mixture model (fixed pk)

• Use GAL densities for the mixture components:

f (y | x) =K∑

k=1

ωk fpk(y | µpk + x′β, σ, γpk)

• Specify the values for 0 < p1 < ... < pK < 1.• Conditional truncated normal priors for µp1 < ... < µpK .• Rescaled Beta priors for the γpk (default uniform).• Hierarchical Laplace prior for β, π(β | σ, λ) =

∏dj=1

λ2σ exp(−λ

σ|βj|).

• Mixture weights defined through increments of a c.d.f. G (on (0, pK)):

ω1 = G(p1), ωk = G(pk)− G(pk−1), k = 2, ...,K

where G is assigned a Dirichlet process prior.

• MCMC: one set of latent variables for the mixture, two sets for the GALdensities→M-H steps for the γpk ; Gibbs sampling for all other parameters.

15 / 28

Page 52: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Synthetic data examples

• Different data sets simulated from yi = x′iβ + εi, where:• β = (3, 1.5, 0, 0, 2, 0, 0, 0)′;

• xi generated independently from a N8(0,Σ) distribution, with (i, j)th co-variance element 0.5|i−j|, for 1 ≤ i, j ≤ 8.

• Different scenarios for the error distribution:• mixture of three AL components (to highlight the benefits of the GAL

mixture kernel), with n = 600;

• normal and skew-normal distributions, with n = 500 in each case.

16 / 28

Page 53: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Synthetic data examples

• Different data sets simulated from yi = x′iβ + εi, where:• β = (3, 1.5, 0, 0, 2, 0, 0, 0)′;

• xi generated independently from a N8(0,Σ) distribution, with (i, j)th co-variance element 0.5|i−j|, for 1 ≤ i, j ≤ 8.

• Different scenarios for the error distribution:• mixture of three AL components (to highlight the benefits of the GAL

mixture kernel), with n = 600;

• normal and skew-normal distributions, with n = 500 in each case.

16 / 28

Page 54: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

−10 −5 0 5 10

0.00

0.05

0.10

0.15

0.20

Posterior predictive density

x

Den

sity

Truth (population)Truth (data)Posterior mean

(a) GAL mixture

−10 −5 0 5 10

0.00

0.05

0.10

0.15

0.20

0.25

Posterior predictive density

xD

ensi

ty

Truth (population)Truth (data)Posterior mean

(b) AL mixture

Figure: Synthetic data from a mixture of three AL components. Posterior mean and 95% intervalestimates for the error density under the GAL and AL mixture models. In both cases, pk = k/10,for k = 1, ..., 9.

17 / 28

Page 55: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

Posterior predictive density

x

Den

sity

Truth (population)Truth (data)Posterior mean

(a) Normal errors

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

Posterior predictive density

xD

ensi

ty

Truth (population)Truth (data)Posterior mean

(b) Skew-normal errors

Figure: Synthetic data from normal and skew-normal distributions. Posterior mean and 95%interval estimates for the error density under the GAL mixture model (with fixed pk = k/10, fork = 1, ..., 9).

18 / 28

Page 56: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

0.0 0.2 0.4 0.6 0.8

0.0

0.1

0.2

0.3

0.4

0.5

x

G(k

K)−

G((

k−

1)K

)

PriorPosterior

(a) Mixture weights

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

p = 0.1

x

Den

sity

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

p = 0.2

x

Den

sity

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

p = 0.3

x

Den

sity

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

p = 0.4

x

Den

sity

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

p = 0.5

x

Den

sity

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

p = 0.6

x

Den

sity

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

p = 0.7

Den

sity

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

p = 0.8

Den

sity

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

p = 0.9

Den

sity

(b) Weighted mixture components

Figure: Synthetic data from normal distribution. Prior and posterior for the mixture weights,and posterior mean and 95% interval estimates for the error density weighted components, underthe GAL mixture model (with fixed pk = k/10, for k = 1, ..., 9).

19 / 28

Page 57: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

0.0 0.2 0.4 0.6 0.8

0.0

0.1

0.2

0.3

0.4

0.5

0.6

x

G(k

K)−

G((

k−

1)K

)

PriorPosterior

(a) Mixture weights

−4 −2 0 2 4

0.00

0.10

0.20

0.30

p = 0.1

x

Den

sity

−4 −2 0 2 4

0.00

0.10

0.20

0.30

p = 0.2

x

Den

sity

−4 −2 0 2 4

0.00

0.10

0.20

0.30

p = 0.3

x

Den

sity

−4 −2 0 2 4

0.00

0.10

0.20

0.30

p = 0.4

x

Den

sity

−4 −2 0 2 4

0.00

0.10

0.20

0.30

p = 0.5

x

Den

sity

−4 −2 0 2 4

0.00

0.10

0.20

0.30

p = 0.6

x

Den

sity

−4 −2 0 2 4

0.00

0.10

0.20

0.30

p = 0.7

Den

sity

−4 −2 0 2 4

0.00

0.10

0.20

0.30

p = 0.8

Den

sity

−4 −2 0 2 4

0.00

0.10

0.20

0.30

p = 0.9

Den

sity

(b) Weighted mixture components

Figure: Synthetic data from skew-normal distribution. Prior and posterior for the mixtureweights, and posterior mean and 95% interval estimates for the error density weighted com-ponents, under the GAL mixture model (with fixed pk = k/10, for k = 1, ..., 9).

20 / 28

Page 58: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

AL mixture model (random pk)

• Mixture with AL components:

f (y | x) =K∑

k=1

ωk f ALpk

(y | µpk + x′β, σ)

• Random 0 < p1 < ... < pK < 1 generated from a Poisson process on(0, 1) conditioning on K (uniform subject to the monotonicity restriction).

• Similar priors with before for the µp1 < ... < µpK , for β, and for themixture weights.

• Illustration (and comparison with GAL mixture) using synthetic datafrom a skew-normal distribution (n = 200).

21 / 28

Page 59: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

AL mixture model (random pk)

• Mixture with AL components:

f (y | x) =K∑

k=1

ωk f ALpk

(y | µpk + x′β, σ)

• Random 0 < p1 < ... < pK < 1 generated from a Poisson process on(0, 1) conditioning on K (uniform subject to the monotonicity restriction).

• Similar priors with before for the µp1 < ... < µpK , for β, and for themixture weights.

• Illustration (and comparison with GAL mixture) using synthetic datafrom a skew-normal distribution (n = 200).

21 / 28

Page 60: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

AL mixture model (random pk)

• Mixture with AL components:

f (y | x) =K∑

k=1

ωk f ALpk

(y | µpk + x′β, σ)

• Random 0 < p1 < ... < pK < 1 generated from a Poisson process on(0, 1) conditioning on K (uniform subject to the monotonicity restriction).

• Similar priors with before for the µp1 < ... < µpK , for β, and for themixture weights.

• Illustration (and comparison with GAL mixture) using synthetic datafrom a skew-normal distribution (n = 200).

21 / 28

Page 61: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

AL mixture model (random pk)

• Mixture with AL components:

f (y | x) =K∑

k=1

ωk f ALpk

(y | µpk + x′β, σ)

• Random 0 < p1 < ... < pK < 1 generated from a Poisson process on(0, 1) conditioning on K (uniform subject to the monotonicity restriction).

• Similar priors with before for the µp1 < ... < µpK , for β, and for themixture weights.

• Illustration (and comparison with GAL mixture) using synthetic datafrom a skew-normal distribution (n = 200).

21 / 28

Page 62: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

−4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

0.5

Posterior predictive density

x

Den

sity

Truth (population)Truth (data)Posterior mean

(a) AL mixture, random pk

−4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

Posterior predictive density

xD

ensi

ty

Truth (population)Truth (data)Posterior mean

(b) GAL mixture, fixed pk

Figure: Synthetic data from skew-normal distribution. Posterior mean and 95% interval esti-mates for the error density under the AL mixture with K = 5 and random pk , and the GALmixture with fixed pk = {0.1, 0.25, 0.5, 0.75, 0.9}.

22 / 28

Page 63: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

−4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

0.5

Posterior predictive error density

x

Den

sity

−4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

0.5

p1 = 0.24, w1 = 0.23

xD

ensi

ty

−4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

0.5

p2 = 0.35, w2 = 0.27

x

Den

sity

−4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

0.5

p3 = 0.45, w3 = 0.24

Den

sity

−4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

0.5

p4 = 0.56, w4 = 0.17D

ensi

ty

−4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

0.5

p5 = 0.71, w5 = 0.09

Den

sity

Figure: Synthetic data from skew-normal distribution. Posterior mean and 95% interval esti-mates for the error density (top left panel) and for the error density weighted components, underthe AL mixture model with K = 5 and random pk .

23 / 28

Page 64: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Boston housing data example

• Realty price data from the Boston area with n = 506 observations:• response: log-transformed median value of owner-occupied housing in

USD 1000;

• 15 predictors, including: per capita crime (CRIM), nitric oxides concentra-tion (parts per 10 million) per town (NOX), average number of rooms perdwelling (RM), index of accessibility to radial highways per town (RAD),and full-value property-tax rate per USD 10,000 per town (TAX), trans-formed African American population proportion (B), and percentage val-ues of lower status population (LSTAT).

• Similar inferences under the random-pk AL mixture and fixed-pk GALmixture models (both with K = 9).

• Based on posterior predictive criteria (LPML), both models outperformsingle-quantile regression models (with GAL errors) for essentially anyfixed quantile level.

24 / 28

Page 65: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Boston housing data example

• Realty price data from the Boston area with n = 506 observations:• response: log-transformed median value of owner-occupied housing in

USD 1000;

• 15 predictors, including: per capita crime (CRIM), nitric oxides concentra-tion (parts per 10 million) per town (NOX), average number of rooms perdwelling (RM), index of accessibility to radial highways per town (RAD),and full-value property-tax rate per USD 10,000 per town (TAX), trans-formed African American population proportion (B), and percentage val-ues of lower status population (LSTAT).

• Similar inferences under the random-pk AL mixture and fixed-pk GALmixture models (both with K = 9).

• Based on posterior predictive criteria (LPML), both models outperformsingle-quantile regression models (with GAL errors) for essentially anyfixed quantile level.

24 / 28

Page 66: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Boston housing data example

• Realty price data from the Boston area with n = 506 observations:• response: log-transformed median value of owner-occupied housing in

USD 1000;

• 15 predictors, including: per capita crime (CRIM), nitric oxides concentra-tion (parts per 10 million) per town (NOX), average number of rooms perdwelling (RM), index of accessibility to radial highways per town (RAD),and full-value property-tax rate per USD 10,000 per town (TAX), trans-formed African American population proportion (B), and percentage val-ues of lower status population (LSTAT).

• Similar inferences under the random-pk AL mixture and fixed-pk GALmixture models (both with K = 9).

• Based on posterior predictive criteria (LPML), both models outperformsingle-quantile regression models (with GAL errors) for essentially anyfixed quantile level.

24 / 28

Page 67: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

−0.6 −0.4 −0.2 0.0 0.2 0.4 0.6

01

23

4

Posterior predictive density

ε

Den

sity

(a) Error density

−0.6 −0.2 0.0 0.2 0.4 0.6

01

23

x

Den

sity

p = 0.12

−0.6 −0.2 0.0 0.2 0.4 0.6

01

23

x

Den

sity

p = 0.22

−0.6 −0.2 0.0 0.2 0.4 0.6

01

23

x

Den

sity

p = 0.31

−0.6 −0.2 0.0 0.2 0.4 0.6

01

23

x

Den

sity

p = 0.38

−0.6 −0.2 0.0 0.2 0.4 0.6

01

23

x

Den

sity

p = 0.46

−0.6 −0.2 0.0 0.2 0.4 0.6

01

23

x

Den

sity

p = 0.54

−0.6 −0.2 0.0 0.2 0.4 0.6

01

23

Den

sity

p = 0.65

−0.6 −0.2 0.0 0.2 0.4 0.6

01

23

Den

sity

p = 0.76

−0.6 −0.2 0.0 0.2 0.4 0.6

01

23

Den

sity

p = 0.87

(b) Weighted mixture components

Figure: Boston housing data. Under the random-pk AL mixture model, posterior mean and 95%interval estimates for: (a) the error density; (b) the error density weighted components.

25 / 28

Page 68: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

0 5 10 15

−0.

2−

0.1

0.0

0.1

0.2

ID

Effe

ct

● ● ●

LON

LAT

CRIM

ZN INDUSCHAS

NOX

RM

AGE

DIS

RAD

TAXPTRATIO

B

LSTAT

GALAL

Figure: Boston housing data. Posterior mean and 95% interval for βj , j = 1, . . . , 15, under thefixed-pk GAL mixture model, and the random-pk AL mixture model.

26 / 28

Page 69: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Summary

• A mixture model to:• integrate information from multiple parts of the response distribution to

inform estimation of covariate effects;

• identify the most relevant parts of the response distribution through mixtureweights associated with different quantiles.

• Two modeling scenarios: mixtures of GAL/AL distributions withfixed/random quantile levels.

• Applications to ROC estimation with covariates, extending the mixturemodel to incorporate stochastic ordering for the response distributionsassociated with the infected and non-infected groups.

27 / 28

Page 70: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Summary

• A mixture model to:• integrate information from multiple parts of the response distribution to

inform estimation of covariate effects;

• identify the most relevant parts of the response distribution through mixtureweights associated with different quantiles.

• Two modeling scenarios: mixtures of GAL/AL distributions withfixed/random quantile levels.

• Applications to ROC estimation with covariates, extending the mixturemodel to incorporate stochastic ordering for the response distributionsassociated with the infected and non-infected groups.

27 / 28

Page 71: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

Summary

• A mixture model to:• integrate information from multiple parts of the response distribution to

inform estimation of covariate effects;

• identify the most relevant parts of the response distribution through mixtureweights associated with different quantiles.

• Two modeling scenarios: mixtures of GAL/AL distributions withfixed/random quantile levels.

• Applications to ROC estimation with covariates, extending the mixturemodel to incorporate stochastic ordering for the response distributionsassociated with the infected and non-infected groups.

27 / 28

Page 72: Bayesian Quantile Mixture Regressionthanos/BNP2019_talk.pdfPrologos Quantile Mixture Regression Data Application Epilogos Bayesian Quantile Mixture Regression Athanasios Kottas Department

Prologos Quantile Mixture Regression Data Application Epilogos

• Acknowledgment: funding from NSF under award SES 1631963.

MANY THANKS!

28 / 28