Bayesian shrinkage approach in variable selection for...

29
Bayesian shrinkage approach in variable selection for mixed effects models Mingan Yang GGI Statistics Conference, Florence, 2015 Bayesian Variable Selection June 22-26, 2015 Mingan Yang Bayesian shrinkage approach in variable selection for mixed

Transcript of Bayesian shrinkage approach in variable selection for...

Page 1: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Bayesian shrinkage approach in variableselection for mixed effects models

Mingan Yang

GGI Statistics Conference, Florence, 2015Bayesian Variable Selection

June 22-26, 2015

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 2: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Outline

1 Introduction

2 model

3 Simulation

4 Application

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 3: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Outline

1 Introduction

2 model

3 Simulation

4 Application

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 4: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Variable selection problem draws considerable attention.Penalized regression with shrinkage estimation andsimultaneous variable selection seems promisingFrequentist approaches: Tibshirani (1996), Fan and Li(2001), Zou (2006), Zou and Li (2008) etc.Bayesian approaches: Park and Casella (2008), Hans(2009), Carvalho et. al. (2009, 2010), Griffin and Brown(2007), and Armagan et al. (2013)None proposed for mixed effects models (LME). Evenmore challenging for LME coupled with random effects.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 5: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Linear mixed effects (LME) model:

yi = Xiβ + Ziζi + εi

ζi ∼ N(0,Ψ)

εi ∼ N(0, σ2Λ)

Challenge: standard methods such as AIC, BIC, Bayesfactors etc. do not work well different number of randomeffects. (4p models for p × 1 fixed and random vector)Very few approaches proposed so far: Chen and Dunson(2003), Cai and Dunson (2006), Kinney and Dunson(2007), Bondell et al. (2010), Ibrahim et al. (2010), Yang(2012, 2013).

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 6: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Some current methods for random effects

Chen and Dunson (2003): Stochastic search variableselection (SSVS) approach to the LMM.Kinney and Dunson (2007):Application of dataaugmentation algorithm, parameter expansion (Gelman,2004, 2005) and model approximation for the GLMM.Ibrahim et. al. (2010):Maximum penalized likelihood (MPL)and smoothly clipped absolute deviation (SCAD) andALASSOBondell et. al. (2010): adaptive LASSO and EMCai and Dunson (2006):Taylor series expansion for GLMM.All these models are parametric ones.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 7: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

We use Bayesian shrinkage priors for joint selection ofboth fixed and random effectsDesirable properties of such priors: spikes at zero, studentt-like tails, simple characterization as a scale mixture ofnormals to facilitate computatonSuch shrinkage priors can shrink small coefficients to zerowithout over-shrink large coefficients.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 8: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Outline

1 Introduction

2 model

3 Simulation

4 Application

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 9: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Shrinkage priors for fixed effects

Stochastic search variable selection (SSVS, George andMcCulloch 1993): Introduce latent variable J, βJ

k = (βk , jk = 1)′

We use generalized double Pareto, horse shoe prior, andnormal-exponential-gamma priors for the fixed effects.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 10: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Stochastic search variable selection:SSVS

Proposed by George and McCulloch (1993), originally forsubset selection in linear regression, searching for modelswith highest posterior probabilitiesstarting with full models containing all candidate predictorschoosing mixture priors that allow predictors to drop byzeroing the coefficientsrunning Gibbs sampler with conditional conjugacy from theposterior distribution.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 11: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Mixed effects

β = (β1, · · · , βp)′, J = (j1, · · · , jp)′

jk = 0 indicates βk = 0jk 6= 0 for βk 6= 0

We take the Zellner g-prior (Zellner and Siow, 1980)βJ ∼ N(0, σ2(X J′

X J)−1/g), g ∼ Gamma(.) and Bernoullifor jkGenerally ζ ∼ N(0,Ω). Application of Choleskydecomposition Ω = ΛΓΓ′Λ where Λ is a positive diagonalmatrix with diagonal elements λl , l = 1, · · · ,q proportionalto the random effects standard deviations, and Γ is a lowertriangular matrix with diagonal elements equal to 1.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 12: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

The generalized double Pareto (GDP) density is defined asf (x |a,b) = 1/(2a)(1 + |x |/(ab))−b+1, x ∼ GDP(a,b).The GDP resembles the Laplace density with similarsparse shrinkage properties, also has Cauchy-like tails,which avoids over-shrinkageHowever the posterior form is very complicated and theparameters are not in closed forms. Instead, it can beexpressed as a scale mixture of normal distribution.x ∼ N(0, τ), τ ∼ Exp(λ2/2), and λ ∼ Ga(α, η), thenx ∼ GDP(a = η/b,b).

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 13: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

β jk ∼ N(0, σ2/gκk ), κk ∼ exp(ϕ2

k/2), ϕk ∼ Gamma(ν1, ϑ1),

β jk ∼ N(0, σ2/gκ2

k ), κk ∼ Ca+(0, ν2), ν2 ∼ Ca+(0, ϑ2),

β jk ∼ N(0, σ2/gκk ), κk ∼ exp(ς2/2), ς2 ∼ Gamma(ν3, ϑ3),

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 14: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Model specification

yij = X ′ijβ + Z ′ijζi + εij , V (ζi) = Ω

i = 1, · · · ,n, j = 1, · · · ,ni

Xij : q × 1

Cholesky decomposition

yij = X ′ijβ + Z ′ijΛΓbi + εij , ΛΓV (bi)Γ′Λ′ = Ω

Λ : Diag(λ1, · · · , λq)

Γ: lower triangular matrix q × q with diagonal element 1.Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 15: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Prior for random effects

The random effect with the generalized shrinkage prior isspecified as follows:

bik ∼ Gk

Gk ∼ DP(αGk0)

Gk0 ∼ GDP(1.0,1.0)

The horse shoe prior and normal-exponential-gamma priorcan be specified similarly.Λ : Diag(λ1, · · · , λq) Γ: lower triangular matrix q × q withdiagonal element 1.λk ∼ pk0δ0 + (1− pk0)N+(0, φ2

k ), φ2k ∼ IG(1/2,1/2)

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 16: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Parameter expansion (PX)

Liu and Wu (1999), Gelman (2004, 2006)y |θ, z ∼ N(θ + z,1), z|θ ∼ N(0,D)θ unknown parameter, z missingy |θ, α,w ∼ N(θ − α + w ,1), w |θ, α ∼ N(α,D)z −→ α,w

PX breaks association, speeds up convergence.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 17: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Outline

1 Introduction

2 model

3 Simulation

4 Application

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 18: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

yij = Xijβ + Zijζi + εij , εij ∼ N(0, σ2)

Xij = (1, xij1, · · · , xij15)′, xijk ∼ Unif (−2,2)

zijk = xijk , k = 1, · · · ,4

Case 1:%i ∼ N(0,Ω),Ω =

0.00 0.00 0.00 0.000.00 0.9 0.48 0.060.00 0.48 0.40 0.100.00 0.06 0.10 0.10

.

β = (β1, · · · , β16)′, β1 = β2 = β3 = 1.5, βk = 0, k = 4, ...,16Case 2:q = 8, (%i1, · · · %i4)′ ∼ N(0,Ω1),Ω = Ω1 same as Case 1

Case 3:β4 = · · · = β9 = 0.01,everything else same as Case 2.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 19: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Case Method PF PR PM MSE S.E.1 GDPP 1.00 0.63 0.63 0.12 0.18

HSPP 1.00 0.64 0.63 0.11 0.19NEGP 0.98 0.59 0.58 0.13 0.19KDNP 0.86 0.57 0.46 0.18 0.25

2 GDPP 1.00 0.57 0.55 0.11 0.011HSPP 0.97 0.56 0.54 0.12 0.013NEGP 0.96 0.54 0.53 0.13 0.014KDNP 0.82 0.53 0.44 0.16 0.39

3 GDPP 1.00 0.60 0.60 0.18 0.10HSPP 0.99 0.61 0.59 0.20 0.11NEGP 0.93 0.58 0.56 0.26 0.15KDNP 0.40 0.55 0.23 0.35 0.24

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 20: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

captionComparisons of parameters estimates in the simulationstudy. S.E. is the standard error of the MSE.PM is thepercentage of times the true model was selected, PF and PRare the percentage of times the correct fixed and randomeffects were selected respectively.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 21: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Table: Models with highest posterior probability in the simulationstudy for Case 3: ∗ is the true model

Model GDPP HSPP NEGP KDNPxij1, xij2, xij3, zij2, zij3, z∗

ij4 0.62 0.61 0.59 0.23xij1, xij2, xij3, zij1, zij2, zij3, zij4 0.37 0.38 0.39 0.15xij1, xij2, xij3, xij6, xij9, zij2, zij3, zij4 0.082xij1, xij2, xij3, xij5, xij7, xij9, zij2, zij3, zij4 0.060xij1, xij3, xij5, zij2, zij3, zij4 0.051xij1, xij2, xij3, xij9, zij2, zij3, zij4 0.031xij1, xij2, xij3, xij6, zij1, zij2, zij3, zij4 0.027xij1, xij2, xij3, xij8, xij9, zij2, zij3, zij4 0.025xij1, xij2, xij3, xij8, zij1, zij2, zij3, zij4 0.023xij1, xij2, xij3, xij8, zij2, zij3, zij4 0.022

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 22: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

The shrinkage approaches provide better results than theKDNP.Based on extensive simulations, we find that the KDNP isnot stable and suffers from numerical issues and failseasily.Of all the approaches, GDPP and HSPP perform betterthan the others in variable selection and parameterestimation. GDPP can be implemented in a more efficientGibbs sampling algorithm than HSPP, which usesMetropolis-Hasting algorithm.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 23: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Outline

1 Introduction

2 model

3 Simulation

4 Application

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 24: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

A bioassay study conducted to identify the in vivoestrogenic responses to chemicals.Data consisted of 2681 female rats, experimented by 19participating labs in 8 countries from OECD.2 protocols on immature female rats with administration ofdoses by oral gavage (protocol A) and subcutaneousinjection (protocol B).2 other protocols on adult female rats with oral gavage(protocol C) and subcutaneous injection (protocol D).Each protocol was comprised of 11 groups, each with 6rats. 1 untreated group, 1 control group, 7 groupsadministered with varied doses of ethinyl estradiol (EE)and ZM.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 25: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Goal: how the uterus weights interacts with the in vivoactivity of chemical compounds under differentexperimental conditions.There is concerns about heterogeneity across differentlabs.Kannol et. al (2001) analyzed the data with simpleparametric methods.The number of possible competing models is over 4,000.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 26: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Table: Top 10 models with highest posterior probabilities by GDPP,HSPP and NEGP for real data

Model GDPP HSPP NEGPxij1, xij3, xij4, xij5, xij6, zij1, zij3, zij4 0.098 0.093 0.072xij1, xij3, xij5, xij6, zij1, zij2 0.063 0.049 0.097xij1, xij3, xij4, xij5, xij6, zij1, zij2, zij4 0.047 0.039 0.038xij1, xij3, xij4, xij5, xij6, zij1, zij4 0.038 0.023 0.033xij1, xij3, xij4, xij5, xij6, zij1, zij3, zij4, zij6 0.026xij1, xij3, xij4, xij5, xij6, zij1, zij2, zij6 0.023xij1, xij3, xij4, xij5, xij6, zij1, zij2, zij5 0.020 0.018 0.034xij1, xij3, xij4, xij5, xij6, zij1 0.020 0.013 0.046xij1, xij3, xij4, xij5, xij6, zij1, zij2, zij3, zij4 0.020 0.025xij1, xij3, xij4, xij5, xij6, zij1, zij3, zij4, zij5 0.019xij1, xij3, xij4, xij5, xij6, zij1, zij3, zij4, zij6 0.015xij1, xij3, xij5, xij6, zij1, zij3, zij4, zij5 0.032xij1, xij3, xij4, xij5, xij6, zij1, zij2 0.028xij1, xij3, xij4, xij5, xij6, zij1, zij2, zij4, zij5 0.015xij1, xij3, xij5, xij6, zij1, zij3, zij4 0.078xij1, xij3, xij5, xij6, zij1, zij4 0.055xij1, xij3, xij5, xij6, zij1, zij2, zij4 0.046

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 27: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

Table: Comparisons of parameter estimates in the real data study

Parameter GDPP HSPP NEGP LMERβ1 3.61a(3.31, 3.76)b 3.60a(3.35, 3.87)b 3.60a(3.28, 3.96)b 3.61a(3.16, 4.10)c

β2 NAd NAd NAd 0.06 (0.02, 0.12)β3 1.21 (1.04, 1.47) 1.18 (1.06, 1.53) 1.22 (0.97, 1.49) 1.20 (1.03, 1.69)β4 1.20 (0.89, 1.64) 1.21 (0.87, 1.70) 1.19 (0.79, 1.68) 1.27 (0.85, 1.64)β5 0.14 (0.13, 0.16) 0.14 (0.13, 0.16) 0.14 (0.13, 0.16) 0.14 (0.13, 0.15)β6 -0.48 (-0.57, -0.35) -0.47 (-0.56, -0.39) -0.47(-0.59, -0.24) -0.49 (-0.57, -0.37)a Meanb 95% credible intervalc 95% confidence intervald Not selected in the model

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 28: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

We try to fit the model with KDNP but it fails with too manynumerical issues.Fixed effect of protocol B is not selected for any model, butthe corresponding random effects is, which indicatesheterogeneity.Both fixed and random effects of protocol C and D areselected in the leading models.The fixed effects of EE dose and ZM dose are selected inthe leading models, but seldom for the correspondingrandom effects.

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models

Page 29: Bayesian shrinkage approach in variable selection for ...theory.fi.infn.it/SMIC/confRM/Talks/poster_MinganYang.pdf · Introduction model Simulation Application Some current methods

Introductionmodel

SimulationApplication

References

Chen and Dunson Biometrics, 2003Cai and Dunson Biometrics, 2006Kinney and Dunson Biometrics, 2007Bondell et al. Biometrics, 2010Ibrahim et al. Biometrics, 2010Yang, M, Computational Statistics and Data Analysis, 2012Yang, M, Biometrical Journal, 2013Yang, M, Under review, 2015

Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models