Bayesian shrinkage approach in variable selection for...
Transcript of Bayesian shrinkage approach in variable selection for...
Bayesian shrinkage approach in variableselection for mixed effects models
Mingan Yang
GGI Statistics Conference, Florence, 2015Bayesian Variable Selection
June 22-26, 2015
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Outline
1 Introduction
2 model
3 Simulation
4 Application
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Outline
1 Introduction
2 model
3 Simulation
4 Application
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Variable selection problem draws considerable attention.Penalized regression with shrinkage estimation andsimultaneous variable selection seems promisingFrequentist approaches: Tibshirani (1996), Fan and Li(2001), Zou (2006), Zou and Li (2008) etc.Bayesian approaches: Park and Casella (2008), Hans(2009), Carvalho et. al. (2009, 2010), Griffin and Brown(2007), and Armagan et al. (2013)None proposed for mixed effects models (LME). Evenmore challenging for LME coupled with random effects.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Linear mixed effects (LME) model:
yi = Xiβ + Ziζi + εi
ζi ∼ N(0,Ψ)
εi ∼ N(0, σ2Λ)
Challenge: standard methods such as AIC, BIC, Bayesfactors etc. do not work well different number of randomeffects. (4p models for p × 1 fixed and random vector)Very few approaches proposed so far: Chen and Dunson(2003), Cai and Dunson (2006), Kinney and Dunson(2007), Bondell et al. (2010), Ibrahim et al. (2010), Yang(2012, 2013).
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Some current methods for random effects
Chen and Dunson (2003): Stochastic search variableselection (SSVS) approach to the LMM.Kinney and Dunson (2007):Application of dataaugmentation algorithm, parameter expansion (Gelman,2004, 2005) and model approximation for the GLMM.Ibrahim et. al. (2010):Maximum penalized likelihood (MPL)and smoothly clipped absolute deviation (SCAD) andALASSOBondell et. al. (2010): adaptive LASSO and EMCai and Dunson (2006):Taylor series expansion for GLMM.All these models are parametric ones.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
We use Bayesian shrinkage priors for joint selection ofboth fixed and random effectsDesirable properties of such priors: spikes at zero, studentt-like tails, simple characterization as a scale mixture ofnormals to facilitate computatonSuch shrinkage priors can shrink small coefficients to zerowithout over-shrink large coefficients.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Outline
1 Introduction
2 model
3 Simulation
4 Application
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Shrinkage priors for fixed effects
Stochastic search variable selection (SSVS, George andMcCulloch 1993): Introduce latent variable J, βJ
k = (βk , jk = 1)′
We use generalized double Pareto, horse shoe prior, andnormal-exponential-gamma priors for the fixed effects.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Stochastic search variable selection:SSVS
Proposed by George and McCulloch (1993), originally forsubset selection in linear regression, searching for modelswith highest posterior probabilitiesstarting with full models containing all candidate predictorschoosing mixture priors that allow predictors to drop byzeroing the coefficientsrunning Gibbs sampler with conditional conjugacy from theposterior distribution.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Mixed effects
β = (β1, · · · , βp)′, J = (j1, · · · , jp)′
jk = 0 indicates βk = 0jk 6= 0 for βk 6= 0
We take the Zellner g-prior (Zellner and Siow, 1980)βJ ∼ N(0, σ2(X J′
X J)−1/g), g ∼ Gamma(.) and Bernoullifor jkGenerally ζ ∼ N(0,Ω). Application of Choleskydecomposition Ω = ΛΓΓ′Λ where Λ is a positive diagonalmatrix with diagonal elements λl , l = 1, · · · ,q proportionalto the random effects standard deviations, and Γ is a lowertriangular matrix with diagonal elements equal to 1.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
The generalized double Pareto (GDP) density is defined asf (x |a,b) = 1/(2a)(1 + |x |/(ab))−b+1, x ∼ GDP(a,b).The GDP resembles the Laplace density with similarsparse shrinkage properties, also has Cauchy-like tails,which avoids over-shrinkageHowever the posterior form is very complicated and theparameters are not in closed forms. Instead, it can beexpressed as a scale mixture of normal distribution.x ∼ N(0, τ), τ ∼ Exp(λ2/2), and λ ∼ Ga(α, η), thenx ∼ GDP(a = η/b,b).
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
β jk ∼ N(0, σ2/gκk ), κk ∼ exp(ϕ2
k/2), ϕk ∼ Gamma(ν1, ϑ1),
β jk ∼ N(0, σ2/gκ2
k ), κk ∼ Ca+(0, ν2), ν2 ∼ Ca+(0, ϑ2),
β jk ∼ N(0, σ2/gκk ), κk ∼ exp(ς2/2), ς2 ∼ Gamma(ν3, ϑ3),
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Model specification
yij = X ′ijβ + Z ′ijζi + εij , V (ζi) = Ω
i = 1, · · · ,n, j = 1, · · · ,ni
Xij : q × 1
Cholesky decomposition
yij = X ′ijβ + Z ′ijΛΓbi + εij , ΛΓV (bi)Γ′Λ′ = Ω
Λ : Diag(λ1, · · · , λq)
Γ: lower triangular matrix q × q with diagonal element 1.Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Prior for random effects
The random effect with the generalized shrinkage prior isspecified as follows:
bik ∼ Gk
Gk ∼ DP(αGk0)
Gk0 ∼ GDP(1.0,1.0)
The horse shoe prior and normal-exponential-gamma priorcan be specified similarly.Λ : Diag(λ1, · · · , λq) Γ: lower triangular matrix q × q withdiagonal element 1.λk ∼ pk0δ0 + (1− pk0)N+(0, φ2
k ), φ2k ∼ IG(1/2,1/2)
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Parameter expansion (PX)
Liu and Wu (1999), Gelman (2004, 2006)y |θ, z ∼ N(θ + z,1), z|θ ∼ N(0,D)θ unknown parameter, z missingy |θ, α,w ∼ N(θ − α + w ,1), w |θ, α ∼ N(α,D)z −→ α,w
PX breaks association, speeds up convergence.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Outline
1 Introduction
2 model
3 Simulation
4 Application
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
yij = Xijβ + Zijζi + εij , εij ∼ N(0, σ2)
Xij = (1, xij1, · · · , xij15)′, xijk ∼ Unif (−2,2)
zijk = xijk , k = 1, · · · ,4
Case 1:%i ∼ N(0,Ω),Ω =
0.00 0.00 0.00 0.000.00 0.9 0.48 0.060.00 0.48 0.40 0.100.00 0.06 0.10 0.10
.
β = (β1, · · · , β16)′, β1 = β2 = β3 = 1.5, βk = 0, k = 4, ...,16Case 2:q = 8, (%i1, · · · %i4)′ ∼ N(0,Ω1),Ω = Ω1 same as Case 1
Case 3:β4 = · · · = β9 = 0.01,everything else same as Case 2.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Case Method PF PR PM MSE S.E.1 GDPP 1.00 0.63 0.63 0.12 0.18
HSPP 1.00 0.64 0.63 0.11 0.19NEGP 0.98 0.59 0.58 0.13 0.19KDNP 0.86 0.57 0.46 0.18 0.25
2 GDPP 1.00 0.57 0.55 0.11 0.011HSPP 0.97 0.56 0.54 0.12 0.013NEGP 0.96 0.54 0.53 0.13 0.014KDNP 0.82 0.53 0.44 0.16 0.39
3 GDPP 1.00 0.60 0.60 0.18 0.10HSPP 0.99 0.61 0.59 0.20 0.11NEGP 0.93 0.58 0.56 0.26 0.15KDNP 0.40 0.55 0.23 0.35 0.24
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
captionComparisons of parameters estimates in the simulationstudy. S.E. is the standard error of the MSE.PM is thepercentage of times the true model was selected, PF and PRare the percentage of times the correct fixed and randomeffects were selected respectively.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Table: Models with highest posterior probability in the simulationstudy for Case 3: ∗ is the true model
Model GDPP HSPP NEGP KDNPxij1, xij2, xij3, zij2, zij3, z∗
ij4 0.62 0.61 0.59 0.23xij1, xij2, xij3, zij1, zij2, zij3, zij4 0.37 0.38 0.39 0.15xij1, xij2, xij3, xij6, xij9, zij2, zij3, zij4 0.082xij1, xij2, xij3, xij5, xij7, xij9, zij2, zij3, zij4 0.060xij1, xij3, xij5, zij2, zij3, zij4 0.051xij1, xij2, xij3, xij9, zij2, zij3, zij4 0.031xij1, xij2, xij3, xij6, zij1, zij2, zij3, zij4 0.027xij1, xij2, xij3, xij8, xij9, zij2, zij3, zij4 0.025xij1, xij2, xij3, xij8, zij1, zij2, zij3, zij4 0.023xij1, xij2, xij3, xij8, zij2, zij3, zij4 0.022
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
The shrinkage approaches provide better results than theKDNP.Based on extensive simulations, we find that the KDNP isnot stable and suffers from numerical issues and failseasily.Of all the approaches, GDPP and HSPP perform betterthan the others in variable selection and parameterestimation. GDPP can be implemented in a more efficientGibbs sampling algorithm than HSPP, which usesMetropolis-Hasting algorithm.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Outline
1 Introduction
2 model
3 Simulation
4 Application
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
A bioassay study conducted to identify the in vivoestrogenic responses to chemicals.Data consisted of 2681 female rats, experimented by 19participating labs in 8 countries from OECD.2 protocols on immature female rats with administration ofdoses by oral gavage (protocol A) and subcutaneousinjection (protocol B).2 other protocols on adult female rats with oral gavage(protocol C) and subcutaneous injection (protocol D).Each protocol was comprised of 11 groups, each with 6rats. 1 untreated group, 1 control group, 7 groupsadministered with varied doses of ethinyl estradiol (EE)and ZM.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Goal: how the uterus weights interacts with the in vivoactivity of chemical compounds under differentexperimental conditions.There is concerns about heterogeneity across differentlabs.Kannol et. al (2001) analyzed the data with simpleparametric methods.The number of possible competing models is over 4,000.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Table: Top 10 models with highest posterior probabilities by GDPP,HSPP and NEGP for real data
Model GDPP HSPP NEGPxij1, xij3, xij4, xij5, xij6, zij1, zij3, zij4 0.098 0.093 0.072xij1, xij3, xij5, xij6, zij1, zij2 0.063 0.049 0.097xij1, xij3, xij4, xij5, xij6, zij1, zij2, zij4 0.047 0.039 0.038xij1, xij3, xij4, xij5, xij6, zij1, zij4 0.038 0.023 0.033xij1, xij3, xij4, xij5, xij6, zij1, zij3, zij4, zij6 0.026xij1, xij3, xij4, xij5, xij6, zij1, zij2, zij6 0.023xij1, xij3, xij4, xij5, xij6, zij1, zij2, zij5 0.020 0.018 0.034xij1, xij3, xij4, xij5, xij6, zij1 0.020 0.013 0.046xij1, xij3, xij4, xij5, xij6, zij1, zij2, zij3, zij4 0.020 0.025xij1, xij3, xij4, xij5, xij6, zij1, zij3, zij4, zij5 0.019xij1, xij3, xij4, xij5, xij6, zij1, zij3, zij4, zij6 0.015xij1, xij3, xij5, xij6, zij1, zij3, zij4, zij5 0.032xij1, xij3, xij4, xij5, xij6, zij1, zij2 0.028xij1, xij3, xij4, xij5, xij6, zij1, zij2, zij4, zij5 0.015xij1, xij3, xij5, xij6, zij1, zij3, zij4 0.078xij1, xij3, xij5, xij6, zij1, zij4 0.055xij1, xij3, xij5, xij6, zij1, zij2, zij4 0.046
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
Table: Comparisons of parameter estimates in the real data study
Parameter GDPP HSPP NEGP LMERβ1 3.61a(3.31, 3.76)b 3.60a(3.35, 3.87)b 3.60a(3.28, 3.96)b 3.61a(3.16, 4.10)c
β2 NAd NAd NAd 0.06 (0.02, 0.12)β3 1.21 (1.04, 1.47) 1.18 (1.06, 1.53) 1.22 (0.97, 1.49) 1.20 (1.03, 1.69)β4 1.20 (0.89, 1.64) 1.21 (0.87, 1.70) 1.19 (0.79, 1.68) 1.27 (0.85, 1.64)β5 0.14 (0.13, 0.16) 0.14 (0.13, 0.16) 0.14 (0.13, 0.16) 0.14 (0.13, 0.15)β6 -0.48 (-0.57, -0.35) -0.47 (-0.56, -0.39) -0.47(-0.59, -0.24) -0.49 (-0.57, -0.37)a Meanb 95% credible intervalc 95% confidence intervald Not selected in the model
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
We try to fit the model with KDNP but it fails with too manynumerical issues.Fixed effect of protocol B is not selected for any model, butthe corresponding random effects is, which indicatesheterogeneity.Both fixed and random effects of protocol C and D areselected in the leading models.The fixed effects of EE dose and ZM dose are selected inthe leading models, but seldom for the correspondingrandom effects.
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models
Introductionmodel
SimulationApplication
References
Chen and Dunson Biometrics, 2003Cai and Dunson Biometrics, 2006Kinney and Dunson Biometrics, 2007Bondell et al. Biometrics, 2010Ibrahim et al. Biometrics, 2010Yang, M, Computational Statistics and Data Analysis, 2012Yang, M, Biometrical Journal, 2013Yang, M, Under review, 2015
Mingan Yang Bayesian shrinkage approach in variable selection for mixed effects models