§ / Applied Bayesian Inference, KSU, April 29, 2012 §❺ Metropolis-Hastings sampling and general...
-
Upload
jade-gibson -
Category
Documents
-
view
216 -
download
0
Transcript of § / Applied Bayesian Inference, KSU, April 29, 2012 §❺ Metropolis-Hastings sampling and general...
§ /
Applied Bayesian Inference, KSU, April 29, 2012
1
§❺ Metropolis-Hastings sampling and general MCMC approaches for GLMM
Robert J. Tempelman
§ /
Applied Bayesian Inference, KSU, April 29, 2012
2
Genetic linkage example…again
Recall plant genetic linkage analysis problem
or
Suppose flat constant prior (p(q) 1) was used. Then
4321
44
1
4
1
4
2
!!!!
!|
4321
yyyy
yyyy
nL
y
4321 112| yyyyL y
R
dL
Lp
y
yy
|
||
§ /
Applied Bayesian Inference, KSU, April 29, 2012
3
Suppose posterior density is not recognizable
• Additionally, suppose there is no clear data augmentation strategy
• Several solutions:– e.g. adaptive rejection sampling (not discussed here)– One recourse is to use the Metropolis-Hastings
algorithm in which one generates from a candidate (or proposal) density function q(q', q'') in generating a MCMC chain of random variates from.
• q‘ : where you’re at now at current MCMC cycle• q'': proposed value for next MCMC cycle
§ /
Applied Bayesian Inference, KSU, April 29, 2012
4
Metropolis Hastings
• Say MCMC cycle is currently at value q[t-1] from cycle t-1.
• Draw a proposed value q* from candidate density for cycle t.
• Accept move from q[t-1] to q[t] = q* with probability:
– Otherwise set q[t] = q*
0*,|*
,1
1,*,|
*,|*min
*, ]1[]1[]1[
]1[
]1[
ttt
t
t qpif
otherwiseqp
qp
y y
y
[ 1] , *tq
Good readable reference? Chib and Greenburg (1995)
§ /
Applied Bayesian Inference, KSU, April 29, 2012
5
How to compute this ratio “safely”
• Always use logarithms whenever evaluating ratios!!!
• Once you compute this…then backtransform
[ 1]
[ 1] [ 1]
[ 1] [ 1] [ 1]
* | *,log log
| , *
log * | log *, log | log , *
y
y
y y
t
t t
t t t
p q
p q
p q p q
exp log
§ /
Applied Bayesian Inference, KSU, April 29, 2012
6
Back to plant genetics exampleRecall y1=1997, y2=906, y3=904, y4=32.
Let’s use as the candidate generating function (based on likelihood approx.)1.Determine a starting value (i.e. 0th cycle) q[o]
2.For t = 1, m (number of MCMC cycles)a) Generate q * from q(q[t-1], q*) = N(0.0357,3.6338 x 10-5) b) Generate U from a Uniform(0,1) distributionc) If U<a(q[t-1], q*) then set q[t]= q *, else set q[t] = q[t-1]
• Note that this is an independence chains algorithm
q(q[t-1], q*) = N( =m 0.0357, s2 = 3.6338 x 10-5)
q(q[t-1], q*) = q(q*)
§ /
Applied Bayesian Inference, KSU, April 29, 2012
7
Independence chains Metropolis
• When candidate does not depend on q[t-1]
– i.e.
• However, in spite of this “independence” label, there is still serial autocorrelation between the samples.
• IML code online. Generate output for 9000 draws after 1000 burn-in samples. Save every 10.
0*|
,1
1,*|
|*min
*, ]1[]1[
]1[
]1[
qpif
otherwiseqp
qp
tt
t
t y y
y
q(q[t-1], q*) = q(q*)
§ /
Applied Bayesian Inference, KSU, April 29, 2012
9
Monitoring MH acceptance rates over cycles for genetic linkage example
• Average MH acceptance rates (for every 10 cycles)
ALPHASAV
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
CYCLE
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Many acceptance rates close to 1!
Is this good?
NO
Intermediate acceptance ratios (0.25-0.5) are optimal for MH mixing.
§ /
Applied Bayesian Inference, KSU, April 29, 2012
10
How to optimize Metropolis acceptance ratios
• Recall q(q[t-1], q*) = N(m,s2)– m = 0.0357, s2=3.6338 x 10-5
• Suggest using q(q[t-1], q*) = N(m,cs2) and modify c (during burn-in) so that MH acceptance rates are intermediate– Increase c ….decrease acceptance rates– Decrease c ….increase acceptance rates.
§ /
Applied Bayesian Inference, KSU, April 29, 2012
11
“Tuning” the MH-sampler:My strategy
• Every 10 MH cycles for first half of burnin, assess the following:– if average acceptance rate > .80, then set c = 1.2 c,– if average acceptance rate < .20 then set c = 0.7 c,– otherwise let c be.
• SAS PROC MCMC has a somewhat different strategy.
• Let’s rerun same PROC IML code again but with this modification.
§ /
Applied Bayesian Inference, KSU, April 29, 2012
12
Average acceptance ratio versus cycle(during 400 burn-in cycles)
C_CHG
1
2
3
4
5
6
7
CYCLE_SCALE
0 1000 2000 3000 4000
c
cycle
One should finish the tuning process not much later than half-ways through “burnin”
§ /
Applied Bayesian Inference, KSU, April 29, 2012
13
Monitoring MH acceptance rates over cycles
• Average MH acceptance rates (every 10 cycles) post burn-in (16000 cycles)
ALPHASAV
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
CYCLE
4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000
§ /
Applied Bayesian Inference, KSU, April 29, 2012
14
Posterior density of q.
Analysis Variable : theta
Mean Median Std Dev 5th Pctl 95th Pctl
0.0366 0.0366 0.0064 0.0265 0.0471
§ /
Applied Bayesian Inference, KSU, April 29, 2012
15
Random walk Metropolis sampling• More common (especially when proposals based
on likelihood function are not plausible) than independence chains Metropolis.
• Proposal density is chosen to be symmetric in q* and q[t-1].– i.e. q(q[t-1], q*) = q(q*, q[t-1])
• Example: generate a random variate d from N(0,cs2) and add it to the previous cycle value q[t-1] to generate q* = q[t-1] + d: same as sampling from
]1[']1[
22
]1[ **2
1exp
2
1*, ttt
ccq
§ /
Applied Bayesian Inference, KSU, April 29, 2012
16
Random walk Metropolis (cont’d)
• Because of symmetricity of q(q[t-1], q*) in q[t-
1] and q*, MH acceptance ratio simplifies:
• i.e., because
0|
,1
1,|
|*min
*, ]1[]1[]1[
y y
yttt pif
otherwisep
p
[ 1] [ 1], * *,t tq q
§ /
Applied Bayesian Inference, KSU, April 29, 2012
17
Back to example.
• Start again with s2 = 0.00602 and a starting value for q[t-1] at t=1.
• Generate proposed value from
accept with probability
– i.e., generate from N(0,cs2) and add to q[t-1] • Tune c for intermediate acceptance rates during burn-in.
]1[']1[
22
]1[ **2
1exp
2
1*, ttt
ccq
0|
,1
1,|
|*min
*, ]1[]1[]1[
y y
yttt pif
otherwisep
p
§ /
Applied Bayesian Inference, KSU, April 29, 2012
19
What about “canned” software?
• WinBugs• AD Model Builder• Various R packages (MCMCglmm)• SAS PROC MCMC
– Will demonstrate shortly…functions a bit like PROC NLINMIXED (no class statement)
• They all work fine.– But sometimes they don’t recognize conjugacy in priors
• i.e., can’t distinguish between conjugate and non-conjugate (Metropolis) sampling.
• So often defaults to Metropolis. (PROC MCMC: random walk Metropolis)
§ /
Applied Bayesian Inference, KSU, April 29, 2012
20
Recall old split plot in time example
• Recall the “bunny” example from earlier.– We used PROC GLIMMIX and MCMC (SAS PROC IML) to
analyze the data.– Our MCMC implementation involved recognizeable FCD
• Split plot in time assumption.– Other alternatives?
• Marginal versus conditional specifications on CS• AR(1)• Others?
– Some FCD are not recognizeable• Metropolis updates necessary.• Let’s use SAS PROC MCMC.
§ /
Applied Bayesian Inference, KSU, April 29, 2012
21
First create the dummy variables using PROC TRANSREG (PROC MCMC does not have a “CLASS” statement)
(Dataset called ‘recodedsplit’)Obs _NA
ME_Intercept
trt1 trt2 time1
time2
time3
Trt1time1
Trt1time2
Trt1time3
Trt2time1
Trt2time2
Trt2time3
trt time y trtrabbit
1 -0.3 1 1 0 1 0 0 1 0 0 0 0 0 1 1 -0.3 1_1
2 -0.2 1 1 0 0 1 0 0 1 0 0 0 0 1 2 -0.2 1_1
3 1.2 1 1 0 0 0 1 0 0 1 0 0 0 1 3 1.2 1_1
4 3.1 1 1 0 0 0 0 0 0 0 0 0 0 1 4 3.1 1_1
5 -0.5 1 1 0 1 0 0 1 0 0 0 0 0 1 1 -0.5 1_2
6 2.2 1 1 0 0 1 0 0 1 0 0 0 0 1 2 2.2 1_2
7 3.3 1 1 0 0 0 1 0 0 1 0 0 0 1 3 3.3 1_2
8 3.7 1 1 0 0 0 0 0 0 0 0 0 0 1 4 3.7 1_2
9 -1.1 1 1 0 1 0 0 1 0 0 0 0 0 1 1 -1.1 1_3
10 2.4 1 1 0 0 1 0 0 1 0 0 0 0 1 2 2.4 1_3
Part of X matrix (full-rank) &_trgind
§ /
Applied Bayesian Inference, KSU, April 29, 2012
22
SAS PROC MCMC(“Conditional” specification)
proc mcmc data=recodedsplit outpost=ksu.postsplit propcov=quanew seed = &seed nmc=400000 thin=10monitor = (beta1-beta&nvar sigmae sigmag); array covar[&nvar] intercept &_trgind; array beta[&nvar] ; parms sige 1 ; * residual sd; parms sigg 1 ; * random ef sd; parms (beta1-beta&nvar) 1;
prior beta:~normal(0,var=1e6);/* prior beta: ~ general(0); could also do this too */ prior sige ~ general(0,lower=0); /* Gelman prior */ prior sigg ~ general(0,lower=0); /* Gelman prior */
Where to save the MCMC samples
Metropolis implementation strategy
data null; call symputx(‘seed', 8723); call symputx('nvar',12); run;
Save how often?
Total number of samples after burnin
NBI = 1000 (default number of burn-in cycles)
Fixed effects dummy variables
Fixed effects
Parms: starting values
Priors: b ~ N(0,106)p(se) ~ constant;p(su) ~ constant
§ /
Applied Bayesian Inference, KSU, April 29, 2012
23
SAS PROC MCMC(conditional specification)
beginnodata; sigmae = sige*sige; sigmau = sigg*sigg; endnodata;
call mult(covar, beta, mu); random u ~ normal (0,var=sigmau) subject=trtrabbit ; model y ~ normal(mu + u,var=sigmae);run;
'x βi i
' ' 2,x β z ui i i ey N 20,i uu N
22u u
22e e
§ /
Applied Bayesian Inference, KSU, April 29, 2012
24
PROC MCMC outputParametersBlock Parameter Sampling
MethodInitialValue
Prior Distribution
1 sige N-Metropolis 1.0000 general(0,lower=0)
2 sigg N-Metropolis 1.0000 general(0,lower=0)
3 beta1 N-Metropolis 1.0000 normal(0,var=1e6)
beta2 1.0000 normal(0,var=1e6)
beta3 1.0000 normal(0,var=1e6)
beta4 1.0000 normal(0,var=1e6)
beta5 1.0000 normal(0,var=1e6)
beta6 1.0000 normal(0,var=1e6)
beta7 1.0000 normal(0,var=1e6)
beta8 1.0000 normal(0,var=1e6)
beta9 1.0000 normal(0,var=1e6)
beta10 1.0000 normal(0,var=1e6)
beta11 1.0000 normal(0,var=1e6)
beta12 1.0000 normal(0,var=1e6)
Random Effects Parameters
Parameter Subject Levels Prior Distribution
u trtrabbit 15 normal(0,var=sigmau)
§ /
Applied Bayesian Inference, KSU, April 29, 2012
25
Posterior Summaries
Parameter N Mean StandardDeviation
Percentiles
25% 50% 75%
beta1 40000 0.2178 0.3910 -0.0434 0.2199 0.4823
beta2 40000 2.3706 0.5528 2.0007 2.3707 2.7360
beta3 40000 -0.2079 0.5524 -0.5761 -0.2063 0.1545
beta4 40000 -0.8958 0.5086 -1.2292 -0.8967 -0.5616
beta5 40000 0.0139 0.5066 -0.3172 0.0115 0.3501
beta6 40000 -0.6407 0.5006 -0.9753 -0.6429 -0.3033
beta7 40000 -1.9340 0.7151 -2.4049 -1.9339 -1.4548
beta8 40000 -1.2282 0.7134 -1.7030 -1.2309 -0.7548
beta9 40000 -0.0719 0.7071 -0.5445 -0.0763 0.3993
beta10 40000 0.3055 0.7127 -0.1721 0.3011 0.7832
beta11 40000 -0.5411 0.7097 -1.0132 -0.5395 -0.0682
beta12 40000 0.5758 0.7033 0.1095 0.5748 1.0406
sigmae 40000 0.6314 0.1478 0.5266 0.6124 0.7148
sigmau 40000 0.1276 0.1465 0.0285 0.0850 0.1748
Compare to conditional model results from § 82,84
§ /
Applied Bayesian Inference, KSU, April 29, 2012
26
Effective Sample Sizes
Parameter ESS AutocorrelationTime
Efficiency
beta1 4285.7 9.3334 0.1071
beta2 5778.0 6.9229 0.1444
beta3 5171.1 7.7353 0.1293
beta4 5639.7 7.0926 0.1410
beta5 3900.5 10.2550 0.0975
beta6 3901.6 10.2522 0.0975
beta7 4197.4 9.5297 0.1049
beta8 6248.7 6.4013 0.1562
beta9 6857.7 5.8329 0.1714
beta10 2890.5 13.8385 0.0723
beta11 6647.5 6.0173 0.1662
beta12 5563.2 7.1902 0.1391
sigmae 6173.6 6.4792 0.1543
sigmau 1364.3 29.3186 0.0341
§ /
Applied Bayesian Inference, KSU, April 29, 2012
27
LSMEANS USING PROC MIXED
trt Least Squares Means
trt Estimate Standard Error
1 1.4000 0.2135
2 -0.2900 0.2135
3 -0.1600 0.2135
time Least Squares Means
time Estimate Standard Error
1 -0.5000 0.2100
2 0.3667 0.2100
3 0.4667 0.2100
4 0.9333 0.2100
trt*time Least Squares Means
trt time Estimate Standard Error
1 1 -0.2400 0.3638
1 2 1.3800 0.3638
1 3 1.8800 0.3638
1 4 2.5800 0.3638
2 1 -0.5800 0.3638
2 2 -0.5200 0.3638
2 3 -0.06000 0.3638
2 4 5.5E-15 0.3638
3 1 -0.6800 0.3638
3 2 0.2400 0.3638
3 3 -0.4200 0.3638
3 4 0.2200 0.3638
§ /
Applied Bayesian Inference, KSU, April 29, 2012
28
“Least-squares means”using output from PROC MCMC
Variable Mean Median Std DevTRT1 1.399202 1.399229 0.241373TRT2 -0.2857 -0.28771 0.238766TRT3 -0.16286 -0.16038 0.241136
TIME1 -0.50001 -0.50024 0.225171TIME2 0.362834 0.365804 0.226114TIME3 0.466009 0.465563 0.224869TIME4 0.938682 0.937432 0.223448
Variable Mean Median Std DevTRT1TIME1 -0.24151 -0.24036 0.395506TRT1TIME2 1.374094 1.373212 0.390686TRT1TIME3 1.875858 1.873671 0.388689TRT1TIME4 2.588362 2.585974 0.388577TRT2TIME1 -0.58048 -0.58151 0.387481TRT2TIME2 -0.51727 -0.51545 0.385221TRT2TIME3 -0.05497 -0.05467 0.389197TRT2TIME4 0.0099 0.008927 0.389475TRT3TIME1 -0.67805 -0.67985 0.393538TRT3TIME2 0.231677 0.2315 0.395277TRT3TIME3 -0.42287 -0.41975 0.38795TRT3TIME4 0.217785 0.219946 0.390986
Marginal means
Cell means
Compare to Gibbs sampling results from § 85
§ /
Applied Bayesian Inference, KSU, April 29, 2012
29
Posterior densities of s2u s2
e
Bounded above 0…by definition
§ /
Applied Bayesian Inference, KSU, April 29, 2012
30
The Marginal Model Specification (Type = CS)
• SAS PROC MIXED CODE
title "Marginal Model: Compound Symmetry using PROC MIXED";proc mixed data=ear ; class trt time rabbit; model temp = trt time trt*time /solution; repeated time /subject = rabbit(trt) type=cs rcorr; lsmeans trt*time;run;
§ /
Applied Bayesian Inference, KSU, April 29, 2012
31
• Now
• To ensure R is p.s.d,– nt: number of repeated measures per rabbit
2 2 2 2 2
2 2 2 2 22
( ) 2 2 2 2 2
2 2 2 2 2
1
1
1
1
R
u e u u u
u u e u uk i
u u u e u
u u u u e
2
2 2u
u e
2 2 2
u e
1
11
tn
§ /
Applied Bayesian Inference, KSU, April 29, 2012
32
Need to format data differentlyObs trt time trtrabbit first last y
1 1 1 1_1 1 0 -0.3
2 1 2 1_1 0 0 -0.2
3 1 3 1_1 0 0 1.2
4 1 4 1_1 0 1 3.1
5 1 1 1_2 1 0 -0.5
6 1 2 1_2 0 0 2.2
7 1 3 1_2 0 0 3.3
8 1 4 1_2 0 1 3.7
9 1 1 1_3 1 0 -1.1
10 1 2 1_3 0 0 2.4
data=recodedsplit1
§ /
Applied Bayesian Inference, KSU, April 29, 2012
33
I’ll keep the covariates in a different file too.
Obs Intercept
trt1 trt2 time1
time2
time3
trt1time1
trt1time2
trt1time3
trt2time1
trt2time2
trt2time3
1 1 1 0 1 0 0 1 0 0 0 0 0
2 1 1 0 0 1 0 0 1 0 0 0 0
3 1 1 0 0 0 1 0 0 1 0 0 0
4 1 1 0 0 0 0 0 0 0 0 0 0
5 1 1 0 1 0 0 1 0 0 0 0 0
6 1 1 0 0 1 0 0 1 0 0 0 0
7 1 1 0 0 0 1 0 0 1 0 0 0
8 1 1 0 0 0 0 0 0 0 0 0 0
9 1 1 0 1 0 0 1 0 0 0 0 0
10 1 1 0 0 1 0 0 1 0 0 0 0
data=covariates
§ /
Applied Bayesian Inference, KSU, April 29, 2012
34
PROC MCMCdata a; run;
/* PROC MCMC WITH COMPOUND SYMMETRY ASSUMPTION */title1 "Bayesian inference on compound symmetry ";proc mcmc jointmodel data=a outpost=ksu.postcs propcov=quanew seed = &seed nmc=400000 thin=10 ;
array covar[1]/nosymbols ; array data[1]/nosymbols; array first1[1]/nosymbols; array last1[1]/nosymbols;
array beta[&nvar] ; array mu[&nrec]; array ytemp[&nrep]; array mutemp[&nrep]; array VCV[&nrep,&nrep];
This data step is a little silly but it is required.
jointmodel option implies that each observation contribution to likelihood function is NOT conditionally independent.
§ /
Applied Bayesian Inference, KSU, April 29, 2012
35
begincnst; rc = read_array("recodedsplit1",data,"y"); rc = read_array("recodedsplit1",first1,"first"); rc = read_array("recodedsplit1",last1,"last"); rc = read_array("covariates",covar);endcnst;
parms sige .25 ; * residual sd; parms intrcl .3 ; * intraclass correlation;
parms (beta1-beta&nvar) 1;
§ /
Applied Bayesian Inference, KSU, April 29, 2012
36
beginnodata; prior beta:~normal(0,var=1e6); prior sige ~ general(0, lower=0); /* Gelman prior */ prior intrcl ~ general(0,lower=&lbound1,upper=.999); sigmae = sige*sige; sigmag = intrcl*sigmae; call fillmatrix(VCV,sigmag);
do i = 1 to &nrep; VCV[i,i] = sigmae;end;call mult(covar,beta,mu);
endnodata;
ljointpdf = 0;
• &lbound1 = -1/3 (lower bound on CS correlation when blocksize = 4)
§ /
Applied Bayesian Inference, KSU, April 29, 2012
37
do irec = 1 to &nrec; if (first1[irec] = 1) then counter=0;
counter = counter + 1; ytemp[counter] = data[irec]; mutemp[counter] = mu[irec]; if (last1[irec] = 1) then do; do; ljointpdf = ljointpdf + lpdfmvn(ytemp, mutemp, VCV); end; end;
end; model general(ljointpdf);run;
§ /
Applied Bayesian Inference, KSU, April 29, 2012
38
PROC MCMCPosterior Summaries
Parameter N Mean StandardDeviation
Percentiles
25% 50% 75%
sige 40000 0.8643 0.1040 0.7921 0.8528 0.9225
intrcl 40000 0.1736 0.1453 0.0679 0.1599 0.2661
beta1 40000 0.2267 0.3909 -0.0313 0.2298 0.4869
beta2 40000 2.3553 0.5523 1.9916 2.3491 2.7140
beta3 40000 -0.2290 0.5536 -0.5965 -0.2327 0.1388
beta4 40000 -0.8982 0.5012 -1.2320 -0.8984 -0.5682
beta5 40000 0.0185 0.4937 -0.3080 0.0204 0.3433
beta6 40000 -0.6505 0.4985 -0.9830 -0.6529 -0.3221
beta7 40000 -1.9185 0.7058 -2.3900 -1.9170 -1.4498
beta8 40000 -1.2292 0.7038 -1.6901 -1.2329 -0.7667
beta9 40000 -0.0599 0.7024 -0.5232 -0.0555 0.4045
beta10 40000 0.3204 0.7087 -0.1426 0.3182 0.7891
beta11 40000 -0.5386 0.7072 -0.9975 -0.5438 -0.0748
beta12 40000 0.5890 0.7025 0.1227 0.5945 1.0596
§ /
Applied Bayesian Inference, KSU, April 29, 2012
39
PROC MIXED vs PROC MCMC
Covariance Parameter Estimates
Cov Parm Subject Estimate Standard Error
Z Value Pr Z
CS rabbit(trt) 0.08336 0.09910 0.84 0.4002
Residual 0.5783 0.1363 4.24 <.0001
Variable Median Std Dev Minimum Maximum
sigmau2 0.110874 0.15354 -0.34127 5.535211
sigmae2 0.592512 0.152743 0.246462 1.870365
PROC MCMC
PROC MIXED
§ /
Applied Bayesian Inference, KSU, April 29, 2012
40
Posterior marginal densities for s2u and s2
e under marginal model
Notice how much of the posterior density of s2
u is concentrated to the left of 0!
Potential “ripple effect” on inferences on K’b ? (Stroup and Littell., 2002) relative to conditional spec.?
§ /
Applied Bayesian Inference, KSU, April 29, 2012
41
First order autoregressive model (type = AR(1))
• SAS PROC MIXED CODE
title "Marginal Model: AR(1) using PROC MIXED";proc mixed data=ear ; class trt time rabbit; model temp = trt time trt*time /solution; repeated time /subject = rabbit(trt) type= AR(1) rcorr; lsmeans trt*time;run; CORRECTION!
§ /
Applied Bayesian Inference, KSU, April 29, 2012
42
Specifying VCV for AR(1)
• Note
• Might be easier to specify:
2 3
22
( ) 2
3 2
1
1
1
1
Rk i
2
1( ) 2 2 2
1 0 0
1 0 1
0 1 1
0 0 1
Rk i
Especially for large Rk(i)
Example MCMC code provided online.
§ /
Applied Bayesian Inference, KSU, April 29, 2012
43
Variance Component InferenceCovariance Parameter Estimates
Cov Parm Subject Estimate Standard Error
AR(1) rabbit(trt) 0.2867 0.1453
Residual 0.6551 0.141
Variable Median Std Dev 5th Pctl 95th Pctl
rho 0.286 0.149 0.0313 0.52
sigmae2 0.706 0.178 0.501 1.056
PROC MIXED MCMC
§ /
Applied Bayesian Inference, KSU, April 29, 2012
44
An example of a “sticky” situation
• Consider a Poisson (count data) example.• Simulated data from a split plot design.
– 4 whole plots per each of 3 levels of a whole plot factor.
• 3 subplots per whole plot -> 3 levels of a subplot factor.
• Whole plot variance: s2w = 0.50
• Overdispersion (G-side) variance:– B*wholeplot variance: s2
e = 1.00
§ /
Applied Bayesian Inference, KSU, April 29, 2012
45
GLIMMIX code:
proc glimmix data=splitplot method=laplace; class A B wholeplot subject ; model y = A|B /dist=poisson solution ; random wholeplot(A) B*wholeplot(A); lsmeans A B A*B/e ilink;run;
§ /
Applied Bayesian Inference, KSU, April 29, 2012
46
Inferences on variance components:
• PROC GLIMMIX
Covariance Parameter Estimates
Cov Parm Estimate Standard Error
wholeplot(A) 0.6138 0.3516
B*wholeplot(A) 0.9293 0.2514
§ /
Applied Bayesian Inference, KSU, April 29, 2012
47
Using PROC MCMCproc mcmc data=recodedsplit outpost=postout propcov=quanew seed = 9548 nmc=400000 thin=10; array covar[&nvar] intercept &_trgind; array beta[&nvar] ; parms sigmau .5; parms sigmae .5; parms (beta1-beta&nvar) 1; prior beta: ~ normal(0,var=10E6); prior sigmae ~ igamma(shape=.1,scale=.1); prior sigmau ~ igamma(shape=.1,scale=.1); call mult(covar, beta, mu); random u~ normal (0,var=sigmau) subject=plot ; random e~ normal (0,var=sigmae) subject= subject; lambda = exp(mu + u + e); model y ~ poisson(lambda);run;
'x βi i 20,j uu N
' 'exp x β z ui i i ie 20,i ee N
~i iy Poisson
2 0.1,0.1u IG 2 0.1,0.1e IG 6~ ,10β 0 IN
§ /
Applied Bayesian Inference, KSU, April 29, 2012
48
Some outputPosterior Summaries
Parameter
N Mean StandardDeviation
Percentiles
25% 50% 75%
sigmag 40000 0.7947 0.5956 0.3891 0.6635 1.0324
sigmae 40000 1.4055 0.4559 1.0802 1.3285 1.6449
beta1 40000 6.6630 0.3811 6.4611 6.6790 6.9158
beta2 40000 -3.8229 0.8258 -4.3769 -3.8290 -3.2845
beta3 40000 -4.2165 0.8073 -4.7672 -4.2412 -3.7257
beta4 40000 -0.7618 0.4472 -1.0997 -0.8095 -0.4266
beta5 40000 -1.5901 0.6757 -2.1210 -1.5089 -1.1206
beta6 40000 -2.0756 0.7286 -2.5323 -2.0938 -1.6069
beta7 40000 0.7144 1.1396 -0.0554 0.7189 1.4600
beta8 40000 0.6214 1.1488 -0.1162 0.6336 1.3851
beta9 40000 2.4683 1.0499 1.8227 2.4922 3.1429
beta10 40000 1.9011 1.1083 1.2645 1.9517 2.6003
beta11 40000 -0.8063 0.8887 -1.4099 -0.8112 -0.2278
beta12 40000 1.3887 0.9450 0.6332 1.4562 2.0298
In the same ball-park as the PROC GLIMMIX solutions/VC estimates…but there is a PROBLEM ->>>>>
§ /
Applied Bayesian Inference, KSU, April 29, 2012
49
Pretty slow mixingEffective Sample Sizes
Parameter ESS AutocorrelationTime
Efficiency
sigmag 155.1 257.9 0.0039
sigmae 186.2 214.8 0.0047
beta1 43.0 931.1 0.0011
beta2 59.4 673.8 0.0015
beta3 61.8 646.8 0.0015
beta4 44.1 906.0 0.0011
beta5 42.5 940.4 0.0011
beta6 54.4 735.8 0.0014
beta7 62.5 639.9 0.0016
beta8 86.9 460.1 0.0022
beta9 58.6 682.1 0.0015
beta10 136.2 293.7 0.0034
beta11 53.7 745.5 0.0013
beta12 49.3 811.0 0.0012
§ /
Applied Bayesian Inference, KSU, April 29, 2012
51
From SAS log file:
Too sticky!!! Solution? Thin even more than saving every 10….and generate a lot more samples!
§ /
Applied Bayesian Inference, KSU, April 29, 2012
52
Hierarchical centering sampling advocated by SAS
proc mcmc data=recodedsplit outpost=postout propcov=quanew seed = 234 nmc=400000 thin=10; array covar[&nvar] intercept &_trgind; array beta[&nvar] ; array wp[16]; parms wp: 0; parms sigmae .5 ; parms sigmag .5 ; parms (beta1-beta&nvar) 1; prior wp: ~ normal(0,var=sigmag); prior beta: ~ normal(0,var=10E6); prior sigmae ~ igamma(shape=.1,scale=.1); prior sigmag ~ igamma(shape=.1,scale=.1); call mult(covar, beta, mu); w = wp[plot] + mu; random llambda ~ normal (w,var=sigmae) subject= subject; lambda = exp(llambda); model y ~ poisson(lambda);run;
20,j uu N 6~ ,10β 0 IN 2 0.1,0.1e IG 2 0.1,0.1u IG 'x βi i
' 'x β z ui i iw
~i iy Poisson
2log ~ ,i i eN w
§ /
Applied Bayesian Inference, KSU, April 29, 2012
53
Faster mixing!Effective Sample Sizes
Parameter ESS AutocorrelationTime
Efficiency
wp1 497.2 80.4554 0.0124wp2 621.5 64.3569 0.0155wp3 336.4 118.9 0.0084wp4 669.9 59.7148 0.0167wp5 967.1 41.3624 0.0242wp6 1767.9 22.6263 0.0442wp7 1160.7 34.4624 0.0290wp8 1109.0 36.0701 0.0277wp9 1275.3 31.3651 0.0319wp10 717.9 55.7176 0.0179wp11 1518.0 26.3512 0.0379wp12 1223.3 32.6995 0.0306wp13 583.9 68.5094 0.0146wp14 606.2 65.9881 0.0152wp15 674.1 59.3384 0.0169wp16 799.2 50.0492 0.0200
Effective Sample Sizes
Parameter ESS AutocorrelationTime
Efficiency
sigmae 3831.5 10.4397 0.0958sigmag 825.1 48.4794 0.0206beta1 850.1 47.0507 0.0213beta2 1475.5 27.1103 0.0369beta3 908.7 44.0188 0.0227beta4 907.1 44.0954 0.0227beta5 6352.5 6.2967 0.1588beta6 4736.8 8.4446 0.1184beta7 8021.8 4.9864 0.2005beta8 4565.9 8.7606 0.1141beta9 7303.8 5.4766 0.1826beta10 8076.8 4.9525 0.2019beta11 5080.2 7.8738 0.1270beta12 4005.2 9.9870 0.1001
§ /
Applied Bayesian Inference, KSU, April 29, 2012
55
Natural next step
• Compute marginal/cell means as function of effects (b)…just like before.– i.e., k’b
• Transform to the observed scale and look at posterior distribution:– Naturally(?): exp(k’b)
• But that is a “conditional specification”
– Marginally; it might be something different…..
§ /
Applied Bayesian Inference, KSU, April 29, 2012
56
Simple illustration of marginal versus conditional in overdispersed Poisson
• If Yi ~ Poisson (exp(m+ui)) then marginally
so we probably should look at this posterior density of this function instead for “population-averaged” inference.
• Conditionally on ui = 0
• Implications on what functions we look at for posterior distributions.
2
E exp exp2E
i
ui i
u
Y u
E | 0 expi iY u “subject-specific” inference
§ /
Applied Bayesian Inference, KSU, April 29, 2012
57
Enough with your probit link!
• I WANT TO DO MCMC ON A LOGISTIC MIXED EFFECTS MODEL.– I’m an odd(s ratio) kind of guy/girl. – Ok..fine. See worked out example for PROC
MCMC.• Chen Fang. 2011. The RANDOM statement and more:
moving on with PROC MCMC. SAS Global Forum 2011. http://support.sas.com/rnd/app/papers/abstracts/334-2011.html
§ /
Applied Bayesian Inference, KSU, April 29, 2012
58
Other SAS procedures doing Bayesian/MCMC inference?
• Yes, but primarily for fixed effects models.– PROC GENMOD, LIFEREG, PHREG.– Greater need might be for mixed model versions.
• PROC MIXED has some Bayesian MCMC capabilities for simple variance component models.– i.e., not repeated measures.
§ /
Applied Bayesian Inference, KSU, April 29, 2012
59
Repeated measures in generalized linear mixed models
• The G-Side versus R-side conundrum• In classical GLMM analyses (PROC GLMM,
GENMOD), the R-side process cannot be simulated.– Model is “vacuous” (Walt Stroup).
• So take the G-side route.– This would be easy to analyze using MCMC if
underlying liabilities were augmented (need a multivariate normal cdf otherwise).