1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe...
-
Upload
isabella-benson -
Category
Documents
-
view
221 -
download
0
Transcript of 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe...
![Page 1: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/1.jpg)
1
![Page 2: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/2.jpg)
2
Dirichlet process mixtures are active research areas Dirichlet mixtures are it!
The flexibility of DPM models supported its huge popularity in wide variety of areas of application.
DPM models are general and can be argued to have less structure.
Double Dirichlet Process Mixtures add a degree of structure, possibly at the expense of some degree of flexibility, but possibly with better interpretability in some cases
We discuss applications (and limitations) of these semiparametric double mixtures
We compare fit-prediction duality with competing models
![Page 3: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/3.jpg)
3
Other DP extensions
Double Dirichlet process mixtures are a subclass of dependent Dirichlet Process mixtures (MacEachern 1999,……)
Double DP mixture are different from Hierarchical Dirichlet Processes (The et al. 2006 )
Double DPM is simply independent DPMS
![Page 4: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/4.jpg)
4
Motivating Example 1
Luminex measurements on two biomarker proteins from n=156 Patients IL-1β protein C-reactive protein
The biological effects of these two proteins are thought to be not (totally) overlapping.
niNy
yi
i
i
i
i ,...,1,,~2
12
2
1
![Page 5: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/5.jpg)
5
Two Biomarkers (y1 and y2)
Usual DP mixture of normals (Ferguson 1983,…..)
niNy
yi
i
i
i
i ,...,1,,~2
12
2
1
)WishartInv.~(),(~
),(DP~
,...,1,~i.i.d.),(
0020
0
NG
GG
niGii
Questions Should we model the two biomarkers jointly?
Should we cluster the patients based on both biomarkers jointly?
The biomarkers may operate somewhat independently.
![Page 6: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/6.jpg)
6
Double DP mixtures
niNy
y
i
iiii
i
i
i
i ,...,1,,~22
2121
2
12
2
1
)GammaInv.~(),(~
),(DP~
,...,1,~i.i.d.),(
21
20101101
1011
1211
NG
GG
niGii
)GammaInv.~(),(~
),(DP~
,...,1,~i.i.d.),(
22
20202202
2022
2222
NG
GG
niGii
Equicorrelation – corr(y1i, y2i) are assumed to be the same for all i=1,…,n
Clustering based on biomarker 1 and based on biomarker 2 can be different
![Page 7: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/7.jpg)
7
Motivating Example 2: Interrater Agreement
Agreement between 2 Raters (Melia and Diener-West 1994)
Each rater provides an ordinal rating on a scale of 1-5 (lowest to highest invasion)of the extent to which tumor has invaded the eye,n=885
Rater 1
1 2 3 4 5
Rater2
1 291 74 1 1 1
2 186 256 7 7 3
3 2 4 0 2 0
4 3 10 1 14 2
5 1 7 1 8 3
![Page 8: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/8.jpg)
8
Interrater agreement
Kottas, Muller, Quintana (2005) analyzed these data using a flexible DP mixture of Bivariate probit ordinal model which modeled the unstructured joint probabilities
prob(Rater 1=i and Rater2 = j), i=1,…,5, j=1,…,5
One way to quantify interrrater agrrement is to measure departure from the structured model of independence
We consider a (mixture of) Double DP mixtures model here which provides separate DP structures for the two raters. We then measure ``agreement’’ from this model.
![Page 9: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/9.jpg)
9
Motivating Example 3
Mixed model for longitudinal data
),(~),...,(
,
1 iiiiipii
ijiijijij
bZXNyyy
orbZXy
It is common to assume (Bush and MacEachern 1996)
DPGGbi ~,~
Modeling the error covariance i or the error variance (if i =diag(2i)) extends
the normal distribution assumption to normal scale mixtures (t, Logistic,…)
DPGGi ~,~2
![Page 10: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/10.jpg)
10
Putting the two together
One way to combine these two structures is
DPGGb ii ~,~),( 2
Do we expect the random effects bi appearing in the modeling the mean and the error variances to cluster similarly?
The error variance model often is used to extend the distributional assumption.
),(~,~
),(~,~
022222
01111
GDPGG
GDPGGb
i
i
![Page 11: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/11.jpg)
11
Double DPM
I will discuss Fitting Applicability Flexibility Limitations
of such double semiparametric mixtures
I will also compare these models with usual DP models via predictive model comparison criteria
![Page 12: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/12.jpg)
12
Dirichlet process
Dirichlet Process is a probability measure on the space of distributions (probability measures) G.
G ~ Dirichlet Process (G0), where G0 is a probability
Dirichlet Process assigns positive mass to every open set of probabilities on support(G0)
Conjugacy: Y1,…., Yn ~ i.i.d. G, (G) = DP( G0) Then Posterior (G|Y) ~ DP( G0 + nFn) where Fn is the empirical distn.
Polya Urn Scheme
![Page 13: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/13.jpg)
13
Stick breaking and discreteness
G~ DP( G0) implies G is almost surely discrete
,1(~),1)(1(
,1(~),1(
,1(~,
ionrepresntat breaking-Stick
....~,....,
1982) Sethuraman and (Tiwari 1 prob. with
31233
2122
111
021
1}{
Betawwwwq
Betawwwq
Betawwq
Gdii
qGj
j j
![Page 14: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/14.jpg)
14
Bayes estimate from DP
The discrete nature of a random G from a DP leads to some disturbing features, such as this result from Diaconis and Freedman (1986)
Location modelyi = + i, i=1,…n has prior (), such as a normal prior 1,…, n ~ i.i.d. G
G ~ DP(G0) - symmetrized G0 = Cauchy or t-distn
Then the posterior mean is an inconsistent estimate of
![Page 15: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/15.jpg)
15
Dirichlet process mixtures (DPM)
If we marginalize over i, we obtain a semiparametric mixture
where the mixing distribution G is random and follows DP(G0)
)(~
..~,...,
parmsother .covariates possible are
distn. parametricknown a is
,...1),,( assuch ),,,|(~
0
1
GDPG
Gidi
x
f
niNxyfY
n
i
iiiii
)(),,|(~ dGxyfY iii
![Page 16: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/16.jpg)
16
DPM - clusters
Since G is almost surely discrete, 1,…,n form clusters
1= 5 = 8 1unique
2= 3 = 4= 6= 7 2unique
etc.
The number of clusters, and the clusters themselves, are random.
)(~,..~,...,
),,|(~
01 GDPGGidi
xyfY
n
iiii
![Page 17: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/17.jpg)
17
DPM – MCMC
The Polya urn/marginalized sampler (Escobar 1994, Escobar & West 1995) samples i one-at-a-time from
(i | -i, data)
Improvements, known as collapsed samplers, are proposed in MacEachern (1994, 1998) where, instead of sampling i , only the cluster membership of i are sampled.
For non-conjugate DPM (sampling density f(yi |i ) and base measure G0 are not conjugate), various algorithms have been proposed.
![Page 18: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/18.jpg)
18
Finite truncation and Blocked Gibbs
With this finite truncation, it is now a finite mixture model with stick-breaking structure on qj
(1,....,n) and (q1,....,qM) can be updated in blocks (instead of one-at-time as in Polya Urn sampler) which may provide better mixing
1
}{1
}{1 -DP of instead ~,...,
),,|(~
jj
M
jjn
iiii
jjqGqG
xyfY
![Page 19: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/19.jpg)
19
Comments
In each iteration, the Polya urn/marginal sampler cycles thru each observation, and for each, assigns its membership among a new and existing clusters.
The Poly urn sampler is also not straightforward to implement in non-linear (non-conjugate) problems or when the sample size n may not be fixed.
For the blocked sampler, on the other hand, the choice of the truncation M is not well understood.
![Page 20: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/20.jpg)
20
Model comparison in DPM models
Basu and Chib (2003) developed Bayes factor/ marginal likelihood computation method for DPM.
This provided a framework for quantitative comparison of DPM with competing parametric and semi/nonparametric models.
Log-Marginal and log-Bayes factor for Longitudinal Aids trial (n=467) with random coeffs having distn G DPM Student-t Normal DPM -3477 (76) (62) Student-t -3553 (-14) Normal -3539
![Page 21: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/21.jpg)
21
Marginal likelihood of DPM
Based on the Basic marginal identity (Chib 1995) log-posterior()=log-likelihood() + log-prior() - log-marginal
log-marginal = log-likelihood(*) + log-prior(*) – log-posterior(*)
The posterior ordinate of DPM is evaluated via prequential conditioning as in Chib (1995)
The likelihood ordinate of DPM is evaluated from a (collapsed) sequential importance sampler.
Log-Marginal and log-Bayes factor DPM Student-t Normal DPM -3477 (76) (62) Student-t -3553 (-14) Normal -3539
![Page 22: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/22.jpg)
22
![Page 23: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/23.jpg)
23
![Page 24: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/24.jpg)
24
![Page 25: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/25.jpg)
25
Double Dirichlet process mixtures (DDPM)
Marginalization obtains a double semiparametric mixture
where the mixing distributions G and G are
random
)(~)(~
..~,...,..~,...,
parmsother .covariates possible are
distn. parametricknown a is
,...1),,( assuch ),,,,|(~
00
11
GDPGGDPG
GidiGidi
x
f
niNxyfY
nn
i
iiiiiii
)()(),,,|(~ dGdGxyfY iii
![Page 26: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/26.jpg)
26
Two Biomarkers case: y1 and y2
niNy
y
i
iiii
i
i
i
i ,...,1,,~22
2121
2
12
2
1
)GammaInv.~(),(~
),(DP~
,...,1,~i.i.d.),(
21
20101101
1011
1211
NG
GG
niGii
)GammaInv.~(),(~
),(DP~
,...,1,~i.i.d.),(
22
20202202
2022
2222
NG
GG
niGii
![Page 27: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/27.jpg)
27
A simpler model: normal means only
We generate n=50 (i,i) means and then (yi1,yi2) observations from this Double-DPM model
priorWishartGDPGGDPG
ahasGdiiGdii
niNy
y
nn
i
i
i
i
)(~)(~
,...~,.....,,...~,.....,
,..1,,~
00
11
22
1
![Page 28: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/28.jpg)
28
Double DPM
-4 -2 0 2 4 6
-10
-8-6
-4-2
02
mu
phi
psi
-5 0 5
-10
-8-6
-4-2
02
Observations y
y[,1]
y[,2
]
![Page 29: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/29.jpg)
29
-5 0 5 10
-10
-8-6
-4-2
02
Plot of y and mu: Clusters in different symbols
y[,1]
y[,2
]
y
mu
-5 0 5
-10
-8-6
-4-2
02
Plot of y and mu: Clusters in different symbols
y[,1]
y[,2
]
y
mu
Single DPM in the bivariate mean vector Double DPM in mean components
![Page 30: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/30.jpg)
30
Model fitting
We fitted the Double DPM and the Bivariate DPM models to these data.
The Double DPM model can be fit by a two-stage Polya urn sampler or a two-stage blocked Gibbs sampler.
“Collapsing” can become more difficult.
2,1,||1
2,1,)(1
1
1
2
dEn
MADAverage
dEn
MSEAverage
n
i
trueididpostd
n
i
trueididpostd
![Page 31: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/31.jpg)
31
MSE MAD
y1 y2 y1 y2
Data generated from Double DP
Double DP 0.99 1.32 0.72 0.92 Bivariate DP in (y1,y2)
1.02 4.29 0.83 1.73
Data generated from Bivariate DP
Double DP 0.98 1.29 0.81 0.94 Bivariate DP in (y1,y2)
0.98 1.08 0.75 0.81
![Page 32: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/32.jpg)
32
Wallace (asymmetric) criterion for comparing two clusters/partitions
Let S be the number of mean pairs which are in the same cluster in a MCMC posterior draw and also in the true clustering.
Let nk, k=1,..K be the number of means in cluster Ck in the MCMC draw.
Then the Wallace asymmetric criterion for comparing these two clusters is
Average Wallace
Double DP Bivariate DP Data generated from
y1 y2 y1 y2 Double DP 0.89 0.42 0.66 0.48 Bivariate DP in (y1,y2) 0.72 0.24 0.62 0.62
k kk nn
S
2/)1(
![Page 33: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/33.jpg)
33
Measurements on two biomarker proteins by Luminex panels
10 20 30 40 50 60 70 80
14
00
01
60
00
18
00
02
00
00
22
00
0
IL1-beta
CR
P• Frozen parafin embedded tissues, pre and post surgery
• Luminex panel
• Nodal involvement
![Page 34: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/34.jpg)
34
Two biomarker proteins
The bivariate DPM
niNy
yi
i
i
i
i ,...,1,,~2
12
2
1
)WishartInv.~(),(~
),(DP~
,...,1,~i.i.d.),(
0020
0
NG
GG
niGii
vs the Double DPM
niNy
y
i
iiii
i
i
i
i ,...,1,,~22
2121
2
12
2
1
)GammaInv.~(),(~
),(DP~
,...,1,~i.i.d.),(
21
20101101
1011
1211
NG
GG
niGii
)GammaInv.~(),(~
),(DP~
,...,1,~i.i.d.),(
22
20202202
2022
2222
NG
GG
niGii
![Page 35: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/35.jpg)
35
0 2000 6000 10000
-50
050
100
Double DP
mu.
pred
[1]
0 2000 6000 10000
1400
018
000
2200
0
Double DP
mu.
pred
[2]
0 2000 6000 10000
-50
050
100
Bivariate DP
mu.
pred
[1]
0 2000 6000 10000
1400
018
000
2200
0
Bivariate DP
mu.
pred
[2]
µpred
![Page 36: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/36.jpg)
36
ypred
0 10 20 30 40
0.00
0.04
0.08
0.12
Double DP
y.pr
ed[1
]
14000 18000 22000
0.00
000
0.00
015
0.00
030
Double DP
y.pr
ed[1
]
0 10 20 30 40
0.00
0.04
0.08
Bivariate DP
y.pr
ed[1
]
14000 18000 22000
0.00
000
0.00
015
0.00
030
Bivariate DP
y.pr
ed[1
]
![Page 37: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/37.jpg)
37
ypred
y.pred[1]
0 10 20 30 40 50 60
y.pr
ed[2
]
10000
15000
20000
25000
de
nsity
0.000
0.005
0.010
0.015
0.020
0.025
0.030
Double DP
y.pred[1]
0 10 20 30 40 50 60
y.pr
ed[2
]
10000
15000
20000
25000de
nsity
0.000
0.005
0.010
Bivariate DP
![Page 38: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/38.jpg)
38
0 50 100 150
-14
-12
-10
-8
Observation
Lo
g C
PO
Double DPBivariate DP
log CPO = log f(yi| y-i)
LPML = log f(yi| y-i)
Double DP = -1498.67Bivariate DP= -1533.01
![Page 39: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/39.jpg)
39
Model comparison
I prefer to use marginal likelihood/ Bayes factor for model comparison.
The DIC (Deviance Information Criterion) , as proposed in Spiegelhalter et al. (2002) can be problematic for missing data/random-effects/mixture models.
Celeux et al. (2006) proposed many different DICs for missing data models
![Page 40: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/40.jpg)
40
DIC3
I have earlier considered DIC3 (Celeux et al. 2006, Richardson 2002) in missing data and random effects models which is based on the observed likelihood
)()|,,()|(where
,|)|(log2|)|(log43
dypyp
yypEyypEDIC
obsobs
obsobsobsobs
The integration over the latent parameters often has to be obtained numerically.
This is difficult in the present problem
![Page 41: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/41.jpg)
41
DIC9
I am proposing to use DIC9 which is similar to DIC3 but is based on the conditional likelihood
),,|( obsyp
DIC9
Double DP 3018.3 Bivariate DP 3050.3
,|),,|(log2|),,|(log4 ,,,,9 obsobsobsobs yypEyypEDIC
![Page 42: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/42.jpg)
42
Convergence rate results: Ghosal and Van Der Vaart (2001)
Normal location mixtures
Model: Yi ~ i.i.d. p(y) = (y-)dG(), i=1,…,n
G ~ DP(G0), G0 is Normal
Truth: p0(y) = (y-)dF()
Ghosal and Van Der Vaart (2001): Under some regularity conditions,
Hellinger distance (p, p0) 0 “almost surely”
at the rate of (log n)3/2/n
![Page 43: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/43.jpg)
43
Ghosal and Van Der Vaart (2001): results contd.
Bivariate DP location-scale mixture of normals
Yi ~ i.i.d. p(y) = (y-)dH(,), i=1,…,n H ~DP(H0)
Ghosal and Van Der Vaart (2001): If H0 is Normal {a compactly supported distn}, then the convergence rate is
(log n)7/2/n
Double DP location-scale mixture of normals
Yi ~ i.i.d. p(y) = (y-)dG() dG(), i=1,…,n
G ~DP(G0), G ~DP(G0)
Ghosal and Van Der Vaart (2001): If G0 is Normal, G0 is compactly supported and the true density
p0(y) = (y-)dF1() dF2() is also a double mixture, then
Hellinger distance (p, p0) 0 at the rate of (log n)3/2/n
![Page 44: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/44.jpg)
44
Interrater data
Agreement between 2 Raters (Melia and Diener-West 1994)
Each rater provides an ordinal rating on a scale of 1-5 (lowest to highest invasion)of the extent to which tumor has invaded the eye,n=885
Rater 1
1 2 3 4 5
Rater2
1 291 74 1 1 1
2 186 256 7 7 3
3 2 4 0 2 0
4 3 10 1 14 2
5 1 7 1 8 3
![Page 45: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/45.jpg)
45
DPM multivariate ordinal model
Kottas, Muller and Quintana (2005)
)(~
),(),|(~i.i.d.
),P(
i)subject for k Rater2 and j Rater1(
0
22
1
221-k 2111-j 1
GDPG
dGzNZ
ZZ
ZZ
P
i
ii
kiji
![Page 46: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/46.jpg)
46
Interrater agreement
The objective is to measure agreement between raters beyond what is possible by chance.
This is often measured by departure from independence, often specifically in the diagonals
Polychoric correlation of the latent bivariate normal Z has been used as a measure of association.
………………… of the latent bivariate normal mixtures???
Rater 1
1 2 3 4 5
Rater2
1 291
74 1 1 1
2 186
256
7 7 3
3 2 4 0 2 0
4 3 10 1 14 2
5 1 7 1 8 3
![Page 47: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/47.jpg)
47
Latent class model (Agresti & Lang 1993)
C latent classes
C1,...,c c, class tobelongs isubject if
i)subject for k Rater2 and j Rater1(
cjk
ijk
p
Pp
Ratings of the two raters within a class are independent
kcjccjk ppp 21 Rater 1
1 2 3 4 5
Rater2
1 291
74 1 1 1
2 186
256
7 7 3
3 2 4 0 2 0
4 3 10 1 14 2
5 1 7 1 8 3
kcjc
C
ciijk ppcIp 21
1
)(
![Page 48: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/48.jpg)
48
Mixtures of Double DPMs
For each latent class, we model pc1j and pc2k by two separate univariate ordinal probit DPM models
kcjc
C
ciijk ppcIp 21
1
)(
)(~
),(),|(~i.i.d.
,..,1,2,1,5,..1),(p
0
221
,1,
clclcl
clcli
jclcljclclj
GDPG
dGzNZ
CcljZP
![Page 49: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/49.jpg)
49
Computational issue
The ``sample size’’ nc in latent group c is not fixed. This causes problem for the polya-urn/marginal sampler which works with fixed sample size
Do, Muller, Tang (2005) suggested a solution to this problem by jointly sampling the latent il
=(il,il2) and the latent rating class membership i.
kcjc
C
ciijk ppcIp 21
1
)(
![Page 50: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/50.jpg)
50
Estimated cell probabilities 1 2 3 4 5 .3288 .0836 .0011 .0011 .0011 1 .3261
(.2945,.3590) .0869
(.0696,.1078) .0013
(.0001,.0044) .0020
(.0003,.0056) .0007
(0,.0027) .3286
(.3060,,.3524) .0821
(.0701,.09510) .003
(.0008,.007) .0022
(.0007,.0047) .0009
(.0001,.0023) .2102 .2893 .0079 .0079 .0034 2 .2135
(.1858,.2428) .2826
(.2521,.3141) .0082
(.0031,.0152) .0069
(.0022,.0143) .0031
(.0007,.0075) .2098
(.1856,.2334) .2853
(.2700,.2997) .0076
(.0036,.0129) 0103
(..0074,.0138) .0033
(.0017,.0053) .0023 .0045 0 .0023 0 3 .0023
(.004,.0065) .0055
(.0017,.0107) .0016
(.0003,.0038) .0022
(.004,.0062) .0008
(.0,.0032) .0029
(.0009,.0068) .006
(.0025,.0114) .0003
(0,.0009) .0015
(.0002,.0039) .0004
(0,.0011) .0034 .0113 .0011 .0158 .0023 4 .0042
(.0012,.0094) .0102
(.0042,.0187) .0023
(.0004,.006) ..0143
(.0066,.024) .0028
(.0006,.0069) .0031
(.001,.0059) .0124
(.0092,.0157) .0016
(.0002,.0044) .0127
(.0089,.0168) .0033
(.0014,.0057) .0011 .0079 .0011 .009 .0034 5 .0012
(.0001,.0041) .0071
(.0026,.0140) .0019
(.0003,.0054) .0083
(.0034,.0153) .0039
(.0009,.009) .0019
(.0006,.004) .0086
(.0055,.0129) .0011
(.0002,.0032) .0086
(.0053,.0118) .0026
(.0011,.0045)
![Page 51: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/51.jpg)
51
Marginal probability estimates
Latent Group 1
Rater B 1 2 3 4 5
0.9374 .0559 .0049 .0011 .0008
Rater A 1 2 3 4 5
0.6691 .3246 .0036 .0014 .0012
1 2 3 4 51 0.6261 0.03859 0.003273 0.00068 0.0004732 0.3058 0.01675 0.001511 0.000326 0.0002253 0.003405 0.000194 2.38E-05 6.81E-06 5.06E-064 0.001237 0.000121 4.37E-05 1.86E-05 1.79E-055 0.000882 0.000212 7.31E-05 3.12E-05 3.10E-05
![Page 52: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/52.jpg)
52
Marginal probability estimates
Rater B 1 2 3 4 5
0.1432 .7432 .0226 .0706 .0205
Rater A 1 2 3 4 5
0.1562 .7139 .0188 .0661 .0451
Latent Group 2
1 2 3 4 51 0.02196 0.1264 0.002716 0.003709 0.0013822 0.1107 0.5625 0.0138 0.02051 0.0064143 0.002445 0.01194 0.000555 0.003084 0.0007644 0.005113 0.02517 0.003294 0.02575 0.0067475 0.002968 0.01718 0.002219 0.01753 0.005212
![Page 53: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/53.jpg)
53
Joint mean covariance modeling
Trial with n=200 patients who had acute MI within 28 days of baseline and are depressed/low social support
Underwent 6 months of usual care (control) or individual and/or group-based cognitive behavioral counseling (treatment).
Response y = depression (Beck Depression Inventory) measured at 0,182,365,548, 913, 1278 days (but actually at irregular intervals)
Covariate: Treatment, Family history, Age, Sex, BMI,……
Intermittent missing response, missing covariate….
![Page 54: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/54.jpg)
54
0 200 400 600 800 1000 1200
01
02
03
04
05
0
Visit Days
Be
ck d
ep
ress
ion
Inve
nto
ry
![Page 55: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/55.jpg)
55
Model
Model for the mean
)(
)(
22101 iijiijiiij
iijijijij
ttbtbbbZ
bZXyE
Model for the covariance
),(~),..,( 661 iiiii Nyyy Pourahmadi (1999), Pourahmadi and Daniels (2002) use a Cholesky decomposition of the covariance
which allows one to use log-linear model for the variances and “linear regression” for the off-diagonal terms
![Page 56: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/56.jpg)
56
Modeling the covariance
We assume ),..,(diag 2
621 iii
221
2log
ijiijioiiij
iijijij
tbtbbbZ
bZX
1 2 3 4 5 6
55
60
65
70
75
80
visits
Va
ria
nce
![Page 57: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/57.jpg)
57
Mean and variance level random effects
? A joint DPM for the two random effects together which allows clustering at the patient level
nibbbbbbbb iioiiiioii ,..1),,,(and),,( 2121
DPGGbb ii ~,~i.i.d),( ,,
DPGGb
DPGGb
i
i
~,~i.i.d
~,~i.i.d
? Or Double DPM, that is, independent DPM separately for the each of the two random effects which allows separate clustering at the mean and variance level
Most frequentist and parametric Bayesian analyses use the latter independence among the mean and variance level random effects.
![Page 58: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/58.jpg)
58
-7000 -5000 -3000
-2400
-2300
-2200
Double DP
iterations
log-lik
elio
od
-7000 -5000 -3000
-2400
-2300
-2200
Bivariate DP
iterations
log-lik
elio
od
0 50 100 150 200
-25
-20
-15
-10
-5
Log CPO
Observations
Lo
g-C
PO
Double DPBivariate DP
![Page 59: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/59.jpg)
59
Fixed effects estimates
Double DP Bivariate DP Normal .slope -1.12 (-2.39,-0.16) -0.89 (-1.97,0.24) -0.26 (-.65,.28) .change -3.29 (-4.36,-1.77) -2.98 (-4.4,-1.11) -4.057 (-4.72,-3.43)
.slope 0.18 (-.006,.38) 0.19 (.014,.382) -0.16 (-0.32,0.05) .quad -0.005 (-0.027,0.021) -.006 (-.027,.013) 0.033 (0.009,0.051)
![Page 60: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/60.jpg)
60
Pseudo marginal likelihood
log f(yi| y-i) Double DP -2495.290 Bivariate DP -2503.916 Normal -2530.92
![Page 61: 1. 2 DDirichlet process mixtures are active research areas DDirichlet mixtures are it! TThe flexibility of DPM models supported its huge popularity.](https://reader036.fdocuments.net/reader036/viewer/2022062322/5697c01d1a28abf838cd09d8/html5/thumbnails/61.jpg)
61
Summary
Double DP mixtures may add a level of structure to mixture modeling with DP.
They produce interesting “product-clustering”
They are applicable to specific problems that may benefit from this structure