Missing Data
description
Transcript of Missing Data
![Page 1: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/1.jpg)
Missing DataMichael C. Neale
International Workshop on Methodology for Genetic Studies of Twins and Families Boulder CO 2006
Virginia Institute for Psychiatric and Behavioral GeneticsVirginia Commonwealth University
Vrije Universiteit Amsterdam
![Page 2: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/2.jpg)
Various forms of missing data
Twins volunteer to participate or not Twins are obtained from hospital records Attrition during longitudinal studies Equipment failure:
Systematic (if < value then scored zero) Random
Structured interviews Don’t bother asking question 2 if response to
Q1 is no Designs to increase statistical power
![Page 3: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/3.jpg)
Plan of Talk
1. Review likelihood & Mx facilities2. Conceptual view of ascertainment
1. Ordinal2. Continuous
3. Ascertainment in twin studies1. Ordinal2. Continuous
4. Missing data1. Systematic (if < value then scored zero)2. Random
5. Structured interviews1. Don’t bother asking question 2 if response to Q1 is
no
![Page 4: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/4.jpg)
Likelihood approach
Usual nice properties of ML remain
Flexible
Simple principle Consideration of possible outcomes Re-normalization
May be difficult to compute
Advantages & Disadvantages
![Page 5: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/5.jpg)
Maximum Likelihood Estimates
Asymptotically unbiased
Minimum variance of all asymptotically unbiased estimators
Invariant to transformations
a2 = (a)2
Have nice properties
^ ^
![Page 6: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/6.jpg)
Example: Two Coin Toss3 outcomes
HH HT/TH TTOutcome
0
0.5
1
1.5
2
2.5Frequency
Probability i = freq i / sum (freqs)
![Page 7: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/7.jpg)
Example: Two Coin Toss2 outcomes
HH HT/TH TTOutcome
0
0.5
1
1.5
2
2.5Frequency
Probability i = freq i / sum (freqs)
![Page 8: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/8.jpg)
Non-random ascertainment
Probability of observing TT globally 1 outcome from 4 = 1/4
Probability of observing TT if HH is not ascertained 1 outcome from 3 = 1/3
or 1/4 divided by 'ascertainment correction' of 3/4 = 1/3
Example
![Page 9: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/9.jpg)
Correcting for ascertainmentUnivariate continuous case; only subjects > t ascertained
0 1 2 3 4-1
-2
-3
-4
0
0.1
0.2
0.3
0.4
0.5
xi
likelihood
t
![Page 10: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/10.jpg)
Correcting for ascertainmentUnivariate continuous case; only subjects > t ascertained
0 1 2 3 4-1
-2
-3
-4
0
0.1
0.2
0.3
0.4
0.5
xi
likelihood
t
![Page 11: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/11.jpg)
Correcting for ascertainment
Without ascertainment, we compute
With ascertainment, the correction is
Dividing by the realm of possibilities
Does likelihood increase or decrease after correction?
pdf, (ij,ij), at observed value xidivided by:
-(ij,ij) dx = 1
t-(ij,ij) dx < 1
![Page 12: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/12.jpg)
Ascertainment in PracticeBG Studies
Studies of patients and controls
Patients and relatives Twin pairs with at least one affected
Single ascertainment pi 0 Complete ascertainment pi = 1 Incomplete 0 < pi <1
Linkage studies Affected sib pairs, DSP etc Multiple affected families
pi = probability that someone is ascertained given that they are affected
![Page 13: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/13.jpg)
Correction depends on model
1 Correction independent of model parameters: "sample weights"
2 Correction depends on model parameters: weights vary during optimization
In twin data almost always case 2 continuous data binary/ordinal data
![Page 14: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/14.jpg)
High correlation Concordant>t
tx
0 1
tx ty(x,y) dy dx
1
0
ty
![Page 15: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/15.jpg)
Medium correlation
tx
ty
0 1
tx ty(x,y) dy dx
1
0
![Page 16: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/16.jpg)
Low correlation
ty
0 1
tx ty(x,y) dy dx
1
0
tx
![Page 17: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/17.jpg)
Two approaches for twin data in Mx
Contingency table approach Automatic Limited to two variable case
Raw data approach Manual Multivariate Moderator / Covariates
![Page 18: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/18.jpg)
Contingency Table Case
Feed program contingency table as usual
Use -1 for frequency for non-ascertained cells
Correction for ascertainment handled automatically
Binary data
![Page 19: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/19.jpg)
At least one twin affected
tx
ty
1
0
0 1
1-- -(x,y) dy dxtx tyAscertainment Correction
![Page 20: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/20.jpg)
Ascertain iff twin 1 > t
tx
ty
1
0
0 1
ty -(x,y) dx dy ty (y) dy =
Twin 1
Twin 2
![Page 21: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/21.jpg)
Contingency Tables
Use -1 for cells not ascertained
Can be used for ordinal case
Need to start thinking about thresholds Supply estimated population values Estimate them jointly with model
![Page 22: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/22.jpg)
Mx SyntaxClassical Twin Study: Contingency Table
ftp://views.vcu.edu/pub/mx/examples/ncbook2/categor.mx
G1: Model parameters Data Calc NGroups=4 Begin Matrices; X Lower 1 1 Free Y Lower 1 1 Free Z Lower 1 1 Free W Lower 1 1 End Matrices; ! parameters are fixed by default, unless declared free Begin Algebra; A= X*X'; C= Y*Y'; E= Z*Z'; D= W*W'; End Algebra:End
![Page 23: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/23.jpg)
Mx SyntaxGroup 2
G2: young female MZ twin pairs Data Ninput=2 CTable 2 2 329 83 95 83 Begin Matrices= Group 1 T full 2 1 Free End Matrices; Covariances A+C+D+E | A+C+D _ A+C+D | A+C+D+E ; Thresholds T ; Options RSidualEnd
![Page 24: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/24.jpg)
Mx SyntaxGroup 3
G3: young female DZ twin pairs Data Ninput=2 CTable 2 2 201 94 82 63
Begin Matrices= Group 1 H Full 1 1 Q Full 1 1 T Full 2 1 Free End Matrices; Matrix H .5 Matrix Q .25 Start .6 All
Covariances A+C+D+E | H@A+C+Q@D _ H@A+C+Q@D | A+C+D+E / Thresholds T ; Options RSidual NDecimals=4End
![Page 25: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/25.jpg)
Mx SyntaxGroup 4
Group 4: constrain variance to 1 Constraint NI=1 Begin Matrices = Group 1 ; I unit 1 1 End Matrices;
Constraint I = A+C+E+D ; Option MultipleEnd Specify 2 t 8 8 Specify 3 t 8 8End
![Page 26: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/26.jpg)
Raw data approach
Correction not always necessary ML MCAR/MAR Prediction of missingness
Correct through weight formula
![Page 27: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/27.jpg)
What can we do with Mx?Normal theory likelihood function for raw data in Mx
j=1
ln Li = fi ln [ wj g(xi,ij,ij)]m
xi - vector of observed scores on n subjects
:ij - vector of predicted means
Gij - matrix of predicted covariances - functions of parameters
![Page 28: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/28.jpg)
Likelihood Function Itself
Example: Normal pdf
The guts of it
j=1
ln Li = fi ln [ wij g(xi,ij,ij)]m
g(xi,ij,ij) - likelihood function
![Page 29: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/29.jpg)
Normal distribution (ij,ij)Likelihood is height of the curve
0 1 2 3 4-1
-2
-3
-4
0
0.1
0.2
0.3
0.4
0.5
xi
likelihood
![Page 30: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/30.jpg)
Weighted mixture of modelsFinite mixture distribution
j=1
m
j = 1....m modelswij Weight for subject i model j
e.g., Segregation analysis
ln Li = fi ln [ wij g(xi,ij,ij)]
![Page 31: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/31.jpg)
Mixture of Normal Distributions Two normals, propotions w1 & w2, different means
But Likelihood Ratio not Chi-Squared - what is it?
0 1 2 3-1
-2
-3
-4
0
0.1
0.2
0.3
0.4
0.5
1
xi
g
2
w1 x l1
w2 x l2
![Page 32: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/32.jpg)
General Likelihood Function
Sample frequencies binary data Sometimes 'sample weights' Might also vary over model j
Finally the frequencies
fi - frequency of case i
j=1
ln Li = fi ln [ wj g(xi,ij,ij)]m
![Page 33: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/33.jpg)
General Likelihood Function
Model for Means can differ Model for Covariances can differ Weights can differ Frequencies can differ
Things that may differ over subjects
i = 1....n subjects (families)j=1
ln Li = fi ln [ wij g(xi,ij,ij)]m
![Page 34: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/34.jpg)
Raw Ordinal Data Syntax Read in ordinal file
May use frequency command to save space
Weight uses \mnor function \mnor(R_M_U_L_K)
R - covariance matrix (p x p) M - mean vector (1xp) U - upper threshold (1xp) L - lower threshold (1xp) K - indicator for type of integration in each dimension (1xp)
0: L=-4; 1: U=+4 2: Iu; 3: L=-4 U=4
L
![Page 35: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/35.jpg)
Mx SyntaxG1: Model parameters Data Calc NGroups=4 Begin Matrices; X Lower 1 1 Free Y Lower 1 1 Free Z Lower 1 1 Free W Lower 1 1 End Matrices; ! parameters are fixed by default, unless declared free Begin Algebra; A= X*X'; C= Y*Y'; E= Z*Z'; D= W*W'; End Algebra:End
![Page 36: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/36.jpg)
Mx SyntaxG2: MZ twin pairs Data Ninput=3 Ordinal File=mz.frq Labels T1 T2 Freq Definition Freq ; Begin Matrices= Group 1 T full 2 1 Free F full 1 1 ! Frequency End Matrices; Specify F Freq Covariances A+C+D+E | A+C+D _ A+C+D | A+C+D+E ; Thresholds T ; Frequency F; Options RSidualEnd
![Page 37: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/37.jpg)
Mx SyntaxG3: DZ twin pairs Data Ninput=3 Labels T1 T2 Freq Ordinal File=dz.frq Definition Freq ;
Begin Matrices= Group 1 H Full 1 1 Q Full 1 1 T Full 2 1 Free F full 1 1 ! Frequency End Matrices; Specify F Freq Matrix H .5 Matrix Q .25 Start .6 All
Covariances A+C+D+E | H@A+C+Q@D _ H@A+C+Q@D | A+C+D+E / Thresholds T ; Frequency F ; Options RSidual NDecimals=4End
![Page 38: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/38.jpg)
Mx Syntax
Group 4: constrain variance to 1 Constraint NI=1 Begin Matrices = Group 1 ; I unit 1 1 End Matrices;
Constraint I = A+C+E+D ; Option MultipleEnd Specify 2 t 8 9 Specify 3 t 8 9End
![Page 39: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/39.jpg)
Ascertainment additional commands
Why inverse of J and K?
Begin Algebra; M=(A+C+E|A+C_A+C|A+C+E); N=(A+C+E|h@A+C_h@A+C|A+C+E); J=I-\mnor(M_Z_T_T_Z); ! Z = [0 0] K=I-\mnor(N_Z_T_T_Z); ! DZ caseEnd Algebra;
Weight J~; ! for MZ groupWeight K~; ! DZ group
![Page 40: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/40.jpg)
Correcting for ascertainment
Multivariate selection: multiple integrals double integral for ASP four double integrals for EDAC
Use (or extend) weight formula
Precompute in a calculation groupunless they vary by subject
Linkage studies
![Page 41: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/41.jpg)
When ascertainment is NOT necessary
Various flavors of missing data mechanisms
MCAR: Missing completely at random
MAR: Missing at random
NMAR: Not missing at random
Little & Rubin Terminology
![Page 42: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/42.jpg)
Simulation: 3 types of missing data
Selrand: MCAR missingness function of independent random
variable
Selonx: MAR missingness predicted by other measured
variable in analysis + MCAR
Selony: NMAR missingness mechanism related to ''residual"
variance in dependent variable
![Page 43: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/43.jpg)
Method Simulate bivariate normal data X,Y
Sigma = 1 .5 .5 1
Mu = 0, 0
Make some variables missing Generate independent random normal variable, Z, if Z>0 then Y missing If X>0 then Y missing If Y>0 then Y missing
Estimate elements of Sigma & Mu
Constrain elements to population values 1,.5, 0 etc
Compare fit
Ideally, repeat multiple times and see if expected 'null' distribution emerges
![Page 44: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/44.jpg)
Method Check to see if estimates ‘look right’
Quantify ‘look right’
Test model with parameters fixed to population values
Look at chi-squared
Repeat!
Examine distribution of chi-squareds
![Page 45: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/45.jpg)
R simulation 'model'
1.00
A
1.00
C
1.00
E
S1 S2
sqrt(1-r) sqrt(r) sqrt(r) sqrt(1-r)
![Page 46: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/46.jpg)
R simulation script: selrand.Rr<-0.5 #Correlation (r)N<-1000 #Number of pairs of scoresoutfile<-"selrand.rec" #output file for the simulation
vals<-matrix(rnorm((N*3)),,3); #3 random normals per pair of observed scores
rootr<-sqrt(r); #Square root of rroot1mr<-sqrt(1-r); #Square root of (1-r)
#Data generation follows: pairs<-matrix(cbind(rootr*vals[,1]+root1mr*vals[,2],rootr*vals[,1] +root1mr*vals[,3]),,2)
# pairs now contains pairs of scores drawn from population # with unit variances and correlation r
![Page 47: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/47.jpg)
# Now to make some of the scores missing, using a loopfor (i in 1:N) { x<-rnorm(1,0,1) # x is an independent random number from N(0,1)if (x>0) pairs[i,2] <- NA}write.table(pairs,file=outfile,row.names=FALSE,na=".",col.names=FALSE)
R simulation script: selrand.R
![Page 48: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/48.jpg)
R simulation script: selonx.R
# Now to make some of the scores missing, using a loopfor (i in 1:N) { if (pairs[i,1]>0) pairs[i,2] <- NA}write.table(pairs,file=outfile,row.names=FALSE,na=".",col.names=FALSE)
# note how missingness of 2nd column of pairs depends on # value of 1st column
![Page 49: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/49.jpg)
R simulation script: selony.R
# Now to make some of the scores missing, using a loopfor (i in 1:N) { if (pairs[i,2]>0) pairs[i,2] <- NA}write.table(pairs,file=outfile,row.names=FALSE,na=".",col.names=FALSE)
# note how missingness of 2nd column of pairs depends on # value of 2nd column itself
![Page 50: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/50.jpg)
Mx ScriptRather basic, like Monday morning
Estimate pop cov matrix of X&Y, with Y observed iff X>0 Data ngroups=1 ninput=2 Rectangular file=selonx.rec Begin Matrices; a sy 2 2 free ! covariance of x,y m fu 1 2 free ! mean of x,y End Matrices; Means M / Covariance A / matrix a 1 .3 1 bound .1 2 a 1 1 a 2 2 option rs mu issatend
fix all matrix 1 a 1 .5 1matrix 1 m0 0end
![Page 51: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/51.jpg)
Mx Scripts & Data
Check output: Summary statistics (obs means) Estimated means & covariance matrices Difference in fit between estimated values and
population values
Interpretation?
F:\mcn\2006\sel
![Page 52: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/52.jpg)
ML estimation under different missingness mechanisms
Missingness mean x mean y var x cov xy var y LR Chisq
MCAR (rand) MLE
<sample>
MAR (on x) MLE
<sample>
NMAR (on y) MLE
<sample>
![Page 53: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/53.jpg)
ML estimation under different missingness mechanisms
Missingness mean x mean y var x cov xy var yLR
Chisq
MCAR (rand) MLE
-0.0116 -0.1 1.0505 0.4998 0.8769 6.492
sample -0.0116 -0.0919 1.0505 0.8839
MAR (on x) MLE
0.0048 0.0998 1.0084 0.4481 1.1025 5.768
sample 0.0014 0.4437 1.0084 0.9762
NMAR (on y) MLE
-0.0204 0.6805 0.9996 0.1356 0.2894 227.262
sample 0.0448 0.7373 0.9996 0.2851
![Page 54: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/54.jpg)
BG Studies: Screen + Examination
Bivariate analysis of screen & exam No ascertainment correction required Example: all pairs where at least one screens
positive are examined Works for continuous & ordinal
Undersampling: some proportion of pairs concordant negative for screen are also examined Ascertainment correction required Different correction for screen -- vs +-/-+/++
Only a subset, selected on basis of screen, are examined
![Page 55: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/55.jpg)
Other examples of necessarily missing data
Consider: Measures of abuse/dependence in non-users Age at onset in the healthy Symptoms of schizophrenia in non-
schizophrenics
How do we know whether, e.g., Drug use is related to drug abuse Age at onset predicts severity
![Page 56: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/56.jpg)
Two binary variables X and Y
Both X and Y observed XY Outcomes: 00, 01, 10, 11
But Y observed only if X=1 XY Outcomes 0?, 10, 11
Only three cell frequencies “0/1/2” No use / Use with no abuse / Use and abuse
No variance in use when abuse is observed
Amnesiac researcher, forgot to get data from relatives
![Page 57: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/57.jpg)
Two binary variables in twins
Twin 1No Yes
No
Yes
No Yes
No
Yes
No Yes
No
Yes
Twin 2
1 0 1 0
1 0 1 1
1 1 1 0
1 1 1 1
0 0 0 0
0 0 0 1
0 1 0 0
0 1 0 1
0 0 1 0
0 0 1 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 1 0 0
1 1 0 1
Enlightened researcher, good memory, has data from relatives
![Page 58: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/58.jpg)
Censored case Y iff X=1
Twin 1No Yes
No
Yes
No Yes
No
Yes
No Yes
No
Yes
Twin 2
0 ? 0 ?
0 ? 1 0
0 ? 1 1
1 0 0 ? 1 1 0 ?
1 0 1 0
1 0 1 1
1 1 1 0
1 1 1 1
![Page 59: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/59.jpg)
Can estimate parameters!
• Correlation for Initiation• Correlation for
Dependence• Correlation Twin 1
initiation with Twin 2 Dependence (“proxy”)
• Indirectly assess correlation between initiation and dependence
A Twin 1
A Twin 2
B Twin 1
B Twin 2
b
b
VA VB*
VA VB*
rB*rA
![Page 60: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/60.jpg)
Multivariate case: Stem and probe Stem = Yes
STEM P2 P3 P4 P5 P6
F1
l6l1
![Page 61: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/61.jpg)
Stem and probe itemsStem = No
Probes Missing
STEM P2 P3 P4 P5 P6
F1
l6l1
![Page 62: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/62.jpg)
Proxy information from cotwin identifies the model
STEM P2 P3 P4 P5 P6
F1
l6l1
STEM P2 P3 P4 P5 P6
F2
l6l1
R > 0
![Page 63: Missing Data](https://reader036.fdocuments.net/reader036/viewer/2022062322/568145db550346895db2dd1a/html5/thumbnails/63.jpg)
Conclusion
Be careful when designing studies with non-random ascertainment
Usually possible to correct
In principle, heritability should not change
In practice, it might