Guide to Handling Missing Information Contacting researchers Algebraic recalculations, conversions...
-
Upload
clementine-mathews -
Category
Documents
-
view
214 -
download
2
Transcript of Guide to Handling Missing Information Contacting researchers Algebraic recalculations, conversions...
Guide to Handling Missing Information• Contacting researchers
• Algebraic recalculations, conversions and approximations
• Imputation method (substituting missing data)
Imputation Method
- When recalculations not possible-e.g. no standard deviation for a study- Use available data from other studies or other
meta-analysis
a. Within study imputation
b. Multiple imputations
Imputation Method
Within-study imputation
= Standard deviation (SD) for missing data from study j
=Mean from study with missing SD
=Summation of all known SD from different studies
=Summation of means from different studies other than j
Method 1.(Means)
SDj~
Xj
_
Ʃik SDi
(Ʃik Xi)
_
SDj= Xj Ʃik SDi
_______
Ʃik Xi
_~
Assumptions
•Assumes SD to mean ratio is at the same scale for all studies- Experimental scales can differ tremendously between different taxonomic groups or experimental designs
SDj= Xj Ʃik SDi
_______
Ʃik Xi
~ -
-Regression techniques- Reports sample size but missing information
to calculate pooled SD (required for Hedge’s d).
α = Interceptβ = slope of the linear regression of n vs s
nj = observed sample size of the study with missing data
Method 2.(sample size)
sj=α+β(nj)~
Assumptions• Assumes n (observed sample size of the study
with missing data) is a good predictor s.
sj=α+β(nj)~
K= number of studies with complete information on s and n (sample size of individual study)
Method 3.No. of studies
sj= Ʃik sj √ni_____K √nj
~
Method 4. Follman et al. (1992) Furukawa et al. (2006)
sj= √Ʃik [(ni-1)Ϭ2
i]__________√Ʃi
k (ni-1)
Ϭ2= variance n= sample size of individual study
~
Assumptions
• Some degree of homogeneity among the observed SD and X across studies
• Assume information is missing at random and not due to reporting biases (non-random)
-Imputations retain their original units. -Large variations among estimates will bias
imputations.
_
Multiple imputations• Use random sampling approach
• Average repeated sampling for missing data
Overall imputed synthesis
Advantage of multiple imputations
• Variability is explicitly modeled therefore do no treat imputed value as true observation
• e.g. Does not account for error associated with α or β.
sj=α+β(nj)~
Methods: Multiple imputations• Various methods: use maximum likelihood or
Bayesian models.• Requires specialized software• e.g. Hot Deck- To calculate pooled s but
several SD values missing- Random sample of s drawn with replacement possible s- Process repeated with replacement from
possible s- Repeat till we get “m” number of complete
data sets
Methods: Hot deckcalculate effect size= δ for each(m) dataCalculate variance = Ϭ2 (δl) set
δ = Ʃlm = 1 δl
__
_.___m
Variance= Ϭ2(δ)= Ʃl
m = 1 Ϭ2(δl) + (1+1) Ʃlm= 1(δl – δ)2
m_________
m_ _________
m-1
. _ _ .Pooled effect size
Rubin and Schenker (1991)If 30% data missing->m= 3If 50% data missing->m= 5
Non-parametric analyses and bootstrapping
• Alternative to Hedge’s d• Using weighting scheme • Does not require SD• E.g log response ratiolnR= ln XT
XC
If sample size available but no SDϬ2=(lnR)= nT nC
nT+nC
_____
T= treatmentC= control
___ Inverse of a simplified estimate of variance
Effects of Imputation• No standardized method for imputation-> biasRubin and Schenker (1991) e.g.• Appropriateness of imputed data can be
evaluated using a sensitivity analysis• Benefits despite potential bias- Improved variance estimate (i.e. smaller CI) over
exclusion- May potentially improve representation of null
studies