Guide to Handling Missing Information Contacting researchers Algebraic recalculations, conversions...

Guide to Handling Missing Information• Contacting researchers

• Algebraic recalculations, conversions and approximations

• Imputation method (substituting missing data)

Imputation Method

- When recalculations not possible-e.g. no standard deviation for a study- Use available data from other studies or other

meta-analysis

a. Within study imputation

b. Multiple imputations

Imputation Method

Within-study imputation

= Standard deviation (SD) for missing data from study j

=Mean from study with missing SD

=Summation of all known SD from different studies

=Summation of means from different studies other than j

Method 1.(Means)

SDj~

Xj

_

Ʃik SDi

(Ʃik Xi)

_

SDj= Xj Ʃik SDi

_______

Ʃik Xi

_~

Assumptions

•Assumes SD to mean ratio is at the same scale for all studies- Experimental scales can differ tremendously between different taxonomic groups or experimental designs

SDj= Xj Ʃik SDi

_______

Ʃik Xi

~ -

-Regression techniques- Reports sample size but missing information

to calculate pooled SD (required for Hedge’s d).

α = Interceptβ = slope of the linear regression of n vs s

nj = observed sample size of the study with missing data

Method 2.(sample size)

sj=α+β(nj)~

Assumptions• Assumes n (observed sample size of the study

with missing data) is a good predictor s.

sj=α+β(nj)~

K= number of studies with complete information on s and n (sample size of individual study)

Method 3.No. of studies

sj= Ʃik sj √ni_____K √nj

~

Method 4. Follman et al. (1992) Furukawa et al. (2006)

sj= √Ʃik [(ni-1)Ϭ2

i]__________√Ʃi

k (ni-1)

Ϭ2= variance n= sample size of individual study

~

Assumptions

• Some degree of homogeneity among the observed SD and X across studies

• Assume information is missing at random and not due to reporting biases (non-random)

-Imputations retain their original units. -Large variations among estimates will bias

imputations.

_

Multiple imputations• Use random sampling approach

• Average repeated sampling for missing data

Overall imputed synthesis

Advantage of multiple imputations

• Variability is explicitly modeled therefore do no treat imputed value as true observation

• e.g. Does not account for error associated with α or β.

sj=α+β(nj)~

Methods: Multiple imputations• Various methods: use maximum likelihood or

Bayesian models.• Requires specialized software• e.g. Hot Deck- To calculate pooled s but

several SD values missing- Random sample of s drawn with replacement possible s- Process repeated with replacement from

possible s- Repeat till we get “m” number of complete

data sets

Methods: Hot deckcalculate effect size= δ for each(m) dataCalculate variance = Ϭ2 (δl) set

δ = Ʃlm = 1 δl

__

_.___m

Variance= Ϭ2(δ)= Ʃl

m = 1 Ϭ2(δl) + (1+1) Ʃlm= 1(δl – δ)2

m_________

m_ _________

m-1

. _ _ .Pooled effect size

Rubin and Schenker (1991)If 30% data missing->m= 3If 50% data missing->m= 5

Non-parametric analyses and bootstrapping

• Alternative to Hedge’s d• Using weighting scheme • Does not require SD• E.g log response ratiolnR= ln XT

XC

If sample size available but no SDϬ2=(lnR)= nT nC

nT+nC

_____

T= treatmentC= control

___ Inverse of a simplified estimate of variance

Effects of Imputation• No standardized method for imputation-> biasRubin and Schenker (1991) e.g.• Appropriateness of imputed data can be

evaluated using a sensitivity analysis• Benefits despite potential bias- Improved variance estimate (i.e. smaller CI) over

exclusion- May potentially improve representation of null

studies

Guide to Handling Missing Information Contacting researchers Algebraic recalculations, conversions...

Documents

Transcript of Guide to Handling Missing Information Contacting researchers Algebraic recalculations, conversions...