Missing data & how to handle it.pptx

download Missing data & how to handle it.pptx

of 32

Transcript of Missing data & how to handle it.pptx

  • 7/24/2019 Missing data & how to handle it.pptx

    1/32

    Missing data & how to

    handle it

    Arooj Arshad

    PhD Scholar

  • 7/24/2019 Missing data & how to handle it.pptx

    2/32

    Goals

    Discuss ways to evaluate andunderstand missing data

    Discuss common missing datamethods

    Know the advantages anddisadvantages of common methods

    Treatment of the missing data

  • 7/24/2019 Missing data & how to handle it.pptx

    3/32

    Missing data can occur for manyreasons: articiants can fail to resond to

    !uestions "legitimately or illegitimately#more on that later$%

    e!uiment and data collecting or recordingmechanisms can malfunction%

    sujects can withdraw from studies eforethey are comleted%

    data entry errors can occur'

  • 7/24/2019 Missing data & how to handle it.pptx

    4/32

    Di(erence etween missing and

    legitimate missing data

  • 7/24/2019 Missing data & how to handle it.pptx

    5/32

    Methods for analy)ing missing data re!uireassumtions aout the nature of the data and aoutthe reasons for the missing oservations that areoften not ac*nowledged'

    +hen researchers use missing data methods withoutcarefully considering the assumtions re!uired of thatmethod% they run the ris* of otaining iased andmisleading results' ,eviewing the stages of datacollection% data rearation% data analysis% andinterretation of results will highlight the issues thatresearchers must consider in ma*ing a decision aouthow to handle missing data in their wor*'

  • 7/24/2019 Missing data & how to handle it.pptx

    6/32

    Point to e rememer--'

    All researchers should e.amine theirdata for missingness% andresearchers wanting the est "i'e'%

    the most ReplicableandGeneralizable$ results from theirresearch need to e reared to deal

    with missing data in the mostaroriate and desirale wayossile'

  • 7/24/2019 Missing data & how to handle it.pptx

    7/32

    Missing Data Mechanisms

    Missing Completely at Random (MCAR) Probability of the missing data on Y is unrelated to Y

    and X

    /.amle: the reorting of income y the resondents'

    0hec*ed with the hel of 1ittle2s M0A, test' Missing at Random (MAR)

    Probability of missing data on y is relayed to X

    /.amle: for really sic* atients% clinicians may not drawlood for routine las'

    !ot Missing at Random Probability of missing data on Y is dependent on "alue

    of Y

    /.amle: ,esondents with high income less li*ely to reortincome

  • 7/24/2019 Missing data & how to handle it.pptx

    8/32

    Missing Data 0onse!uences

    #ias

    /stimatesystematically

    deviates from the!uantity ofinterest'

    3o ias is the datais M0A,% ut iascan occur with notM0A,'

    $ariance

    Missing data cansometimes leas towrong standard

    errors' +rong study

    conclusions aoutrelationshi of

    variales tooutcomes'

  • 7/24/2019 Missing data & how to handle it.pptx

    9/32

    0ommonly45sed Missing Data

    6andling Methods

  • 7/24/2019 Missing data & how to handle it.pptx

    10/32

    0ommonly45sed Missing Data

    Methods

    Deletion Methods 1istwise7comlete case deletion%

    airwise deletion

    Single 8mutation Methods Mean7mode sustitution% dummy

    variale method% single regression

    Model49ased Methods Ma.imum 1i*elihood% Multile

    imutation

  • 7/24/2019 Missing data & how to handle it.pptx

    11/32

    Deletion Method

  • 7/24/2019 Missing data & how to handle it.pptx

    12/32

    1istwise Deletion "0omlete 0ase

    Analysis$

    nly analy)e caseswith comlete datadroing the missingvariales'

    +hen a researcher isestimating a model%such as a linear

    regression% moststatistical ac*agesuse listwise deletiony default'

  • 7/24/2019 Missing data & how to handle it.pptx

    13/32

    Ad"antages /ase of imlementation'

    0omaraility across analyses

    %isad"antage ,educes statistical ower "because lo&ers n a researcher

    cannot anticiate if an ade!uate amount of data remain forthe analysis$'

    Doesn2t use all information

    /stimates may e iased if data isn2t M0A, "comlete caseanalysis assumes that the oserved comlete cases are arandom samle of the originally targeted samle% or in,uin;s "?$ terminology% that the missing data are M0A,$

    1istwise Deletion "0omlete 0ase

    Analysis$

  • 7/24/2019 Missing data & how to handle it.pptx

    14/32

    Pairwise deletion "Availale 0ase

    Analysis$ Analysis with all cases in which

    the variales of interest areresent'

    Ad"antage' Kees as many cases as

    ossile for each analysis' 5ses all information

    ossile with each analysis'%isad"antage'0an2t comare analysesecause samle di(erenteach time'

  • 7/24/2019 Missing data & how to handle it.pptx

    15/32

    Single 8mutation Methods

  • 7/24/2019 Missing data & how to handle it.pptx

    16/32

    Single 8mutation Methods

    Mean7Mode sustitution

    Dummy variale control

    0onditional mean sustitution

  • 7/24/2019 Missing data & how to handle it.pptx

    17/32

    Mean7Mode Sustitution

    ,elace missing value with samle meanor mode

    ,un analyses as if comlete cases analysis

    Ad"antages0an use comlete case analysis methods

    %isad"antages,educes variaility

    +ea*ens covariance and correlation estimatesin the data "ecause 8t ignores relationshietween variales$

  • 7/24/2019 Missing data & how to handle it.pptx

    18/32

    Dummy @ariale Adjustment

    0reate an indicator for missing value "

  • 7/24/2019 Missing data & how to handle it.pptx

    19/32

    ,egression 8mutation

    ,elaces missing values withredicted score from a regressione!uation'

    Advantage:

    5ses information from oserveddata

    Disadvantages:

    verestimates model t andcorrelation estimates

    +ea*ens variance

  • 7/24/2019 Missing data & how to handle it.pptx

    20/32

    Model 9ased Methods

  • 7/24/2019 Missing data & how to handle it.pptx

    21/32

    Model 9ased Methods

    Ma.imum 1i*elihood 5sing /Malgorithm

    Multile imutationThese methods share two assumtions:

    that the joint distriution of the data ismultivariate normal% and that the

    missing data mechanism is ignorale'

  • 7/24/2019 Missing data & how to handle it.pptx

    22/32

    8denties the set of arameter values that roduces thehighest log4li*elihood'

    M1 estimate: value that is most li*ely to have resulted inthe oserved data

    0oncetually% rocess the same with or without missingdata

    Advantages:

    5ses full information "oth comlete cases andincomlete cases$ to calculate log li*elihood

    5niased arameter estimates with M0A,7MA, data Disadvantages

    S/s iased downward#can e adjusted y using oservedinformation matri.

  • 7/24/2019 Missing data & how to handle it.pptx

    23/32

    we can ase estimation on theli*elihood of the oserved data'

  • 7/24/2019 Missing data & how to handle it.pptx

    24/32

    Multile 8mutation

  • 7/24/2019 Missing data & how to handle it.pptx

    25/32

  • 7/24/2019 Missing data & how to handle it.pptx

    26/32

  • 7/24/2019 Missing data & how to handle it.pptx

    27/32

  • 7/24/2019 Missing data & how to handle it.pptx

    28/32

  • 7/24/2019 Missing data & how to handle it.pptx

    29/32

  • 7/24/2019 Missing data & how to handle it.pptx

    30/32

  • 7/24/2019 Missing data & how to handle it.pptx

    31/32

  • 7/24/2019 Missing data & how to handle it.pptx

    32/32

    ,eferences

    Allison% Paul D' CC