Partially missing at random and ignorable inferences for parameter subsets with missing data

31
Partially missing at random and ignorable inferences for parameter subsets with missing data Roderick Little

description

Partially missing at random and ignorable inferences for parameter subsets with missing data. Roderick Little. Outline. Survey Bayesics in three slides Inference with missing data: Rubin's (1976) paper on conditions for ignoring the missing-data mechanism - PowerPoint PPT Presentation

Transcript of Partially missing at random and ignorable inferences for parameter subsets with missing data

Page 1: Partially missing at random and ignorable inferences for parameter subsets with missing data

Partially missing at random and ignorable inferences for parameter

subsets with missing data

Roderick Little

Page 2: Partially missing at random and ignorable inferences for parameter subsets with missing data

Outline• Survey Bayesics in three slides• Inference with missing data: Rubin's (1976)

paper on conditions for ignoring the missing-data mechanism

• Rubin’s standard conditions are sufficient but not necessary: example

• Propose definitions of MAR, ignorability for likelihood (and Bayes) inference for subsets of parameters

• Examples• Joint work with Sahar Zanganeh

Graybill Conference: Partially Missing at Random 2

Page 3: Partially missing at random and ignorable inferences for parameter subsets with missing data

Calibrated Bayes– Frequentists should be Bayesian

• Bayes is optimal under assumed model

– Bayesians should be frequentist• We never know the model (and all models are wrong)• Inferences should have good repeated sampling

characteristics

– Calibrated Bayes (e.g. Box 1980, Rubin 1984, Little 2012)

• Inference based on a Bayesian model• Model chosen to yield inferences that are well-calibrated

in a frequentist sense• Aim for posterior probability intervals that have

(approximately) nominal frequentist coverage

Graybill Conference: Partially Missing at Random 3

Page 4: Partially missing at random and ignorable inferences for parameter subsets with missing data

Calibrated Bayes models for surveys should incorporate sample design features

– All models are wrong, some models are useful• Design-assisted: make the estimator more robust• Calibrated Bayes: make the model more robust – many

models yield design-consistent estimates

– Models that ignore features like survey weights are vulnerable to misspecification

– But models can be successfully applied in survey setting, with attention to design features

• Weighting, stratification, clustering

– Capture design weights as covariates in the prediction model (e.g. Gelman 2007)

Graybill Conference: Partially Missing at Random 4

Page 5: Partially missing at random and ignorable inferences for parameter subsets with missing data

Benefits of Bayes• Unified approach to all problems

– Avoids current approach -- “inferential schizophrenia”

• Not asymptotic– Propagates errors in estimating parameters

• Avoids frequentist pitfalls:– Conditions on ancillaries– Obeys likelihood principle

Graybill Conference: Partially Missing at Random 5

Page 6: Partially missing at random and ignorable inferences for parameter subsets with missing data

v

Graybill Conference: Partially Missing at Random 6

Page 7: Partially missing at random and ignorable inferences for parameter subsets with missing data

There are those who predict…

… and those who weight

Graybill Conference: Partially Missing at Random 7

Page 8: Partially missing at random and ignorable inferences for parameter subsets with missing data

Rubin (1976 Biometrika)• Landmark paper (3700+ citations, after being

rejected by many journals!)– RL wrote his first (11 page) referee report, and an

obscure discussion

• Modeled the missing data mechanism by treating missingness indicators as random variables, assigning them a distribution

• Sufficient conditions under which missing data mechanism can be ignored for likelihood and frequentist inference about parameters– Focus here on likelihood, Bayes

Graybill Conference: Partially Missing at Random 8

Page 9: Partially missing at random and ignorable inferences for parameter subsets with missing data

Ignoring the mechanism

• Full likelihood:

• Likelihood ignoring mechanism:

• Missing data mechanism can be ignored for likelihood inference when

obs mis

, |

data with no missing values, observed, missing

= response indicator matrix

( , | , ) ( | ) ( | , )D R D RD

D D D

R

f D R f D f R D

obs | mis( , | , ) const. ( | ) ( | , )D RDL D R f D f R D dD

ign obs mis( | , ) const. ( | )DL D R f D dD

obs ign obs rest obs( , | , ) ( | , ) ( | , )L D R L D R L D R

Graybill Conference: Partially Missing at Random 9

Page 10: Partially missing at random and ignorable inferences for parameter subsets with missing data

Rubin’s sufficient conditions for ignoring the mechanism

• Missing data mechanism can be ignored for likelihood inference when– (a) the missing data are missing at random (MAR):

– (b) distinctness of the parameters of the data model and the missing-data mechanism:

• MAR is the key condition: without (b), inferences are valid but not fully efficient

| obs mis | obs mis( | , , ) ( | , ) for all ,R D R Df R D D f R D D

( , ) ; for Bayes, and a-priori independent

Graybill Conference: Partially Missing at Random 10

Page 11: Partially missing at random and ignorable inferences for parameter subsets with missing data

“Sufficient for ignorable” is not the same as “ignorable”

• These definitions have come to define ignorability (e.g. Little and Rubin 2002)

• However, Rubin (1976) described (a) and (b) as the "weakest simple and general conditions under which it is always appropriate to ignore the process that causes missing data".

• These conditions are not necessary for ignoring the mechanism in all situations.

MAR+distinctness ignorable

ignorable MAR+distinctness

Graybill Conference: Partially Missing at Random 11

Page 12: Partially missing at random and ignorable inferences for parameter subsets with missing data

Example 1: Nonresponse with auxiliary data

obs resp aux

*resp 1 2 aux 1

( , )

( , ), 1,..., , , 1,...,i i j

D D D

D y y i m D y j n

00011

??

1 1 2Y R Y Y

??

Not linked

1 aux

2 1 resp

But... mechanism is ignorable, does not need to be modeled:

Marginal distribution of estimated from

Conditional of given estimated from D

Y D

Y Y

1aux

1 2 ind 1 2

1 2 1

includes the respondent values of ,

but we do not know w

, ~ ( , | )

Pr( 1| , , ) ( , )

hich they are.

i i

i i i i

D

Y Y f y y

r y

Y

y g y

Or whole population N

1Not MAR -- missing for nonrespondents iy i

Graybill Conference: Partially Missing at Random 12

Page 13: Partially missing at random and ignorable inferences for parameter subsets with missing data

MAR, ignorability for parameter subsets• MAR and ignorability are defined in terms of

the complete set of parameters in the data model for D

• It would be useful to have a definition of MAR that applies to subsets of parameters, including parameters of substantive interest.

• A trivial example: It seems plausible that a nonignorable mechanism would be MAR for the parameters of distributions of variables that are not missing.

Graybill Conference: Partially Missing at Random 13

Page 14: Partially missing at random and ignorable inferences for parameter subsets with missing data

MAR, ignorability for parameter subsets

1 2

1 1

1 2 obs ign 1 obs rest 2 obs

1 2

=( , )

Mechanism is partially MAR for likelihood inference

about , denoted P-MAR( ), if:

( , , | , ) ( | , ) ( , | , )

for all , ,

L D R L D R L D R

1 1 1 2Mechanism is IGN( ) if MAR( ) and and ( , ) distinct

Graybill Conference: Partially Missing at Random 14

Page 15: Partially missing at random and ignorable inferences for parameter subsets with missing data

MAR, ignorability for parameter subsets

1

obs ign obs rest obs

Special case where =

Mechanism is P-MAR( ) if:

( , | , ) ( | , ) ( | , )

for all ,

A consequence of (but does not imply) Rubin's MAR condition

IGN( ) if MAR( ) and and distinct

L D R L D R L D R

Graybill Conference: Partially Missing at Random 15

Page 16: Partially missing at random and ignorable inferences for parameter subsets with missing data

Partial MAR given a function of mechanism

obs mis obs

obs

Harel and Schafer (2009) define a different kind of Partial MAR:

Mechanism is partially MAR given ( ) if:

( | , , ( ), , ) ( | , ( ), , )

for all , , ,

Here "partial" relates to the mech

g R

P R Y Y g R P R Y g R

R Y

anism,

In my definition "partial" relates to the parameters

This ideas seems quite distinct

Graybill Conference: Partially Missing at Random 16

Page 17: Partially missing at random and ignorable inferences for parameter subsets with missing data

Example 1: Auxiliary Survey Data

obs resp aux

*resp 1 2 aux 1

( , )

( , ), 1,..., , , 1,...,i i j

D D D

D y y i m D y j n

00011

??

1 1 2Y R Y Y

??

Not linked

Easy to show that mechanism is P-MAR( ),

and IGN( ) if , are distinct

aux

1 2

1 2 1 2

1 2 1

1 includes the respondent values of ,

but we do not know which they

( , ), 1,..., }

, ~ ( , | )

Pr( 1| , ,

are

( )

.

) ,

i i

i i

i i i i

D

D y y i n

Y Y f y y

y y

Y

r y g

Graybill Conference: Partially Missing at Random 17

Page 18: Partially missing at random and ignorable inferences for parameter subsets with missing data

Ex. 2: MNAR Monotone Bivariate Data

• Paper presents more interesting case with Y1, Y2 blocks of variables and missing data in each block

1 2

obs 1 2 1

1 2 1 2 1 1 2 1 2

2 1 2 1 2

( , ), 1,..., }

( , ), 1,..., and , 1,...,

, ~ ( , | ) ( | ) ( | , )

Pr ( 1| , , ) ( , , ) (MNAR)

i i

i i i

i i i i i

i i i i i

D y y i n

D y y i m y i m n

Y Y f y y f y f y y

r y y g y y

00011

??

1 2M Y Y

1

1

1

1

COMMENT: Clearly, inference about parameters

of the marginal distribution of can ignore mechanism,

since has no missing values.

In proposed definition, this mechanism is P-MAR( ),

and IGN( ) if

Y

Y

1 2 and ( , ) distinct

1

Graybill Conference: Partially Missing at Random 18

Page 19: Partially missing at random and ignorable inferences for parameter subsets with missing data

More generally…(1) (2)

1 2

(1) (2) (1)1 2 1 1 1 1 1

(2) (1)1 2 1 2 1 2 2

( , ), ( , ) blocks of incomplete variables, and

( , , , ) ( | )Pr( | , )

( | , )Pr( | , , , )

i i i i i i i

i i i i i i

Y R Y R

f y y r r f y r y

f y y r r y y

(1)1 1 1 1,obs, 1 1,mis,Assume: Pr( | ; ) ( , ) for all ,i i i ir y g y y

(2) (1) (1)1 2 2 2 1 2 2Pr( | , , ; ) ( , , , ),i i i i i i ir r y y g r y y

1 1 1

2 1 2

Mechanism is P-MAR( ), IGN( ) if and

( , , ) are distinct

Graybill Conference: Partially Missing at Random 19

Page 20: Partially missing at random and ignorable inferences for parameter subsets with missing data

Ex. 3: Complete Case Analysis in Regression

1 2

obs 1 2 1 2 1 2

1 2 1

( , ), 1,..., }

( , ), 1,..., , ~ ( , | )

Pr( 1| , , ) ( , )

i i

i i i i

i i i i

D y y i n

D y y i m Y Y f y y

r y y g y

000011

??

1 2R Y Y

??

1 2 1 1 1 2 2 1 2

1 2 obs 1 2 obs 2 1 obs

1 2 obs 2 2 1 2

2 2 1

1

2

Let ( , | ) ( | ) ( | , )

( , , | , ) const. ( | ) ( , | , ),

MNAR, but P-MAR( ), and  IGN( ) if

where

(

  , ( , ) distin t

| , )

c

| ) (

i i i i i

r

i ii

f y y f y f y y

L D R L D L D R

L D f y y

2 1

MNAR, but inference about parameters of

conditional distribution of given based on

complete cases is valid, ignoring the mechanism.

Y Y

Graybill Conference: Partially Missing at Random 20

Page 21: Partially missing at random and ignorable inferences for parameter subsets with missing data

Ex. 4:A normal pattern-mixture model

obs 1 2 1

2 | 2 2

( ) ( )1 2 2 ind 2 ind

2 1 2 2

( , ), 1,..., and , 1,...,

( , | , ) ( | , ) ( | )

( , | , ) ~ ( , ), 0,1, ~ Bern( )

Assume Pr( 1| , ) ( ), unknown (M

COMMENT: Dist

NA

ribution

R)

i i i

D R R

j ji i i i

i i i i

D y y i m y i m n

f D R f D R f R

y y r j G j r

r y y g y g

1 2 2 2 of   given  and is independent of ,

so it can be estimated from complete cases, ignoring the mechanism

Y Y R R

00011

??

2 1 2R Y Y

(0) (0) (1) (1)1 2 12 0 12 2 11 2 2 22 1 11

obs 2 1 1 2 obs 2 2 obs 2

1 1 obs 1 2 1 2 1 21

1 2 1 2 1 2

( , , ), , , ,

( , | , ) const. ( | , ) ( , | , ), where

( | ) ( | , )

MNAR, but P-MAR( ), not IGN( ) since and

m

i ii

L D R L D R L D R

L D f y y

are not distinct

Graybill Conference: Partially Missing at Random 21

Page 22: Partially missing at random and ignorable inferences for parameter subsets with missing data

Ex. 5: Subsample ignorable likelihood

• Interest concerns parameters of regression of Y on (Z,X,W)• Z complete, W and (X,Y) incomplete. W complete in P1.• Division of covariates into W, X is based on following MNAR

assumptions about the missing data mechanism:• Pr(W complete) = fn(W,X,Z) (not Y)

(X,Y) MAR in subsample with W fully observed (that is, P1)

Pattern Z W X Y

P1 √ √ ? ?

P2 √ ? ? ?

wu

1

1This mechanism is P-MAR( );corresponding analysis is

to apply an ignorable likelihood method, discarding data in P2

Little and Zhang (2011)

Columns could be vectors√ = fully observed? = observed or missing

Graybill Conference: Partially Missing at Random 22

Page 23: Partially missing at random and ignorable inferences for parameter subsets with missing data

Ex. 6: Auxiliary data, survey nonresponse

1 2 3

obs resp aux

resp 1 2 3 1

*aux 2

( , , ), 1,..., }

( , )

( , , ), 1,..., , ( ), 1,..., ,

, 1,..., , = population size

i i i

i i i i

j

D y y y i n

D D D

D y y y i r y i r n

D y j N N

??

2 1 2 3 Y Y Y Y

??

Not linked

1..r..n..N

2

1 2 1 2

2 aux 1 resp

3 1 2

NOT MAR -- missing for nonrespondents

But mechanism is P-MAR( ) if ( , , ) additive function of ( , )

Marginal of from ,marginal of from

Conditional of given , from co

i

i i i i

y

g y y y y

Y D Y D

Y Y Y

respmplete cases in D

1 2 3 1 2 3

1 2 3 1 2

, , ~ ( , , | )

Pr( 1| , , , ) ( , , )i i i

i i i i i i

Y Y Y f y y y

m y y y g y y

Graybill Conference: Partially Missing at Random 23

Page 24: Partially missing at random and ignorable inferences for parameter subsets with missing data

Simulation Study

1 2 3 1 2 3 1 2 1 2 3

1 2

3 1 2

3 1 2 1 1 2 2 12 1 2

1 2

1 2 1 1 2 2 12 1

[ , , , ] [ , ][ | , ][ | , , ]

[ , ] multinomial

[ | , ] generated as

logit Pr( 1| , ) 0.5 *

[ | , ] generated as

logit Pr( 1| , ) 0.5 *

Y Y Y M Y Y Y Y Y M Y Y Y

Y Y

Y Y Y

Y Y Y Y Y Y Y

M Y Y

M Y Y Y Y Y

2

Each , set to zero or two (various com

100,000, 20

binati

0, 1000 and 10,

on )

00

s

0

j j

Y

N n

Graybill Conference: Partially Missing at Random 24

Page 25: Partially missing at random and ignorable inferences for parameter subsets with missing data

Simulation Study: methodsCC: Complete Case estimates based on the responding units

M1: ML based on a logistic regression with interaction for Y3

M2: ML based on an additive logistic regression for Y3

NR: Weighting class estimates where nonresponse weights are obtained based on Y1

PS: Post-stratification weighted estimates (PS) based on Y2

NRPS: Adjust weights using both Y1 and Y2. For the case of

categorical variable, this method is equivalent to Linear Calibration regression, or Generalized Raking estimates

Graybill Conference: Partially Missing at Random 25

Page 26: Partially missing at random and ignorable inferences for parameter subsets with missing data

Graybill Conference: Partially Missing at Random 26

Page 27: Partially missing at random and ignorable inferences for parameter subsets with missing data

Simulation: summary findings• When response depends on Y1 *Y2 interaction,

all methods do poorly• When data are MCAR, all methods do similarly

well• Model-based methods remove almost all the

bias and perform better when response doesn’t depend on Y1 *Y2 interaction

• Qualitative patterns hold for different sample sizes

Graybill Conference: Partially Missing at Random 27

Page 28: Partially missing at random and ignorable inferences for parameter subsets with missing data

Frequentist inference• Rubin’s (1976) sufficient conditions for

ignorability for frequentist inference were even stronger (essentially MCAR)

• These can be weakened too – for example asymptotic frequentist inference based on ML and observed information matrix works under conditions given here

• Small sample inference seems more problematic

Graybill Conference: Partially Missing at Random 28

Page 29: Partially missing at random and ignorable inferences for parameter subsets with missing data

Frequentist inference• Rubin’s (1976) sufficient conditions for

ignorability for frequentist inference were even stronger (essentially MCAR)

• These can be weakened too – for example asymptotic frequentist inference based on ML and observed information matrix works under conditions given here

• Small sample inference is more complex

Graybill Conference: Partially Missing at Random 29

Page 30: Partially missing at random and ignorable inferences for parameter subsets with missing data

Summary• Proposed definitions of partial MAR,

ignorability for subsets of parameters• Expands range of situations where

missing data mechanism can be ignored• Though, in some cases, MAR analysis

entails a loss of information –– How much is lost is an interesting question,

varies by context

Graybill Conference: Partially Missing at Random 30

Page 31: Partially missing at random and ignorable inferences for parameter subsets with missing data

ReferencesHarel, O. and Schafer, J.L. (2009). Partial and Latent Ignorability in missing data problems. Biometrika, 2009, 1-14

Little, R.J.A. (1993). Pattern Mixture Models for Multivariate ‑Incomplete Data. JASA, 88, 125-134.

Little, R. J. A., and Rubin, D. B. (2002). Statistical Analysis with Missing Data (2nd ed.) Wiley.

Little, R.J. and Zangeneh, S.Z. (2013). Missing at random and ignorability for inferences about subsets of parameters with missing data. University of Michigan Biostatistics Working Paper Series.

Little, R. J. and Zhang, N. (2011). Subsample ignorable likelihood for regression analysis with missing data. JRSSC, 60, 4, 591–605.

Rubin, D. B. (1976). Inference and Missing Data. Biometrika 63, 581-592.

Graybill Conference: Partially Missing at Random 31