PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data...

15
PEARL VS RUBIN (GELMAN) AN EPIC battle between the Rubin Causal Model school (Gelman et al) AND the Structural Causal Model school (Pearl et al) a cursory overview Dokyun Lee Wednesday, November 30, 2011

Transcript of PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data...

Page 1: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

PEARL VS RUBIN (GELMAN)AN EPIC battle between the Rubin Causal Model school (Gelman et al)

AND the Structural Causal Model school (Pearl et al)

a cursory overview

Dokyun Lee

Wednesday, November 30, 2011

Page 2: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

WHO ARE THEY?

Rubin Causal Model

Pioneer of Observational studies

Causal inference in experiments and observational studies

Inference in sample surveys with nonresponse and in missing data problems

Advised by Cochran

Don B. RubinProfessor @

Harvard Statistics

Judea PearlProfessor @

UCLA Computer ScienceVS

Probabilistic approach to AI

Contributed to the development of Bayesian networks (belief propagation, graphical models - subsumes many stat models such as Kalman filtering, Markov models, Ising models etc)

One of the first to mathematize causal modeling in the empirical sciences.

Developing a method of causal and counterfactual inference based on structural models

Wednesday, November 30, 2011

Page 3: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

ACTUALLY INVOLVED

Probabilistic approach to AI

Contributed to the development of Bayesian networks (belief propagation, graphical models - subsumes many stat models such as Kalman filtering, Markov models, Ising models etc)

One of the first to mathematize causal modeling in the empirical sciences.

Developing a method of causal and counterfactual inference based on structural models

Advised by Don Rubin

Prominent Bayesian Statistician with a famous blog

Some may already know him from STAT 542 textbook.

Applies Bayesian analysis to Political Science

Judea PearlProfessor @

UCLA Computer ScienceVS

Andrew GelmanProfessor @

Columbia Statistics

Wednesday, November 30, 2011

Page 4: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

SOME POSTS TO GIVE YOU AN IDEA

Larry Wasserman @

CMU author of “the

purple book” All of Statistics +

other book

Andrew Gelman @ Columbia University

author of many books including “Bayesian Data Analysis” used in

STAT 542

Also involved (some indirectly, some only by their papers): Philip Dawid, Jeff Wooldridge, Dehejia and Wahba, Imbens, Michael

Sobel + lot more, of course Paul Rosenbaum is mentioned many times.

Total 6 long blog entrees, 91 comments by many leader of the field, many research letters and notes

Wednesday, November 30, 2011

Page 5: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

SOME MORE BACKGROUND

“Causality” 2009 Second edition by Judea Pearl:

“The method of propensity score is based on a simple, yet ingenious, idea of purely statistical character [...]

“The condition was articulated in the cryptic language of potential outcome, stating that the set [X] must render [Z] “Strongly ignorable,” i.e., {Y_0,Y_1} ind [Z] |[X]. As stated several times in this book, the opacity of “ignorability” is the Achilles’ heel of the potential-outcome approach - no mortal can apply this condition to judge whether it holds even in simple problems, with all causal relationships correctly specified, let alone in partially specified problems that involve dozens of variables.”

Wednesday, November 30, 2011

Page 6: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

THE BEGINNING

1: 2008, Letter to the editor of Statistics in Medicine, Ian Shrier presented a question to Don Rubin.

“Is it possible that, asymptotically, the use of Propensity Scores (PS) methods may actually increase, not decrease, overall bias, compared with the crude, unadjusted estimate of a causal effect?”

2: Shrier, Sjolander, and Pearl sent three separate letters to Statistics in Medicine in which M-bias was explained and exemplified

3: 2009 Rubin in response:

“To avoid conditioning on some observed covariates in the hope of obtaining an unbiased estimator because of phantom but complementary imbalances on unobserved covariates, is neither Bayesian nor scientifically sound but rather it is distinctly frequentist and nonscientific ad hocery.”

4: 2009, Judea Pearl “Myth, Confusion, and Science in Causal Analysis” 2009 Statistics in Medicine.

“Of course, Yes; the M-graph model presented by Shrier provides a simple such example [...] Rubin pleaded to be “puzzled” and “confused” by the terminology, by the example, and by graphs in general”

Wednesday, November 30, 2011

Page 7: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

WHAT IS THIS M-BIAS?

Rests on “Berkson Paradox”

Two independent causes of a common effect become dependent when we observe the effect; information refuting one cause should make the other more likely.

e.g. outcome: late to Stat 921

one reason: woke up super late

second reason: had to save the world again

P(save world = “yes” | late = “yes”) != P(save world = “yes” | late = “yes”,woke up late = “no”)

Thus “save world” is not independent of “woke up late” given “late”

woke up late saved world

Wednesday, November 30, 2011

Page 8: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

“BAYES BALL” ALGORITHM

Wednesday, November 30, 2011

Page 9: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

Wednesday, November 30, 2011

Page 10: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

SUPER SIMPLIFIED VERSION OF THEIR PHILOSOPHY

“Causality doesn’t come without manipulation”

“Granger Causality is not causality”

Set up a causal model with graphical models and determine the relationship between the variables.

In words of Gelman: “The research programme under which all causal inference problems can be framed in terms of graphs, colliders, the do operator, and the like”

What we’ve been learning all along in the course.

In words of Gelman: “The research programme under which all causal inference problems can be framed in terms of potential outcomes”

Don B. RubinProfessor @

Harvard Statistics

Judea PearlProfessor @

UCLA Computer ScienceVS

Wednesday, November 30, 2011

Page 11: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

ONE ISSUE RAISED AMONG MANY (BACK TO M-BIAS)

Rubin: Condition on all pre-treatment variables

Pearl: Do not condition on all information lest some confounders introduce more bias.

graphical models: helpful or not.

Wednesday, November 30, 2011

Page 12: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

SOME PAPERS ON THIS ISSUE

Brookhard, M. Alan, Sebastian Schneeweiss, Kenneth J. Rothman, Robert J. Glynn, Jerry Avorn, & Til Stu ̈rmer. 2006. “Variable selection for propensity score models.” American Journal of Epidemiology 163 (June): 1149-1156.

Kevin A. Clarke, Brenton Kenkel, and Miguel R. Rueda. Misspecification and the propensity score: when to leave out relevant pre-treatment variables. preprint, 2010.

Soko Setoguchi, Sebastian Schneeweiss, M. Alan Brookhart, Robert J. Glynn, and E. Francis Cook. Evaluating uses of data mining techniques in propensity score estimation: a simulation study. Pharmacoepidemiology and drug safety, 17(6):546–555, June 2008.

etc

Wednesday, November 30, 2011

Page 13: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

SOME CLAIMS AMONG COUNTLESS MANY

Rubin model is a particular case of Pearl model.

Rubin model is not explicit when it comes to “ignorability” condition

Judea Pearl may have proven equivalence of the Rubin model and the Pearl model but assumptions are wrong or irrelevant for some real world problems.

Judea Pearl Andrew Gelman

Wednesday, November 30, 2011

Page 14: PEARL VS RUBIN (GELMAN) - leedokyun.comleedokyun.com/obs.pdf · nonresponse and in missing data problems Advised by ... Professor @ UCLA Computer Science V S Probabilistic approach

STRUCTURAL CAUSAL MODEL BOOKS

“Causality” By Judea Pearl (UCLA), 2009, Cambridge University Press.

“Targeted Learning” by Mark Van Der Laan (UCB) and Sherri Rose (Johns Hopkins), 2011, Springer Series in Statistics

Wednesday, November 30, 2011