Lecture9-PropensityScore - Pennsylvania State University · 2015. 8. 28. · propensity score...

8/28/15

1

Lecture 9: Propensity Score Analysis

Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox

Meta-analysis Homework - Stata

cd /Volumes/Hollenbeak/Teaching/Residents/OutcomesResearch/ clear insheet using "blockers.csv" generate net = nt-rt generate nec = nc-rc /// Mantel-Haenzel metan rt net rc nec, or /// Peto metan rt net rc nec, or peto /// DerSimonian-Laird metan rt net rc nec, or random /// Mantel-Haenzel metan rt net rc nec, rr

Meta-analysis Homework - R

library(rmeta) setwd("/Volumes/Hollenbeak/Teaching/Residents/OutcomesResearch/Round4/") dat1 <-read.csv("blockers.csv") # Mantel Haenzel Analysis ma1 <- meta.MH(dat1$nt, dat1$nc, dat1$rc, dat1$rc, names=dat1$study, statistic="RR") summary(ma1) plot(ma1) ma2 <- meta.MH(dat1$nt, dat1$nc, dat1$rc, dat1$rc, names=dat1$study, statistic="OR") summary(ma2) plot(ma2) # DerSimonian-Laird Analysis ma3 <- meta.DSL(dat1$nt, dat1$nc, dat1$rc, dat1$rc, names=dat1$study, statistic="OR") summary(ma3) plot(ma3)

8/28/15

2

R Code for Forest Plot from Scratch

tabletext<-cbind( c("Study","Botteri 2013", "Sorenson 2013", "Ganz

2011", "Melhem 2011", "Powe 2010",NA,"Summary"), c("HR", "0.32 (0.12 - 0.88)", "0.91 (0.82 - 1.01)", "0.86 (0.57 - 1.31)", "0.52 (0.31 - 0.88)", "0.43 (0.2 - 0.93)" , NA, "0.67 (0.39 - 1.13)")) m<- c(NA, 0.32, 1.3, 0.86, 0.52, 0.43, NA, 0.67) l<- c(NA, 0.12, 1.11, 0.57, 0.31, 0.20, NA, 0.40) u<- c(NA, 0.88, 1.52, 1.31, 0.88, 0.93, NA, 1.13) forestplot(tabletext, m, l, u, zero=1, is.summary=c(TRUE,rep(FALSE,5), TRUE, TRUE), align=c(1, 3), col=meta.colors(box="black", line="gray", summary="gray80", background="white"))

R Code for Forest Plot from Scratch

StudyBotteri 2013Sorenson 2013Ganz 2011Melhem 2011Powe 2010

Summary

HR0.32 (0.12 − 0.88)0.91 (0.82 − 1.01)0.86 (0.57 − 1.31)0.52 (0.31 − 0.88)0.43 (0.2 − 0.93)

0.67 (0.39 − 1.13)

0.2 0.4 0.6 0.8 1 1.2 1.4

Background

•  The primary criticism of research that utilizes observational data is selection bias

•  Because treatments are not assigned in a random fashion (as in randomized trials) we can’t be sure that estimated treatment effects are due to treatment

•  There may be other variables that drive both treatment and outcomes

8/28/15

3

Example: PTCA after AMI

•  Suppose we are interested in knowing whether it is better to receive percutaneous coronary angioplasty (PTCA) after acute myocardial infarction

•  A randomized trial would randomize some patients to receive PTCA, and some to be medically managed

•  But suppose all we have is observational data

7

Example: PTCA after AMI

•  Data are from PHC4 an include all discharges with and AMI diagnosis

•  We can identify PTCA using ICD-9 procedure codes •  Compare outcomes of patients with PTCA to those

without, whom we will say are medically managed

8

PTCA after AMI

9

8/28/15

4

PTCA after AMI

•  Which has better outcomes? •  PTCA has an odds ratio of 0.07!! •  Does the use of PTCA account for all of the

difference in mortality? •  Would we get this same result in a randomized,

controlled trial? •  Probably not if patients are preferentially selected

for PTCA –  Evidence of selection bias is imbalanced covariates

10

Imbalanced Covariates

•  We have to consider potential selection bias –  Who are the patients getting PTCA? Are they younger?

Healthier? If so, not all of this effect should be attributed to the PTCA

•  Issue is called “imbalanced covariates” –  Hints at selection bias

•  Compare characteristics of groups

11

Covariates

8/28/15

5

Problem

•  Patients getting PTCA are younger, and less sick than medically managed patients

•  No wonder their mortality is less –  Not a fair comparison!

•  What would the mortality improvement be if PTCA patients were compared to similar medically managed patients?

Getting a Fair Comparison

•  Randomized trial –  Balanced covariates

•  Case-control matching •  Propensity score methods

Randomized Controlled Trial

•  Not always possible or ethical •  Can be very costly •  Does not achieve matching, but balances all

potential covariates

8/28/15

6

Case-Control Matching

•  Could match PTCA cases to medically managed cases –  Popular epidemiology approach

•  Numbers of categories rise exponentially, for example: –  Age (10 categories) x –  Sex (2 categories) x –  Prior diagnoses (5 @ 2 categories each)

â2000 potential matching groups

Propensity Score Methods

•  Akin to a retrospective randomized controlled trial •  Retrospectively matches cases to controls •  Only compare patients with a similar propensity for

treatment •  Two Steps: –  1. Estimate propensity for treatment –  2. Match on propensity and compare matched groups

Step 1: Estimate Propensity Score

•  Use logistic regression to predict who gets treatment

•  Compute the predicted probability for each patient –  Predicted probability = Propensity Score

Pr(yi

= 1) =e�0+�1x1+···+�kxk

1 + e�0+�1x1+···+�kxk

8/28/15

7

Step 2: Match Cases and Controls

21

8/28/15

8

Step 2: Propensity Score Methods

•  There are a few options for the second step of the propensity score analysis 1.  Propensity Score Matching: Match cases to controls on

the propensity score 2.  Stratified Analysis: Stratify results on quantiles (usually

quintiles) of the propensity score 3.  Kernel Matching: Weight all observations on some

function of the propensity score 4.  Regression: Include the propensity score as a covariate

in a regression analysis

22

Propensity Score Matching

•  Propensity score matching selects a matched group of controls, where matching is done on the propensity score

•  Instead of matching on numerous covariates, matches on a single variable

23

Options for Matching

•  k-Nearest neighbor –  Match k:1 –  Select control(s) with closest propensity score

•  Caliper restriction –  Require that the nearest neighbor is no more than some

distance from case

•  Sample with or without replacement –  With replacement means a patient may serve as a

control for more than one case

24

8/28/15

9

Adequacy of Match

•  After matching, need to test whether covariates are balanced

•  Simple t tests and chi-square tests are adequate •  Stata has built in functions •  If important covariates are still not balanced,

consider –  Changing the matching method (shrink caliper, etc.) –  Another propensity score method

25

Propensity Score Matching

•  The outcome for a propensity score matching analysis is the average effect of treatment on the treated (ATT) –  The average treatment effect for just those individuals

who were “treated”

•  This is estimated from the propensity score matched groups

26

E(y1 | X,T = 1)� E(y0 | X,T = 1)

Bootstrapping

•  Usual statistics after matching cannot be trusted •  The standard errors (and therefore confidence

intervals and p-values) are incorrect –  Don’t account for uncertainty of step 1 and step 2

•  Solution is to bootstrap the data –  Resamples the data (with replacement) –  Repeats step 1 and step 2 –  Gives correct standard errors –  500 to 1000 iterations is preferred

27

8/28/15

10

Recapping

1.  Logistic regression to predict treatment 2.  Match cases to controls on the propensity score 3.  Compare matched cases and controls to make

sure covariates are now balanced, adjust if necessary

4.  Compute ATT 5.  Bootstrap to get confidence intervals and p-value

for ATT

28

Stata Code

•  Stata has a set of routines that automate the propensity score matching –  net search psmatch2 will allow you to install them

•  Command syntax is: –  psmatch2 depvar indvar1…indvar2, neighbor(k) caliper(x) noreplace out(outcome)

–  psgraph –  pstest indvar1…indvar2 –  boostrap r(att), reps(N): psmatch2…

29

Example: PTCA versus Medical Management

•  Propensity score match –  1:1 nearest neighbor –  Caliper restriction of 0.03 –  Sample without replacement –  Compare mortality rates

•  psmatch2 ptca age5565 age6575 age7585 age85 female nonwhite mq1 mq2 mq3 mq4 emergent urgent transfer qw, neighbor(1) caliper(0.03) noreplace out(died)

30

8/28/15

11

31

Logistic regression Number of obs = 30746 LR chi2(14) = 9489.10 Prob > chi2 = 0.0000 Log likelihood = -14678.604 Pseudo R2 = 0.2443 ------------------------------------------------------------------------------ ptca | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age5565 | -.2631597 .0491659 -5.35 0.000 -.3595231 -.1667964 age6575 | -.5231374 .0523722 -9.99 0.000 -.6257849 -.4204899 age7585 | -1.030738 .0554906 -18.58 0.000 -1.139498 -.921979 age85 | -2.05675 .0787066 -26.13 0.000 -2.211012 -1.902488 female | -.2599268 .029954 -8.68 0.000 -.3186356 -.201218 nonwhite | -.0228066 .0356434 -0.64 0.522 -.0926664 .0470533 mq1 | 1.048819 .1870906 5.61 0.000 .6821279 1.41551 mq2 | .6223701 .1897966 3.28 0.001 .2503757 .9943646 mq3 | -.3106409 .1925283 -1.61 0.107 -.6879896 .0667077 mq4 | -1.492374 .2299972 -6.49 0.000 -1.94316 -1.041587 emergent | -.8131428 .0552582 -14.72 0.000 -.9214469 -.7048387 urgent | -.6004808 .0577716 -10.39 0.000 -.713711 -.4872505 transfer | 1.078308 .0391937 27.51 0.000 1.001489 1.155126 qw | .8630298 .0291064 29.65 0.000 .8059823 .9200772 _cons | -.2195265 .1918366 -1.14 0.252 -.5955193 .1564663 ------------------------------------------------------------------------------

32

---------------------------------------------------------------------------------------- Variable Sample | Treated Controls Difference S.E. T-stat ----------------------------+----------------------------------------------------------- died Unmatched | .013345284 .157498189 -.144152905 .003721816 -38.73 ATT | .01809443 .094430308 -.076335878 .003821262 -19.98 ----------------------------+----------------------------------------------------------- Note: S.E. does not take into account that the propensity score is estimated. psmatch2: | psmatch2: Common Treatment | support assignment | Off suppo On suppor | Total -----------+----------------------+---------- Untreated | 0 20,705 | 20,705 Treated | 2,967 7,074 | 10,041 -----------+----------------------+---------- Total | 2,967 27,779 | 30,746

33

8/28/15

12

Test Covariate Balance

•  pstest age5565 age6575 age7585 age85 female nonwhite mq1 mq2 mq3 mq4 emergent urgent transfer qw

34

35

---------------------------------------------------------------------------- | Mean %reduct | t-test Variable Sample | Treated Control %bias |bias| | t p>|t| ------------------------+----------------------------------+---------------- age5565 Unmatched | .23454 .1064 34.6 | 30.08 0.000 Matched | .23268 .21233 5.5 84.1 | 2.91 0.004 | | age6575 Unmatched | .24529 .20246 10.3 | 8.56 0.000 Matched | .32075 .29403 6.4 37.6 | 3.44 0.001 | | age7585 Unmatched | .18534 .36513 -41.1 | -32.62 0.000 Matched | .25163 .24908 0.6 98.6 | 0.35 0.727 | | age85 Unmatched | .03077 .24733 -65.9 | -48.45 0.000 Matched | .04368 .04227 0.4 99.3 | 0.41 0.678 | | female Unmatched | .34439 .52664 -37.4 | -30.49 0.000 Matched | .41532 .40811 1.5 96.0 | 0.87 0.384 | | nonwhite Unmatched | .2196 .18933 7.5 | 6.24 0.000 Matched | .25205 .22533 6.6 11.7 | 3.73 0.000 | | mq1 Unmatched | .38104 .10558 67.8 | 60.40 0.000 Matched | .25516 .27453 -4.8 93.0 | -2.61 0.009 | |

Stata Boostrap Step

•  bootstrap r(att), reps(500): psmatch2 ptca age5565 age6575 age7585 age85 female nonwhite mq1 mq2 mq3 mq4 emergent urgent transfer qw, neighbor(1) caliper(0.03) noreplace out(died) logit

36

8/28/15

13

37

Stata Bootstrap Results Bootstrap replications (500) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 .................................................. 150 .................................................. 200 .................................................. 250 .................................................. 300 .................................................. 350 .................................................. 400 .................................................. 450 .................................................. 500 Bootstrap results Number of obs = 30746 Replications = 500 command: psmatch2 ptca age5565 age6575 age7585 age85 female nonwhite mq1 mq2

mq3 mq4 emergent urgent transfer qw, neighbor(1) caliper(0.03) noreplace out(died) logit

_bs_1: r(att) ------------------------------------------------------------------------------ | Observed Bootstrap Normal-based | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _bs_1 | -.0763359 .0035432 -21.54 0.000 -.0832804 -.0693914 ------------------------------------------------------------------------------

Final Results

38

R Code

•  Use the R package “MatchIt” –  install.packages(“MatchIt”) –  library(MatchIt)

•  Also use the R package “Zelig” –  install.packages(“Zelig”) –  library(Zelig)

•  MatchIt does the analysis, Zelig gives you confidence intervals

8/28/15

14

R Code

Conclusions

•  Among all patients with AMI there is a mortality risk difference of 14.5 percentage points (1.3% vs. 15.7%) for patients who received PTCA relative to those who did not

•  Of this, approximately half of the risk difference (7.6 percentage points) is attributable to PTCA

•  The rest is attributable to differences in the patient populations

42

8/28/15

15

Propensity Scores Cannot …

•  Guarantee balance of characteristics that are not modeled explicitly

•  Capture the effects of being “at risk” if there has been no clinical manifestation

•  Replace randomized trials for efficacy studies

Propensity Scores Can …

•  Achieve a balance not obtainable through matching or stratification

•  Achieve greater balance than randomization for recorded factors

•  Be easily implemented in large database research •  Be very useful in health services research

Homework

•  What is the attributable cost of SSI in liver transplantation?

•  Are there imbalanced covariates between patients with and without SSI?

•  Perform a propensity score matching analysis on the liver transplant data –  Perform a 1:1 match –  Sample with replacement –  Compute the ATT –  Bootstrap 500 iterations to get the 95% confidence

interval

Lecture9-PropensityScore - Pennsylvania State University · 2015. 8. 28. · propensity score...

Documents

Transcript of Lecture9-PropensityScore - Pennsylvania State University · 2015. 8. 28. · propensity score...