Lecture9-PropensityScore - Pennsylvania State University · 2015. 8. 28. · propensity score...
Transcript of Lecture9-PropensityScore - Pennsylvania State University · 2015. 8. 28. · propensity score...
8/28/15
1
Lecture 9: Propensity Score Analysis
Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox
Meta-analysis Homework - Stata
cd /Volumes/Hollenbeak/Teaching/Residents/OutcomesResearch/ clear insheet using "blockers.csv" generate net = nt-rt generate nec = nc-rc /// Mantel-Haenzel metan rt net rc nec, or /// Peto metan rt net rc nec, or peto /// DerSimonian-Laird metan rt net rc nec, or random /// Mantel-Haenzel metan rt net rc nec, rr
Meta-analysis Homework - R
library(rmeta) setwd("/Volumes/Hollenbeak/Teaching/Residents/OutcomesResearch/Round4/") dat1 <-read.csv("blockers.csv") # Mantel Haenzel Analysis ma1 <- meta.MH(dat1$nt, dat1$nc, dat1$rc, dat1$rc, names=dat1$study, statistic="RR") summary(ma1) plot(ma1) ma2 <- meta.MH(dat1$nt, dat1$nc, dat1$rc, dat1$rc, names=dat1$study, statistic="OR") summary(ma2) plot(ma2) # DerSimonian-Laird Analysis ma3 <- meta.DSL(dat1$nt, dat1$nc, dat1$rc, dat1$rc, names=dat1$study, statistic="OR") summary(ma3) plot(ma3)
8/28/15
2
R Code for Forest Plot from Scratch
tabletext<-cbind( c("Study","Botteri 2013", "Sorenson 2013", "Ganz
2011", "Melhem 2011", "Powe 2010",NA,"Summary"), c("HR", "0.32 (0.12 - 0.88)", "0.91 (0.82 - 1.01)", "0.86 (0.57 - 1.31)", "0.52 (0.31 - 0.88)", "0.43 (0.2 - 0.93)" , NA, "0.67 (0.39 - 1.13)")) m<- c(NA, 0.32, 1.3, 0.86, 0.52, 0.43, NA, 0.67) l<- c(NA, 0.12, 1.11, 0.57, 0.31, 0.20, NA, 0.40) u<- c(NA, 0.88, 1.52, 1.31, 0.88, 0.93, NA, 1.13) forestplot(tabletext, m, l, u, zero=1, is.summary=c(TRUE,rep(FALSE,5), TRUE, TRUE), align=c(1, 3), col=meta.colors(box="black", line="gray", summary="gray80", background="white"))
R Code for Forest Plot from Scratch
StudyBotteri 2013Sorenson 2013Ganz 2011Melhem 2011Powe 2010
Summary
HR0.32 (0.12 − 0.88)0.91 (0.82 − 1.01)0.86 (0.57 − 1.31)0.52 (0.31 − 0.88)0.43 (0.2 − 0.93)
0.67 (0.39 − 1.13)
0.2 0.4 0.6 0.8 1 1.2 1.4
Background
• The primary criticism of research that utilizes observational data is selection bias
• Because treatments are not assigned in a random fashion (as in randomized trials) we can’t be sure that estimated treatment effects are due to treatment
• There may be other variables that drive both treatment and outcomes
8/28/15
3
Example: PTCA after AMI
• Suppose we are interested in knowing whether it is better to receive percutaneous coronary angioplasty (PTCA) after acute myocardial infarction
• A randomized trial would randomize some patients to receive PTCA, and some to be medically managed
• But suppose all we have is observational data
7
Example: PTCA after AMI
• Data are from PHC4 an include all discharges with and AMI diagnosis
• We can identify PTCA using ICD-9 procedure codes • Compare outcomes of patients with PTCA to those
without, whom we will say are medically managed
8
PTCA after AMI
9
8/28/15
4
PTCA after AMI
• Which has better outcomes? • PTCA has an odds ratio of 0.07!! • Does the use of PTCA account for all of the
difference in mortality? • Would we get this same result in a randomized,
controlled trial? • Probably not if patients are preferentially selected
for PTCA – Evidence of selection bias is imbalanced covariates
10
Imbalanced Covariates
• We have to consider potential selection bias – Who are the patients getting PTCA? Are they younger?
Healthier? If so, not all of this effect should be attributed to the PTCA
• Issue is called “imbalanced covariates” – Hints at selection bias
• Compare characteristics of groups
11
Covariates
8/28/15
5
Problem
• Patients getting PTCA are younger, and less sick than medically managed patients
• No wonder their mortality is less – Not a fair comparison!
• What would the mortality improvement be if PTCA patients were compared to similar medically managed patients?
Getting a Fair Comparison
• Randomized trial – Balanced covariates
• Case-control matching • Propensity score methods
Randomized Controlled Trial
• Not always possible or ethical • Can be very costly • Does not achieve matching, but balances all
potential covariates
8/28/15
6
Case-Control Matching
• Could match PTCA cases to medically managed cases – Popular epidemiology approach
• Numbers of categories rise exponentially, for example: – Age (10 categories) x – Sex (2 categories) x – Prior diagnoses (5 @ 2 categories each)
â2000 potential matching groups
Propensity Score Methods
• Akin to a retrospective randomized controlled trial • Retrospectively matches cases to controls • Only compare patients with a similar propensity for
treatment • Two Steps: – 1. Estimate propensity for treatment – 2. Match on propensity and compare matched groups
Step 1: Estimate Propensity Score
• Use logistic regression to predict who gets treatment
• Compute the predicted probability for each patient – Predicted probability = Propensity Score
Pr(yi
= 1) =e�0+�1x1+···+�kxk
1 + e�0+�1x1+···+�kxk
8/28/15
7
Step 2: Match Cases and Controls
21
8/28/15
8
Step 2: Propensity Score Methods
• There are a few options for the second step of the propensity score analysis 1. Propensity Score Matching: Match cases to controls on
the propensity score 2. Stratified Analysis: Stratify results on quantiles (usually
quintiles) of the propensity score 3. Kernel Matching: Weight all observations on some
function of the propensity score 4. Regression: Include the propensity score as a covariate
in a regression analysis
22
Propensity Score Matching
• Propensity score matching selects a matched group of controls, where matching is done on the propensity score
• Instead of matching on numerous covariates, matches on a single variable
23
Options for Matching
• k-Nearest neighbor – Match k:1 – Select control(s) with closest propensity score
• Caliper restriction – Require that the nearest neighbor is no more than some
distance from case
• Sample with or without replacement – With replacement means a patient may serve as a
control for more than one case
24
8/28/15
9
Adequacy of Match
• After matching, need to test whether covariates are balanced
• Simple t tests and chi-square tests are adequate • Stata has built in functions • If important covariates are still not balanced,
consider – Changing the matching method (shrink caliper, etc.) – Another propensity score method
25
Propensity Score Matching
• The outcome for a propensity score matching analysis is the average effect of treatment on the treated (ATT) – The average treatment effect for just those individuals
who were “treated”
• This is estimated from the propensity score matched groups
26
E(y1 | X,T = 1)� E(y0 | X,T = 1)
Bootstrapping
• Usual statistics after matching cannot be trusted • The standard errors (and therefore confidence
intervals and p-values) are incorrect – Don’t account for uncertainty of step 1 and step 2
• Solution is to bootstrap the data – Resamples the data (with replacement) – Repeats step 1 and step 2 – Gives correct standard errors – 500 to 1000 iterations is preferred
27
8/28/15
10
Recapping
1. Logistic regression to predict treatment 2. Match cases to controls on the propensity score 3. Compare matched cases and controls to make
sure covariates are now balanced, adjust if necessary
4. Compute ATT 5. Bootstrap to get confidence intervals and p-value
for ATT
28
Stata Code
• Stata has a set of routines that automate the propensity score matching – net search psmatch2 will allow you to install them
• Command syntax is: – psmatch2 depvar indvar1…indvar2, neighbor(k) caliper(x) noreplace out(outcome)
– psgraph – pstest indvar1…indvar2 – boostrap r(att), reps(N): psmatch2…
29
Example: PTCA versus Medical Management
• Propensity score match – 1:1 nearest neighbor – Caliper restriction of 0.03 – Sample without replacement – Compare mortality rates
• psmatch2 ptca age5565 age6575 age7585 age85 female nonwhite mq1 mq2 mq3 mq4 emergent urgent transfer qw, neighbor(1) caliper(0.03) noreplace out(died)
30
8/28/15
11
31
Logistic regression Number of obs = 30746 LR chi2(14) = 9489.10 Prob > chi2 = 0.0000 Log likelihood = -14678.604 Pseudo R2 = 0.2443 ------------------------------------------------------------------------------ ptca | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age5565 | -.2631597 .0491659 -5.35 0.000 -.3595231 -.1667964 age6575 | -.5231374 .0523722 -9.99 0.000 -.6257849 -.4204899 age7585 | -1.030738 .0554906 -18.58 0.000 -1.139498 -.921979 age85 | -2.05675 .0787066 -26.13 0.000 -2.211012 -1.902488 female | -.2599268 .029954 -8.68 0.000 -.3186356 -.201218 nonwhite | -.0228066 .0356434 -0.64 0.522 -.0926664 .0470533 mq1 | 1.048819 .1870906 5.61 0.000 .6821279 1.41551 mq2 | .6223701 .1897966 3.28 0.001 .2503757 .9943646 mq3 | -.3106409 .1925283 -1.61 0.107 -.6879896 .0667077 mq4 | -1.492374 .2299972 -6.49 0.000 -1.94316 -1.041587 emergent | -.8131428 .0552582 -14.72 0.000 -.9214469 -.7048387 urgent | -.6004808 .0577716 -10.39 0.000 -.713711 -.4872505 transfer | 1.078308 .0391937 27.51 0.000 1.001489 1.155126 qw | .8630298 .0291064 29.65 0.000 .8059823 .9200772 _cons | -.2195265 .1918366 -1.14 0.252 -.5955193 .1564663 ------------------------------------------------------------------------------
32
---------------------------------------------------------------------------------------- Variable Sample | Treated Controls Difference S.E. T-stat ----------------------------+----------------------------------------------------------- died Unmatched | .013345284 .157498189 -.144152905 .003721816 -38.73 ATT | .01809443 .094430308 -.076335878 .003821262 -19.98 ----------------------------+----------------------------------------------------------- Note: S.E. does not take into account that the propensity score is estimated. psmatch2: | psmatch2: Common Treatment | support assignment | Off suppo On suppor | Total -----------+----------------------+---------- Untreated | 0 20,705 | 20,705 Treated | 2,967 7,074 | 10,041 -----------+----------------------+---------- Total | 2,967 27,779 | 30,746
33
8/28/15
12
Test Covariate Balance
• pstest age5565 age6575 age7585 age85 female nonwhite mq1 mq2 mq3 mq4 emergent urgent transfer qw
34
35
---------------------------------------------------------------------------- | Mean %reduct | t-test Variable Sample | Treated Control %bias |bias| | t p>|t| ------------------------+----------------------------------+---------------- age5565 Unmatched | .23454 .1064 34.6 | 30.08 0.000 Matched | .23268 .21233 5.5 84.1 | 2.91 0.004 | | age6575 Unmatched | .24529 .20246 10.3 | 8.56 0.000 Matched | .32075 .29403 6.4 37.6 | 3.44 0.001 | | age7585 Unmatched | .18534 .36513 -41.1 | -32.62 0.000 Matched | .25163 .24908 0.6 98.6 | 0.35 0.727 | | age85 Unmatched | .03077 .24733 -65.9 | -48.45 0.000 Matched | .04368 .04227 0.4 99.3 | 0.41 0.678 | | female Unmatched | .34439 .52664 -37.4 | -30.49 0.000 Matched | .41532 .40811 1.5 96.0 | 0.87 0.384 | | nonwhite Unmatched | .2196 .18933 7.5 | 6.24 0.000 Matched | .25205 .22533 6.6 11.7 | 3.73 0.000 | | mq1 Unmatched | .38104 .10558 67.8 | 60.40 0.000 Matched | .25516 .27453 -4.8 93.0 | -2.61 0.009 | |
Stata Boostrap Step
• bootstrap r(att), reps(500): psmatch2 ptca age5565 age6575 age7585 age85 female nonwhite mq1 mq2 mq3 mq4 emergent urgent transfer qw, neighbor(1) caliper(0.03) noreplace out(died) logit
36
8/28/15
13
37
Stata Bootstrap Results Bootstrap replications (500) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 .................................................. 150 .................................................. 200 .................................................. 250 .................................................. 300 .................................................. 350 .................................................. 400 .................................................. 450 .................................................. 500 Bootstrap results Number of obs = 30746 Replications = 500 command: psmatch2 ptca age5565 age6575 age7585 age85 female nonwhite mq1 mq2
mq3 mq4 emergent urgent transfer qw, neighbor(1) caliper(0.03) noreplace out(died) logit
_bs_1: r(att) ------------------------------------------------------------------------------ | Observed Bootstrap Normal-based | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _bs_1 | -.0763359 .0035432 -21.54 0.000 -.0832804 -.0693914 ------------------------------------------------------------------------------
Final Results
38
R Code
• Use the R package “MatchIt” – install.packages(“MatchIt”) – library(MatchIt)
• Also use the R package “Zelig” – install.packages(“Zelig”) – library(Zelig)
• MatchIt does the analysis, Zelig gives you confidence intervals
8/28/15
14
R Code
Conclusions
• Among all patients with AMI there is a mortality risk difference of 14.5 percentage points (1.3% vs. 15.7%) for patients who received PTCA relative to those who did not
• Of this, approximately half of the risk difference (7.6 percentage points) is attributable to PTCA
• The rest is attributable to differences in the patient populations
42
8/28/15
15
Propensity Scores Cannot …
• Guarantee balance of characteristics that are not modeled explicitly
• Capture the effects of being “at risk” if there has been no clinical manifestation
• Replace randomized trials for efficacy studies
Propensity Scores Can …
• Achieve a balance not obtainable through matching or stratification
• Achieve greater balance than randomization for recorded factors
• Be easily implemented in large database research • Be very useful in health services research
Homework
• What is the attributable cost of SSI in liver transplantation?
• Are there imbalanced covariates between patients with and without SSI?
• Perform a propensity score matching analysis on the liver transplant data – Perform a 1:1 match – Sample with replacement – Compute the ATT – Bootstrap 500 iterations to get the 95% confidence
interval