Quality Measurement: Is the Information Sound Enough to be Used by Decision Makers?
Cheryl L. Damberg, Ph.D., Director of ResearchPacific Business Group on HealthAcademy Health: June 8, 2004
© Pacific Business Group on Health, 2004 2
Reframed Question… How good is good enough? For use by whom for what purposes?
Purchasers--changes in plan design to reward higher quality, more efficient providers, steer enrollment (premiums, out of pocket costs)
Plans--incentive payments, tiering, narrow networks, channeling or centers of excellence
Consumers--to guide treatment choices Providers--quality improvement
© Pacific Business Group on Health, 2004 3
How Good is Good Enough? We don’t know what the right standard is
Should standards apply in same way to all end users?
What are the dangers of “noisy” information? Demming Toyota studies (Six Sigma) showed that when
gave back noisy information on performance Increased variation, decreased quality Disorienting; lost natural instinct for how to improve
How do we make optimal decision in the face of uncertainty? Decision theory analysis could help to inform these
questions Need research in this area
© Pacific Business Group on Health, 2004 4
Reality Check! What information?
Measures exist--few implemented routinely or universally Most providers have no clue what their performance is
“I’m following guidelines, it is someone else who isn’t” Is the current information better than no information?
Absent information—choice is like a flip of the coin (50:50) Decisions will still be made with no information or
poor information Default position is to base decisions solely on price
Consequences differ Patient—inconvenience for little gain in outcome Provider—ruin reputation, livelihood
© Pacific Business Group on Health, 2004 5
What’s Currently Going On Out There in Measurement? Two ends of the extreme…examples Commercial vendors
Using administrative data, often with poor case mix adjustment omitted variables that can lead to biased results handling of missing data rank ordering problems that lead enduser to incorrect
decision Research-level work
Doing shrinking estimates to address noise problem without thinking about issues of underlying data quality
© Pacific Business Group on Health, 2004 6
Where in the Measurement Process Can Things Go Wrong?
Implementation Display Reporting
Measures
Link to outcomes
Importance Valid Reliable
Poor data Small “n”
Will enduser draw correct conclusion based on how reported?
© Pacific Business Group on Health, 2004 7
Data: The Next Generation…..
© Pacific Business Group on Health, 2004 8
Underlying Problem of Data Quality One of the greatest threats to validity of
performance results are the data that “feed” the measures Even if quality measure is good (i.e., reliable,
valid), can still produce bad (“biased”) result if the data used to score performance are flawed or if the source of data omits key variables important in predicting the outcome.
© Pacific Business Group on Health, 2004 9
Example 1: Risk-Adjusted Hospital Outcome for Bypass Surgery CA CABG Mortality Reporting Program
70 hospitals submitted data in 1999 Concern about comparability across hospitals in
coding Potential impact on hospital scores Importance of “getting it right” given public reporting
38 hospitals selected for audit Focused on outliers or near outliers, with random
selection in the middle; over sampled high risk cases 2408 cases audited
Inter-rater reliability 97.6% (range: 95-99%: Cohen’s Kappa)
© Pacific Business Group on Health, 2004 10
Table 1: Comparison of Audited Data and CCMRP Submissions for Acuity, All Hospitals, 1999 Data
Audited Data Elective Urgent Emergent Salvage
Total
Elective 447 431 7 1 886 Urgent 140 911 53 4 1,108 Emergent 16 117 199 3 335
CCMRP Data
Salvage 1 18 29 4 52
Total 604 1,477 288 12 2,381
© Pacific Business Group on Health, 2004 11
Results of Audit Revealed downcoding and upcoding
problems Worst agreement: acuity (65.6%), angina
type (65.4%), angina class (45.8%), MI (68.3%), and ejection fraction (78.0%)
Missing data: incorrect classification of risk based on policy of replacing with lowest risk Ejection fraction (15.8%), MI (38.1%)
© Pacific Business Group on Health, 2004 12
Table 1: Agreement Statistics, All Hospitals, 1999 Data
Variable Records Audited
Missing Values
% Missing Values that Would be
Incorrectly Classified % Agreement
% Lower Triangle Severity Weighted
Disagreement
Acuity 2,408 2 100.00 65.56 64.36
Angina Type (Stable/Unstable) 2,408 0 NA 65.37 34.73
Angina (Yes/No) 2,408 0 NA 86.21 42.47
CCS Angina Class 2,408 105 79.05 45.76 53.19
Congestive Heart Failure 2,408 31 38.71 82.23 32.94
COPD 2,408 6 0.00 86.34 73.25
Creatinine (mg/dl) 2,408 556 3.96 93.31 56.37
Cerebrovascular Disease 2,408 3 0.00 87.67 45.79
Dialysis 2,408 91 0.00 98.13 86.67
Diabetes 2,408 3 0.00 94.73 45.67
Ejection Fraction (%) 2,408 228 15.79 78.95 60.27
Method of measuring ejection fraction 2,408 406 0.00 74.34 Not Calculated
Hypertension 2,408 7 85.71 84.39 40.43
Time from PTCA to surgery 125 45 42.22 78.40 12.50
Left Main Stenosis 2,408 388 7.22 85.96 51.46
© Pacific Business Group on Health, 2004 13
Results of Audit Classification of some hospitals as outliers
may be a result of coding deficiencies When model was re-run, saw changes in
statistical significance and/or risk differential Death (outcome variable)—small levels of
disagreement can change hospital rating Change in rankings
1 (no different better than) 6 (worse than no different) 1 (no different worse than)
© Pacific Business Group on Health, 2004 14
Impact on Fitted Model Characteristics when Replacing Audited Records with Information from Audit, 1999 Data
Model: CCMRP Data Model: CCMRP Data and Audited Data
Where Record was Audited
Estimate p-value OR* Estimate p-value OR
Intercept -7.74 0.00 -9.11 0.00
Creatinine (mg/dl) 0.18 0.00 1.20 0.01 0.15 1.01
Congestive Heart Failure 0.38 0.00 1.46 0.55 0.00 1.73
Hypertension 0.14 0.18 1.15 0.23 0.04 1.25
Dialysis 0.39 0.18 1.47 1.24 0.00 3.45
Diabetes 0.19 0.04 1.21 0.25 0.01 1.29
Elective Reference Group Reference Group
Urgent 0.26 0.02 1.29 0.33 0.00 1.39
Emergent 1.24 0.00 3.46 1.33 0.00 3.77
Acuity
Salvage 2.46 0.00 11.71 3.11 0.00 22.46
Fit Statistics:
R2 0.188 0.202
c-statistic 0.818 0.833
Hosmer-Lemeshow 2 (p-value) 9.303 (0.317) 23.068 (0.003)
© Pacific Business Group on Health, 2004 15
Steps Taken to Safeguard Against Getting it Wrong Audit Data cross validation Training on coding of variables; support to
hospital coders Display of confidence intervals
Small hospital with zero deaths (CI: 0.0%-10.0%) Combine data over multiple years
Generate more stable estimates for small volume hospitals
© Pacific Business Group on Health, 2004 16
Example 2: Pay for Performance Plan payouts to medical groups based on
rewarding those groups that rank at 75th percentile or higher
Rank ordering problems Medical groups with estimates based on small “n”
(i.e., noisy) more likely to fall in top or bottom part of distribution
Straight ranking ignores uncertainty in estimates Potential for rewarding wrong players
Rewarding noise, not signal
© Pacific Business Group on Health, 2004 17
Example 3: Individual Physician Performance Measurement Small “n” problem
Physician lacks enough events (e.g., diabetics) to score him/her at the level of the individual indicator
Estimates at indicator level are noisy (large SEs) Need to pool more information on physician’s
performance across conditions to improve the signal to noise ratio Create summary scores (e.g., RAND QA Tools)
© Pacific Business Group on Health, 2004 18
Can We Proceed? OK to start with Version 1.0 of the measures
Means of soliciting feedback Help drive improvement in measurement Won’t get it perfect on first attempt
Important to safeguard against possible mistakes in classifying Check validity of data (audit, cross validate)
Assess extent of disagreement Perform sensitivity analyses
© Pacific Business Group on Health, 2004 19
Hedging Against Uncertainty Conservative ways of reporting so don’t
mislead (level of certainty in estimate) Rank ordering—small groups may rank either in
the highest/lowest part of the distribution, yet we are most uncertain of their true performance
Cruder binning (categorization) When faced with more uncertainty or
consequences are higher Use measures as a tool to identify bottom
performers, then send out teams to find out what is going as a way to validate
© Pacific Business Group on Health, 2004 20
Measurement Issues Remain Existing measures
OK, but difficult to implement (many rely on chart review) Hospital performance
Complexity of what to measure (service line vs. overall) Physician performance
Small “n” problem; challenges of pooling data Comprehensive assessment important, but too much
information will overwhelm endusers Need for summary measures
Need to improve data systems
© Pacific Business Group on Health, 2004 21
Why Do We Need to Fill the Gaps? Lack of information and transparency
Hard to improve if you don’t know where the problem is
Continue rewarding status quo Need to increase competition to improve
quality and contain costs Information is vital for competitive markets to
operate
Top Related