Approaching Predictive Modeling
description
Transcript of Approaching Predictive Modeling
Approaching Predictive Approaching Predictive ModelingModeling
Steven S. Eisenberg, Steven S. Eisenberg, MDMDChief Science OfficerChief Science OfficerUnited HealthCareUnited HealthCare
4th Annual DM Colloquium June 23, 2005
OverviewOverview
• Types of predictive modelingTypes of predictive modeling• What predictive modeling can doWhat predictive modeling can do• The usual The usual • The less usual (focus)The less usual (focus)• SummarySummary• Q&AQ&A
The future The future ain't what ain't what it used to it used to
bebe
My Favorite Philosopher on My Favorite Philosopher on Predictive ModelsPredictive Models
What Are We Hoping to What Are We Hoping to Learn?Learn?
• From an From an actuarialactuarial perspectiveperspective– more accurately predict utilization and cost of populations more accurately predict utilization and cost of populations – adjunct to better and more accurate pricing decisions adjunct to better and more accurate pricing decisions – Perhaps flattening the actuarial cyclePerhaps flattening the actuarial cycle
• From a From a medical managementmedical management perspective perspective – identify individuals at very high risk for high utilizationidentify individuals at very high risk for high utilization– open the door to managing those individuals open the door to managing those individuals
• case management/disease management case management/disease management • prior to the high utilization prior to the high utilization
– mitigating some of the impactmitigating some of the impact• healthier populationhealthier population• lower costslower costs
Leading toLeading toImprovements in:
Improvements in:Disease Disease Management
ManagementPatient CarePatient Care
QualityQuality
Cost Cost ManagementManagement
The ToolsThe Tools• Model is a mathematical representation of Model is a mathematical representation of
realityreality– Relevant, consistent input data are neededRelevant, consistent input data are needed– The outcome The outcome mustmust be measurable be measurable– A way to relate the two mathematically A way to relate the two mathematically mustmust
existexist • Currently well over 100 different modelsCurrently well over 100 different models• For our (healthcare) purposes these really For our (healthcare) purposes these really
break down into three groups:break down into three groups:– Artificial IntelligenceArtificial Intelligence– Statistical ModelsStatistical Models– Rules Based AlgorithmsRules Based Algorithms
ACG's
DCG's
ETG's
ERG's
CCG's
CRG's, Others
CART
One Size Fits All ?One Size Fits All ?
What Predictive Modeling Can What Predictive Modeling Can DoDo
• Stratify membersStratify members– Primary or secondaryPrimary or secondary– Enhance impact of interventionsEnhance impact of interventions
• Identification of high utilizersIdentification of high utilizers– Assign risk scoresAssign risk scores
• Describe comparative severity of illnessDescribe comparative severity of illness• Identify members not receiving proper care/requiring Identify members not receiving proper care/requiring
special carespecial care– Case Management/Disease ManagementCase Management/Disease Management
• Highlight inconsistency/inefficiency of careHighlight inconsistency/inefficiency of care• Prospectively identify adverse eventsProspectively identify adverse events• Allow focused interventionsAllow focused interventions
– Maximize benefits of disease managementMaximize benefits of disease management– Allow intervention earlier in disease cycleAllow intervention earlier in disease cycle
• Financial forecasting – Actuarial riskFinancial forecasting – Actuarial risk
Application of Predictive Application of Predictive ModelsModels
• Identifying/managing complexly ill Identifying/managing complexly ill members (hospitalization avoidance)members (hospitalization avoidance)
• Refining disease management strategiesRefining disease management strategies• Managing pharmacy services (integrated Managing pharmacy services (integrated
with medical management)with medical management)• Underwriting more preciselyUnderwriting more precisely• Reimbursement based on illness burdenReimbursement based on illness burden• Assessing physician management Assessing physician management
strategiesstrategies
Additional Uses of ModelingAdditional Uses of Modeling• Influence adoption of best practicesInfluence adoption of best practices• Track effectiveness of interventionsTrack effectiveness of interventions• Establish pay for performanceEstablish pay for performance• Set more accurate premiumsSet more accurate premiums• Develop contracts with providersDevelop contracts with providers
– ActuarialActuarial• Help plan network compositionHelp plan network composition
– Based on member needsBased on member needs• Develop specific, targeted interventionsDevelop specific, targeted interventions
– Probabilities for certain outcomesProbabilities for certain outcomes– Practice guidelinesPractice guidelines– Practice standardizationPractice standardization
• Decrease variationDecrease variation
Choosing the Right ModelChoosing the Right Model• There is no one model that does There is no one model that does
everything the besteverything the best• What are you trying to do?What are you trying to do?
– Is there a model that fits the problem Is there a model that fits the problem (or the data) better?(or the data) better?
• What is available to you?What is available to you?– Can you use whatever model/data Can you use whatever model/data
you have available? you have available?• What can you afford?What can you afford?• What are you willing to What are you willing to
compromise on?compromise on?
The “Usual Suspects”The “Usual Suspects”• Most DM programs / Healthplans use grouper Most DM programs / Healthplans use grouper
rules based algorithms prospectivelyrules based algorithms prospectively– ERG’s/ETG’sERG’s/ETG’s– DCG’s DCG’s – ACG’sACG’s
• Most Fraud & Abuse programs useMost Fraud & Abuse programs use– Decision treesDecision trees– Rules based algorithmsRules based algorithms– Neural netsNeural nets– Pattern analysisPattern analysis
Some Less Usual SuspectsSome Less Usual Suspects• There are some “inexpensive” and There are some “inexpensive” and
less cumbersome ways to do less cumbersome ways to do somesome predictive modelingpredictive modeling– Trend linesTrend lines– Time seriesTime series– Markov ModelsMarkov Models– Pharmacy only modelsPharmacy only models
Statistical Statistical ModelsModels
Trend Lines = RegressionTrend Lines = Regression• Definition: the technique of fitting a simple Definition: the technique of fitting a simple
equation to real data pointsequation to real data points• Linear regression is the most common typeLinear regression is the most common type
– e.g. y=a+bx+ee.g. y=a+bx+e• Other TypesOther Types
– Multilinear regressionMultilinear regression– Logistic RegressionLogistic Regression
• It is a mathematical way of It is a mathematical way of assessing the impact and contribution of assessing the impact and contribution of diverse/disparate variables on a process or outcomediverse/disparate variables on a process or outcome
• Linear regressionLinear regression is used for is used for continuous variablescontinuous variables• Logistic regressionLogistic regression is used for is used for binomial variablesbinomial variables
Change in $PMPM Cost Over TimeChange in $PMPM Cost Over Time
Trend LinesTrend Lines• ““Poor man’s” Predictive modelPoor man’s” Predictive model• Built into ExcelBuilt into Excel
ExampleExampleChange in PMPM Cost over TimeChange in PMPM Cost over Time
Doing a PredictionDoing a Prediction
Change in PMPM Cost over TimeChange in PMPM Cost over Time
Double click on Double click on the trendlinethe trendline
Project the Trendline 6 Months Project the Trendline 6 Months ForwardForward
By doing so you are making By doing so you are making the assumption thatthe assumption that
all the variables are and will remainall the variables are and will remainconstantconstant
Change in PMPM Cost over TimeChange in PMPM Cost over Time
The PredictionThe Prediction
Time Series Time Series • Time series analysis accounts for the fact that data Time series analysis accounts for the fact that data
points taken over time may have an internal points taken over time may have an internal structure reflecting a pattern or more than one structure reflecting a pattern or more than one pattern pattern – Trend Trend – Seasonal variation (seasonality)Seasonal variation (seasonality)
• General aspectsGeneral aspects– TrendTrend
• systematic linear or (most often) nonlinear component that systematic linear or (most often) nonlinear component that changes over time and does not repeat or at least does not changes over time and does not repeat or at least does not repeat within the time range captured by our data repeat within the time range captured by our data
– SeasonalitySeasonality• May have a relationship similar to trend May have a relationship similar to trend
but tends to repeat itself in sytematic but tends to repeat itself in sytematic intervals over time intervals over time
Common Uses of Time Common Uses of Time SeriesSeries
• Economic Forecasting Economic Forecasting • Sales Forecasting Sales Forecasting • Budgetary Analysis Budgetary Analysis • Stock Market Analysis Stock Market Analysis • Yield Projections Yield Projections • Process and Quality Process and Quality
Control Control • Inventory Studies Inventory Studies • Workload Projections Workload Projections • Utility Studies Utility Studies • Census AnalysisCensus Analysis
Example:Example:Pharmacy UtilizationPharmacy Utilization
ExampleExample• Pharmacy Utilization over time – Excel w Pharmacy Utilization over time – Excel w
Trend LineTrend LineTotal Pharmacy $ Paid by Month
$0.00
$2,000,000.00
$4,000,000.00
$6,000,000.00
$8,000,000.00
$10,000,000.00
$12,000,000.00
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun
TrendTrendLineLine
Example When Analyzed via Example When Analyzed via Time SeriesTime Series
JanJan JunJunFebFeb MarMar AprApr MayMay JulJul AugAug SepSep OctOct NovNov DecDec
Trend Trend ComponentComponent
Seasonality Seasonality ComponentComponent
95%95%CICI
ExampleExample
The Prediction Using Time The Prediction Using Time SeriesSeries
Markov ModelsMarkov Models– A probabilistic process over a finite set of A probabilistic process over a finite set of
possibilities, {S1, ..., Sk}, usually called its possibilities, {S1, ..., Sk}, usually called its statesstates
• The model is capable of The model is capable of showing the probability of any showing the probability of any given state coming up nextgiven state coming up next, pr(xt=Si), and this may depend , pr(xt=Si), and this may depend on the prior history (to t-1). on the prior history (to t-1).
– originally introduced in the late 1960’s and originally introduced in the late 1960’s and early 1970’s early 1970’s
• used for a variety of applications in science and technologyused for a variety of applications in science and technology– Markov disease state simulations Markov disease state simulations portray the portray the
progression of disease over timeprogression of disease over time• It does this by dividing the disease into discrete “It does this by dividing the disease into discrete “statesstates,”,”• specifying the risks of progression per unit time between those specifying the risks of progression per unit time between those
states,states,• assigning utilities and costs to each state, assigning utilities and costs to each state, • and conducting a simulation with a defined end-point.and conducting a simulation with a defined end-point.
Markov Models Are Often Markov Models Are Often Represented GraphicallyRepresented Graphically
• State Transition State Transition FigureFigure
• Transition Transition Probability MatrixProbability Matrix
The Midwest Healthplan ProjectThe Midwest Healthplan ProjectA Real World Example of UsingA Real World Example of Using
Markov ModelsMarkov Models
Project OverviewProject Overview• A large Midwest Healthplan wants to A large Midwest Healthplan wants to
understand the movement of members by understand the movement of members by segments over time and be able to identify segments over time and be able to identify future high cost utilizers future high cost utilizers – Leverage their investment in medical management Leverage their investment in medical management
programs/techniquesprograms/techniques– Help control runaway medical inflationHelp control runaway medical inflation– Make points with their large employers by showing Make points with their large employers by showing
proactive managementproactive management• Wanted to do it without having to buy and support Wanted to do it without having to buy and support
another technologyanother technology
Project Overview, cont.Project Overview, cont.• Focusing on certain higher risk parts of their book of business Focusing on certain higher risk parts of their book of business
we developed the Markov Model for them to better understand we developed the Markov Model for them to better understand their historical movement of members from one disease state their historical movement of members from one disease state level (severity) to another over time level (severity) to another over time – at a population levelat a population level
• The model is then being run to predict what is predicted to occur The model is then being run to predict what is predicted to occur over the ensuing 6-12 monthsover the ensuing 6-12 months
• The Healthplan can then de-encrypt and identify these members The Healthplan can then de-encrypt and identify these members and reach out to them with case/disease managementand reach out to them with case/disease management– at an individual levelat an individual level
• Outcomes and costs can be monitored over timeOutcomes and costs can be monitored over time– Pre- Post AnalysisPre- Post Analysis– Matched Cohort AnalysisMatched Cohort Analysis
Diabetes Example – Predicting Member Diabetes Example – Predicting Member Counts Counts (By Age Band and Gender)(By Age Band and Gender)
(all diabetes)
The Distribution of Disease StatesThe Distribution of Disease States
Baseline Population LevelBaseline Population Level
The Markov Model The Markov Model PredictionPrediction
Member # A012556
Individual Member Transition PredictionIndividual Member Transition Prediction
1
3
4
273.17%20.50%
00.03%
6.30% 0%
Final Analysis*Final Analysis*
* with thanks to Ken Kubisty, Bearing Point Solutionswith thanks to Ken Kubisty, Bearing Point Solutions
• Rules based, member centric Rules based, member centric • Uses only pharmacy, demographic, and eligibility Uses only pharmacy, demographic, and eligibility
data as the inputsdata as the inputs• Developed by Developed by Symmetry Health Data SystemsSymmetry Health Data Systems• Assigns weighted risk score individuals based onAssigns weighted risk score individuals based on
– distribution of drugs a member is taking, age, and sexdistribution of drugs a member is taking, age, and sex– weights differ by:weights differ by:
• Threshold assumption -- $250K, $100K, $50K, $25KThreshold assumption -- $250K, $100K, $50K, $25K• Stop-loss amount is typically used as the cut-off pointStop-loss amount is typically used as the cut-off point
• Combines PRG profile and weightsCombines PRG profile and weights – represents relative health risk for a member for future represents relative health risk for a member for future
periodperiod
Pharmacy Risk GroupsPharmacy Risk Groups
AdvantagesAdvantages• DataData
– AvailabilityAvailability– Cleanliness and accuracyCleanliness and accuracy– TimelinessTimeliness
• Cost effective – IT and administrationCost effective – IT and administration• Supports more frequent risk assessmentSupports more frequent risk assessment• Predictive accuracyPredictive accuracy
– R squared and other predictive measures close R squared and other predictive measures close to those of claims based systemsto those of claims based systems
DisadvantagesDisadvantages• Pharmacy Pharmacy plusplus medical claims can improve accuracy – e.g. medical claims can improve accuracy – e.g.
– Members w/ medical use, w/o pharmacy useMembers w/ medical use, w/o pharmacy use– Conditions where drugs not integral component of treatmentConditions where drugs not integral component of treatment– Further stratification within a diseaseFurther stratification within a disease
• Incentives Incentives – linking risk to specific drug treatments may not provide best incentives for efficient and quality carelinking risk to specific drug treatments may not provide best incentives for efficient and quality care
• Linking risk to disease prevalence Linking risk to disease prevalence – harder to do without disease categorizationharder to do without disease categorization
ExampleExampleDCCsDCCs PRG ProfilePRG Profile WeightWeight
34604 – Riluzole34604 – Riluzole Agents to treat ALSAgents to treat ALS 10.32710.327
32000 – Fluoxetine 32000 – Fluoxetine HCLHCL
Antidepressants, Antianxiety Antidepressants, Antianxiety AgentsAgents
0.5340.534
Age-SexAge-Sex Age-Sex GroupAge-Sex Group
Male, 58Male, 58 Males, 55 to 64Males, 55 to 64 1.0311.031
Prospective Risk Prospective Risk ScoreScore
11.89211.892
The DHS Pilot ProjectThe DHS Pilot ProjectA Real World Example of UsingA Real World Example of Using
PRG’sPRG’s
State of MN, Department of State of MN, Department of Human ServicesHuman Services
• Desire to extend disease management to Desire to extend disease management to FFS Medicaid populationFFS Medicaid population– ~100,000~100,000– High risk populationHigh risk population
• High morbidity/Chronic IllnessHigh morbidity/Chronic Illness• Very low incomeVery low income• Distrust of managed care Distrust of managed care
• Need to demonstrate to legislature that Need to demonstrate to legislature that concepts work for this populationconcepts work for this population– Establish the Establish the opportunityopportunity for a formalized DM for a formalized DM
approach to this populationapproach to this population– Collect a series of success storiesCollect a series of success stories– Provide the data and the stories to the legislatureProvide the data and the stories to the legislature
The ApproachThe Approach• Use the tool as the first pass to provide the basic output Use the tool as the first pass to provide the basic output
file file – Rank order of patients by prospective riskRank order of patients by prospective risk
• Analyze the medical history of the highest risk members Analyze the medical history of the highest risk members – Create a clinical vignette of their medical historyCreate a clinical vignette of their medical history– ? Focus on those conditions and diseases that have a track ? Focus on those conditions and diseases that have a track
record of success in disease or case management record of success in disease or case management • Focus only on the top few percent of highest risk Focus only on the top few percent of highest risk
membersmembers– About 250 for the pilot projectAbout 250 for the pilot project
The PilotThe Pilot• Members includedMembers included
– Medicaid FFS onlyMedicaid FFS only– Continuously enrolled for at least 18 monthsContinuously enrolled for at least 18 months
• Members excludedMembers excluded– Primary serious mental health diagnosesPrimary serious mental health diagnoses– Members in skilled or unskilled nursing homesMembers in skilled or unskilled nursing homes
• Primary concern is cognitionPrimary concern is cognition• Need for a short time frameNeed for a short time frame
– Program began mid January ‘04Program began mid January ‘04– Program ended mid June ’04Program ended mid June ’04
• Final Dataset – 14,443 members Final Dataset – 14,443 members – members with highest prospective risk score had a members with highest prospective risk score had a
complete claims dump for the prior 18 monthscomplete claims dump for the prior 18 months– Highest 2% underwent detailed claims analysisHighest 2% underwent detailed claims analysis
Results – Top 24 Patients Results – Top 24 Patients • 11 females, 13 males11 females, 13 males
– Range 20-63 y.o. Range 20-63 y.o. • average 45.4 y.o.average 45.4 y.o.
– Costs for 18 Costs for 18 months months • $5,432 - $491,331$5,432 - $491,331• Average $117,945Average $117,945
– Total # of Total # of Claims/Pt Claims/Pt • 195 – 2,531195 – 2,531• Average 1,129Average 1,129
• DiagnosesDiagnoses– Diabetes - 11Diabetes - 11– Chronic Renal Chronic Renal
Failure/ESRD - 9Failure/ESRD - 9– Post kidney Post kidney
transplant - 5transplant - 5– HIV positive – 5HIV positive – 5– AIDS - 2AIDS - 2– Cystic Fibrosis - 3Cystic Fibrosis - 3– Active malignancy - Active malignancy -
22– Smokers - 5Smokers - 5
Clinical VignettesClinical Vignettes
• Member #1 Member #1 is a 30 year old woman with long standing is a 30 year old woman with long standing Cystic Fibrosis. She has problems with malaise, fatigue, skin Cystic Fibrosis. She has problems with malaise, fatigue, skin disease, and hair loss as well as multiple dislocated disease, and hair loss as well as multiple dislocated vertebrae in her neck. She had a very rocky 18 month vertebrae in her neck. She had a very rocky 18 month course with multiple recurring episodes of pneumonia course with multiple recurring episodes of pneumonia requiring hospitalization as well as multiple episodes of requiring hospitalization as well as multiple episodes of dehydration and bouts of painful Herpes Simplex. dehydration and bouts of painful Herpes Simplex.
• Her prospective risk score was over Her prospective risk score was over 2727 and she had the and she had the highest total expenditure in the dataset of highest total expenditure in the dataset of $491,000$491,000 for the for the 18 month time period.18 month time period.
Summary and ConclusionsSummary and Conclusions
• Predictive Modeling is a toolPredictive Modeling is a tool– It is a method, not an answer in itselfIt is a method, not an answer in itself– Modeling is only an arrow to add to the quiver—it Modeling is only an arrow to add to the quiver—it
is not the whole quiveris not the whole quiver• Consider the use of multiple modelsConsider the use of multiple models
– just as multiple forms of assessment are done for just as multiple forms of assessment are done for diagnosis diagnosis
– May increase reliability and accuracyMay increase reliability and accuracy• Predictive modeling is also a way to Predictive modeling is also a way to
better understand your data accuracybetter understand your data accuracy– and conversely where you have problems with and conversely where you have problems with
your data your data
Challenges of Predictive Challenges of Predictive ModelingModeling
• All of the models are more accurate at All of the models are more accurate at the aggregate (population) level than at the aggregate (population) level than at the individual levelthe individual level– Most results published are at the population levelMost results published are at the population level– Population level may work well for actuarial Population level may work well for actuarial – Medical Mgmt is typically focused on the individualMedical Mgmt is typically focused on the individual
• You can adjust (improve) the results by You can adjust (improve) the results by changing the threshold, the specificity, changing the threshold, the specificity, sensitivity, etc.sensitivity, etc.– Models demonstrate better R squared values when Models demonstrate better R squared values when
outliers are excludedoutliers are excluded• e.g. Stop-loss amountse.g. Stop-loss amounts
– But the But the outliers may be exactly the members that outliers may be exactly the members that you are trying to findyou are trying to find to have the impact you are to have the impact you are looking forlooking for
Summary & ConclusionsSummary & Conclusions• There is no one clearly superior There is no one clearly superior
predictive model predictive model– Certain approaches may be more valuable for Certain approaches may be more valuable for
underwritingunderwriting– Other approaches may be more valuable for Other approaches may be more valuable for
managing care managing care• The The actionabilityactionability quotient must also be quotient must also be
consideredconsidered– If you cannot act on the results, the study is If you cannot act on the results, the study is
merely interesting merely interesting • Linking models with interventions can help Linking models with interventions can help
you improve quality and efficiency of careyou improve quality and efficiency of care
Summary and ConclusionsSummary and Conclusions• All predictive models tend to overpredict low utilizers and All predictive models tend to overpredict low utilizers and
under predict very high utilizersunder predict very high utilizers– Some of this may be mitigated by using a threshold and excluding costs Some of this may be mitigated by using a threshold and excluding costs
beyond a certain point (typically at a stop-loss amount)beyond a certain point (typically at a stop-loss amount)– But this can exclude exactly those folks you may want to identifyBut this can exclude exactly those folks you may want to identify
• None of the models can predict “random” eventsNone of the models can predict “random” events– TraumaTrauma– PregnancyPregnancy– Catastrophic ClaimsCatastrophic Claims
• Measurement of “success” is very difficultMeasurement of “success” is very difficult– How do you “unmanage” a case to determine savings?How do you “unmanage” a case to determine savings?
• But the tools are very valuable, getting better, and But the tools are very valuable, getting better, and cancan be made to work be made to work– You will see increasing success over the next several You will see increasing success over the next several
yearsyears
You can either take action You can either take action or you can hang back and or you can hang back and
hope for a miracle.hope for a miracle.Miracles are great, but Miracles are great, but
they are so unpredictablethey are so unpredictable..
Peter DruckerPeter Drucker