UC Irvine Heart Disease dataset
Intelligent Clinical Decision Support with Probabilistic and Temporal EHR Modeling
Simulations over 5,807 IN/TN patients >50% improvement in cost effectiveness >30% improvement in outcomes
beyond existing fee-for-service model
[Bennett and Hauser, Artificial
Intelligence in Medicine 2013]
EHR
SRL POMDP
Domain expert
Provides rules & corrective feedback
Identifies clinically relevant features
Disease progression model
Intelligent Clinical Decision Support System (ICDSS)
Primary clinician Observations, test results, imaging, interventions
Recommended treatment plans w/ human-readable rationale
Kris Hauser* (PI), Sriraam Natarajan* (Co-PI), Shaun Grannis† (Co-PI) *Indiana University School of Informatics and Computing † Indiana University School of Medicine, Regenstrief Institute
With clinical partners:
Overview and Motivation Progress & Expected Outcomes Existing and Ongoing Work
POMDP models in chronic depression treatment
0.45
0.55
0.65
0.75
0.85
0.95
J48 SVM AdaBoost Bagging Logis3c Naïve Bayes Single Rela3onal
Tree
So> Margin boos3ng
AUCR
OC
SRL for medical prediction tasks Cardiology domain
Circulation; 92(8), 2157-62, 1995
JACC; 43, 842-7, 2004
Plaque in the left Coronary artery
Given - age, sex, cholesterol, bmi, glucose, hdl level, ldl level, exercise, trig level, systolic bp and diastolic bp for all years 0 and 20 Predict– high Coronary Artery Calcification (CAC) levels at year 20
Atherosclerosis is the cause of the majority of Acute Myocardial Infarctions (heart attacks)
Model: mixtures of relational probabilistic decision trees Learning: Relational Functional Gradient Boosting (RFGB)
§ New Soft Margin relational learning algorithm improves recall
§ Tunable penalties for false positives vs false negatives
§ State-of-the-art performance on UC Irvine Heart Disease dataset (at right) and others
§ Predic've model: Dynamic Bayesian Network § Predic'on horizon: 8 § Decision variable: treat / not treat § Outcome variables: self-‐reported survey (CDOI), treatment cost § Objec've func'on: tradeoff between CDOI and $$$ (clinician-‐selected weight)
[Yang et al, submitted to ICD
M]
Overview of envisioned CDSS
§ Clinical decision-support systems (CDSS) have potential to exploit the wealth of clinical data in EHRs in addition to expert recommendations
§ New AI techniques needed to plan chronic and multi-stage treatments: compute patient-specific, temporal, statistically-justified treatment plans
§ Balance costs vs patient outcomes, reason with uncertainty § Hypothesis: CDSS can improve state of current clinical practice by
providing outcome-driven and cost-driven optimized decisions
Two promising AI techniques: SRL and POMDP Sta's'cal Rela'onal Learning (SRL): learning probabilis3c models from datasets with rela3onal structure § Handles linked datasets, incomplete/missing data, noise § Excellent for EHR data
Par'ally Observable Markov Decision Processes (POMDP): learning probabilis3c models from datasets with rela3onal structure § Handles linked datasets, incomplete/missing data, noise § Excellent for EHR data
-‐1800
-‐1600
-‐1400
-‐1200
-‐1000
-‐800
-‐600
-‐400
-‐200
0
Trivial Rand. Forest Log. Reg. Handmade BN Learned BN
LOG-‐LIKELIHO
OD
MODEL
Final Outcome, Probabilis[c Predic[on,
10-‐fold cross valida[on Higher is be\er (0 is perfect predic[on)
Developing SRL methods to learn likelihoods of future adverse medical events • CARDIA 20 year longitudinal
dataset (N=5115) • Preliminary results suggest
high predictive power @ year 20
Future: use POMDP to suggest lifestyle changes (smoking, exercise), medications
ER domain
Int’l Stroke Trial dataset N=19,435 patients (1991-1996) with acute stroke symptoms 3 observation, 2 decision points: • Admission (t=0): demographics,
symptoms observed. Medicine administered
• Discharge (t=2 weeks): test results, diagnosis. Possible change of treatment, long-term prescription
• Follow up (t=6 mo): death, long term dependency
• Current: Learning probabilistic conditional models from partial & noisy data
• Future: optimize patient-specific plans
Regenstrief PHESS: 65 million records across Indiana Identify high utilizers: significant source of waste
Stroke domain
Top Related