Post on 02-Jan-2016
Hospitalization Prediction From Health Care Claims
Adithya Renduchintala, Benjamin Martin, & Lance LegelUniversity of Colorado Boulder Data Mining Spring
2012
OVERVIEW•Why hospitalization?•What will we do?•What is our data?•How will we evaluate?•How will we research?•How will we implement?•When are our milestones?
WHY HOSPITALIZATION?
• 70 million Americans hospitalized / year• 5 million / year preventable
→ $30 billion / year
•Data mining can help!
WHAT WILL WE DO?
→ →
Analyze health care data on 76,000 people over a 3 year period with 2.6 million events
Correlate events and hospitalization outcomes to train prediction algorithms
Predict number of days a person will be hospitalized next year given new event data
WHAT IS OUR DATA?
2,600,000 instances of above data for 76,000 unique members +• Member sex and age group• Number of drugs prescribed per member• Number of laboratory and pathology tests per member
WHAT IS OUR DATA?
WHAT IS OUR DATA?
WHAT IS OUR DATA?
WHAT IS OUR DATA?
i = current member n = number of membersp = predicted days in hospital for ia = actual days in hospital for i
HOW WILL WE EVALUATE?
HOW WILL WE RESEARCH?
1. “Data mining and clinical data repositories: Insights from a 667,000 patient data set”
2. “Introduction to neural networks in health care”
3. “Stock market prediction system with modular neural networks”
HOW WILL WE IMPLEMENT?
↔ ↔
Support vector machine to classify members as “yes” or “no” for being hospitalized
Feature engineering of domain model knowledge into learning algorithms
Neural network to quantify number of days “yes” members are hospitalized
WHEN ARE OUR MILESTONES?
March 1 Regression fitted
March 20 SVM features integrated
April 10 ANN features integrated