Rehospitalization Analytics: Modeling and Reducing the Risks of Rehospitalization Chandan K. Reddy...

1
Rehospitalization Analytics: Modeling and Reducing the Risks of Rehospitalization Chandan K. Reddy Department of Computer Science, Wayne State University NSF Award #1231742 (10/01/2012 – 09/30/2015) PROJECT GOALS WORK IN PROGRESS PROJECT TEAM MEMBERS CONCLUSIONS REFERENCES • Chandan Reddy, PI • David Lanfear, Co-PI • Bhanu Vinzamuri, Graduate Student Rajiur Rahman, Graduate Student • Yan Li, Graduate Student • Chandan K. Reddy, Bhanu Vinzamuri, Yan Li, and David Lanfear, "Predicting 30-day Readmissions using Regularized Regression Methods“, (in submission). Indranil Palit and Chandan K. Reddy, "Scalable and Parallel Boosting with MapReduce", IEEE Transactions on Knowledge and Data Engineering (TKDE), Vol.24, No.10, pp.1904-1916, October 2012. • Samir Al-Stouhi and Chandan K. Reddy, "Adaptive Boosting for Transfer Learning using Dynamic Updates", In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD) , Athens, Greece, September 2011. Effective risk prediction using several clinical data sources available in hospitals is still in its infancy and building advanced models can benefit the hospitals. Knowledge transfer instead of data sharing seems to be a viable option for improving the risk prediction results. • Advanced models are required to clearly understand the disparities in different race/age/gender groups of patients. OUTREACH ACTIVITIES A special issue on Intelligent Systems for Healthcare will be published in the ACM Transactions on Intelligent Systems and Technology (ACM TIST). http://tist.acm.org/ • Organized ACM SIGKDD Workshop on Health Informatics in August 2012. http://www.ischool.drexel.edu/HI-KDD2012/ Presented a tutorial on “Big Data Analytics for Healthcare” at SDM 2013. http://dmkd.cs.wayne.edu/TUTORIAL/Healthcare / CONTACT [email protected] u READMISSION CYCLE RESEARCH INNOVATIONS PROBLEM MOTIVATION PREDICTION RESULTS Method C- index AUC LACE NA 0.57 RANDOM SURVIVAL FORESTS 0.574 0.597 COX 0.586 0.61 COX – LASSO 0.61 0.63 COX – ELASTIC NET 0.62 0.64 COX – ADAPTIVE ELASTIC NET 0.635 0.66 CRITICAL VARIABLES Variabl es Impo rtan ce LR COX COX Lass o COX AEN HGB 0.81 NO YES YES YES Ckd 0.75 NO YES YES YES Diabete s 0.71 NO NO YES YES Hyper - Tension 0.70 NO YES YES YES BUN 0.66 YES YES YES YES Age 0.66 YES NO NO NO CAD 0.61 NO NO YES YES Afib 0.60 NO NO YES YES ` CLINICAL FEATURE TRANSFORMATION OUR MODELING FRAMEWORK ELASTIC NET (EN) : Induces sparsity and can handle correlated variables. COX ELASTIC NET : Uses the EN along with Cox Likelihood function. Efficiently optimized using Coordinate descent style optimization algorithms. COX ADAPTIVE ELASTIC NET : Uses the weighted LASSO penalty. Sparse Cox methods can improve the prediction ability and simultaneously provide some important variables for rehospitalization. • Transfer Learning methods allow for population variations and provide accurate models at hospitals that contain only few patient records. • Understanding racial disparities in clinical data is now feasible due to the advances made in the constrained predictive modeling Build novel computational models to effectively predict the risk of rehospitalization using patients’ electronic health records. • Construct adaptable time-sensitive classifiers that make predictions in the presence of inherent concept drifts in the data distribution. Develop new methods that can extract the overall significant and population-specific risk factors effectively even in the presence of several correlations in the data. • Validate the proposed computational models using Heart Failure patient data collected at the Henry Ford Health System since 2001. Integrating multiple heterogeneous clinical data sources about the patients. • Decision making in the presence of partial patient information. Understanding disease risk predictors that are previously unknown (or not studied). Knowledge transfer between different data hospitals with different population distributions. • Studying the population-specific risk factors through modeling subgroup disparities (e.g. racial). Hospitalizations account for more than 30% of the 2 trillion annual cost of healthcare in the US. • As many as 20% of all hospital admissions occur within 30 days of a previous discharge. Such rehospitalizations are not only expensive but are also potentially harmful, and most importantly, they are often preventable. • Identifying patients at a risk of rehospitalization can guide efficient resource utilization and is a cost-effective measure that can save millions of healthcare dollars each year. Integrate data from multiple clinical data sources. Apply clinical feature transformation. The likelihood function of the cox regression model will be optimized and evaluated using the 30-day time interval. Different penalties will be added to the objective function.

Transcript of Rehospitalization Analytics: Modeling and Reducing the Risks of Rehospitalization Chandan K. Reddy...

Page 1: Rehospitalization Analytics: Modeling and Reducing the Risks of Rehospitalization Chandan K. Reddy Department of Computer Science, Wayne State University.

Rehospitalization Analytics: Modeling and Reducing the Risks of RehospitalizationChandan K. Reddy

Department of Computer Science, Wayne State University NSF Award #1231742 (10/01/2012 – 09/30/2015)

PROJECT GOALS

WORK IN PROGRESS

PROJECT TEAM MEMBERS

CONCLUSIONS

REFERENCES

• Chandan Reddy, PI• David Lanfear, Co-PI

• Bhanu Vinzamuri, Graduate Student• Rajiur Rahman, Graduate Student • Yan Li, Graduate Student

• Chandan K. Reddy, Bhanu Vinzamuri, Yan Li, and David Lanfear, "Predicting 30-day Readmissions using Regularized Regression Methods“, (in submission).

• Indranil Palit and Chandan K. Reddy, "Scalable and Parallel Boosting with MapReduce", IEEE Transactions on Knowledge and Data Engineering (TKDE), Vol.24, No.10, pp.1904-1916, October 2012.

• Samir Al-Stouhi and Chandan K. Reddy, "Adaptive Boosting for Transfer Learning using Dynamic Updates", In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD) , Athens, Greece, September 2011.

• Effective risk prediction using several clinical data sources available in hospitals is still in its infancy and building advanced models can benefit the hospitals.

• Knowledge transfer instead of data sharing seems to be a viable option for improving the risk prediction results.

• Advanced models are required to clearly understand the disparities in different race/age/gender groups of patients.

OUTREACH ACTIVITIES

• A special issue on Intelligent Systems for Healthcare will be published in the ACM Transactions on Intelligent Systems and Technology (ACM TIST). http://tist.acm.org/

• Organized ACM SIGKDD Workshop on Health Informatics

in August 2012. http://www.ischool.drexel.edu/HI-KDD2012/

• Presented a tutorial on “Big Data Analytics for Healthcare” at SDM 2013. http://dmkd.cs.wayne.edu/TUTORIAL/Healthcare/

CONTACT [email protected]

READMISSION CYCLE

RESEARCH INNOVATIONS

PROBLEM MOTIVATION PREDICTION RESULTS

Method C-index AUC

LACE NA 0.57

RANDOM SURVIVALFORESTS

0.574 0.597

COX 0.586 0.61

COX – LASSO 0.61 0.63

COX – ELASTIC NET 0.62 0.64

COX – ADAPTIVE ELASTIC NET

0.635 0.66

CRITICAL VARIABLES

Variables Importance

LR COX COX Lasso

COX AEN

HGB 0.81 NO YES YES YES

Ckd 0.75 NO YES YES YES

Diabetes 0.71 NO NO YES YES

Hyper -Tension

0.70 NO YES YES YES

BUN 0.66 YES YES YES YES

Age 0.66 YES NO NO NO

CAD 0.61 NO NO YES YES

Afib 0.60 NO NO YES YES

CLINICAL FEATURE TRANSFORMATION

OUR MODELING FRAMEWORK

• ELASTIC NET (EN): Induces sparsity and can handle correlated variables.

• COX ELASTIC NET: Uses the EN along with Cox Likelihood function. Efficiently optimized using Coordinate descent style optimization algorithms.

• COX ADAPTIVE ELASTIC NET: Uses the weighted LASSO penalty.

• Sparse Cox methods can improve the prediction ability and simultaneously provide some important variables for rehospitalization.

• Transfer Learning methods allow for population variations and provide accurate models at hospitals that contain only few patient records.

• Understanding racial disparities in clinical data is now feasible due to the advances made in the constrained predictive modeling techniques.

• Building clinician-friendly user interfaces.

• Build novel computational models to effectively predict the risk of rehospitalization using patients’ electronic health records.

• Construct adaptable time-sensitive classifiers that make predictions in the presence of inherent concept drifts in the data distribution.

• Develop new methods that can extract the overall significant and population-specific risk factors effectively even in the presence of several correlations in the data.

• Validate the proposed computational models using Heart Failure patient data collected at the Henry Ford Health System since 2001.

• Integrating multiple heterogeneous clinical data sources about the patients.

• Decision making in the presence of partial patient information.

• Understanding disease risk predictors that are previously unknown (or not studied).

• Knowledge transfer between different data hospitals with different population distributions.

• Studying the population-specific risk factors through modeling subgroup disparities (e.g. racial).

• Modeling time-sensitive medical data.

• Hospitalizations account for more than 30% of the 2 trillion annual cost of healthcare in the US.

• As many as 20% of all hospital admissions

occur within 30 days of a previous discharge. Such rehospitalizations are not only expensive but are also potentially harmful, and most importantly, they are often preventable.

• Identifying patients at a risk of rehospitalization can guide efficient resource utilization and is a cost-effective measure that can save millions of healthcare dollars each year.

• Integrate data from multiple clinical data sources.

• Apply clinical feature transformation.

• The likelihood function of the cox regression model will be optimized and evaluated using the 30-day time interval.

• Different penalties will be added to the objective function.