MLPA Highlights · 2017-11-20 · MLPA Highlights Author: Paul Reilly Created Date: 20100902203954Z ...
MLPA for health care presentation smc
-
Upload
shaun-comfort -
Category
Data & Analytics
-
view
92 -
download
0
Transcript of MLPA for health care presentation smc
Texas State University SHLC Presentation
Shaun Comfort, MD, MBA Associate Director of Risk Management
Genentech, A Member of the Roche Group
This presentation represents the opinions of Dr. Comfort, and not that of Genentech, A Member of the Roche Group.
Common Buzzwords
2
• Artificial Intelligence (AI) The theory and development of computer systems able to perform tasks that normally require human intelligence such as vision, speech recognition, decision-making, and translation. (Source: Google Search)
• Machine Learning (ML) - Machine learning is a type of (AI) that provides
computers with the ability to learn without being explicitly programmed. (Source: WhatIs.com).
• Predictive Analytics (PA) - Predictive analytics uses statistical
algorithms and machine learning techniques to identify the likelihood of
future outcomes based on historical data. (Source:
https://www.sas.com/en_us/insights/analytics/predictive-analytics.html).
• For this presentation we assume that ML and PA are synonyms
Machine Learning
3 Source: Downloaded Google Images
Unlike traditional programming (aka “coding”), ML uses a set of input data and the answers (aka “output”, “response”, etc) to build a program
Some Applications of ML General “Supervised” Learning Flow and Examples:
Input(s) Fitting Function(s) Output(s)
Annotated Emails Naïve Bayes Spam (Yes/No?)
“ “ Financial Data CART/Partition Fraud(Yes/No?)
“ “ Google Images Deep ANN(s) Image ID(Cat Y/N?)
“ “ Starfield Maps Deep ANN(s) Asteroid (Y/N?)
Inputs Fitting Function Outputs
Source: Adapted from Andrew Ng Lecture: Artificial Intelligence is the New Electricity, Stanford MSx Future Forum. January 25, 2017
What Can ML Do for Healthcare? Some Potential Examples for “Supervised” ML:
Input(s) Fitting Function Output(s)
EHRs, Lab Data CART/Partition Predict High Risk Re-
Admit Patients
EHRs, Lab Data CART/Partition Medical Diagnostic
Decision Trees
Payer Data, EHRs Deep ANN(s), Log
Regression, etc ID Adverse Events
Hospital Operations, Pharmacy data
Deep ANN(s) Improve efficiency
Inputs Fitting Function Outputs
SML/PA Process What Question do you want to answer (eg, identify high
utilizers for intervention, predict re-admissions) ?
What data do you have to train a model?
What features (ie, predictors, factors, etc) in your data do you want to use?
What kind of model do you want to use (eg, K-NN, ANN, Logistic Regression, Random Forests, etc)?
What metrics must your model meet? High precision, high predictability, etc?
Performance Yardsticks Some common ‘metrics’ used to gauge a model’s
performance include:
Classification Models (eg, High Re-Admission Risk Y/N?, Adverse Event Y/N?, etc)
Inter-rater Agreement Scores (eg, Kappa), Sensitivity, Specificity, Precision, Recall, False Positive/Negative Rate
Regression Models (eg, Forecasting Hospital Census, Resource modeling, etc)
Root Mean Square Forecast Error, etc
IV Catheter Insertion Example Mann 2014, identified predictive variables for successful
intravenous catheter (IV) insertion, based on data from 592 children in two hospitals.
The dataset provided with JMP-SAS Software, used for this exercise.
Goal:
Predict - Prob(success) of starting IV on first try using 17 features (17)): Mean Difficulty, Mean Nurse Experience, Active Minutes, Nurse Competency Scores, etc
Technique – Random Forest using Bootstrap Aggregation, 416/176 Training/Validation cases, Trees/Forest = 6, Terms/Split = 4
Source: J. Mann, P. Larsen, and J. Brinkley, "Exploring the use of negative binomial regression modeling for pediatric peripheral
intravenous catheterization", Journal of Medical Statistics and Informatics, Vol. 2, Article 6, 2014
Catheter Insertion Example, cont. One of the Decision Tree’s from the Forest
Model Performance Results Key Results are as follows:
Generalized r2 = 0.79, Misclassification Rate = 10%
Sensitivity (Pos % Agreement) = 79.7%*
Specificity (Neg % Agreement) = 97.1%*
Positive Predictive Value (Precision) = 95.2%*
Recall (% of True Assessments Positive) = 79.7%*
Gwet AC1 Kappa = 80.6%*
F-Score = 86.8%*
ROC Area Under the Curve = 0.97*
Conclusion – Model shows high predictive agreement with actual validation (hold out) data
*Results based on the validation (not training) data set. All analysis performed using JMP-SAS 13 Pro
Catheter Insertion Example, cont. Resulting ROC and Confusion Matrix with Validation Data
ROC on Validation Data
Actual
Count
Predicted
Count
Success 1 No Yes
No 99 3
Yes 15 59
Confusion Matrix
Catheter Insertion Example, cont. Which features were most important?
Mean difficulty score most important + mean nurse experience, active minutes, and competency at IV placements following.
Term
Number
of Splits G^2 Portion
Mean Difficulty 22 163.388829 0.6582
Mean Nurse Exp 13 15.7981274 0.0636
Mean Active Minutes 16 13.5956915 0.0548
Mean Nurse Comp 13 12.2800927 0.0495
Mean Distress 12 8.05889412 0.0325
Weight 4 6.83369539 0.0275
Lost IV 9 4.94248458 0.0199
Shift 7 4.6603365 0.0188
Mean Cooperative 8 4.18512134 0.0169
Age 7 3.934808 0.0159
Gender 7 2.20229066 0.0089
Device Assisted 4 2.15128853 0.0087
Dehydrated 7 1.99162831 0.0080
Previous IV 6 1.62364168 0.0065
Counselor Present 4 1.17042892 0.0047
Family Present 7 0.87367306 0.0035
Expert RN 1 0.32490665 0.0013
Support Present 3 0.23575945 0.0009
Conclusions Catheter Insertion Prediction Model –
Random Forest Model with high predictive ability to estimate chance of successful pediatric IV insertion on 1st try
Key insights are that the mean difficulty assessment score for the IV placement is by far the most predictive feature determining outcome
Nurse experience, active time spent on IV insertions, and competency scores on IV insertions are the next most important predictors so:
Use your most experienced IV RNs on your most difficult patients to maximize ‘first time’ successful insertion
Train and ‘score’ your IV insertion nurses to assess competency, etc for successful insertions
Some Final Thoughts The Good News –
MLPA techniques have been used with great success in many industries
The rise of large datasets, hardware advancements, and investments in AI are paying off with the rush towards Supervised Machine Learning solutions
The Health Care Industry (ie, Medicine, HC Delivery, Pharma, Med Dev, etc) can derive similar benefits with appropriate adoption of this technology
The Bad News: Garbage In “Still” = Garbage Out (GIGO) Not even super-AI can develop meaningful insights from trash
Invest in collecting/cleaning your data appropriately.
Solid, clean data is “gold dust” for predictive modelling. Treat it as such!!
Compare your model results to human subject matter expert performance whenever possible, this is your best ‘ground truth’