MLPA for health care presentation smc

Texas State University SHLC Presentation

Shaun Comfort, MD, MBA Associate Director of Risk Management

Genentech, A Member of the Roche Group

This presentation represents the opinions of Dr. Comfort, and not that of Genentech, A Member of the Roche Group.

Common Buzzwords

2

• Artificial Intelligence (AI) The theory and development of computer systems able to perform tasks that normally require human intelligence such as vision, speech recognition, decision-making, and translation. (Source: Google Search)

• Machine Learning (ML) - Machine learning is a type of (AI) that provides

computers with the ability to learn without being explicitly programmed. (Source: WhatIs.com).

• Predictive Analytics (PA) - Predictive analytics uses statistical

algorithms and machine learning techniques to identify the likelihood of

future outcomes based on historical data. (Source:

https://www.sas.com/en_us/insights/analytics/predictive-analytics.html).

• For this presentation we assume that ML and PA are synonyms

https://www.sas.com/en_us/insights/analytics/predictive-analytics.html



Machine Learning

3 Source: Downloaded Google Images

Unlike traditional programming (aka “coding”), ML uses a set of input data and the answers (aka “output”, “response”, etc) to build a program

Some Applications of ML General “Supervised” Learning Flow and Examples:

Input(s) Fitting Function(s) Output(s)

Annotated Emails Naïve Bayes Spam (Yes/No?)

“ “ Financial Data CART/Partition Fraud(Yes/No?)

“ “ Google Images Deep ANN(s) Image ID(Cat Y/N?)

“ “ Starfield Maps Deep ANN(s) Asteroid (Y/N?)

Inputs Fitting Function Outputs

Source: Adapted from Andrew Ng Lecture: Artificial Intelligence is the New Electricity, Stanford MSx Future Forum. January 25, 2017

What Can ML Do for Healthcare? Some Potential Examples for “Supervised” ML:

Input(s) Fitting Function Output(s)

EHRs, Lab Data CART/Partition Predict High Risk Re-

Admit Patients

EHRs, Lab Data CART/Partition Medical Diagnostic

Decision Trees

Payer Data, EHRs Deep ANN(s), Log

Regression, etc ID Adverse Events

Hospital Operations, Pharmacy data

Deep ANN(s) Improve efficiency

Inputs Fitting Function Outputs

SML/PA Process What Question do you want to answer (eg, identify high

utilizers for intervention, predict re-admissions) ?

What data do you have to train a model?

What features (ie, predictors, factors, etc) in your data do you want to use?

What kind of model do you want to use (eg, K-NN, ANN, Logistic Regression, Random Forests, etc)?

What metrics must your model meet? High precision, high predictability, etc?

Performance Yardsticks Some common ‘metrics’ used to gauge a model’s

performance include:

Classification Models (eg, High Re-Admission Risk Y/N?, Adverse Event Y/N?, etc)

Inter-rater Agreement Scores (eg, Kappa), Sensitivity, Specificity, Precision, Recall, False Positive/Negative Rate

Regression Models (eg, Forecasting Hospital Census, Resource modeling, etc)

Root Mean Square Forecast Error, etc

IV Catheter Insertion Example Mann 2014, identified predictive variables for successful

intravenous catheter (IV) insertion, based on data from 592 children in two hospitals.

The dataset provided with JMP-SAS Software, used for this exercise.

Goal:

Predict - Prob(success) of starting IV on first try using 17 features (17)): Mean Difficulty, Mean Nurse Experience, Active Minutes, Nurse Competency Scores, etc

Technique – Random Forest using Bootstrap Aggregation, 416/176 Training/Validation cases, Trees/Forest = 6, Terms/Split = 4

Source: J. Mann, P. Larsen, and J. Brinkley, "Exploring the use of negative binomial regression modeling for pediatric peripheral

intravenous catheterization", Journal of Medical Statistics and Informatics, Vol. 2, Article 6, 2014

Catheter Insertion Example, cont. One of the Decision Tree’s from the Forest

Model Performance Results Key Results are as follows:

Generalized r2 = 0.79, Misclassification Rate = 10%

Sensitivity (Pos % Agreement) = 79.7%*

Specificity (Neg % Agreement) = 97.1%*

Positive Predictive Value (Precision) = 95.2%*

Recall (% of True Assessments Positive) = 79.7%*

Gwet AC1 Kappa = 80.6%*

F-Score = 86.8%*

ROC Area Under the Curve = 0.97*

Conclusion – Model shows high predictive agreement with actual validation (hold out) data

*Results based on the validation (not training) data set. All analysis performed using JMP-SAS 13 Pro

Catheter Insertion Example, cont. Resulting ROC and Confusion Matrix with Validation Data

ROC on Validation Data

Actual

Count

Predicted

Count

Success 1 No Yes

No 99 3

Yes 15 59

Confusion Matrix

Catheter Insertion Example, cont. Which features were most important?

Mean difficulty score most important + mean nurse experience, active minutes, and competency at IV placements following.

Term

Number

of Splits G^2 Portion

Mean Difficulty 22 163.388829 0.6582

Mean Nurse Exp 13 15.7981274 0.0636

Mean Active Minutes 16 13.5956915 0.0548

Mean Nurse Comp 13 12.2800927 0.0495

Mean Distress 12 8.05889412 0.0325

Weight 4 6.83369539 0.0275

Lost IV 9 4.94248458 0.0199

Shift 7 4.6603365 0.0188

Mean Cooperative 8 4.18512134 0.0169

Age 7 3.934808 0.0159

Gender 7 2.20229066 0.0089

Device Assisted 4 2.15128853 0.0087

Dehydrated 7 1.99162831 0.0080

Previous IV 6 1.62364168 0.0065

Counselor Present 4 1.17042892 0.0047

Family Present 7 0.87367306 0.0035

Expert RN 1 0.32490665 0.0013

Support Present 3 0.23575945 0.0009

Conclusions Catheter Insertion Prediction Model –

Random Forest Model with high predictive ability to estimate chance of successful pediatric IV insertion on 1st try

Key insights are that the mean difficulty assessment score for the IV placement is by far the most predictive feature determining outcome

Nurse experience, active time spent on IV insertions, and competency scores on IV insertions are the next most important predictors so:

Use your most experienced IV RNs on your most difficult patients to maximize ‘first time’ successful insertion

Train and ‘score’ your IV insertion nurses to assess competency, etc for successful insertions

Some Final Thoughts The Good News –

MLPA techniques have been used with great success in many industries

The rise of large datasets, hardware advancements, and investments in AI are paying off with the rush towards Supervised Machine Learning solutions

The Health Care Industry (ie, Medicine, HC Delivery, Pharma, Med Dev, etc) can derive similar benefits with appropriate adoption of this technology

The Bad News: Garbage In “Still” = Garbage Out (GIGO) Not even super-AI can develop meaningful insights from trash

Invest in collecting/cleaning your data appropriately.

Solid, clean data is “gold dust” for predictive modelling. Treat it as such!!

Compare your model results to human subject matter expert performance whenever possible, this is your best ‘ground truth’

MLPA for health care presentation smc

Data & Analytics

Transcript of MLPA for health care presentation smc