Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

34
Transitioning from Traditional DW to Spark in OR Predictive Modeling Ayad Shammout and Denny Lee October 21 st , 2015

Transcript of Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

Page 1: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

Transitioning from Traditional DW to Spark in OR Predictive Modeling

Ayad Shammout and Denny LeeOctober 21st, 2015

Page 2: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

About Ayad Shammout

• Director of Business Intelligence, Beth Israel Deaconess Medical Center

• Helped build Business Intelligence, highly available / disaster recovery infrastructure for BIDMC

2

Page 3: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

About Denny Lee

• Technology Evangelist, Databricks

• Former Sr. Director of Data Sciences Eng, Concur

• Helped bring Hadoop onto Windows and Azure

3

Page 4: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

We are Databricks, the company behind Spark

Founded by the creators of Apache Spark in 2013

Share of Spark code contributed by Databricksin 2014

75%

4

Data Value

Created Databricks on top of Spark to make big data simple.

Page 5: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

Why is Operating Room Scheduling Predictive Modeling Important?

Page 6: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

6

$15-$20 / minute for a basic surgical procedure

Time is an OR's most valuable resource

Lack of OR availabilitymeans loss of patient

OR efficiency differs depending on theOR staffing and allocation (8, 10, 13, or 16h), not the workload (i.e. cases)

Page 7: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

7

“You are not going to get the elephant to shrink or change its size. You need to face the fact that the elephant is 8 OR tall and 11hr wide”

Steven Shafer, MD

Page 8: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

8

Operating RoomBetter utilization =

Better profit margins

Reduce support andmaintenance costs

Medical StaffBetter utilization =

Better profit margins

Better medical staffefficiencies = Better

outcomes

PatientsShorter wait times

and less cancellations

Better medical staffefficiencies = Better

outcomes

Page 9: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

Develop Predictive Model

• Develop a predictive model that would identify available OR time 15 business days in advance.

• Allow us to confirm wait list cases two weeks in advance, instead of when the blocks normally release four days out.

9

Page 10: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

Forecast OR Schedule

• Case load 15 business days in advance

• Book more cases weeks in advance to prevent under-utilization

• Reduce staff overtime and idle time

10

Page 11: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

Background

• Three surgical groups• GYN, urology, general surgery, colorectal, surgical

oncology• Eyes, plastics, ENT• Orthopedics, podiatry

• Currently built using SQL Server Data Mining

11

Page 12: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

Using Traditional Data Warehousing Techniques

Page 13: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

OR DWSSAS Data

MiningData Sources

OR Reports

Traditional Data Warehousing & Data Mining OR Predictive Model

Process mining model every 3 hours

OR Prediction DB

Data inserts every 3 hours

Prediction results

Page 14: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

14

Original Design

• Multiple data sources pushing data into SQL Server and SQL Server Analysis Server Data Mining

• Hand built 225 different DM modules (5 days, 15 business days ahead, 3 different groups)

• Pipeline process had to run 225 times / day (3 pools x 75 modules)

Page 15: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

15

Regression Calculations

SSAS Data Mining T-SQL Code

Intercept R2

Mean Adjusted R2

Coefficients Standard Deviation

Variance Standard Error

Page 16: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

Taking advantage of Spark’s DW Capabilities and MLlib

Page 17: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

OR DWData Sources

OR Reports

OR Predictive Model in Spark

Data inserts every 3 hours

Page 18: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

18

demoOR Block SchedulingExtract History data and run linear regression with SGD with multiple variables

Page 19: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

19

Page 20: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling
Page 21: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

21

Page 22: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

22

Page 23: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

23

Page 24: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

24

Page 25: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

25

Page 26: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

26

Page 27: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

27

OR Schedule Report (example)

Page 28: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

28

Why the model is working

• Can coordinate waitlist scheduling logistics with physicians and patients within two weeks of the surgery

• Plan staff scheduling and resources so there are less last-minute staffing issues for nursing and anesthesia

• Utilization metrics are showing us where we can maximize our elective surgical schedule and level demand

Page 29: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

Key Learnings when Migrating from Traditional DW to Spark

Page 30: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

30

Transitioning to the CloudBeth Israel Deaconess Medical Center is increasingly moving to cloud infrastructure services with the hopes of closing its data center when the hospital's lease is up in the next five years. CIO John Halamka says he's decommissioning HP and Dell servers as he moves more of his compute workloads to Amazon Web Services, where he's currently using 30 virtual machines to test and develop new applications. "It is no longer cost effective to deal with server hosting ourselves because our challenge isn't real estate, it's power and cooling," he says.

Page 31: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

31

Transitioning to the Cloud

• Need time for engineers, analysts, and data scientists to learn how to build for the cloud

• Build for security right from start – process heavy, a lot of documentation, audits / reviews

• Differentiating data engineers and engineers (REST APIs, services, elasticity, etc.)

Page 32: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

32

Transitioning to Spark

• No more stored procedures or indexes• Good for Spark SQL, services design

• Prototype, prototype, prototype • Leverage existing languages and skill sets • Leverage the MOOCs and other Spark training• Break down the silos of data engineers, engineers, data

scientists, and analysts

Page 33: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

33

Transitioning DW to Spark• Understand Partitioning, Broadcast Joins, and Parquet

• Not all Hive functions are available in Spark (99% of the time that is okay) due to Hive context

• Don’t limit yourself to build star-schemas / snowflake schemas

• Expand outside of traditional DW: machine learning, streaming

Page 34: Transitioning from Traditional DW to Apache® Spark™ in Operating Room Predictive Modeling

Thank you.For more information, please contact [email protected]@databricks.com