Predictive Analytics PA WC Conf June 2017 FINAL.pptx [Read ...€¦ · Predictive Analytics •...

Post on 06-Jun-2020

3 views 0 download

Transcript of Predictive Analytics PA WC Conf June 2017 FINAL.pptx [Read ...€¦ · Predictive Analytics •...

1

AGENDA

What is predictive analytics?What is data mining?What is big data?What is a data scientist?

How big data is leveraged to save lives

Advanced math – moving away from linear models of data analytics

Predictive analytics in claims and underwriting

Questions

2

WHAT IS PREDICTIVE ANALYTICS?

Predictive analytics describes any approach to data mining with four attributes:

An emphasis on prediction, rather than description, classification or clustering Rapid analysis measured in hours or days, rather than the

stereotypical months of traditional data mining An emphasis on the business relevance of the resulting insights -

no “ivory tower” analyses An emphasis on ease of use, thus making the tools accessible to

business users.

3

4

WHAT IS DATA MINING?

Data mining is an interdisciplinary subfield of computer science. It is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems.

WHAT IS BIG DATA?

Big data is a term for data sets that are so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, querying, updating and information privacy.

5

6

A data scientist is someone who blends math, algorithms, and an understanding of

human behavior with the ability to hack systems together to get answers to

interesting human questions from data

What is a Data Scientist?

How Big Data is Leveraged To Save Lives

Kent Szalla

7

Agenda 

• Predictive Analytics• How  to get started ‐ Vision Strategy Plan (VSP) • Prerequisites for Success • Closing

8

Predictive Analytics

• Not:– Standard statistical analysis– Actuarial science

• Ingredients:– Lots of data– Lots of processing power– Tools (Tensorflow, R, JSAT, IBM, etc.)– Data scientists– Domain knowledge– CrowdFlower definition: AI=TD+ML+HITL

9

Predictive Analytics

• Where:– Everywhere

• Who:– Google– Uber– PeopleNet– Othot

10

“Machine learning...is the next transformation...[it] will be the basis and fundamentals of every 

successful huge IPO win in 5 years.” – Eric Schmidt, Google Executive Chairman

11

VisionVSP

StrategyPlan

Goal

Logic to reach Goal

Steps to reach Strategy

12

Vision ‐ GoalWhat we want to accomplish

Why does a company implement a Safety Program?

Protect Life & Reduce Costs

13

Strategy – LogicHow we reach our goal

How can we Protect Life & Reduce Costs?

Reduce & Control Risk

14

Plan ‐ Steps

• Understand our vulnerabilities• Create action plan to mitigate

Steps to implement our Strategy

How can we Reduce & Control Risk?

Measure risk (historical and potential)

15

VisionVSP

StrategyPlan

• Protect Life & Reduce Costs

• Reduce & Control Risk

• Measure Risk• Action Plan

16

What metrics do we measure?

• Incidents/Injuries• Near Hits• Types (FA vs Recordable)

• Inspections• Observations• Observers

• Wt. % Safe• Late Fixes• Indexing

Measuring Risk

Historical Risk Engagement Quality

17

What granularities do we measure?

Measuring Risk

Locations

Workers / Contractors

Observers

18

Metrics

• Incidents/Injuries• Near Hits• Types (FA vs Recordable)

• Inspections• Observations• Observers

• Wt. % Safe• Late Fixes• Indexing

Measuring Risk

Historical Risk Engagement Quality (Inspection)

GranularitiesLocationsWorkers / ContractorsObservers

How much data is this?

19

In a given month…

Granularity # of Levels

Locations

Workers / Contractors

Observers

Total

20

25

30

35

40

Monthly Trend for Average CompanyLocations Past 12 Months

20

37

50

55

60

65

70

75

80

85

Monthly Trend for Average CompanyWorkers / Contractors Past 12 Months

21

77

0

20

40

60

80

100

120

Monthly Trend for Average CompanyObservers Past 12 Months

22

93

23

In a given month…

Granularity # of Levels

Locations

Workers / Contractors

Observers

Total

24

In a given month…

Granularity # of Levels

Locations 37

Workers / Contractors 77

Observers 93

Total 207

Metric Type # of Metrics

Historical Incidents 6

Engagement 5

Quality 22

Total 33

207 Granularities  X 33 Metrics 6,831 Data Points=

25

How much information can the brain handle at one time?

The information you can hold in your mind at one time is the information you can interrelate.

Nelson Cowan, Ph.D. PsychologyProfessor Univ. Missouri‐Columbia

4Answer:

< 6,831

26

Problem

4 6,831

Mind’s LimitInformation to 

Process

<

Problem Solver (Data Scientist)

27

How can a Data Scientist help?

6,831 1

28

Prediction

Condense Information Measure Risk

Plan • Measure Risk• Action Plan

29

Interpreting Prediction

Probability

An injury is likely to occur

30

Interpreting Prediction

Probability

80% Chance of Rain

80% Chance of Injury

31

Interpreting Prediction

Project Risk

Palo Construction

RFK Bridge

OPD Headquarters

One Life Way

National Harbor

Flag

32

Interpreting Prediction

Project Risk

Palo Construction 20%

RFK Bridge 95%

OPD Headquarters 10%

One Life Way 50%

National Harbor 80%

33

What else can we do with Probabilities?

Grouping

0% 100%33% 66%

0% 100%50% 80%

0% 100%70%

Not Likely

Very Likely

34

Project Risk

Palo Construction 20%

RFK Bridge 95%

OPD Headquarters 10%

One Life Way 50%

National Harbor 80%

What else can we do with Probabilities?

Grouping

35

Project Risk

RFK Bridge 95%

National Harbor 80%

One Life Way 50%

Palo Construction 20%

OPD Headquarters 10%

What else can we do with Probabilities?

Ranking

0%

25%

50%

75%

100%

Location Probability TrendsRFK Bridge National Harbor

36

What else can we do with Probabilities?

Trending

37

What else can we do with Probabilities?

Aggregation

38

Measure Risk

Plan • Measure Risk• Action Plan

GroupingRankingTrendingAggregation

39

Plan • Measure Risk• Action Plan

How do we come up with an Action Plan?

40Metric 1

Metric

 2

41Metric 1

Metric

 2Profile 1

Profile 2

42

43

Profile 1

Low EngagementHigh # At‐Risk

Low # Focused

Action Plan using Best Practices

44

Measure Risk

Plan • Measure Risk• Action Plan

GroupingRankingTrendingAggregation

45

Action Plan

Plan • Measure Risk• Action Plan

Action Plan

Expanding the Concept

4747

Project Risk Recordable Body Part Cause

RFK Bridge 95% 41% Arm Struck By

National Harbor 80% 79% Back Slip/Trip

One Life Way 50% 15% Ankle Slip/Trip

Palo Construction 20% 2% Eye Foreign Object

OPD Headquarters 10% 1% Arm Laceration

Expanding the Concept

4848

Project Risk Recordable Body Part Cause

RFK Bridge 95% 41% Arm Struck By

National Harbor 80% 79% Back Slip/Trip

One Life Way 50% 15% Ankle Slip/Trip

Palo Construction 20% 2% Eye Foreign Object

OPD Headquarters 10% 1% Arm Laceration

Future Capabilities

4949

Future Capabilities

Project Risk Recordable Body Part Cause

• People = data scientists and domain experts• Processes = collect and scrub data. Data quality.• Tools = many available• Adequate VSP• Engagement at all levels • Adequate plan to review/interpret results

– Data Use Plan– Seat at the Table 

50

Prerequisites for Success

51

52

ADVANCED MATH VERSUS BLACK BOX

Moving Away from Linear (Traditional) Models

Predict health given height and weight

Weight

Height

Healthy Individual

Unhealthy Individual

Moving Away from Linear (Traditional) Models

Predict health given height and weight

Weight

Height

Healthy Individual

Unhealthy Individual

Predict Healthy

Predict Unhealthy

Logistic Regression

Moving Away from Linear (Traditional) Models

Predict health given height and weight

Weight

Height

Healthy Individual

Unhealthy Individual

Predict Healthy

Predict Unhealthy

Logistic Regression

Moving Away from Linear (Traditional) Models

Predict health given height and weight

Weight

Height

Healthy Individual

Unhealthy Individual

Predict Unhealthy

Predict Unhealthy

Predict Healthy

Decision Tree

• Leverage more of the data being captured

Traditional Approach Big Data Approach

Analyze small subsets of data Analyze all data

Analyzedinformation

All available information

All available informationanalyzed

Analytics can help identify “Useful” data

3

Slide 57

3 there's a point here about validating intuition AND finding new useful data, maybeAnn Gergen, 3/3/2015

Text Mining Variables

• Text mining refers to the process of deriving relevant and usable text that can be parsed and codified into a word or numerical value.

• Text mining can identify co‐morbid conditions and/situations that will have profound impact on the outcome of a claim.  

smoking

Pain unchanged

CXR

Diabetes/insulin/injections              Packs day/coughing Pain killers/anti‐depression   Children/school   Pain unchanged Height/Weight Homemaker wife went to work     c/o, CXR, FB, FX CBT – Cognitive Behavior Therapy

SAMPLE KEY WORDS/PHRASES

Text sources: Adjuster notes, medical reports, independent medical exams, etc.

Modeling Architecture 

Data Store – all historical data collected and organized

Training – identifying company/internal/external data specific patterns

Testing – using “hold out” sets to measure the accuracy of predictions

Segmentation Analysis - Tests Model Accuracy

Divide all scored claims into segments

● After scoring distribute by ranking risks by score

○ Highest Risk to the Right

○ Lowest Risk to the Left

○ Each claim has an individual score

○ Worst Claim far right vs. Best Claim far left

○ Then add actual losses to test model accuracy

Segmentation Analysis - Tests Model Accuracy

Divide all scored claims into segments

Lowest RiskBest Claims

Highest RiskWorst Claims

20% of scored claims 20% of scored claims 20% of scored claims 20% of scored claims 20% of scored claims

High Risk

Low Risk

Early ID < Day 30 Models Identify 20% of Claims that have 78% of total costs

Medium Risk

Predictive Modeling in Action

2.19% 3.15% 4.74%

11.47%

4

Slide 62

4 early identification is most meaningful here! focus on how that translates to reserving, actuarial evals, etc.Ann Gergen, 3/3/2015

63

APPLYING ADVANCED ANALYTICS TO UNDERWRITING

GLM

Traditional Linear vs. Multivariate results

MULTIVARIATE

Daily Claim Alert Dashboard

67

CASE LEVEL RESERVING DASHBOARDS

Case‐Level Reserving Dashboard

69