Post on 06-Jun-2020
1
AGENDA
What is predictive analytics?What is data mining?What is big data?What is a data scientist?
How big data is leveraged to save lives
Advanced math – moving away from linear models of data analytics
Predictive analytics in claims and underwriting
Questions
2
WHAT IS PREDICTIVE ANALYTICS?
Predictive analytics describes any approach to data mining with four attributes:
An emphasis on prediction, rather than description, classification or clustering Rapid analysis measured in hours or days, rather than the
stereotypical months of traditional data mining An emphasis on the business relevance of the resulting insights -
no “ivory tower” analyses An emphasis on ease of use, thus making the tools accessible to
business users.
3
4
WHAT IS DATA MINING?
Data mining is an interdisciplinary subfield of computer science. It is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems.
WHAT IS BIG DATA?
Big data is a term for data sets that are so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, querying, updating and information privacy.
5
6
A data scientist is someone who blends math, algorithms, and an understanding of
human behavior with the ability to hack systems together to get answers to
interesting human questions from data
What is a Data Scientist?
How Big Data is Leveraged To Save Lives
Kent Szalla
7
Agenda
• Predictive Analytics• How to get started ‐ Vision Strategy Plan (VSP) • Prerequisites for Success • Closing
8
Predictive Analytics
• Not:– Standard statistical analysis– Actuarial science
• Ingredients:– Lots of data– Lots of processing power– Tools (Tensorflow, R, JSAT, IBM, etc.)– Data scientists– Domain knowledge– CrowdFlower definition: AI=TD+ML+HITL
9
Predictive Analytics
• Where:– Everywhere
• Who:– Google– Uber– PeopleNet– Othot
10
“Machine learning...is the next transformation...[it] will be the basis and fundamentals of every
successful huge IPO win in 5 years.” – Eric Schmidt, Google Executive Chairman
11
VisionVSP
StrategyPlan
Goal
Logic to reach Goal
Steps to reach Strategy
12
Vision ‐ GoalWhat we want to accomplish
Why does a company implement a Safety Program?
Protect Life & Reduce Costs
13
Strategy – LogicHow we reach our goal
How can we Protect Life & Reduce Costs?
Reduce & Control Risk
14
Plan ‐ Steps
• Understand our vulnerabilities• Create action plan to mitigate
Steps to implement our Strategy
How can we Reduce & Control Risk?
Measure risk (historical and potential)
15
VisionVSP
StrategyPlan
• Protect Life & Reduce Costs
• Reduce & Control Risk
• Measure Risk• Action Plan
16
What metrics do we measure?
• Incidents/Injuries• Near Hits• Types (FA vs Recordable)
• Inspections• Observations• Observers
• Wt. % Safe• Late Fixes• Indexing
Measuring Risk
Historical Risk Engagement Quality
17
What granularities do we measure?
Measuring Risk
Locations
Workers / Contractors
Observers
18
Metrics
• Incidents/Injuries• Near Hits• Types (FA vs Recordable)
• Inspections• Observations• Observers
• Wt. % Safe• Late Fixes• Indexing
Measuring Risk
Historical Risk Engagement Quality (Inspection)
GranularitiesLocationsWorkers / ContractorsObservers
How much data is this?
19
In a given month…
Granularity # of Levels
Locations
Workers / Contractors
Observers
Total
20
25
30
35
40
Monthly Trend for Average CompanyLocations Past 12 Months
20
37
50
55
60
65
70
75
80
85
Monthly Trend for Average CompanyWorkers / Contractors Past 12 Months
21
77
0
20
40
60
80
100
120
Monthly Trend for Average CompanyObservers Past 12 Months
22
93
23
In a given month…
Granularity # of Levels
Locations
Workers / Contractors
Observers
Total
24
In a given month…
Granularity # of Levels
Locations 37
Workers / Contractors 77
Observers 93
Total 207
Metric Type # of Metrics
Historical Incidents 6
Engagement 5
Quality 22
Total 33
207 Granularities X 33 Metrics 6,831 Data Points=
25
How much information can the brain handle at one time?
The information you can hold in your mind at one time is the information you can interrelate.
Nelson Cowan, Ph.D. PsychologyProfessor Univ. Missouri‐Columbia
4Answer:
< 6,831
26
Problem
4 6,831
Mind’s LimitInformation to
Process
<
Problem Solver (Data Scientist)
27
How can a Data Scientist help?
6,831 1
28
Prediction
Condense Information Measure Risk
Plan • Measure Risk• Action Plan
29
Interpreting Prediction
Probability
An injury is likely to occur
30
Interpreting Prediction
Probability
80% Chance of Rain
80% Chance of Injury
31
Interpreting Prediction
Project Risk
Palo Construction
RFK Bridge
OPD Headquarters
One Life Way
National Harbor
Flag
32
Interpreting Prediction
Project Risk
Palo Construction 20%
RFK Bridge 95%
OPD Headquarters 10%
One Life Way 50%
National Harbor 80%
33
What else can we do with Probabilities?
Grouping
0% 100%33% 66%
0% 100%50% 80%
0% 100%70%
Not Likely
Very Likely
34
Project Risk
Palo Construction 20%
RFK Bridge 95%
OPD Headquarters 10%
One Life Way 50%
National Harbor 80%
What else can we do with Probabilities?
Grouping
35
Project Risk
RFK Bridge 95%
National Harbor 80%
One Life Way 50%
Palo Construction 20%
OPD Headquarters 10%
What else can we do with Probabilities?
Ranking
0%
25%
50%
75%
100%
Location Probability TrendsRFK Bridge National Harbor
36
What else can we do with Probabilities?
Trending
37
What else can we do with Probabilities?
Aggregation
38
Measure Risk
Plan • Measure Risk• Action Plan
GroupingRankingTrendingAggregation
39
Plan • Measure Risk• Action Plan
How do we come up with an Action Plan?
40Metric 1
Metric
2
41Metric 1
Metric
2Profile 1
Profile 2
42
43
Profile 1
Low EngagementHigh # At‐Risk
Low # Focused
Action Plan using Best Practices
44
Measure Risk
Plan • Measure Risk• Action Plan
GroupingRankingTrendingAggregation
45
Action Plan
Plan • Measure Risk• Action Plan
Action Plan
Expanding the Concept
4747
Project Risk Recordable Body Part Cause
RFK Bridge 95% 41% Arm Struck By
National Harbor 80% 79% Back Slip/Trip
One Life Way 50% 15% Ankle Slip/Trip
Palo Construction 20% 2% Eye Foreign Object
OPD Headquarters 10% 1% Arm Laceration
Expanding the Concept
4848
Project Risk Recordable Body Part Cause
RFK Bridge 95% 41% Arm Struck By
National Harbor 80% 79% Back Slip/Trip
One Life Way 50% 15% Ankle Slip/Trip
Palo Construction 20% 2% Eye Foreign Object
OPD Headquarters 10% 1% Arm Laceration
Future Capabilities
4949
Future Capabilities
Project Risk Recordable Body Part Cause
• People = data scientists and domain experts• Processes = collect and scrub data. Data quality.• Tools = many available• Adequate VSP• Engagement at all levels • Adequate plan to review/interpret results
– Data Use Plan– Seat at the Table
50
Prerequisites for Success
51
52
ADVANCED MATH VERSUS BLACK BOX
Moving Away from Linear (Traditional) Models
Predict health given height and weight
Weight
Height
Healthy Individual
Unhealthy Individual
Moving Away from Linear (Traditional) Models
Predict health given height and weight
Weight
Height
Healthy Individual
Unhealthy Individual
Predict Healthy
Predict Unhealthy
Logistic Regression
Moving Away from Linear (Traditional) Models
Predict health given height and weight
Weight
Height
Healthy Individual
Unhealthy Individual
Predict Healthy
Predict Unhealthy
Logistic Regression
Moving Away from Linear (Traditional) Models
Predict health given height and weight
Weight
Height
Healthy Individual
Unhealthy Individual
Predict Unhealthy
Predict Unhealthy
Predict Healthy
Decision Tree
• Leverage more of the data being captured
Traditional Approach Big Data Approach
Analyze small subsets of data Analyze all data
Analyzedinformation
All available information
All available informationanalyzed
Analytics can help identify “Useful” data
3
Slide 57
3 there's a point here about validating intuition AND finding new useful data, maybeAnn Gergen, 3/3/2015
Text Mining Variables
• Text mining refers to the process of deriving relevant and usable text that can be parsed and codified into a word or numerical value.
• Text mining can identify co‐morbid conditions and/situations that will have profound impact on the outcome of a claim.
smoking
Pain unchanged
CXR
Diabetes/insulin/injections Packs day/coughing Pain killers/anti‐depression Children/school Pain unchanged Height/Weight Homemaker wife went to work c/o, CXR, FB, FX CBT – Cognitive Behavior Therapy
SAMPLE KEY WORDS/PHRASES
Text sources: Adjuster notes, medical reports, independent medical exams, etc.
Modeling Architecture
Data Store – all historical data collected and organized
Training – identifying company/internal/external data specific patterns
Testing – using “hold out” sets to measure the accuracy of predictions
Segmentation Analysis - Tests Model Accuracy
Divide all scored claims into segments
● After scoring distribute by ranking risks by score
○ Highest Risk to the Right
○ Lowest Risk to the Left
○ Each claim has an individual score
○ Worst Claim far right vs. Best Claim far left
○ Then add actual losses to test model accuracy
Segmentation Analysis - Tests Model Accuracy
Divide all scored claims into segments
Lowest RiskBest Claims
Highest RiskWorst Claims
20% of scored claims 20% of scored claims 20% of scored claims 20% of scored claims 20% of scored claims
High Risk
Low Risk
Early ID < Day 30 Models Identify 20% of Claims that have 78% of total costs
Medium Risk
Predictive Modeling in Action
2.19% 3.15% 4.74%
11.47%
4
Slide 62
4 early identification is most meaningful here! focus on how that translates to reserving, actuarial evals, etc.Ann Gergen, 3/3/2015
63
APPLYING ADVANCED ANALYTICS TO UNDERWRITING
GLM
Traditional Linear vs. Multivariate results
MULTIVARIATE
Daily Claim Alert Dashboard
67
CASE LEVEL RESERVING DASHBOARDS
Case‐Level Reserving Dashboard
69