Automated Machine Learning and Understanding its Potential · 2020. 9. 17. · Bojan Tunguz,...

Automated Machine Learning and Understanding its Potential

1

Jeff Heaton, Ph.D. Vice President Data ScienceRGA

Data scientists are adept at automating many jobs… including their own Automation is not binary – it is often a

combination of human and machine Complete automation of complex

tasks is hard Closing the final 10-15% of a

complex automation task is the most challenging part

Relationship between AutoML and automation

What is AutoML? What’s it Not?

Free

Vendors Participating In the Insurance Industry

AWS SageMaker Autopilot Azure

AutoML BigSquidDataRobotDataiku

AutoKeras Auto-Sklearn TPOT

Commercial

DataikuDotDataGoogle Cloud

AutoMLH2O And ~20 others

Cross-sectional and time-series data may be of the following forms

Insurers Work With Multiple Model Types

Tabular DataDenormalized

Relational Data

XML/JSONRelationalGraph/Network

Raw Text ImagesPDFs

Structured Semi-Structured Unstructured

Cross-sectional and time-series data may be of the following forms

Insurers Work With Multiple Data Types

Neural NetworksGLMsSupport Vector

MachinesGradient Boosted

MachinesExample: Lapse

modeling

Neural NetworksK-Means T-SNEExample:

Customer segmentation

Neural NetworksReinforcement

LearningGAMsExample:

complex underwriting automation

Supervised Unsupervised Self-supervised

The Typical AutoML System Accepts Data and Produces a Model

Deep Learning – Feature EngineeringBayesian Optimization – Hyperparameter Optimization LIME – Model “explain-ability” and interpretation

Key ML advancements both produce better models and drive automation

Human-Led Advancements Enabled AutoML

Level 0: Automated system issues warnings and may momentarily intervene but has no sustained vehicle control. Level 1 (“hands on”): The driver and the automated system share

control of the vehicle. Examples are Adaptive Cruise Control and Parking Assistance Level 2 (“hands off”): The automated system takes full control of the

vehicle (accelerating, braking, and steering).

SAE International and the National Highway Traffic Safety(NHTS) Administration recognize six tiers of autonomous driving capability in cars. Levels 0-2 are what you can currently purchase.

Six Levels of Car Autonomy (levels 0-2, we are here)

Level 3 (“eyes off”): The driver can safely turn their attention away from the driving tasks, e.g. the driver can text or watch a movie. Level 4 (“mind off”): As level 3, but no driver attention is ever required

for safety, e.g. the driver may safely go to sleep or leave the driver’s seat. Level 5 (“steering wheel optional”): No human intervention is required

at all. An example would be a robotic taxi.

Some automakers are currently working on levels 3-5; however, production models are not yet available.

Six Levels of Car Autonomy(levels 3-5, where we hope to go)

Level 0: No automation. You code your own ML algorithms. From scratch. In C++. Level 1: Use of high-level algorithm APIs. Sklearn, Keras, Pandas,

H2O, XGBoost, etc. Level 2: Automatic hyperparameter tuning and ensembling. Basic

model selection. Level 3: Automatic (technical) feature engineering and feature

selection, technical data augmentation, GUI.

Bojan Tunguz, competitive machine learning at NVIDIA, Physicist & Kaggle: 6 levels of AutoML. Levels 0-3 are what you can currently purchase. (Source: https://medium.com/@tunguz)

Six Levels of AutoML (levels 0-3, we are here)

Level 4: Automatic domain and problem specific feature engineering, data augmentation, and data integration. Level 5: Full ML Automation. Ability to come up with super-human

strategies for solving hard ML problems without any input or guidance. Fully conversational interaction with the human user.

Bojan Tunguz, competitive machine learning at NVIDIA, Physicist & Kaggle wrote an article comparing the 6 levels of AutoML to the six levels of autonomous driving capability in cars. Levels 0-3 are what you can currently purchase. (Source: https://medium.com/@tunguz)

Six Levels of AutoML (levels 4-5, where we hope to go)

Large amounts of tabular dataRelational databasesModels that must be regenerated and retrained in near real-timeCompanies with limited access to data science talent

Appropriate Domains for AutoML In Insurance Industry

Not that much dataHighly sparse dataData that might be augmented with human intuition

AutoML Domains Where Human Judgement Is Needed

THANK YOU

Automated Machine Learning and Understanding its Potential · 2020. 9. 17. · Bojan Tunguz,...

Documents

Transcript of Automated Machine Learning and Understanding its Potential · 2020. 9. 17. · Bojan Tunguz,...