Automated Machine Learning and Understanding its Potential · 2020. 9. 17. · Bojan Tunguz,...
Transcript of Automated Machine Learning and Understanding its Potential · 2020. 9. 17. · Bojan Tunguz,...
Automated Machine Learning and Understanding its Potential
1
Jeff Heaton, Ph.D. Vice President Data ScienceRGA
Data scientists are adept at automating many jobs… including their own Automation is not binary – it is often a
combination of human and machine Complete automation of complex
tasks is hard Closing the final 10-15% of a
complex automation task is the most challenging part
Relationship between AutoML and automation
What is AutoML? What’s it Not?
Free
Vendors Participating In the Insurance Industry
AWS SageMaker Autopilot Azure
AutoML BigSquidDataRobotDataiku
AutoKeras Auto-Sklearn TPOT
Commercial
DataikuDotDataGoogle Cloud
AutoMLH2O And ~20 others
Cross-sectional and time-series data may be of the following forms
Insurers Work With Multiple Model Types
Tabular DataDenormalized
Relational Data
XML/JSONRelationalGraph/Network
Raw Text ImagesPDFs
Structured Semi-Structured Unstructured
Cross-sectional and time-series data may be of the following forms
Insurers Work With Multiple Data Types
Neural NetworksGLMsSupport Vector
MachinesGradient Boosted
MachinesExample: Lapse
modeling
Neural NetworksK-Means T-SNEExample:
Customer segmentation
Neural NetworksReinforcement
LearningGAMsExample:
complex underwriting automation
Supervised Unsupervised Self-supervised
The Typical AutoML System Accepts Data and Produces a Model
Deep Learning – Feature EngineeringBayesian Optimization – Hyperparameter Optimization LIME – Model “explain-ability” and interpretation
Key ML advancements both produce better models and drive automation
Human-Led Advancements Enabled AutoML
Level 0: Automated system issues warnings and may momentarily intervene but has no sustained vehicle control. Level 1 (“hands on”): The driver and the automated system share
control of the vehicle. Examples are Adaptive Cruise Control and Parking Assistance Level 2 (“hands off”): The automated system takes full control of the
vehicle (accelerating, braking, and steering).
SAE International and the National Highway Traffic Safety(NHTS) Administration recognize six tiers of autonomous driving capability in cars. Levels 0-2 are what you can currently purchase.
Six Levels of Car Autonomy (levels 0-2, we are here)
Level 3 (“eyes off”): The driver can safely turn their attention away from the driving tasks, e.g. the driver can text or watch a movie. Level 4 (“mind off”): As level 3, but no driver attention is ever required
for safety, e.g. the driver may safely go to sleep or leave the driver’s seat. Level 5 (“steering wheel optional”): No human intervention is required
at all. An example would be a robotic taxi.
Some automakers are currently working on levels 3-5; however, production models are not yet available.
Six Levels of Car Autonomy(levels 3-5, where we hope to go)
Level 0: No automation. You code your own ML algorithms. From scratch. In C++. Level 1: Use of high-level algorithm APIs. Sklearn, Keras, Pandas,
H2O, XGBoost, etc. Level 2: Automatic hyperparameter tuning and ensembling. Basic
model selection. Level 3: Automatic (technical) feature engineering and feature
selection, technical data augmentation, GUI.
Bojan Tunguz, competitive machine learning at NVIDIA, Physicist & Kaggle: 6 levels of AutoML. Levels 0-3 are what you can currently purchase. (Source: https://medium.com/@tunguz)
Six Levels of AutoML (levels 0-3, we are here)
Level 4: Automatic domain and problem specific feature engineering, data augmentation, and data integration. Level 5: Full ML Automation. Ability to come up with super-human
strategies for solving hard ML problems without any input or guidance. Fully conversational interaction with the human user.
Bojan Tunguz, competitive machine learning at NVIDIA, Physicist & Kaggle wrote an article comparing the 6 levels of AutoML to the six levels of autonomous driving capability in cars. Levels 0-3 are what you can currently purchase. (Source: https://medium.com/@tunguz)
Six Levels of AutoML (levels 4-5, where we hope to go)
Large amounts of tabular dataRelational databasesModels that must be regenerated and retrained in near real-timeCompanies with limited access to data science talent
Appropriate Domains for AutoML In Insurance Industry
Not that much dataHighly sparse dataData that might be augmented with human intuition
AutoML Domains Where Human Judgement Is Needed
THANK YOU