Runway - USENIX...Runway Overview Motivation for Building Runway Runway Features Design Decisions...
Transcript of Runway - USENIX...Runway Overview Motivation for Building Runway Runway Features Design Decisions...
Runway:Model Lifecycle Management at Netflix
OpML 2020Eugen CepoiLiping Peng
Overview
● Motivation for Building Runway
● Runway Features
● Design Decisions
● Future Work
Before Runway
● Not much insight into who is producing or consuming a model
● No easy way to discover existing models and see what’s inside of a model (data,
transformations, features...)
● No standard way to release models
● No standard way to validate models
● Ad-hoc monitoring and alerting
● Computed resources wasted sometimes
A Model Lifecycle Management System is Needed
Currently Runway Provides
● A store to keep track of model-related information
● Friendly UI to discover and understand existing models
● Paved path for model validation and releasing, monitoring and alerting
● SDK and clients that make it easy to programmatically integrate/interact with
Runway
Overview
● Motivation for Building Runway
● Runway Features
● Design Decisions
● Future Work
Discover Models
● Search by model name,
owner, feature name, .etc
Track Model Instances
● Pipeline and run
● Jars and model files
● Publication information
● Alerting configuration
● Validation configuration
and results
Model Instance Visualization
● How was data used?
● What transforms and feature encoders were applied?
Model Instance Details
● Search by feature
names, type, .etc.
● Link to source code
Model Monitoring● Model publication history
● Alert history and metrics
Model Consumers● App clusters
Model Consumers● Parent models
Global View of Unused Models
Global View of Alerted Models
Standardized Validation and Monitoring
● Make sure models can be loaded by application clusters before publishing
● Make sure a model can be published by at most one active pipeline
● Staleness check and alerting
Overview
● Motivation for Building Runway
● Runway Features
● Design Decisions
● Future Work
Orchestration system
Runway
Architecture
training pipelines
pub/sub online apps
Batch Jobs
modelconsumers
&subscribers
stalenesscheck
UI
notebooks
Web App
JVM/Python SDK
Future Work
● (Hyper)Parameter tracking and model selection
● Model quality guard
● Model interpretability / debugging
● Synchronize rollback of models and code in an automated fashion
Thank you. Q & A
Eugen Cepoi: [email protected] Peng: [email protected] Padmanabhan: [email protected]