The Journey to Information for Everyone // Prakash Nanduri, Paxata (FirstMark's Data Driven)
Machine Learning in Production // Josh Bloom, Wise.io [FirstMark's Data Driven]
-
Upload
firstmark-capital -
Category
Technology
-
view
193 -
download
0
Transcript of Machine Learning in Production // Josh Bloom, Wise.io [FirstMark's Data Driven]
COPYRIGHT 2015, WISE.IO INC.
Machine Intelligence For Customer Success
Support & Service as a bridge to value creation
Support Product: ○Intelligent Routing/Triage ○Response Recommendation ○Auto-Response ○Knowledge-base Deflection ○Federated Search ○Spam Filtering ○Sentiment Prediction ○Proactive Support (IOT)
Enhancing Decisions in (SaaS) Workflows
WiseFactoryautomated feature extraction, learning, prediction, deployment
WiseTransferefficient manipulation of large objects
WiseDataSet
WiseMLhigh-productivity data science in Python
WiseAlgorithm
WindTunneldetect drift in CPU, Mem,
Accuracy, Statistics Quality
Wrapping
High-Level API
Deployment &Monitoring
C++ SDK
Core ML Workflow Stack at Wise.io
cf. Pydata 15 keynote
COPYRIGHT 2015, WISE.IO INC.
Front End • angular, javascript, APEX, CasperJS
Wise ArchitectureAs composable, SOA & 3rd party as possible, except for the core IP
API Layer • Python (glue), RDS (Postgresql), Redshift (reporting), stormpath, runscope, periscope, lambda, kinesis
Orchestration/Ops • docker-compose, docker, elastic beanstalk, ECS, EC2, fab, bamboo, cloudwatch
What are you optimizing for?Component What
Algorithm/Model Learning rate, convexity, error bounds, scaling, …
+ Software/HardwareAccuracy, Memory usage, Disk
usage, CPU needs, time to learn, time to predict
+ Project Stafftime to implement, people/resource costs, reliability,
maintainability, experimentability
+ Consumers direct value, useability, explainability, actionability
+ Society indirect value
- multi-axis optimizations in a given component
- highly coupled optimization considerations between components
- myopic view can be costly further up the stack
Copyright 2015, wise.io inc.
8
One ML Algorithmic Trade-OffHigh
LowLow High
Inte
rpre
tabi
lity
Accuracy
Linear/Logistic Regression
Naive Bayes
Decision Trees
SVMs
Bagging
Boosting
Decision Forests
Neural Nets Deep Learning
Nearest Neighbors
Gaussian/Dirichlet
Processes
Splines
* on real-world data setsLasso
Warning
Unscientific &
opinionated!
9
>$50k Prize<$50k Prize
Netflix
winning*metric
best*benchmark
Leaderboard*data*from*Kaggle*&*NeElix
Optimization Metric
9
>$50k Prize<$50k Prize
Netflix
winning*metric
best*benchmark
many*teams*get*within*~few*%*of*opImum
Leaderboard*data*from*Kaggle*&*NeElix
Optimization Metric
9
>$50k Prize<$50k Prize
Netflix
winning*metric
best*benchmark
many*teams*get*within*~few*%*of*opImum
so#which#is#easier#to#put#into#produc0on?
Leaderboard*data*from*Kaggle*&*NeElix
Optimization Metric
10
“We evaluated some of the new methods offline but the additional accuracy gains that we measured did not seem to justify the engineering effort needed to bring them into a production environment.”
Xavier'Amatriain'and'Jus0n'Basilico'(April'2012)
On the Prize
http://research.google.com/pubs/pub43146.html
“It may be surprising to the academic community to know that only a fraction of the code … is actually doing ‘machine learning’. A mature system might end up being (at most) 5% machine learning code and (at least) 95% glue code.”
http://research.google.com/pubs/pub43146.html
• Complex models erode abstraction boundaries
• Data dependencies cost more than code dependencies
• System-level Spaghetti• Changing External World
“It may be surprising to the academic community to know that only a fraction of the code … is actually doing ‘machine learning’. A mature system might end up being (at most) 5% machine learning code and (at least) 95% glue code.”
Prediction API
in-houseas a service
experimental/sandbox
production/scale ready
watsonAPI
time & cost to implement cost to maintain
Fault Tolerant MLaugmentation vs. full automation
Random forest prediction of body segment in Xbox
Kinect
gmail
https://www.reddit.com/r/funny/comments/3e7gy4/yes_netflix_because_my_6_year_old_will_enjoy_the/
“Yes Netflix, because my 6 year old will enjoy the animated fun of
Sons of Anarchy”
ỉπ vs.
(or “Data Science is a Team Sport”)
deep domain skill/knowledge/training deep methodological knowledge/skill
deep domain or methodological skill/knowledge/training strong methodological or domain knowledge/skill
Goal: empower teams of gamma’s to excel
Intelligent Systems: It Takes a Village
“Weak Contracts” ie.
Abstractions within components bleed through
to other componentscf. Sculley …
1. A*smart*programmer*makes*an*invenIve*use*of*a*trained*object*recognizer.*
2. The*object*recognizer*receives*data*that*does*not*resemble*the*tesIng*data*and*outputs*nonsense.*
3. The*code*of*the*smart*programmer*does*not*work.*
Example (via Bottou)