Machine Learning in Production // Josh Bloom, Wise.io [FirstMark's Data Driven]

25
Machine Learning for Customer Success Joshua Bloom, Ph.D. @pro%sb

Transcript of Machine Learning in Production // Josh Bloom, Wise.io [FirstMark's Data Driven]

Machine Learning for Customer Success

Joshua'Bloom,'Ph.D.''@pro%sb

c.*1890Harvard*College*Observatory

COPYRIGHT 2015, WISE.IO INC.

Machine Intelligence For Customer Success

Support & Service as a bridge to value creation

Support Product: ○Intelligent Routing/Triage ○Response Recommendation ○Auto-Response ○Knowledge-base Deflection ○Federated Search ○Spam Filtering ○Sentiment Prediction ○Proactive Support (IOT)

Enhancing Decisions in (SaaS) Workflows

COPYRIGHT 2015, WISE.IO INC.

Selected Customers

WiseFactoryautomated feature extraction, learning, prediction, deployment

WiseTransferefficient manipulation of large objects

WiseDataSet

WiseMLhigh-productivity data science in Python

WiseAlgorithm

WindTunneldetect drift in CPU, Mem,

Accuracy, Statistics Quality

Wrapping

High-Level API

Deployment &Monitoring

C++ SDK

Core ML Workflow Stack at Wise.io

cf. Pydata 15 keynote

COPYRIGHT 2015, WISE.IO INC.

Front End • angular, javascript, APEX, CasperJS

Wise ArchitectureAs composable, SOA & 3rd party as possible, except for the core IP

API Layer • Python (glue), RDS (Postgresql), Redshift (reporting), stormpath, runscope, periscope, lambda, kinesis

Orchestration/Ops • docker-compose, docker, elastic beanstalk, ECS, EC2, fab, bamboo, cloudwatch

What are you optimizing for?Component What

Algorithm/Model Learning rate, convexity, error bounds, scaling, …

+ Software/HardwareAccuracy, Memory usage, Disk

usage, CPU needs, time to learn, time to predict

+ Project Stafftime to implement, people/resource costs, reliability,

maintainability, experimentability

+ Consumers direct value, useability, explainability, actionability

+ Society indirect value

- multi-axis optimizations in a given component

- highly coupled optimization considerations between components

- myopic view can be costly further up the stack

Copyright 2015, wise.io inc.

8

One ML Algorithmic Trade-OffHigh

LowLow High

Inte

rpre

tabi

lity

Accuracy

Linear/Logistic Regression

Naive Bayes

Decision Trees

SVMs

Bagging

Boosting

Decision Forests

Neural Nets Deep Learning

Nearest Neighbors

Gaussian/Dirichlet

Processes

Splines

* on real-world data setsLasso

Warning

Unscientific &

opinionated!

9

>$50k Prize<$50k Prize

Netflix

winning*metric

best*benchmark

Leaderboard*data*from*Kaggle*&*NeElix

Optimization Metric

9

>$50k Prize<$50k Prize

Netflix

winning*metric

best*benchmark

many*teams*get*within*~few*%*of*opImum

Leaderboard*data*from*Kaggle*&*NeElix

Optimization Metric

9

>$50k Prize<$50k Prize

Netflix

winning*metric

best*benchmark

many*teams*get*within*~few*%*of*opImum

so#which#is#easier#to#put#into#produc0on?

Leaderboard*data*from*Kaggle*&*NeElix

Optimization Metric

10

“We evaluated some of the new methods offline but the additional accuracy gains that we measured did not seem to justify the engineering effort needed to bring them into a production environment.”

Xavier'Amatriain'and'Jus0n'Basilico'(April'2012)

On the Prize

http://research.google.com/pubs/pub43146.html

“It may be surprising to the academic community to know that only a fraction of the code … is actually doing ‘machine learning’. A mature system might end up being (at most) 5% machine learning code and (at least) 95% glue code.”

http://research.google.com/pubs/pub43146.html

• Complex models erode abstraction boundaries

• Data dependencies cost more than code dependencies

• System-level Spaghetti• Changing External World

“It may be surprising to the academic community to know that only a fraction of the code … is actually doing ‘machine learning’. A mature system might end up being (at most) 5% machine learning code and (at least) 95% glue code.”

Prediction API

in-houseas a service

experimental/sandbox

production/scale ready

watsonAPI

Prediction API

in-houseas a service

experimental/sandbox

production/scale ready

watsonAPI

time & cost to implement cost to maintain

Certainty of Prediction

Risk/Cost

highlow

high

Automate

AugmentManual

Augment

Fault Tolerant MLaugmentation vs. full automation

Random forest prediction of body segment in Xbox

Kinect

gmail

https://www.reddit.com/r/funny/comments/3e7gy4/yes_netflix_because_my_6_year_old_will_enjoy_the/

“Yes Netflix, because my 6 year old will enjoy the animated fun of

Sons of Anarchy”

Thanks!

@pro%sb

Machine Learning for Customer Success

(and yes, we're hiring…)

ỉπ vs.

(or “Data Science is a Team Sport”)

deep domain skill/knowledge/training deep methodological knowledge/skill

deep domain or methodological skill/knowledge/training strong methodological or domain knowledge/skill

Goal: empower teams of gamma’s to excel

Intelligent Systems: It Takes a Village

“Weak Contracts” ie.

Abstractions within components bleed through

to other componentscf. Sculley …

“Weak Contracts” ie.

Abstractions within components bleed through

to other componentscf. Sculley …

1. A*smart*programmer*makes*an*invenIve*use*of*a*trained*object*recognizer.*

2. The*object*recognizer*receives*data*that*does*not*resemble*the*tesIng*data*and*outputs*nonsense.*

3. The*code*of*the*smart*programmer*does*not*work.*

Example (via Bottou)