Serverless Machine Learning on AWS - Serverless Meetup Milano

32
Serverless Machine Learning on Amazon Web Services clda.co/serverlessmilano 11/03/2016 Applicazioni di Intelligenza Ar:ficiale con AWS Lambda

Transcript of Serverless Machine Learning on AWS - Serverless Meetup Milano

Serverless  Machine  Learning  on  Amazon  Web  Services

clda.co/serverless-­‐milano11/03/2016

Applicazioni  di  Intelligenza  Ar:ficiale  con  AWS  Lambda

@alex_casalboni

clda.co/serverless-­‐milano Serverless  Meetup  @  Milan

Web  Developer  (6+  years)

Sr.  So;ware  Engineer  @  Cloud  Academy

Master  in  Computer  Science

About  me

What  is  Machine  Learning?

Back  to  1959  (Arthur  Samuel)

How  computers  learn  from  Data

How  to  solve  decision  problems

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Machine  Learning  pipeline

Training Predic3on

batch real-­‐Ame

Feature  extrac3on

batch

data informaMon

features ML  models

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

?Machine  Learning  taxonomy

classifica2on

regression 170cm

Supervised    Learning

Unsupervised    Learning

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Machine  Learning  taxonomy

clustering

rule  extrac2on

group A group B

A, B C

Supervised    Learning

Unsupervised    Learning

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

What  problems  can  ML  solve  for  you?

Supervised    Learning

Unsupervised    Learning

classifica'on

regression

clustering

rule  extrac'on

?

170cm

gro gro

A, B C

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

What  problems  can  ML  solve  for  you?

Supervised    Learning

Unsupervised    Learning

classifica'on

regression

clustering

rule  extrac'on

?fraud  detecMon

170cm

gro gro

A, B C

price  of  a  stock  over  Mme

purchase  likelihood

user  segmentaMon

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

LearningDataMachine

Cloud

Big

Science

Information

Internet

Statistics

Technology

Python Future

Mining Social

Deep

IOT

AlgorithmsManagement

Storage Petabytes

Parallel

Network

Privacy

MillionNoSQL

PaaS

SQL

Database

Exabytes

Billion

Dataset

Hadoop

R

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Generated  data  started  growing  ~10  years  ago…

“90%  of  the  data  in  the  world  today  has  been    created  in  the  last  two  years  alone”  -­‐  IBM

“300+  hours  worth  of  video  content  is  being    uploaded  to  the  site  every  minute”  -­‐  Youtube

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

…  and  it  keeps  geKng  bigger!

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

What  does  a  real  Data  ScienAst  look  like?

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

 Data  ScienMst

Very  smart  &  curious

Numbers  lover  (i.e.  Data)

Great  teamwork  skills

40%  analysis,  30%  design,  30%  code

Big  data  challenges

Manual  exploraMon  is  not  an  opMon

Data-­‐driven  decisions  are  a  must

Distributed/parallel  compuMng

The  curse  of  dimensionality

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Why  is  deploying  ML  models  a  challenge?

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Why  is  deploying  ML  models  a  challenge?

+

+

Data  ScienMst

Data

Time

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Why  is  deploying  ML  models  a  challenge?

+

+

Data  ScienMst

Data

Time

ML  Model

Data  VisualisaMon

Prototype

+

+

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Why  is  deploying  ML  models  a  challenge?

ProducMon  Code

+

+

Data  ScienMst

Data

Time

ML  Model

Data  VisualisaMon

Prototype

+

+

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Why  is  deploying  ML  models  a  challenge?

+

+

Data  ScienMst

Data

Time

ML  Model

Data  VisualisaMon

Prototype

+

+

Web  Developer

DevOps

A  lot  of  Time

+

+

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Why  is  deploying  ML  models  a  challenge?

1.  Prototyping  !=  ProducMon-­‐ready

2.  We  need  ElasMcity

4.  MulM-­‐model  architectures

3.  Too  many  nice-­‐to-­‐have  features

5.  Avoid  lack  of  ownership

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

The  Lack  of  Ownership

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

!=

Data  ScienMst DevOps

MathemaAcal  modeling  StaAsAcal  analysis  

Data  mining

(Cloud)  OperaAons  System  administraAon  SoVware  best  pracAces

Machine  Learning  as  a  Service  (MLaaS)

AmazonMachine  Learning

AzureMachine  Learning

GooglePredicAon  API

IMBWatson  AnalyAcs

BigML

cloudacademy.com/blog/machine-­‐learning

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Amazon  Machine  Learning

AmazonML

One  of  the  first  MLaaS  soluMons  (Apr  2015)

It’s  great  for  classificaMon  and  regression  problems

Only  linear  models  (linear  &  logisMc  regression  +  SGD)

No  support  for  advanced  scenarios  yet  

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Serverless  compuAng  to  the  rescue!

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Versioning,  staging  &  caching

1  model  =  1  microservice

Flexible  RESTful  interface

High  Availability  (no  downMme)

Very  liele  operaMonal  effort

Transparent  elasMcity  (PAYG)

Failure  isolaMon  /  DecentralisaMon Offline  training  phase

ProducMon-­‐ready  prototypes A/B  tesMng  through  composiMon

Quick  Example

clda.co/ML-­‐Lambda

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Serverless  ML  @  Cloud  Academy

MulM-­‐model  architecture

1  Lambda  FuncMon  for  each  ML  model

S3  to  store  models  (1~10MB  each)

RDS  to  store  training  data  (PostgreSQL)

Periodic  training  (offline)

Real-­‐world  Example

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

AWS  Lambda

No  real-­‐Mme  models  (only  pseudo  real-­‐Mme)

Deployment  package  management:  size  limit  and  OS  libraries

Not  suitable  for  model  training  yet  (5  min  max  execuMon  Mme)

Cold  start  Mme  is  long  and  hard  to  avoid

Unit/integraMon  tests  help,  but  not  enough

LimitaAons  of  Serverless  ML

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

cloudacademy.com/blog/machine-­‐learning

cloudacademy.com/blog/serverless

cloudacademy.com/webinars

cloudacademy.com/community

AddiAonal  Resources

Serverless  Meetup  @  Milanclda.co/serverless-­‐milano

Grazie  =)

cloudacademy.com

Q  &  A

11/03/2016