Josh Wills, MLconf 2013

10
1 From The Lab to the Factory Building A Produc8on Machine Learning Infrastructure Josh Wills, Senior Director of Data Science Cloudera

description

Josh Wills, Senior Director of Data Science, Cloudera: Building a Production Machine Learning Infrastructure (Quickly)

Transcript of Josh Wills, MLconf 2013

Page 1: Josh Wills, MLconf 2013

1

From  The  Lab  to  the  Factory  Building  A  Produc8on  Machine  Learning  Infrastructure  Josh  Wills,  Senior  Director  of  Data  Science  Cloudera  

Page 2: Josh Wills, MLconf 2013

About  Me  

2  

Page 3: Josh Wills, MLconf 2013

Data  Science:  Another  Defini8on  

3

Page 4: Josh Wills, MLconf 2013

Data  Scien8sts  Build  Data  Products.  

4

Page 5: Josh Wills, MLconf 2013

All*  Products  Become  Data  Products  

5

Page 6: Josh Wills, MLconf 2013

Iden8fying  the  BoHlenecks  

6

Page 7: Josh Wills, MLconf 2013

Oryx:  Model  Building  and  Serving  

•  Algorithms  •  ALS  Recommenders  •  K-­‐Means  Parallel  •  RDF  

•  Batch  model  building  via  MapReduce  

•  Server  for  real-­‐8me  scoring  and  updates  

•  PMML  4.1  Models    

7  

Page 8: Josh Wills, MLconf 2013

Gertrude:  Evalua8on  via  Experiments  

•  Mul8variate  Tes8ng  •  Define  and  explore  a  space  of  parameters  

•  Overlapping  Experiments  •  Tang  et  al.  (2010)  •  Runs  mul8ple  independent  experiments  on  every  request  

8  

Page 9: Josh Wills, MLconf 2013

Planning  For  The  Future  

9

Page 10: Josh Wills, MLconf 2013

 Josh  Wills,  Director  of  Data  Science,  Cloudera            @josh_wills  

 

Thank  you!