Complexity vs. Performance: Empirical Analysis of Machine...

25
Complexity vs. Performance: Empirical Analysis of Machine Learning as a Service Yuanshun Yao, Zhujun Xiao, Bolun Wang*, Bimal Viswanath, Haitao Zheng and Ben Y. Zhao The University of Chicago *University of California, Santa Barbara [email protected]

Transcript of Complexity vs. Performance: Empirical Analysis of Machine...

Page 1: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

Complexityvs.Performance:EmpiricalAnalysisofMachineLearningas

aService

Yuanshun Yao,Zhujun Xiao,BolunWang*,Bimal Viswanath,Haitao ZhengandBenY.Zhao

TheUniversityofChicago*UniversityofCalifornia,SantaBarbara

[email protected]

Page 2: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

MLinNetworkResearch

congestioncontrolprotocols

• Sivaraman etal.,SIGCOMM’14

• Winstein &Balakrishnan,SIGCOMM’13

networklinkprediction

• Liuetal.,IMC’16• Zhaoetal.,IMC’12

userbehavioranalysis

• Wangetal.,IMC’14• Zannettouet al.,IMC’17

Page 3: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

RunningMLisHard

dataset

model

Solution:MachineLearningasaService

(ML-as-a-Service)

Page 4: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

ML-as-a-Service

ML-as-a-Service

trainingdata

userinput(model,parameteretc.)

Page 5: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

Ismymodelgoodenough?

WhyStudyML-as-a-Service?

Q:Howwelldotheyperform?

Q:HowmuchdoestheamountofusercontrolimpactMLperformance?

Page 6: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

ML-as-a-ServicePlatforms

GooglePrediction

AmazonML

MicrosoftML

PIOABM BigML

less amountofuserinput more

Page 7: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

ControlinML

trainingdata trainedmodel

?

Page 8: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

ControlinML

trainingdata trainedmodel

DataCleaning• Invalid/dup/missingdata

?

Page 9: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

ControlinML

trainingdata trainedmodel

DataCleaning• Invalid/dup/missingdata

FeatureSelection• MutualInfo, Pearson,Chi…

?

Page 10: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

ControlinML

trainingdata

ClassifierChoice• LogisticRegression,DecisionTree,kNN…

trainedmodel

DataCleaning• Invalid/dup/missingdata

FeatureSelection• MutualInfo, Pearson,Chi_square…

?

Page 11: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

ControlinML

trainingdata

ClassifierChoice• LogisticRegression,Decision Tree,kNN…

trainedmodel

DataCleaning• Invalid/dup/missingdata

FeatureSelection• MutualInfo, Pearson,Chi_square…

ParameterTuning• LogisticRegression:L1,L2,max_iter…

Page 12: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

ControlinML-as-a-Service

Google ABM

Amazon

PIO BigML

Microsoft

low usercontrol/complexity high

DataCleaning

FeatureSelection

ClassifierChoice

ParameterTuning

Complexity vs.Performance?

Page 13: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

PerformanceMeasurement

Page 14: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

CharacterizingPerformance• Theoreticalmodelingishard• OutputofMLmodeldependsondataset• Noaccesstoimplementationdetails

• Empiricaldata-drivenanalysis• Simulateareal-worldscenariofromendtoend• Needalargenumberofdiversedatasets

• Focusonbinaryclassification

Page 15: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

Dataset• 119datasets• Fromdiverseapplicationdomains• Samplesize:15- 245K,numberoffeatures:1- 4K• 79%ofthemarefromUCIMLRepository

LifeScience37%

ComputerApplications15%

ArtificialTest14%

SocialScience9%

PhysicalScience8%

Financial&Business6%

Other11%

Page 16: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

Methodology• Tuneallavailablecontroldimensions

trainingdata

trainedmodel

Feature Selection Classifier Choice Parameter Tuning

✖✔ ✔API

• LogisticRegression• KNN• SVM• … API

• L1_reg• L2_reg• Max_iter• … API

Page 17: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

Methodology• Tuneallavailablecontroldimensions

trainingdata

trainedmodel

Feature Selection Classifier Choice Parameter Tuning

✖✔ ✔API

testingdata

API

Page 18: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

Trade-offsbetweenComplexityandPerformance

Page 19: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

Complexityvs.Performance

complexitylow high

• Q:Howdoesthecomplexitycorrelatewithperformance?• Highcomplexity->highperformance

0.5

0.6

0.7

0.8

0.9

1

ABM Google Amazon BigML PIO Microsoft Scikit

AverageF-Score Optimized

Page 20: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

Complexityvs.Risk• Q:Howdoestheriskcorrelatewithcomplexity?• Highcomplexity->highrisk

complexitylow high

0

0.1

0.2

0.3

0.4

0.5

ABM Google Amazon BigML PIO Microsoft Scikit

Perfo

rmanceVariance

(F-Score)

Page 21: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

UnderstandingServer-sideOptimization

Page 22: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

Reverse-engineeringOptimization

-1

0

1

2

-1.5 -1 -0.5 0 0.5 1 1.5

Feat

ure

#2

Feature #1

Class 0 Class 1

-6

-3

0

3

6

-3 -2 -1 0 1 2 3

Fe

atu

re #

2

Feature #1

Class 0

Class 1

Circular Linear

• Q:Doesserver-sideadapttodifferentdatasets?

• Reverser-engineeringusingdatasets• Createsyntheticdatasets• Usepredictionresultstoinferclassifierinformation

Page 23: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

UnderstandingOptimizationGoogledecisionboundaries

-1

0

1

2

-1.5 -1 -0.5 0 0.5 1 1.5

Feat

ure

#2

Feature #1

Class 0 Class 1

-6

-3

0

3

6

-3 -2 -1 0 1 2 3

Feat

ure

#2

Feature #1

Class 0Class 1

• Googleswitchesbetweenclassifiersbasedonthedataset

• Usesupervisedlearningtoinferclassifierfamilyused

Page 24: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

Takeaways•ML-as-a-Serviceisanattractivetooltoreduceworkload

• Butusercontrolstillhasalargeimpactonperformance

• Fullyautomatedsystemsarelessrisky

Page 25: Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ysyao/papers/mlaas-imc17-slides.pdf · Complexity vs. Performance: Empirical Analysis of Machine

Thankyou!Questions?