Optimizing Supervised and Implementing Unsupervised Machine Learning Algorithms in HPCC Systems
Supervised Learning Algorithms - Analysis of different approaches
-
Upload
philip-yankov -
Category
Science
-
view
72 -
download
0
Transcript of Supervised Learning Algorithms - Analysis of different approaches
SupervisedLearningAlgorithms
Analysisof
Differentapproaches
EvgeniyMarinovMLConsultant
PhilipYankovx8academy
MLDefiniCon
• ThereareplentyofdefiniCons...• Informal:Thefieldofstudythatgivescomputerstheabilitytolearnwithoutbeingexplicitlyprogrammed(ArthurSamuel,1959)
• Formal:AcomputerprogramissaidtolearnfromexperienceE,withrespecttosometaskT,andsomeperformancemeasureP,ifitsperformanceonTasmeasuredbyPimproveswithexperienceE(TomMitchell,1998).
FromWikipedia
• Machinelearningis:– asubfieldofcomputersciencethatevolvedfromthestudyofpaRernrecogniConandinAIinthe1980s(MLisaseparatefieldflourishingfromthe1990s,firstbenefitedfromstaCsCcsandthenfromtheincreasingavailabilityofdigiCzedinformaConatthatCme).
WhyML?
WhyML?
KeyfactorsenablingMLgrowthtoday
• CloudCompu)ng• InternetofThings• BigData(+UnstructuredData)
WhyDataissoimportant?
WhyDataissoimportant?
• GooglePhotos– Unlimitedstorage
• Googlevoice– OK,Google
Nowadays
• ItissoeasytogetdatayouneedandtouseanAPIorserviceofsomecompanytoexperimentwiththem
MethodsforcollecCngdata
MethodsforcollecCngdata
• Download– Spreadsheet– Text
• API• Crawling/scraping
SupervisedLearning
Task Description
Pipeline
IniCalexample
NotaCon
• Asdasd
• Asdasd
• Asdasd
• Asdasd
TheregressionfuncConf(x)
• as• as
• as
Howtoevaluateourmodel?
Pipeline
Assessing the Model Accuracy
Bias-variancetrade-off
Bias-variancetrade-off
Cross-validaCon
GeneralizaConErrorandOverfi`ng
ChoosingaModelbydatatypesofresponse
Pipeline
DatatypesandGeneralizedLinearmodel
• SimpleandGenerallinearmodels• RestricConsofthelinearmodel• DatatypeoftheresponseY1) (General)LinearmodelR,Y~Gaussian(µ,σ^2)--conCnuous2) LogisCcregression{0,1},Y~Bernoulli(p)--binarydata3)Poissonregression{0,1,...},Y~Poisson(µ)--counCngdata
SimpleandGenerallinearmodels
Simple:General:
ErroroftheGeneralLinearmodel
ClicktoaddText
RestricConsofLinearmodels
AlthoughtheGenerallinearmodelisausefulframework,itisnotappropriateinthefollowingcases:• TherangeofYisrestricted(e.g.binary,count,posiCve/negaCve)
• Var[Y]dependsonthemeanE[Y](fortheGaussiantheyareindependent)
Name Mean Variance
Bernoulli(p) p p(1 - p)
Binomial(p, n) np np(1 - p)
Poisson(p) p p
BinaryresponseY–{0,1}• TheBernoulli(p)isdiscreter.v.withtwopossibleoutcomes:• pandq=1–p• TheparameterpdoesnotchangeoverCme• Bernoulliisbuildingblockforothermorecomplicated
distribuCons
• Examples:• Coinflips{Heads,Tails}–ifunbiased• thenp=0.5• ClickonAd,Fail/SuccessonExam
GeneralizedLinearmodel-IntuiCon
ExponenCalFamily
Generallinearmodel
Binary Data
ModelingCounCng/PoissonData
MaximizingtheLog-LikelihoodandParametersesCmaCon
Preprocessing
Pipeline
Problemswithfeaturetypes
• Bignumberoffeatures->DimensionalityreducCon->SVD,PCA– Dimensionalityreduc)on:“compress”thedatafromahigh-dimensionalrepresentaConintoalower-dimensionalone(usefulforvisualizaConorasaninternaltransformaConforotherMLalgorithms)
• Sparsefeatures->Hashing
• Insteadofusingtwocoordinates(𝒙,𝒚)todescribepointlocaCons,let’suseonlyonecoordinate(𝒛)
• Point’sposiConisitslocaConalongvector𝒗↓𝟏 • Howtochoose𝒗↓𝟏 ?Minimizereconstruc)onerror
SVD–DimensionalityReducCon
v1
first right singular vector
Movie 1 rating
Mov
ie 2
ratin
g
SVD-DimensionalityReducCon
Moredetails• Q:Howexactlyisdim.reduc)ondone?• A:Setsmallestsingularvaluestozero
46
0.56 0.59 0.56 0.09 0.09 0.12 -0.02 0.12 -0.69 -0.69 0.40 -0.80 0.40 0.09 0.09
x x
1 1 1 0 0 3 3 3 0 0 4 4 4 0 0 5 5 5 0 0 0 2 0 4 4 0 0 0 5 5 0 1 0 2 2
0.13 0.02 -0.01 0.41 0.07 -0.03 0.55 0.09 -0.04 0.68 0.11 -0.05 0.15 -0.59 0.65 0.07 -0.73 -0.67 0.07 -0.29 0.32
12.4 0 0 0 9.5 0 0 0 1.3
≈
SVD-DimensionalityReducCon
Moredetails• Q:Howexactlyisdim.reduc)ondone?• A:Setsmallestsingularvaluestozero
47
x x
1 1 1 0 0 3 3 3 0 0 4 4 4 0 0 5 5 5 0 0 0 2 0 4 4 0 0 0 5 5 0 1 0 2 2
0.13 0.02 -0.01 0.41 0.07 -0.03 0.55 0.09 -0.04 0.68 0.11 -0.05 0.15 -0.59 0.65 0.07 -0.73 -0.67 0.07 -0.29 0.32
12.4 0 0 0 9.5 0 0 0 1.3
0.56 0.59 0.56 0.09 0.09 0.12 -0.02 0.12 -0.69 -0.69 0.40 -0.80 0.40 0.09 0.09
≈
SVD-DimensionalityReducCon
Moredetails• Q:Howexactlyisdim.reduc)ondone?• A:Setsmallestsingularvaluestozero
≈ x x
1 1 1 0 0 3 3 3 0 0 4 4 4 0 0 5 5 5 0 0 0 2 0 4 4 0 0 0 5 5 0 1 0 2 2
0.13 0.02 0.41 0.07 0.55 0.09 0.68 0.11 0.15 -0.59 0.07 -0.73 0.07 -0.29
12.4 0 0 9.5
0.56 0.59 0.56 0.09 0.09 0.12 -0.02 0.12 -0.69 -0.69
ǁA-BǁF =√Σij (Aij-Bij)2 is“small”
SVD–DimensionalityReducCon(PCAgeneralizaCon)
Moredetails• Q:Howexactlyisdim.reduc)ondone?• A:Setsmallestsingularvaluestozero
≈
1 1 1 0 0 3 3 3 0 0 4 4 4 0 0 5 5 5 0 0 0 2 0 4 4 0 0 0 5 5 0 1 0 2 2
0.92 0.95 0.92 0.01 0.01 2.91 3.01 2.91 -0.01 -0.01 3.90 4.04 3.90 0.01 0.01 4.82 5.00 4.82 0.03 0.03 0.70 0.53 0.70 4.11 4.11 -0.69 1.34 -0.69 4.78 4.78 0.32 0.23 0.32 2.01 2.01
Frobeniusnorm:ǁMǁF =√Σij Mij
2
Feature selection - example
Dummy Encoding
(De)MoCvaCon
SoluContothoseproblemswithfeatures
Pipeline
Factorization Machine (degree 2)
General Applications of FMs
SummaryPipeline
Pipeline
FromprototypetoproducCon
• PrototypevsProducConCme?–model(pipeline)shouldstaythesame
Libraries
QuesCons?
Thankyou!!!
References• hRps://www.coursera.org/learn/machine-learning
• hRp://www.cs.cmu.edu/~tom/• hRp://scikit-learn.org/stable/• hRp://www.scalanlp.org/• hRp://www.algo.uni-konstanz.de/members/rendle/pdf/Rendle2010FM.pdf
• hRps://securityintelligence.com/factorizaCon-machines-a-new-way-of-looking-at-machine-learning/
References
• AnIntroducContoGeneralizedLinearModels–AnneReDobson,AdrianBarneR
• ApplyingGeneralizedLinearModels–JamesLindsey
• hRps://www.codementor.io/jadianes/building-a-recommender-with-apache-spark-python-example-app-part1-du1083qbw
• hRps://www.chrisstucchio.com/blog/index.html