Neural Networks, Spark MLlib, Deep Learning
Embed Size (px)
Transcript of Neural Networks, Spark MLlib, Deep Learning
NEURAL NETWORKS AND DEEPLEARNING
BY ASIM JALISGALVANIZE
WHO AM I?
ASIM JALISGalvanize/Zipfian, Data EngineeringCloudera, Microso!, SalesforceMS in Computer Science from University of Virginiahttps://www.linkedin.com/in/asimjalis
WHAT IS GALVANIZES DATAENGINEERING PROGRAM?
DO YOU WANT TO . . .Play with terabytes of dataBuild data applications using Spark, Hadoop, Hive, Kafka,Storm, HBaseUse Data Science algorithms at scale
WHAT IS INVOLVED?Learn concepts in interactive lecturesDevelop skills in hands-on labsDesign and build your Capstone ProjectShow project to SF tech companies at Hiring Day
FOR MORE INFORMATIONCheck out Talk to me
WHAT IS THIS TALK ABOUT?What are Neural Networks and how do they work?What is Deep Learning?What is the difference?How can we build neural networks in Apache Spark?
HOW MANY PEOPLE HERE AREFAMILIAR WITH NEURAL
HOW MANY PEOPLE HERE AREFAMILIAR WITH CONVOLUTION
HOW MANY PEOPLE HERE AREFAMILIAR WITH DEEP LEARNING?
HOW MANY PEOPLE HERE AREFAMILIAR WITH APACHE SPARK
WHAT IS A NEURON?
Receives signal on synapseWhen trigger sends signal on axon
Mathematical abstractionInspired by biological neuronEither on or off based on sum of input
Neuron is a mathematical functionAdds up (weighted) inputsApplies the sigmoid functionThis determines if it fires or not
WHAT ARE NEURAL NETWORKS?Biologically inspired machine learning algorithmMathematical neurons arranged in layersAccumulate signals from the previous layerFire when signal reaches threshold
HOW MANY NEURONS SHOULD IHAVE IN MY NETWORK?
HOW MANY INPUT LAYERNEURONS SHOULD WE HAVE?
The number of inputs or features
HOW MANY OUTPUT LAYERNEURONS SHOULD WE HAVE?
The number of classes we are classifying the input into.
HOW MANY HIDDEN LAYERNEURONS SHOULD WE HAVE?
SIMPLEST OPTION IS TO USE 0.
SINGLE LAYER PERCEPTRON
WHAT ARE THE DOWNSIDES OFNO HIDDEN LAYERS?Only works if data is linearly separable.Identical to logistic regression.
MULTILAYER PERCEPTRONFor most realistic classification tasks you will need ahidden layer.Rule of thumb:
Number of hidden layers equals oneNumber of neurons in hidden layer is mean of size ofinput and output layers.
HOW DO WE USE THIS THING?
NEURAL NETWORK WORKFLOWSplit labeled data into train and test setsTrain with labeled dataTest and compare prediction with actual labels
HOW DO WE TRAIN IT?
FEED FORWARDAlso called forward propagation or forward propInitialize inputsWeigh inputs into hidden layer, sum, apply sigmoidCalculate activation of hidden layerWeight inputs into output layer, sum, apply sigmoidCalculate activation of output layer
BACK PROPAGATIONUse forward prop to calculate the errorError is function of all network weightsAdjust weights using gradient descentRepeat with next recordKeep going over training set until convergence
WHAT IS GRADIENT DESCENT?
HOW DO YOU FIND THE MINIMUMIN AN N-DIMENSIONAL SPACE?
Take a step in the steepest direction.Steepest direction is vector sum of all derivatives.
PUTTING ALL THIS TOGETHER
Use forward prop to activateUse back prop to trainThen use forward prop to test
WHY NOT HAVE MULTIPLELAYERS?
DOWNSIDE OF MULTIPLE LAYERSNumber of weights is a product of the layer sizesThe mathematics quickly becomes intractableParticularly when your input is an image with tens ofthousands of pixels
APACHE SPARK MLLIB
WHAT IS SPARK
Framework for processing data across a clusterBy sending the code to the dataAnd executing the code where the data lives
WHAT IS MLLIB?Library for Machine Learning.Builds on top of Spark RDDs.Provides RDDs for Machine Learning.Implements common Machine Learning algorithms.
DEMO USING APACHE TOREE
WHAT IS APACHE TOREE?Like IPython Notebook but for Spark/Scala.Jupyter kernel for Spark/Scala.
HOW CAN I INSTALL TOREE?Use pip to install IPython or Jupyter.Install Apache Spark by downloading tgz file andexpanding.SPARK_HOME=$HOME/spark-1.6.0pip install toreejupyter toree install \--spark_home=$SPARK_HOME
HOW CAN I RUN A TOREENOTEBOOK
jupyter notebookVisit Create new notebook.Set kernel to Toree.sc in notebook should print Spark Context.
HOW CAN I FIGURE OUT HOWMANY LAYERS?
To figure out how many layers to use and what topologyto use you have to rely on standard machine learningtechniques.Use cross-validation.In general k-fold cross validation.10-fold cross validation is popular.
WHAT IS 10-FOLD CROSSVALIDATION OR K-FOLD CROSS
Split your data into 10 (or in general k) equal-sizedsubsets.Train model on 9 of them, set one aside for cross-validation.Validate model on 10th and remember your error rate.Repeat by setting aside each one of the 10.Average the 10 error rates.Then repeat for the next model.Choose the model with the lowest error rate.
HOW DO I DEPLOY MY NEURALNETWORK INTO PRODUCTION?There are two phases.The training phase can be run on the back-end servers.Cross-validate your model and its hyper-parameters onthe back-end.Then deploy the model to the front-end servers, browsers,devices.The front-end only uses forward prop and is always fast.
WHAT IS DEEP LEARNING?Deep Learning is a learning method that can train thesystem with more than 2 or 3 non-linear hidden layers.
WHAT IS DEEP LEARNING?Machine learning techniques which enable unsupervisedfeature learning and pattern analysis/classification.The essence of deep learning is to computerepresentations of the data.Higher-level features are defined from lower-level ones.
HOW IS DEEP LEARNINGDIFFERENT FROM REGULAR
NEURAL NETWORKS?Training neural networks requires applying gradientdescent on millions of dimensions.This is intractable for large networks.Deep learning places constraints on neural networks.This allows them to be solvable iteratively.The constraints are generic.
WHAT IS THE BIG DEAL ABOUTIT?
AlexNet submitted to the ImageNet ILSVRC challenge in2012 is partly responsible for the renaissance.Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton usedDeep Learning techniques.They combined this with GPUs, some other techniques.The result was a neural network that could classify imagesof cats and dogs.It had an error 16% compared to 26% for the runner up.
ILYA SUTSKEVER, ALEXKRIZHEVSKY, GEOFFREY HINTON
WHAT ARE THE DIFFERENT KINDSOF DEEP ARCHITECTURES?
WHAT ARE GENERATIVEARCHITECTURES
Extract features from dataFind common features in unlabelled dataLike Principal Component AnalysisUnsupervised: no labels required
WHAT ARE DISCRIMINATIVEARCHITECTURES
Classify inputs into classesRequire labelsRequire supervised training
WHAT ARE HYBRIDARCHITECTURES?
STEP 1Combination of generative and discriminativeExtract features using generative networkUse unsupervised learning
STEP 2Train discriminative network on extracted featuresUse supervised learning
WHAT ARE AUTO-ENCODERS?An auto-encoder is a learning algorithm.It applies backpropagation and sets the target values tobe equal to its inputs.In other words it trains itself to do the identitytransformation.
WHY DOES IT DO THIS?By placing constraints on it, like restricting the number ofhidden neurons, it can find a good representation of thedata.
IS THE AUTO-ENCODERSUPERVISED OR UNSUPERVISED?
It is unsupervised.The data is unlabeled.Auto-encoders are similar to PCA (Principal ComponentAnalysis).PCA is a technique for reducing the dimensions of data.
WHAT ARE CONVOLUTIONNEURAL NETWORKS?
Feedforward neural networks.Connection pattern inspired by visual cortex.
The convolution layers parameters are a set of learnablefilters.Every filter is small along width and height.During the forward pass, each filter slides across the widthand height of the input, producing a 2-dimensionalactivation map.As we slide across the input we compute the dot productbetween the filter and the input.
Intuitively, the network learns filters that activate whenthey see a specific type of feature anywhere.In this way it creates translation invariance.
WHAT IS A POOLING LAYER?The pooling layer reduces the resolution of the imagefurther.It tiles the output area with 2x2 mask and takes themaximum activation value of the area.
DOES SPARK SUPPORT DEEPLEARNING?
Not directly yethttps://issues.apache.org/jira/browse/SPARK-2352
WHAT ARE SOME MAJOR DEEPLEARNING PLATFORMS?
Theano: Low-level GPU-enabled tensor library.Lasagne, Blocks: NN libraries that make Theano easier touse.Torch7: NN library. Uses Lua for binding. U