Neural Networks, Spark MLlib, Deep Learning

download Neural Networks, Spark MLlib, Deep Learning

of 87

Embed Size (px)

Transcript of Neural Networks, Spark MLlib, Deep Learning

  • NEURAL NETWORKS AND DEEPLEARNING

    BY ASIM JALISGALVANIZE

  • WHO AM I?

  • ASIM JALISGalvanize/Zipfian, Data EngineeringCloudera, Microso!, SalesforceMS in Computer Science from University of Virginiahttps://www.linkedin.com/in/asimjalis

    https://www.linkedin.com/in/asimjalis

  • WHAT IS GALVANIZES DATAENGINEERING PROGRAM?

  • DO YOU WANT TO . . .Play with terabytes of dataBuild data applications using Spark, Hadoop, Hive, Kafka,Storm, HBaseUse Data Science algorithms at scale

  • WHAT IS INVOLVED?Learn concepts in interactive lecturesDevelop skills in hands-on labsDesign and build your Capstone ProjectShow project to SF tech companies at Hiring Day

  • FOR MORE INFORMATIONCheck out Talk to me

    http://galvanize.com

    asim.jalis@galvanize.com

    http://galvanize.com/file:///Users/asimjalis/git/talks/neural-networks/asim.jalis@galvanize.com

  • INTRO

  • WHAT IS THIS TALK ABOUT?What are Neural Networks and how do they work?What is Deep Learning?What is the difference?How can we build neural networks in Apache Spark?

  • HOW MANY PEOPLE HERE AREFAMILIAR WITH NEURAL

    NETWORKS?

  • HOW MANY PEOPLE HERE AREFAMILIAR WITH CONVOLUTION

    NEURAL NETWORKS?

  • HOW MANY PEOPLE HERE AREFAMILIAR WITH DEEP LEARNING?

  • HOW MANY PEOPLE HERE AREFAMILIAR WITH APACHE SPARK

    AND MLLIB?

  • NEURAL NETWORKS

  • WHAT IS A NEURON?

  • Receives signal on synapseWhen trigger sends signal on axon

  • Mathematical abstractionInspired by biological neuronEither on or off based on sum of input

  • Neuron is a mathematical functionAdds up (weighted) inputsApplies the sigmoid functionThis determines if it fires or not

  • WHAT ARE NEURAL NETWORKS?Biologically inspired machine learning algorithmMathematical neurons arranged in layersAccumulate signals from the previous layerFire when signal reaches threshold

  • HOW MANY NEURONS SHOULD IHAVE IN MY NETWORK?

  • HOW MANY INPUT LAYERNEURONS SHOULD WE HAVE?

  • The number of inputs or features

  • HOW MANY OUTPUT LAYERNEURONS SHOULD WE HAVE?

  • The number of classes we are classifying the input into.

  • HOW MANY HIDDEN LAYERNEURONS SHOULD WE HAVE?

  • SIMPLEST OPTION IS TO USE 0.

  • SINGLE LAYER PERCEPTRON

  • WHAT ARE THE DOWNSIDES OFNO HIDDEN LAYERS?Only works if data is linearly separable.Identical to logistic regression.

  • MULTILAYER PERCEPTRONFor most realistic classification tasks you will need ahidden layer.Rule of thumb:

    Number of hidden layers equals oneNumber of neurons in hidden layer is mean of size ofinput and output layers.

  • HOW DO WE USE THIS THING?

  • NEURAL NETWORK WORKFLOWSplit labeled data into train and test setsTrain with labeled dataTest and compare prediction with actual labels

  • HOW DO WE TRAIN IT?

  • FEED FORWARDAlso called forward propagation or forward propInitialize inputsWeigh inputs into hidden layer, sum, apply sigmoidCalculate activation of hidden layerWeight inputs into output layer, sum, apply sigmoidCalculate activation of output layer

  • BACK PROPAGATIONUse forward prop to calculate the errorError is function of all network weightsAdjust weights using gradient descentRepeat with next recordKeep going over training set until convergence

  • WHAT IS GRADIENT DESCENT?

  • HOW DO YOU FIND THE MINIMUMIN AN N-DIMENSIONAL SPACE?

    Take a step in the steepest direction.Steepest direction is vector sum of all derivatives.

  • PUTTING ALL THIS TOGETHER

  • Use forward prop to activateUse back prop to trainThen use forward prop to test

  • WHY NOT HAVE MULTIPLELAYERS?

  • DOWNSIDE OF MULTIPLE LAYERSNumber of weights is a product of the layer sizesThe mathematics quickly becomes intractableParticularly when your input is an image with tens ofthousands of pixels

  • APACHE SPARK MLLIB

  • WHAT IS SPARK

  • Framework for processing data across a clusterBy sending the code to the dataAnd executing the code where the data lives

  • WHAT IS MLLIB?Library for Machine Learning.Builds on top of Spark RDDs.Provides RDDs for Machine Learning.Implements common Machine Learning algorithms.

  • DEMO USING APACHE TOREE

  • WHAT IS APACHE TOREE?Like IPython Notebook but for Spark/Scala.Jupyter kernel for Spark/Scala.

  • HOW CAN I INSTALL TOREE?Use pip to install IPython or Jupyter.Install Apache Spark by downloading tgz file andexpanding.SPARK_HOME=$HOME/spark-1.6.0pip install toreejupyter toree install \--spark_home=$SPARK_HOME

  • HOW CAN I RUN A TOREENOTEBOOK

    jupyter notebookVisit Create new notebook.Set kernel to Toree.sc in notebook should print Spark Context.

    http://localhost:8888

    http://localhost:8888/

  • NEURAL NETWORKCONSTRUCTION

  • HOW CAN I FIGURE OUT HOWMANY LAYERS?

    To figure out how many layers to use and what topologyto use you have to rely on standard machine learningtechniques.Use cross-validation.In general k-fold cross validation.10-fold cross validation is popular.

  • WHAT IS 10-FOLD CROSSVALIDATION OR K-FOLD CROSS

    VALIDATION?

  • Split your data into 10 (or in general k) equal-sizedsubsets.Train model on 9 of them, set one aside for cross-validation.Validate model on 10th and remember your error rate.Repeat by setting aside each one of the 10.Average the 10 error rates.Then repeat for the next model.Choose the model with the lowest error rate.

  • HOW DO I DEPLOY MY NEURALNETWORK INTO PRODUCTION?There are two phases.The training phase can be run on the back-end servers.Cross-validate your model and its hyper-parameters onthe back-end.Then deploy the model to the front-end servers, browsers,devices.The front-end only uses forward prop and is always fast.

  • DEEP LEARNING

  • WHAT IS DEEP LEARNING?Deep Learning is a learning method that can train thesystem with more than 2 or 3 non-linear hidden layers.

  • WHAT IS DEEP LEARNING?Machine learning techniques which enable unsupervisedfeature learning and pattern analysis/classification.The essence of deep learning is to computerepresentations of the data.Higher-level features are defined from lower-level ones.

  • HOW IS DEEP LEARNINGDIFFERENT FROM REGULAR

    NEURAL NETWORKS?Training neural networks requires applying gradientdescent on millions of dimensions.This is intractable for large networks.Deep learning places constraints on neural networks.This allows them to be solvable iteratively.The constraints are generic.

  • WHAT IS THE BIG DEAL ABOUTIT?

    AlexNet submitted to the ImageNet ILSVRC challenge in2012 is partly responsible for the renaissance.Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton usedDeep Learning techniques.They combined this with GPUs, some other techniques.The result was a neural network that could classify imagesof cats and dogs.It had an error 16% compared to 26% for the runner up.

  • ILYA SUTSKEVER, ALEXKRIZHEVSKY, GEOFFREY HINTON

  • WHAT ARE THE DIFFERENT KINDSOF DEEP ARCHITECTURES?

    GenerativeDiscriminativeHybrid

  • WHAT ARE GENERATIVEARCHITECTURES

    Extract features from dataFind common features in unlabelled dataLike Principal Component AnalysisUnsupervised: no labels required

  • WHAT ARE DISCRIMINATIVEARCHITECTURES

    Classify inputs into classesRequire labelsRequire supervised training

  • WHAT ARE HYBRIDARCHITECTURES?

    STEP 1Combination of generative and discriminativeExtract features using generative networkUse unsupervised learning

    STEP 2Train discriminative network on extracted featuresUse supervised learning

  • WHAT ARE AUTO-ENCODERS?An auto-encoder is a learning algorithm.It applies backpropagation and sets the target values tobe equal to its inputs.In other words it trains itself to do the identitytransformation.

  • WHY DOES IT DO THIS?By placing constraints on it, like restricting the number ofhidden neurons, it can find a good representation of thedata.

  • IS THE AUTO-ENCODERSUPERVISED OR UNSUPERVISED?

  • It is unsupervised.The data is unlabeled.Auto-encoders are similar to PCA (Principal ComponentAnalysis).PCA is a technique for reducing the dimensions of data.

  • WHAT ARE CONVOLUTIONNEURAL NETWORKS?

    Feedforward neural networks.Connection pattern inspired by visual cortex.

  • CONVOLUTION NEURALNETWORKS

    The convolution layers parameters are a set of learnablefilters.Every filter is small along width and height.During the forward pass, each filter slides across the widthand height of the input, producing a 2-dimensionalactivation map.As we slide across the input we compute the dot productbetween the filter and the input.

  • CONVOLUTION NEURALNETWORKS

    Intuitively, the network learns filters that activate whenthey see a specific type of feature anywhere.In this way it creates translation invariance.

  • WHAT IS A POOLING LAYER?The pooling layer reduces the resolution of the imagefurther.It tiles the output area with 2x2 mask and takes themaximum activation value of the area.

  • DOES SPARK SUPPORT DEEPLEARNING?

    Not directly yethttps://issues.apache.org/jira/browse/SPARK-2352

    https://issues.apache.org/jira/browse/SPARK-2352

  • WHAT ARE SOME MAJOR DEEPLEARNING PLATFORMS?

  • Theano: Low-level GPU-enabled tensor library.Lasagne, Blocks: NN libraries that make Theano easier touse.Torch7: NN library. Uses Lua for binding. U