Deep learning on a mixed cluster with deeplearning4j and spark

Deep learning on a mixedcluster with Deeplearning4j

and SparkBarcelona Spark meetup, Dec 9, 2016

(right after NIPS)francois@garillot.net @huitseeker

AgendaIntro

Why Deep Learning on aCluster

Big Data Architecture

Deeplearning4j

Spark challenges

Introduction : Deep Learningin the trenches today

The bad thing about doing atalk right after NIPS

you guys are scary.

The good thing about doing atalk right after NIPS

You guys don't need to be told SkyNet is a fantasy (for now).

Paying algorithmsAnomaly detection in many forms (bad guys / predictivemaintenance / market rally)

Fraud detection

Network intrusion

Fintech secutiries churn prediction

Video object detection (security)

Models that are beingneglected in benchmarks and

implementation efforts

Autoencoders

How to deal with this in theSpark world ?

experiment with trained model application: Tensorframes,

what are the deep learning frameworks that let you train?

Why Deep Learning on acluster ?

Practically ... let's look at benchmarks

Training, but how ?

New Amazon GPU instances

Training, but how ?

Cluster training in theenterprise

it's really about multi-tenancy & economies of scale

a big bunch of machines shared among everybody sharesbetter

if only because you can reuse it for other workloads

Minor reasons

enterprises may not haveGPUs

Distributing training

basically distributing SGD (R)

challenge is AllReduce Communication

Sparse updates, asynccommunications

Distributing training : goodengineering matters

Cluster training in your(experimentor) case ?

it's a fun problem : AllReduce

Ultimately solved for people with a large amount of images

that solution is not open-source (but at Facebook, Google,Amazon, Microsoft¹, Baidu)

¹: 1-bit SGD is under non-commercial license in CNTK 2.0

Big Data architecture

With a parameter server

With SparkSpark does the initial ETL

Spark ingests the �nal result

In the middle : parameterserver.

Spark cluster modesMesos GPU support merged

devices cgroups !

YARN GPU support throughtags

Spark Standalone : ?

Deeplearning4j

Deeplearning4jthe �rst commercial-grade, open-source, distributed deep-learning library written for Java and Scala

Skymind its commercial support arm

Scienti�c computing on the JVMlibnd4j : Vectorization, 32-bit addressing, linalg (BLAS!)

JavaCPP: generates JNI bindings to your CPP libs

ND4J : numpy for the JVM, native superfast arrays

Datavec : one-stop interface to an NDArray

DeepLearning4J: orchestration, backprop, layer de�nition

ScalNet: gateway drug, inspired from (and closely following)Keras

RL4J : Reinforcement learning for the JVM

With SparkJavaSparkContent sc = ...; JavaRDD<DataSet> trainingData = ...; MultiLayerConfiguration networkConfig = ...; //Create the TrainingMaster instance int examplesPerDataSetObject = 1; TrainingMaster trainingMaster = new ParameterAveragingTrainingMaster.Builder(examplesPerDataSetObject) .(other configuration options) .build(); //Create the SparkDl4jMultiLayer instance SparkDl4jMultiLayer sparkNetwork = new SparkDl4jMultiLayer(sc, networkConfig, trainingMaster); //Fit the network using the training data: sparkNetwork.fit(trainingData);

Spark Challenges

Even if you don't care about Deeplearning

(from Kazuaki Ishizaki @ IBM Japan)

SPARK-6442 : better linear algebra thanbreeze

ND4J will have sparse representations soon

Even if you don't care about Deeplearning II

Meta-RDDs

Killing the bottlenecksSpark has already changed its networking backend once.

better support for parameters servers and their faulttolerance.

A Last Word (from Andrew Y. Ng)get involved !

don't just read papers, reproduce researchresults

AlsoWe're happy to mentor contributions, and there's a book !

Questions ?

Deep learning on a mixed cluster with deeplearning4j and spark

Software

Transcript of Deep learning on a mixed cluster with deeplearning4j and spark

Spark: Cluster Computing with Working Sets --Aaron 2013/03/28.

[Rakuten TechConf2014] [C-6] Leveraging Spark for Cluster Computing

Spark Application on AWS EC2 Clustertwang1/studentProjects/Spark_applicationOnAWS_… · Spark Application on AWS EC2 Cluster: ... Spark with Scala and ... originally developed in

CaffeOnSpark: Deep Learning On Spark Cluster

Spark introduction RDD Building and running Spark applications · Spark introduction!! RDD!! Building and running Spark applications Lightning-fast cluster computing

Spark SQL and DataFrames Spark GraphX Spark Mlib Spark ...Spark GraphX! Spark Mlib! Spark Streaming Lightning-fast cluster computing. Chaining transformations 2. ... Covert RDD to

1. Spark DataFrames + SQL€¦ · Spark + MongoDB 1. Spark DataFrames + SQL 1.1 Setup the Spark cluster on Azure Create a cluster Sign into the azure portal (portal.azure.com). Search

Creating a Cluster on Azure...Spark 2 HDP 2.6 Cluster type Main services Description List of all services Data Science Spark 2, Zeppelin Useful for data science with Spark 2 and Zeppelin.

Intro to Apache Spark - York UniversityApache Spark 2 Spark is a cluster computing engine. Provides high-level API in Scala, Java, Python and R. Provides high level tools: – Spark

Spark application on ec2 cluster

(2018 I P-SAN Analysis Protocol Data Processingoguchi_lab/panel/2018panel_no1.pdfSpark Cluster Ma Ster Spark Driver Data python 7' 9 400 Number of Workers Spark Spark Worker Spark

Apache Spark, the Next Generation Cluster Computing

Learning spark ch07 - Running on a Cluster

Tuning Apache Spark - docs.cloudera.com · In yarn-cluster mode, the Spark driver runs inside an application master process that is managed by YARN on the cluster. The client can

Apache Spark: Hands-on Session - ce.uniroma2.it · Spark Cluster . 10 • Spark applications run as independent sets of processes on a cluster, coordinated by the SparkContext object

Spark-on-YARN: Empower Spark Applications on Hadoop Cluster

Analytics with Spark - Meetupfiles.meetup.com/16395762/Analytics_with_Spark.pdfRemote Spark Context •Being created and living outside HiveServer2 •In yarn-cluster mode, Spark context

Hierarchical Spark: A Multi-cluster Big Data Computing ...lwang/papers/Cloud2017.pdfHierarchical Spark: A Multi-cluster Big Data Computing Framework Zixia Liu, Hong Zhang, and Liqiang

Data Driven Performance Repository to Classify and ... · MongoDB. Cluster-Python Driver. Cassandra - Python Driver. Python. Spark Cluster. Spark - Cassandra Connector. Spark - MongoDB

Intro to Apache Spark - Databricks · PDF fileApache Spark top-level 2010 Spark paper 2008 Hadoop Summit A Brief History: Spark Spark: Cluster Computing with Working Sets! Matei Zaharia,