Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML...

12
The Convergence of Big Data and AI Jammy Zhou - Linaro

Transcript of Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML...

Page 1: Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML Pipelines by Databricks Images are loaded into a DataFrame and decoded automatically Enable

The Convergence of Big Data and AIJammy Zhou - Linaro

Page 2: Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML Pipelines by Databricks Images are loaded into a DataFrame and decoded automatically Enable

Big Data and AI

Data Science

Data Analytics

ArtificialIntelligence

Machine Learning

Deep Learning

Big DataDatasets

Algorithms

A unified cluster for both!

Page 3: Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML Pipelines by Databricks Images are loaded into a DataFrame and decoded automatically Enable

ML/DL Integration with Big Data

TensorFlowOnSparkCaffeOnSpark

Deep Learning Pipelines

Apache Initiatives 3rd Party Solutions

From Industry Vendors From Chip Vendors

Page 4: Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML Pipelines by Databricks Images are loaded into a DataFrame and decoded automatically Enable

Apache Spark EcosystemA unified analytics engine for large scale data processing

Standalone Hadoop YARN Mesos KubernetesResource Manager

Spark CoreComputing Engine

Spark SQL Spark Streaming MLlib GraphXService Modules

Languages JavaScala Python R

Storage & Data Sources HDFS HBase CassandraHive …...

AWS EC2

SQL

Alluxio

Page 5: Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML Pipelines by Databricks Images are loaded into a DataFrame and decoded automatically Enable

ML Pipelines in MLlib● ML Pipelines provide a set of APIs built on top of DataFrames from Spark SQL● Transformer is an algorithm to transform one DataFrame into another● Estimator is an algorithm to be fit on a DataFrame to produce a Transformer● Pipeline chains multiple Transformers and Estimators to specify a workflow

Transformer Transformer EstimatorDataFrame

Pipeline

PipelineModel

Transformer Transformer ModelDataFrame

PipelineModel

Result

Page 6: Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML Pipelines by Databricks Images are loaded into a DataFrame and decoded automatically Enable

ML Algorithms in MLlib

● Classification & Regression○ Logistic Regression○ Decision Tree○ Random Forest○ Gradient-Boosted Tree○ Linear Regression○ Multilayer Perceptron○ Linear Support Vector Machine○ Naive Bayes

● Clustering○ K-means○ Latent Dirichlet Allocation○ Bisecting k-means○ Gaussian Mixture Model

● Collaborative Filtering● Frequent Pattern Mining

○ FP-Growth○ PrefixSpan

How about Deep Learning?

Page 7: Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML Pipelines by Databricks Images are loaded into a DataFrame and decoded automatically Enable

Deep Learning Pipelines● Deep Learning Pipelines is built on Spark ML Pipelines by Databricks● Images are loaded into a DataFrame and decoded automatically● Enable fast transfer learning with Featurizer to reuse pre-trained models

● Apply pre-trained deep learning models as Transformers○ TF-backed Keras models and TF Graphs are supported

● Deploy models with Spark DataFrames and SQL UDFs● Distributed hyperparameter tuning with Estimator and MLlib built-in tools like

CrossValidator and TrainValidationSplit

source

Page 8: Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML Pipelines by Databricks Images are loaded into a DataFrame and decoded automatically Enable

Project Hydrogen● A Spark initiative to unify the Big Data and AI workloads● Barrier execution mode was introduced in Spark to run distributed DL job as

Spark job with gang scheduling○ Horovod integration via HorovodRunner (by Databricks Runtime ML) or horovod.spark (by

Horovod) to run Horovod as a Spark job

● Optimized data exchange between Spark and DL frameworks○ Pandas UDF implementation via Apache Arrow

● Accelerator aware scheduling○ Heterogeneous accelerator support by resource managers like YARN, Mesos & Kubernetes

Page 9: Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML Pipelines by Databricks Images are loaded into a DataFrame and decoded automatically Enable

Distributed Deep Learning● Distributed support is critical to integrate DL frameworks with Spark

● Parallelism for Deep Learning○ Data parallelism (a.k.a between-graph replication)

■ Synchronous vs. asynchronous■ Centralized vs. decentralized for synchronous training

● Parameter server for centralized mode● Ring-allreduce for decentralized mode

■ Parameter server can also be used for asynchronous training○ Model parallelism (a.k.a in-graph replication)

● Multi-device & multi-node communication○ Interconnect: PCIe, NVLink, xGMI, InfiniBand, Omni-Path, High-Speed Ethernet, RoCE○ Libraries: OpenMPI, NCCL (Nvidia), RCCL (AMD), libfabric (OpenFabrics), UCX

source

Page 10: Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML Pipelines by Databricks Images are loaded into a DataFrame and decoded automatically Enable

Distributed Framework Support

[1] TensorFlow has MPI collectives for Baidu allreduce, Horovod replaces Baidu allreduce with NCCL[2] CollectiveAllReduceStrategy is used by HopsML[3] HorovodEstimator is Horovod integration with Spark MLlib for distributed training

TensorFlow

TensorFlowOnSpark Horovod HopsML

Parameter server

[1][2]

HorovodEstimator

KerasBackend

PyTorchMXNet

Ring-allreduce

[3]

Angel ML

Page 11: Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML Pipelines by Databricks Images are loaded into a DataFrame and decoded automatically Enable

The Arm Story● Linaro Data Center and Cloud Group

○ Big Data Lead Project○ HPC SIG (SVE, MPI, math libraries, etc)

● Linaro Machine Intelligence Initiative○ Initial focus on inference support with Cortex-A SoCs○ ArmNN, TVM, etc

● Nvidia AI and HPC stack for Arm (planned for end of 2019)○ Announced at ISC 19 in Frankfurt on June 17th○ Lift the major barrier to integrate AI solutions with Big Data on Arm platforms

● What’s next?

Page 12: Jammy Zhou - Linaro · Deep Learning Pipelines Deep Learning Pipelines is built on Spark ML Pipelines by Databricks Images are loaded into a DataFrame and decoded automatically Enable

Thank youJoin Linaro to accelerate deployment of your Arm-based solutions through collaboration

[email protected]