Spark Meetup TensorFrames

34
TensorFrames: Google Tensorflow on Apache Spark Tim Hunter Spark Summit 2016 - Meetup

Transcript of Spark Meetup TensorFrames

Page 1: Spark Meetup TensorFrames

TensorFrames: Google Tensorflow on Apache Spark

Tim HunterSpark Summit 2016 - Meetup

Page 2: Spark Meetup TensorFrames

How familiar are you with Spark?

1. What is Apache Spark?

2. I have used Spark

3. I am using Spark in production or I contribute to its development

2

Page 3: Spark Meetup TensorFrames

How familiar are you with TensorFlow?

1. What is TensorFlow?

2. I have heard about it

3. I am training my own neural networks

3

Page 4: Spark Meetup TensorFrames

Founded by the team who created Apache Spark

Offers a hosted service:- Apache Spark in the cloud- Notebooks- Cluster management- Production environment

About Databricks

4

Page 5: Spark Meetup TensorFrames

Software engineer at Databricks

Apache Spark contributor

Ph.D. UC Berkeley in Machine Learning

(and Spark user since Spark 0.2)

About me

5

Page 6: Spark Meetup TensorFrames

Outline

•Numerical computing with Apache Spark

•Using GPUs with Spark and TensorFlow

•Performance details

•The future

6

Page 7: Spark Meetup TensorFrames

Numerical computing for Data Science

•Queries are data-heavy

•However algorithms are computation-heavy

•They operate on simple data types: integers, floats, doubles, vectors, matrices

7

Page 8: Spark Meetup TensorFrames

The case for speed

•Numerical bottlenecks are good targets for optimization

• Let data scientists get faster results

• Faster turnaround for experimentations

•How can we run these numerical algorithms faster?

8

Page 9: Spark Meetup TensorFrames

Evolution of computing power

9

Failureisnotanoption:itisafact

Whenyoucanaffordyourdedicatedchip

GPGPU

Scaleout

Scaleup

Page 10: Spark Meetup TensorFrames

Evolution of computing power

10

NLTKTheano

Today’stalk:Spark+TensorFlow

Page 11: Spark Meetup TensorFrames

Evolution of computing power

• Processor speed cannot keep up with memory and network improvements

• Access to the processor is the new bottleneck

• Project Tungsten in Spark: leverage the processor’s heuristics for executing code and fetching memory

• Does not account for the fact that the problem is numerical

11

Page 12: Spark Meetup TensorFrames

Asynchronous vs. synchronous

• Asynchronous algorithms perform updates concurrently• Spark is synchronous model, deep learning frameworks

usually asynchronous• A large number of ML computations are synchronous• Even deep learning may benefit from synchronous updates

12

Page 13: Spark Meetup TensorFrames

Outline

•Numerical computing with Apache Spark

•Using GPUs with Spark and TensorFlow

•Performance details

•The future

13

Page 14: Spark Meetup TensorFrames

GPGPUs

14

•Graphics Processing Units for General Purpose computations

6000

Theoreticalpeakthroughput

GPU CPU

Theoreticalpeakbandwidth

GPU CPU

Page 15: Spark Meetup TensorFrames

• Library for writing “machine intelligence” algorithms

• Very popular for deep learning and neural networks

•Can also be used for general purpose numerical computations

• Interface in C++ and Python

15

Google TensorFlow

Page 16: Spark Meetup TensorFrames

Numerical dataflow with Tensorflow

16

x = tf.placeholder(tf.int32, name=“x”)y = tf.placeholder(tf.int32, name=“y”)output = tf.add(x, 3 * y, name=“z”)

session = tf.Session()output_value = session.run(output,

{x: 3, y: 5})

x:int32

y:int32

mul 3

z

Page 17: Spark Meetup TensorFrames

Numerical dataflow with Spark

df = sqlContext.createDataFrame(…)

x = tf.placeholder(tf.int32, name=“x”)y = tf.placeholder(tf.int32, name=“y”)output = tf.add(x, 3 * y, name=“z”)

output_df = tfs.map_rows(output, df)

output_df.collect()

df:DataFrame[x:int,y:int]

output_df:DataFrame[x:int,y:int,z:int]

x:int32

y:int32

mul 3

z

Page 18: Spark Meetup TensorFrames

Outline

•Numerical computing with Apache Spark

•Using GPUs with Spark and TensorFlow

•Performance details

•The future

18

Page 19: Spark Meetup TensorFrames

19

It is a communication problem

Sparkworkerprocess Workerpythonprocess

C++buffer

Pythonpickle

Tungstenbinaryformat

Pythonpickle

Javaobject

Page 20: Spark Meetup TensorFrames

20

TensorFrames: native embedding of TensorFlow

Sparkworkerprocess

C++buffer

Tungstenbinaryformat

Javaobject

Page 21: Spark Meetup TensorFrames

•Estimation of distribution from samples•Non-parametric•Unknown bandwidth

parameter•Can be evaluated with

goodness of fit

An example: kernel density scoring

21

Page 22: Spark Meetup TensorFrames

• In practice, compute:

with:

• In a nutshell: a complex numerical function

An example: kernel density scoring

22

Page 23: Spark Meetup TensorFrames

23

Speedup

0

60

120

180

ScalaUDF ScalaUDF(optimized) TensorFrames TensorFrames+GPU

Runtim

e(sec)

def score(x:Double):Double ={val dis=points.map {z_k =>- (x- z_k)*(x- z_k)/(2*b*b)}val minDis =dis.minval exps =dis.map(d=>math.exp(d- minDis))minDis - math.log(b*N)+math.log(exps.sum)}

val scoreUDF =sqlContext.udf.register("scoreUDF", score_)sql("selectsum(scoreUDF(sample)) fromsamples").collect()

Page 24: Spark Meetup TensorFrames

24

Speedup

0

60

120

180

ScalaUDF ScalaUDF(optimized) TensorFrames TensorFrames+GPU

Runtim

e(sec)

def score(x:Double):Double ={val dis=new Array[Double](N)var idx =0while(idx <N){val z_k =points(idx)dis(idx)=- (x- z_k)*(x- z_k)/(2*b*b)idx +=1}val minDis =dis.minvar expSum =0.0idx =0while(idx <N){expSum +=math.exp(dis(idx) - minDis)idx +=1}minDis - math.log(b*N)+math.log(expSum)}

val scoreUDF =sqlContext.udf.register("scoreUDF",score_)sql("selectsum(scoreUDF(sample))fromsamples").collect()

Page 25: Spark Meetup TensorFrames

25

Speedup

0

60

120

180

ScalaUDF ScalaUDF(optimized) TensorFrames TensorFrames+GPU

Runtim

e(sec)def cost_fun(block,bandwidth):distances=- square(constant(X)- sample)/(2*b*b)m=reduce_max(distances,0)x=log(reduce_sum(exp(distances- m),0))return identity(x+m- log(b*N),name="score”)

sample=tfs.block(df,"sample")score=cost_fun(sample,bandwidth=0.5)df.agg(sum(tfs.map_blocks(score,df))).collect()

Page 26: Spark Meetup TensorFrames

26

Speedup

0

60

120

180

ScalaUDF ScalaUDF(optimized) TensorFrames TensorFrames+GPU

Runtim

e(sec)def cost_fun(block,bandwidth):distances=- square(constant(X)- sample)/(2*b*b)m=reduce_max(distances,0)x=log(reduce_sum(exp(distances- m),0))return identity(x+m- log(b*N),name="score”)

with device("/gpu"):sample=tfs.block(df,"sample")score=cost_fun(sample,bandwidth=0.5)df.agg(sum(tfs.map_blocks(score,df))).collect()

Page 27: Spark Meetup TensorFrames

Demo: Deep dreams

27

Page 28: Spark Meetup TensorFrames

Demo: Deep dreams

28

Page 29: Spark Meetup TensorFrames

Outline

•Numerical computing with Apache Spark

•Using GPUs with Spark and TensorFlow

•Performance details

•The future

29

Page 30: Spark Meetup TensorFrames

30

Improving communication

Sparkworkerprocess

C++buffer

Tungstenbinaryformat

Javaobject

Directmemorycopy

Columnarstorage

Page 31: Spark Meetup TensorFrames

The future

• Integration with Tungsten:• Direct memory copy• Columnar storage

•Better integration with MLlib data types

•GPU instances in Databricks: Official support coming this summer

31

Page 32: Spark Meetup TensorFrames

Recap•Spark: an efficient framework for running

computations on thousands of computers

•TensorFlow: high-performance numerical framework

•Get the best of both with TensorFrames:• Simple API for distributed numerical computing• Can leverage the hardware of the cluster

32

Page 33: Spark Meetup TensorFrames

Try these demos yourself

•TensorFrames source code and documentation:github.com/tjhunter/tensorframesspark-packages.org/package/tjhunter/tensorframes

•Demo available later on Databricks

•The official TensorFlow website:www.tensorflow.org

•More questions and attending the Spark summit?We will hold office hours at the Databricks booth.

33

Page 34: Spark Meetup TensorFrames

Thank you.