Distributed Multi-device Execution of TensorFlow – an Outlook

15
Unrestricted © Siemens AG. 2016. All rights reserved. Distributed Multi-device Execution of TensorFlow an Outlook Meetup TensorFlow & OpenAI a match made in Heaven?” | 2016-03-01

Transcript of Distributed Multi-device Execution of TensorFlow – an Outlook

Page 1: Distributed Multi-device Execution of TensorFlow – an Outlook

Unrestricted © Siemens AG. 2016. All rights reserved.

Distributed Multi-device Execution of

TensorFlow – an Outlook Meetup “ TensorFlow & OpenAI – a match made in Heaven?” | 2016-03-01

Page 2: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 2 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

What is TensorFlow?

numerical computation library

using data flow graphs

deployable on heterogeneous distributed

systems

Machine Learning

Perspective

Distributed

Computing Perspective

source: http://www.tomlichtenheld.com/childrens_books/duckrabbit!.html

Page 3: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 3 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

What is TensorFlow?

using data flow graphs

Machine Learning

Perspective

Distributed, Embedded

Computing Perspective

numerical computation library deployable on heterogeneous distributed

systems

source: http://www.tomlichtenheld.com/childrens_books/duckrabbit!.html

Page 4: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 4 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

TensorFlow from a distributed computing

perspective

processor,

memory,

network

hierarchies

automatically assign to computational devices

execute in parallel

multi-

dimensional

data flow

computations source: https://www.tensorflow.org/

Page 5: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 5 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

TensorFlow from a distributed computing

perspective

processor,

memory,

network

hierarchies

multi-

dimensional

data flow

computations

Task Scheduling

Resource Management

placement,

parallelization

resource

availability,

costs

Google‘s cluster management system “Borg” 1)

“Significant area of future work: improving the placement and

node scheduling algorithms”1)

1) http://download.tensorflow.org/paper/whitepaper2015.pdf

source: https://www.tensorflow.org/

Page 6: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 6 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

TensorFlow from a distributed, embedded systems

perspective?

Some presentation by Pete Warden, Tech Lead of the TensorFlow Mobile/Embedded team:

“GoogLeNet v1 is 7MB after just quantization”

http://ip.cadence.com/uploads/presentations/1100AM_Tensor

Flow_on_Embedded_Devices_PeteWarden.pdf

?

?

https://www.youtube.com/watch

?v=b0hqhcwDIi4 https://www.autonomous.ai/deep-learning-robot

http://www.nvidia.com/object/embedded-systems.html

http://www.iphoneincanada.ca/

news/tesla-autopilot-summon/

https://www.youtube.com/watch?v=AbcRlDBnwjM

http://www.dexterindustries.com

/shop/gopigo-starter-kit-2/

Page 7: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 7 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

TensorFlow from a distributed, embedded systems

perspective

All things Tensor

• embedded systems sense

multidimensional, multimodal,

streaming data

• tensor networks for easy

implementation of most complex

mathematical operations

Dataflow paradigm

• data is king

• deterministic data acquisition &

calculation

• real-time constraints

• concurrency

• multi-core, GPU, FPGA

• enables true portability

source: https://www.tensorflow.org/

Page 8: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 8 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

TensorFlow from a distributed, embedded systems

perspective

Insufficient tensor support

• BLAS up to matrix-matrix ops

a start: extensions to Eigen by

Benoit Steiner for TensorFlow http://eigen.tuxfamily.org/dox-devel/unsupported/classEigen_1_1Tensor.html

source: https://www.tensorflow.org/

Page 9: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 9 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

TensorFlow from a distributed, embedded systems

perspective

Insufficient tensor support

• BLAS up to matrix-matrix ops

a start: extensions to Eigen by

Benoit Steiner for TensorFlow http://eigen.tuxfamily.org/dox-devel/unsupported/classEigen_1_1Tensor.html

Heuristic placement algorithm

• suited for cloud resources

need: determinism

Resource Management

• suited for large-scale clusters

need: including resources in

embedded systems

source: https://www.tensorflow.org/

Page 10: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 10 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

Upcoming workshop on tensor computing for IoT

Topics of interest

• multidimensional IoT data

• tensor methods and deep learning

• distributed data and computing models

• across heterogeneous architectures of

multi-core cluster and embedded

computing

• optimized and verifiable composition

of operations in an n-dimensional

array/tensor algebra

(Prefect timing, TensorFlow!)

Manifesto will be available here:

http://www.dagstuhl.de/en/program/calendar/semhp/?semnr=16152

Page 11: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 11 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

Sneak Peak: Multidimensional IoT data

Large-scale autonomous systems

generate massive amounts of data

captured by embedded devices

• about dynamic flows

• in dynamic networks

• streaming, GPS-synchronized

• captures various aspects,

measurements

• highly correlated, coming from

networked systems

Page 12: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 12 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

Sneak Peak: Tensor Networks (TN)

“Geometrization”, graphical representation

• modify, optimize TN structure

• reduce complexity, compare, analyze structures

• detect common, hidden components

Links between TNs & graphical models in ML

example notation

Example transformation

contraction unfolding matrix factorization

SVD reshaping

Page 13: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 13 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

Sneak Peak: Mathematics of Arrays, Psi Calculus

• indexing operations based on

shapes

• compose array operations to

minimize temporary arrays

Determinism

• for any number of tensor

operations, predict

• length of contiguous

blocks

• values in each block

• correctly pre-fetch blocks

• overlap computation & IO

Page 14: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 14 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.

What, now?

Stay tuned, try out

• https://github.com/tensorflow/

• Distributed TensorFlow 2/26/2016

• uses gRPC http://www.grpc.io/

• TensorFlow Serving 2/16/2016

• model lifecycle management

• Dagstuhl perspectives: Tensor Computing for IoT

• intuitive handling of tensor operations, optimizations

• deterministic placement and scheduling

• applications in cyber-physical systems

• reference implementations, evaluations & publications

• Embedded Multicore Building Blocks EMB2 https://github.com/siemens/embb

• Eigen Tensor Module

https://bitbucket.org/eigen/eigen/src/265a621240a21b201cc9e73cffc1021e12e6fc93/unsupported/Eigen/CXX11/src/Tensor/?at=default

Page 15: Distributed Multi-device Execution of TensorFlow – an Outlook

Page 15 March 2016 Sebnem Rusitschka

The future of embedded computing is being built now

– starting at the processor level

“Neo – The tiny chip that

could disrupt exascale

computing”

Raspberry Pi Zero: 1

GHz Linux computer for $5

http://www.nvidia.com/object/embedded-systems.html

http://rexcomputing.com/REX_OCPSummit2015.pdf

http://www.nextplatform.com/2015/03/12/the-little-chip-that-could-disrupt-exascale-computing/

https://medium.com/software-is-eating-the-world/what-s-next-in-computing-

e54b870b80cc#.r6k84z51m