Google Big Data Expo

Post on 21-Jan-2018

340 views 0 download

Transcript of Google Big Data Expo

a 30 min short walkRobert Saxby - Big Data Product Specialist

Sustainability

Google datacenters have half the overhead of typical industry data centers

Largest private investor in renewables: $2 billion generating 3.2 GW

Applying Machine Learning produced 40% reduction in cooling energy

Large Datasets Cutting Edge Models Compute at Scale

Drivers of Success in AI/ML Projects

App DeveloperData Scientist

Build custom modelsUse/extend OSS SDK Use pre-built models

ML researcher

Cloud MLE ML Perception services

End to End: Google Cloud AI Spectrum

App DeveloperData Scientist

Build custom modelsUse/extend OSS SDK Use pre-built models

ML researcher

Cloud MLE ML Perception services

End to End: Google Cloud AI Spectrum

Proprietary + Confidential

What is TensorFlow?

● A system for distributed, parallel machine learning● It’s based on general-purpose dataflow graphs● It targets heterogeneous devices

○ A single PC with CPU○ A single PC with GPU(s)○ A mobile device○ Clusters of 100s or 1000s of CPUs, GPUs and TPUs

Proprietary + Confidential

Another data flow system

MatMul

Add Relu

biases

weights

examples

labels

Xent

Graph of Nodes, also called Operations or ops

Proprietary + Confidential

With tensors

MatMul

Add Relu

biases

weights

examples

labels

Xent

Edges are N-dimensional arrays: Tensors

Proprietary + Confidential

What’s in a name?

0 Scalar (magnitude only) s = 483

1 Vector (magnitude and direction) v = [1.1, 2.2, 3.3]

2 Matrix (table of numbers) m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

3 3-Tensor (cube of numbers) t = [[[2], [4], [6]], [[8], [10], [12]], [[14], [16], [18]]]

4 n-Tensor (you get the idea) ....

Proprietary + Confidential

Convolutional layer

W1[4, 4, 3]

W2[4, 4, 3]

+padding

W[4, 4, 3, 2]

filter size

input channels

output channels

stride

convolutionalsubsampling

convolutionalsubsampling

convolutionalsubsampling

Proprietary + Confidential

With state

Add Mul

biases

...

learning rate

−=...

'Biases' is a variable −= updates biasesSome ops compute gradients

Proprietary + Confidential

And distributed

Add Mul

biases

...

learning rate

−=...

Device BDevice A

TensorFlow Distributed Execution Engine

CPU GPU Android iOS ...

C++ FrontendPython Frontend ...

Layers

Estimator

Models in a box

Train and evaluate models

Build models

Keras Model

Canned Estimators

Proprietary + Confidential

Artificial IntelligenceThe science of making things smart

Neural NetworkA type of algorithm in machine learning

Machine LearningBuilding machines that can learn

Proprietary + Confidential

The popular imagination of what ML is

Lots of data Magical resultsComplex mathematics in multidimensional spaces

Proprietary + Confidential

In reality, ML is

Collect data

Create themodel

Refine themodel

Understand and prepare

the data

Serve themodel

Defineobjectives

Proprietary + Confidential

In reality, ML is

Collect data

Create themodel

Refine themodel

Understand and prepare

the data

Serve themodel

Defineobjectives

Proprietary + Confidential

Neural Network is a function that can learn

Proprietary + Confidential

How about this?

Proprietary + Confidential

Neural Network can extract hidden features from data

Proprietary + Confidential

28x28 pixels

softmax

...

...

0 1 2 9

weighted sum of all pixels + bias

neuron outputs

784 pixels

A very simple model

L0,0

w0,0

w0,1

w0,2

w0,3

… w0,9

w1,0

w1,1

w1,2

w1,3

… w1,9

w2,0

w2,1

w2,2

w2,3

… w2,9

w3,0

w3,1

w3,2

w3,3

… w3,9

w4,0

w4,1

w4,2

w4,3

… w4,9

w5,0

w5,1

w5,2

w5,3

… w5,9

w6,0

w6,1

w6,2

w6,3

… w6,9

w7,0

w7,1

w7,2

w7,3

… w7,9

w8,0

w8,1

w8,2

w8,3

… w8,9

…w783,0

w783,1

w783,2

… w783,9

xxxxxxx

x

L1,0

L1,1

L1,2

L1,3

… L1,9

L2,0

L2,1

L2,2

L2,3

… L2,9

L3,0

L3,1

L3,2

L3,3

… L3,9

L4,0

L4,1

L4,2

L4,3

… L4,9

…L99,0

L99,1

L99,2

… L99,9

L0,0

L0,1

L0,2

L0,3

… L0,9

+ b0 b

1 b

2 b

3 … b

9

+ Same 10 biases on all lines

X : 100 images,one per line, flattened

784 pixels

784 lines

broadcast

100 images at a time

Proprietary + Confidential

9...0 1 2

sigmoid function

softmax

200

100

60

10

30

784overkill

;-)

Going Deep, 5 Layers Deep

Proprietary + Confidential

TanhSigmoidBinary StepIdentity Relu

Softmax

1 1 1

-1

weighted sum of all pixels + bias

2.0

1.0

0.1

Scores→ Logits

0.7

0.2

0.1

Probabilities

Activation Functions

Proprietary + Confidential

Predictions Images Weights Biases

Y[100, 10] X[100, 784] W[784,10] b[10]

matrix multiply broadcast on all lines

applied line by line

tensor shapes in [ ]

Softmax on a batch of images

Proprietary + Confidential

Cross entropy:

computed probabilities

actual probabilities, “one-hot” encoded

0 0 0 0 0 0 1 0 0 0

this is a “6”

0.1 0.2 0.1 0.3 0.2 0.1 0.9 0.2 0.1 0.1

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9Success?

Proprietary + Confidential

Gradient Descent

Proprietary + Confidential

import tensorflow as tf

X = tf.placeholder(tf.float32, [None, 28, 28, 1])W = tf.Variable(tf.zeros([784, 10]))b = tf.Variable(tf.zeros([10]))init = tf.initialize_all_variables()

# modelY=tf.nn.softmax(tf.matmul(tf.reshape(X,[-1, 784]), W) + b)

# placeholder for correct answersY_ = tf.placeholder(tf.float32, [None, 10])

# loss functioncross_entropy = -tf.reduce_sum(Y_ * tf.log(Y))

# % of correct answers found in batchis_correct = tf.equal(tf.argmax(Y,1), tf.argmax(Y_,1))accuracy = tf.reduce_mean(tf.cast(is_correct,tf.float32))

optimizer = tf.train.GradientDescentOptimizer(0.003)train_step = optimizer.minimize(cross_entropy)

sess = tf.Session()sess.run(init)

for i in range(10000):# load batch of images and correct answersbatch_X, batch_Y = mnist.train.next_batch(100)train_data={X: batch_X, Y_: batch_Y}

# trainsess.run(train_step, feed_dict=train_data)

# success ? add code to print ita,c = sess.run([accuracy, cross_entropy], feed=train_data)

# success on test data ?test_data={X:mnist.test.images, Y_:mnist.test.labels}a,c = sess.run([accuracy, cross_entropy], feed=test_data)

initialisation

model

success metrics

training step

Run

The whole code

Workshop Self-paced code lab (summary below ↓): goo.gl/mVZloU Code: github.com/martin-gorner/tensorflow-mnist-tutorial

1-5. Theory (install then sit back and listen or read)

Neural networks 101: softmax, cross-entropy, mini-batching, gradient descent, hidden layers, sigmoids, and how to implement them in Tensorflow

6. Practice (full instructions for this step)Open file: mnist_1.0_softmax.pyRun it, play with the visualisations (keyboard shortcuts on previous slide), read and understand the code as well as the basic structure of a Tensorflow program.

7. Practice (full instructions for this step)

Start from the file mnist_1.0_softmax.py and add one or two hidden layers.

Solution in: mnist_2.0_five_layers_sigmoid.py

8. Practice (full instructions for this step)

Special care for deep neural networks: use RELU activation functions, use a better optimiser, initialise weights with random values and beware of the log(0)

9-10. Practice (full instructions for this step)

Use a decaying learning rate and then add dropout

Solution in: mnist_2.2_five_layers_relu_lrdecay_dropout.py

11. Theory (sit back and listen or read)

Convolutional networks

12. Practice (full instructions for this step)

Replace your model with a convolutional network, without dropout.

Solution in: mnist_3.0_convolutional.py

13. Challenge (full instructions for this step)

Try a bigger neural network (good hyperparameters on slide 43) and add dropout on the last layer to get >99%

Solution in: mnist_3.0_convolutional_bigger_dropout.py

?

?

Proprietary + Confidential

https://cloud.google.com/solutions/running-distributed-tensorflow-on-compute-engine

Distributed TensorFlow on Compute Engine

Proprietary + Confidential

Machine Learning on any data, of any size

Cloud ML EnginePortable models with TensorFlow

Services are designed to work together

Managed distributed training infrastructure that supports CPUs and GPUs

Automatic hyperparameter tuning

Custom Estimators: The Model

https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census

...

def _model_fn(mode, features, labels):

...

if mode == Modes.PREDICT:

...

return tf.estimator.EstimatorSpec(mode, predictions=predictions, export_outputs=export_outputs)

...

if mode == Modes.TRAIN:

...

return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)

...

https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census

Custom Estimators: The Task

... train_input = lambda: model.generate_input_fn(hparams.train_files, num_epochs=hparams.num_epochs, batch_size=hparams.train_batch_size)

...

"""This function is used by learn_runner to create an Experiment which executes model code

provided in the form of an Estimator and input functions."""

def _experiment_fn(run_config, hparams):

tf.estimator.Estimator( model.generate_model_fn(

... ),

train_input_fn=train_input,

eval_input_fn=eval_input,

**experiment_args

)

...

Proprietary + Confidential

Running locally

gcloud ml-engine local train \

--module-name trainer.task --package-path trainer/ \

-- \

--train-files $TRAIN_DATA --eval-files $EVAL_DATA --train-steps 1000 --job-dir $MODEL_DIR

trainingdata evaluation

data

outputdirectory

train locally

Proprietary + Confidential

Single trainer running in the cloud

gcloud ml-engine jobs submit training $JOB_NAME --job-dir $OUTPUT_PATH \

--runtime-version 1.0 --module-name trainer.task --package-path trainer/ --region $REGION \

-- \

--train-files $TRAIN_DATA --eval-files $EVAL_DATA --train-steps 1000 --verbosity DEBUG

train in the cloudregion

Google cloud storage location

Proprietary + Confidential

Distributed training in the cloud

gcloud ml-engine jobs submit training $JOB_NAME --job-dir $OUTPUT_PATH \

--runtime-version 1.0 --module-name trainer.task --package-path trainer/ --region $REGION \

--scale-tier STANDARD_1

-- \

--train-files $TRAIN_DATA --eval-files $EVAL_DATA --train-steps 1000 --verbosity DEBUG

distributed

Proprietary + Confidential

In reality, ML is

Collect data

Create themodel

Refine themodel

Understand and prepare

the data

Serve themodel

Defineobjectives

Proprietary + Confidential

Refine the model

Feature engineering

Better algorithms

More examples, more data

Hyperparameter tuning

Proprietary + Confidential

Hyperparameter tuning

● Automatic hyperparameter tuning service

● Build better performing models faster and save many hours of manual tuning

● Google-developed search (Bayesian Optimisation) algorithm efficiently finds better hyperparameters for your model/dataset

HyperParam #1

Obje

ctive

We want to find this

Not these

https://cloud.google.com/blog/big-data/2017/08/hyperparameter-tuning-in-cloud-machine-learning-engine-using-bayesian-optimization

Proprietary + Confidential

Hyperparameter tuning

gcloud ml-engine jobs submit training $JOB_NAME --job-dir $OUTPUT_PATH \

--runtime-version 1.0 --module-name trainer.task --package-path trainer/ --region $REGION \

--scale-tier STANDARD_1 --config $HPTUNING_CONFIG

-- \

--train-files $TRAIN_DATA --eval-files $EVAL_DATA --train-steps 1000 --verbosity DEBUG

hypertuning

Proprietary + Confidential

Hyperparameter tuning

trainingInput: hyperparameters: goal: MAXIMIZE hyperparameterMetricTag: accuracy maxTrials: 4 maxParallelTrials: 2 params: - parameterName: first-layer-size type: INTEGER minValue: 50 maxValue: 500 scaleType: UNIT_LINEAR_SCALE

...

...

# Construct layers sizes with exponetial decay hidden_units=[ max(2, int(hparams.first_layer_size * hparams.scale_factor**i)) for i in range(hparams.num_layers) ],

...

parser.add_argument( '--first-layer-size', help='Number of nodes in the 1st layer of the DNN', default=100, type=int )

...

hptuning_config.yaml task.py

Proprietary + Confidential

In reality, ML is

Collect data

Create themodel

Refine themodel

Understand and prepare

the data

Serve themodel

Defineobjectives

Proprietary + Confidential

Deploying the model

Creating model

gcloud ml-engine models create $MODEL_NAME --regions=$REGION

Creating versions

gcloud ml-engine versions create v1 --model $MODEL_NAME --origin $MODEL_BINARIES \

--runtime-version 1.0

gcloud ml-engine models list

Proprietary + Confidential

Predicting

gcloud ml-engine predict --model $MODEL_NAME --version v1 --json-instances ../test.json

Using REST:

POST https://ml.googleapis.com/v1/{name=projects/**}:predict

JSON format (in this case):

{"age": 25, "workclass": "private", "education": "11th", "education_num": 7, "marital_status": "Never-married", "occupation": "machine-op-inspector", "relationship": "own-child", "gender": " male", "capital_gain": 0, "capital_loss": 0, "hours_per_week": 40, "native_country": " United-States"}