Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is...

87
Introduction to TensorFlow Filippo Aleotti, Università di Bologna Corso di Sistemi Digitali M Stefano Mattoccia, Università di Bologna

Transcript of Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is...

Page 1: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Introduction to TensorFlowFilippo Aleotti, Università di Bologna

Corso di Sistemi Digitali MStefano Mattoccia, Università di Bologna

Page 2: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

What is TensorFlow

Tensorflow is an open source machine learning and deep learning framework developed by Google.

It allows to develop machine learning models, and it has a great support also for deploying.

Nowadays widely used in productions not only by Google but also by many other companies

Page 3: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

What is TensorFlow

TensorFlow offers API for Python, Java and C/C++

In this course we will use TensorFlow 1.x, even if the new version of TensorFlow called TensorFlow 2 is going to be released (but it has a different paradigm with respect to the previous version)

In TensorFlow 1.x, there exist two phases : the building of the graph and the execution of it.

Models trained with TensorFlow may be ported on mobile devices (iOS and Android) using a lightweight version of TensorFlow called tf-lite

Page 4: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

What is a Tensor

Informally, a tensor is a multi-dimensional array

● A 0-D tensor is a scalar● A 1-D tensor is an array● A 2-D tensor is a matrix● A 3-D tensor is an array of matrices● ...

Page 5: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

What is a Tensor

We can think an image as a 3-D tensor with height H, width W and channel dimensions C

Picture from Wikipedia

Page 6: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Graph

As we said before, TensorFlow keeps divided the execution phase from the graph construction.

The graph specifies what operations perform, and how the tensors have to flow from the inputs in order to generate the desired output

Internally, the compiler can exploit pruning operations to cut off unused branches of the graph, saving memory and computational time

Page 7: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Graph

Suppose we have two scalar tensors, x and y, and we need to compute the dot product between them

The resulting graph is

Page 8: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Graph

Of course, we can build more complex graphs

Page 9: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Graph

In the graph, we have edges and nodes:

● edges: they represent tensors that flow between nodes

● nodes: they are tf.Operation added to the graph. They take zero or more input tensors and generate zero or more output tensors.

In the dot example, x,y and dot are Tensors, while mul is an Operation

Page 10: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Tensor

In the previous example we used the tf.Variable to represent both x and y

In TensorFlow we can represent data using various types of tensors:

● Variable: tensor whose value may change by running operations on it

● Constant: it creates a constant tensor

● Placeholders: used to feed the graph with input values during the execution

Page 11: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Tensor: rank and shape

The rank of a Tensor is the number of dimensions, while its shape is the number of elements in each dimension

Rank Entity Shape Example

0 Scalar [] A single number

1 Array [N0] A list of numbers

2 Matrix [N0,N1] A matrix

3 3-Tensor [N0, N1, N2] An image

4 4-Tensor [N0, N1, N2, N3] A batch of images

Page 12: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Tensor

What is the rank of x? And of y?To answer these questions, just print the two ranks and look at the results!

Page 13: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Tensor

Even if it sounds strange at a first look, the previous result makes sense:

In fact, we have asked to TensorFlow the rank of the two Tensors, and it added to the graph the operation to get it

What we printed out was the shape of the resulting ranking Tensor (generated by the rank operation) and not the rank value!

To get the rank, we have to execute the graph: at the end of the execution, the ranking tensor will assume the rank value

Page 14: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Execution

In TensorFlow, the execution of the graph have to be performed inside a Session.

The Session takes a graph and it executes all the Operations from the starting point up to the desired set of Tensor we want to evaluate.

If no graph is selected, the default graph would be used.

Page 15: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Execution: Session

Graph Definition Graph execution inside a Session

inputs outputs

Outside TensorFlow

Outside TensorFlow

Inside TensorFlow

Page 16: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Execution: Session

We can create a new Session using the with statement, in order to open and close automatically the Session (releasing resources at the end):

Page 17: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Execution: Session

In the previous snapshot we:

● Created a new Session, called session, with a configuration Config. In the config we asked the soft placement option: if no GPU is available (or if an Operation doesn’t have a GPU implementation), than the CPU is the target device

● We initialized all local and global variables. This step is used to allocate and place onto the right device the previously defined (and added to the graph) variable. We run both local and global initialization ops since we may have placed some variables in the LOCAL set of variables (by default variables are GLOBAL)

Page 18: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Execution: Session

● We asked the Session to run the Graph to get us back the value assumed by the dot Tensor

● Finally, we printed dot_value, dot and their types. As we notice before, dot is a Tensor object (so, related to TensorFlow), but dot_value is a scalar (as we expected) with type np.float32

Page 19: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Execution: Session

Let’s try again with the rank example

Running the two rank operations (notice that we used a single session.run with a list as input), now TensorFlow is getting us the expected ranks.

Page 20: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Execution

To summarize:

● TensorFlow has got two phases: the graph creation and its execution● Some operations are performed during the first phase, while others

during the second one.

An example of this is a piece of code inside an if statement: since the if is evaluated during the graph construction, that block will add or not a branch in the graph, but at runtime the new branch will be always executed (if it has been added) or never (if the condition was not satisfied during the graph creation).

If you need to check a condition at runtime, you have to use the tf.cond op.

Page 21: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

About TensorFlow versions

TensorFlow has been released in many version (last release in November 2019 is r1.15).

Some releases may expose different functions and modules from the others, but since there exists a lot of previous code all the versions remain available.

The scripts presented in these slides can be run even with newest (e.g., tf 1.15) TensorFlow releases, but some warning will be displayed: for instance, in newer versions tf.Session is deprecated in favour of tf.compact.v1.Session.

Page 22: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Shapes

Another example of this dualism concerns the shape: to get the shape of a Tensor we can run the following operations

Why two operations? Are them the same? Let’s look what they print out

So, they are not the same since the first returns a Tensor while the other a list

Page 23: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Shapes

However, running the Tensor we are getting the right shape

What is happening here is that we are asking for two different shapes: the static and the dynamic

The former is the shape used at graph creation, while the latter is the shape assumed by the tensors when we run the graph in the session.

Page 24: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Shapes

In the previous case, the two shapes were equal, but sometimes it happens that they are not.

In particular, the dynamic shape (i.e., the one we get with tf.shape) will always assume a value, since at runtime a specific Tensor with a given shape will flow through the graph, while the static shape may not.

The static shape represents the shape of a Tensor while we are building the graph, and we may do not know its shape (or we might know just a portion of it)!

For instance, when we test a trained CNN on images, we might expect images with 3 channels (RBG), but we do not impose any constraint about height and width

Page 25: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Loading data

In the previous example we initialized two variables to perform their product.

However, a common approach (e.g., when training a neural network) consists in reading the data from a dataset, typically stored in the file system of your machine

How can we load data in our sessions with less binding constraints?

Page 26: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Loading data: placeholders

TensorFlow offers many solutions to load the data you need.

Placeholders are a simple yet effective way to load data. As the name suggests, they stand for somethings else that we currently do not know: at creation time, we can exploit the placeholder to perform operations, then in the session.run we will run the graph assigning to each placeholder a specific value

Using the feed_dict method when calling the session.run, we force the placeholder to assume the given value

Using placeholders, the graph remains the same, since we are just feeding the same graph with different values

Page 27: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Loading data: placeholders

In this case, we are feeding the graph first with the 2x2 matrix [[1,5], [3,2]],

then with the 1x2 matrix [3,2].

Page 28: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Loading data: placeholders and static shape

Moreover, you can also notice that we do not specified the full static shape of the placeholder: since we created a placeholder with shape [None, 2], at runtime the program expects N couples (with N >=1).

Exploiting this feature of the static shape we can run firstly giving a 2x2 matrix and then with a 1x2 matrix, but we could have used any Nx2 matrix as input!

Let’s look the static shape:

Page 29: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Loading data: placeholders and static shape

Note that the shape of tensor is (2,) since we know that the placeholder has rank 2 (so we expect two dimensions)

On the other hand, the dynamic shape looks like:

Notice that we are able to obtain more than a single value using the same run just passing a list!

Page 30: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Loading data: tf.Data

Even if Placeholders are useful to feed the network with arbitrary values, TensorFlow offers complex and more powerful instruments to handle input data: tf.Data

tf.Data allows us to build a pipeline that may include the loading of your data, their handling (e.g., data augmentation to enlarge the dataset) and how to feed these data to your network (e.g., data shuffling and batch creation)

Page 31: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Neural Network Concepts

Page 32: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Neural Network Concepts

A Neural Network (and also a Convolutional Neural Network) can be seen as a set of weights that, linked together to form a particular architecture, is able to turn the input into the desired (hopefully) output

Each weight is represented in general using a float32 variable, but to preserve space and increase speed sometimes also int32, int16 or even int8 are used.

In general, at the beginning these weights are initialized randomly (usually, values are random but there exists some rules to follow, e.g. He initialization), and during the training the weights are updated to obtain better and better results.

Page 33: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Neural Network Concepts: loss functions

In order to understand if we are moving in the right direction (i.e., our outputs are not noise but correct ones) we need a loss (cost) function: the loss measures the error we are making giving back as result that output.

For many tasks the loss function may be a distance function (e.g., l1 distance, l2 distance and so on), but actually it depends on the final goal.

If we measure the distance between our predictions and the desired, perfect outcomes then our training is supervised. On the contrary, if we minimise the prediction without any kind of label, than the training is called unsupervised.

Page 34: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Neural Network Concepts: monocular depth estimation example

For instance, suppose you have to estimate the depth starting from a single image (monocular depth estimation).

Supervised training: you have to collect a dataset in which each training image is coupled with its relative ground truth (i.e., a depth value for each pixel). Obtaining ground truth values may be hard and expensive, since you have to use active sensors to obtain better results. The loss function is the distance between ground truth and predictions. Expensive

Unsupervised training: your dataset is made up by standard rgb images, no active sensors are needed (except if you want to realise also a testing split). You might train your network using image reconstruction techniques based on stereo sequences or monocular sequences. Your loss function may be a photometric similarity between images. Cheap

Page 35: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Neural Network Concepts: gradient

However, how can we obtain better weights (so lower costs)?

We could move randomly in the variable space… but of course it is not feasible! In fact, the number of variables may be extremely large (millions of variables), so we would obtain the minimum of a set of hypothesis if we sample randomly from this space.

A better method consists in following the gradient of the function: the gradient is the vector that contains all the partial derivatives of the function at a given point.

The gradient points to the direction of greatest increase of the function

Page 36: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Neural Network Concepts: minimisation

So, given a differentiable loss L, function of the weights of the network, we are able to minimise it iteratively estimating the gradient of the loss and “moving” in the opposite direction.

In other terms, given the current state of the weights, we have to measure the error that our network is committing (this error is function of the weights), then update each weight in the opposite direction of the gradient.

α is the learning rate, the hyperparameter that rules the “intensity” of the change.

Page 37: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Neural Network Concepts: backpropagation

We can obtain automatically the partial derivatives that compose the gradient exploiting the chain rule of derivatives: given two functions of x, f and g, than the derivative of f ( g (x) ) is f ’ ( g (x) ) * g ’(x).

In other terms:

We can use this property to obtain the partial derivative of each weight with respect to the loss function.

Page 38: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Neural Network Concepts: references

Detailed information about CNN and backpropagation can be found in the Convolutional Neural Network for Visual Recognition course realised by Stanford University.

Slides can be found here

TensorFlow is able to calculate derivatives automatically exploiting chain rule!

Page 39: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example

Page 40: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Handwritten digit recognition

In this example, we are going to build our first Neural Network (in particular, a Convolutional Neural Network) with TensorFlow

The network has to recognise handwritten digit, so given a picture of a number between 0 and 9 the task of the network is to give back to us a scalar representative of that number

CNN 2

Page 41: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: handwritten digit recognition

In this example we will use the MNIST Dataset, a collection of 70 000 handwritten digit with ground-truth labels. The dataset is made up by 60 000 training images and 10 000 testing images.

Instead, the network we are going to build is similar to the LeNet-5 network, a CNN proposed by Yann LeCun and Yoshua Bengio.

LeNet is a simple yet effective handwritten recognizer proposed (in its first version) in 1998. Nowadays, there exists complex and more accurate networks, but it is a good starting point to get acquainted with CNNs

Page 42: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: dataset

First of all, we have to download the MNIST dataset from here

The dataset is composed by 60 000 28x28 handwritten digit images. Each image has a ground-truth label

We are going to use a split of 53600 images for training the network, and the remaining 6400 for validation. Once the network has been trained, we test it on the test split, made up of other 10 000 images.

Do you know why validation is important? What are the differences with testing?

NOTE: the dataset doesn’t contain a validation split. We are going to extract it directly from the training one.

Page 43: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: dataset

Once the full dataset (training + test) has been downloaded, put those files into the same folder called, for instance, MNIST

your_path/MNIST/

- train-images-idx3-ubyte - train-labels-idx1-ubyte

- t10k-labels-idx1-ubyte - t10k-images-idx3-ubyte

Then, just run the mnist_converter.py script to extract the dataset

The script will create three folders: train, test and validation. Each folder will contain an images folder and a labels.txt file

Page 44: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: dataloader

Images have been stored as grayscale png

Let’s prepare the portion of the script in charge of loading our data: the Data Loader.

Some considerations:

● For training, we can use 2 placeholders: one for loading the images and one for their relative labels. We also know that images are 28x28x1

● Feeding the network just a single example at each step may lead to noise during the backprop. A better solution consists in feeding the network with a larger training batch (e.g., 10 or 100)

● During the training, it is preferable to shuffle the samples, avoiding any form of relationship due to the data loading

Page 45: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: dataloader

In the following script, we load a training batch selecting the elements in a random way. You may notice also that we normalize the images (dividing by 255)

Page 46: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: network

C1 C3S2 S4 C5 F6 O 2

The network is composed by 5 layers of 3 different types:

● Convolutional layers (C1, C3) with 5x5 kernels and stride 1● Max pooling layers (S2, S4) with 2x2 kernels and stride 2● Fully Connected (FC) layers (C5, F6, O) with 120, 84, 10 neurons

Page 47: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: convolutional block

To create a convolutional block, we can use the following function

In the function, we first create the weights and the biases for the kernel and then we apply the convolution between the inputs and the kernel, adding the biases as the last step.

Page 48: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: network

Now, we have got all the building blocks for setting up our network

Page 49: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: network

In the previous snapshot, we used the conv2d block to build the C1 and C3 layers, the native tf.nn.max_pool for the pooling layers S2 and S4 and the tf.contrib.layers.fully_connected for the FC layers

We used the tf.tanh activation function to add non linearity.

The tanh is bounded between -1 and 1 and it is differentiable.

Page 50: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: loss function

Once the network has been defined, we need a loss function. Remember that we are considering a classification scenario.

Each training image has a corresponding label (we stored these labels in the labels.txt file): our loss function have to receive as input the labels and the predictions of the network and return a measure of the distance between them.

Moreover, the network predicts 10 values (one for each digit) while the correct label is just one. We have to transform each label in a so called One-Hot vector, a vector with a single 1 (in correspondence of the ground-truth value) and 0 for all the remainings.

7 -> [0,0,0,0,0,0,0,1,0,0]

Page 51: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: loss function

A common and largely used loss function for object classification is the Cross-Entropy loss, which measure the distance between two distributions.

In the build_loss function we applied the cross entropy between the two vector, then we reduce all the cross-entropy result (one for each sample in the training batch) to a single scalar value applying the mean reduction

Page 52: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: cross entropy

In the information theory field, it measures the error we commit using the distribution q instead of the true distribution p

Supposing that p and q are probability distributions, the cross entropy measures the distance between them. Our desire is to make the model distribution as closer as possible to the empirical distribution (the real one).

How we can turn p and q into probability distributions? Using the softmax!

Page 53: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: softmax

The softmax σ is defined as normalized exponential function:

Doing so, we are dividing the exponential of each network prediction z by the sum of the exponentials. The exponential makes even larger large predictions, and the normalization ensures that:

● each result will be in the range [0,1] ● they will sum up to 1● they are all 0+.

Page 54: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: softmax

In other words, each result can be seen as a probability!

What we desire is a high probability for the correct class and near zero probability for others.

NOTE: for numerical stability, in TensorFlow the softmax is performed inside the cross-entropy itself, so we do not have to call it externally.

Page 55: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: training the network

When the network is ready, we can build the code that allows to iterate N times, where N is the training step, and applying at each iteration the minimizing operation over the loss measured for that batch.

The optimizer object is in charge of backprop and the weight update operations.

On a MacBook Pro equipped with Intel i5 processor and no GPU, a 18 000 steps training requires about 20 minutes

On a PC with an Intel i7 processor and a NVidia Titan X GPU the same training is carried out in ~ 4 minutes.

Page 56: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: training the network

Page 57: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: training the network

The graph of our network

Page 58: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: testing the network

At the end of the training, our network should be able to identify correctly new and unseen samples.

As for classical software development procedure, before the deployment stage we have to check with tests the goodness of the network

To do so, the dataset we have to use is the test split of MNIST: given a new testing image, our network must give us back the scalar representing that value.

NOTE: the network predicts a distribution of values (i.e. 10 values, one for each digit), and the prediction is the index of the max. So, a way to retrieve the prediction given the distribution is to use the argmax operation

Page 59: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: testing the network

In this case, we have to load the ground-truth labels just to, and no One-Hot vector must be created.

Page 60: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: testing the network

During the test, we have measure the accuracy of the network, which can be defined as the ratio between the number of correct predictions over the number of testing samples.

Page 61: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: recap

In the previous example, we have built a network for handwritten digit recognition. The network is similar to the LeNet5 by Y. LeCunn. We used the MNIST dataset both for training and for testing.

You can download the full example here and run a training by yourself. In that code also validation is performed.

In the provided code, there is also a pretrained model ready for testing. It achieves ~ 99% of accuracy on test.

Page 62: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Example: tf Data

In the repository there is also an implementation that exploit tf.Data instead of placeholders to load data from the file system.

In this case, since images are very tiny, we are not able to fully exploit GPU (in fact, we use ~30% of the overall capability), but in case of larger images tf.Data is the better solution.

Moreover, data augmentation (e.g., image flip, color augmentation etc) can be realised using TensorFlow operations, so also these operations are executed on GPU!

Page 63: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Save and Load models

Page 64: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Save and Load models

During the training, we can serialize the current state of the network (i.e., the value of each weight) into a file called checkpoint.

This is fundamental because:

● at the end of the training, the final checkpoint is tested and, if it is good enough, it is deployed in your application

● the training may require days or even weeks of training: you should keep a checkpoint because in case of trouble (e.g., power outage) you have not to restart from scratch

Page 65: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Save and Load models

We can save a checkpoint first creating a tf.Saver object, then invoking the method save of the Saver. Saver can receive as input the list of variables to save (default is None, and it means all the saveable objects). The save operation will create three files: data, meta and index.

● Data file contains the value for each saved variable● Index contains the mapping between each tensor and some meta

informations related to the tensor (e.g., offset in the file, type of data etc)● Meta file contains all the graph

To restore a checkpoint, we have to create a tf.Saver and apply the method restore passing the path to the directory that contains these files.

Page 66: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Save and Load models: explore a saved model

TensorFlow offers utilities to explore a saved model: we can print tensors, discover their names, the shapes and the values.

Page 67: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Save and Load models

The output will look like this:

Page 68: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Save and Load models: scopes

As you may notice in the previous screenshot, the full name of a tensor is composed by a sequence of names separated by / (similar to file system paths).

In fact, the full name of a tensor is given by the name of tensor and all the scopes that surround the tensor.

We can insert a new scope using the using the tf.variable_scope

Page 69: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Save and Load models: scopes

Scopes are useful to reorganize your tensors : for instance, we can add a scope at the beginning of a function, so all the tensors belonging to that function would have the same scope.

Moreover, we can use scopes to save/load a subset of tensors: if we want to load just some tensors from a stored checkpoint, we can select among available tensors those that contain in their scope a given key.

Page 70: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

TensorFlow Lite

Page 71: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

TensorFlow Lite

So far, we have used TensorFlow using powerful devices such as laptops or even GPU equipped computers.

TensorFlow, however, supports also production environments (TFX) and mobile computing (TensorFlow Lite).

In the following section, we will see some core concepts of TensorFlow Lite

powerful server mobile device

Page 72: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

TensorFlow Lite: flat buffers

To exploit TensorFlow directly on your mobile devices in order to realise deep learning based applications, you have to train your network using a power device (such as a GPU equipped server) first.

Then, you have to export the model obtained at the end of the training procedure to a Flat Buffer file (.tflite). This file stores all the data of your network using a memory efficient serialization.

Finally, you have to invoke the TensorFlow Lite interpreter in your Android/iOS application, storing in the assets the tflite file.

Page 73: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

TensorFlow Lite: model conversion

We can obtain a tflite file using Python functions or the command line.

TensorFlow Lite converter is able to export a FlatBuffer starting from a running Session, Keras models, frozen graphs (protobuffer) and saved models. Examples can be found here.

TensorFlow offers many way to realise the tflite depending on your needs. Notice that some tools are more flexible (for instance, protobuffers allow also graph pruning optimisation)

Page 74: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

TensorFlow Lite: execution

Depending on the final os, in order to run your neural network inside the application you have to:

Android) create a TensorFlow Lite Interpreter: it loads the serialized model, stored in the tflite file, prepare the input data and return you back the outputs.

iOS) on iOS, tflite model is converted into a proprietary format. At the end of the conversion, we will obtain a CoreML model, which can run directly on iOS devices (iPhone, iPad etc). This conversion enables to exploit the Metal framework and the GPU of the device.

Page 75: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

TensorFlow Lite

In MobilePydnet repository you can find the source code to build an app and estimate depth from monocular images acquired by a smartphone.

The code is open source, and it contains both the iOS and Android implementation. You can start from this code to implement your deep learning based application!

Moreover, for Android users, in the official TensorFlow Lite repository there are some examples for image classification, gesture recognition, style transfer using GANs and more.

Page 76: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Monocular depth estimation and PyDNet

Page 77: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

PyDNet

In the previous section, we cited MobilePydnet.

PyDNet is a Convolutional Neural Network able to infer the depth of the scene starting from a single frame.

In the picture, closer points (w.r.t the optical center of the camera) are encoded with hotter colors, while farther are colder.

M. Poggi, F. Aleotti, F. Tosi, S. Mattoccia, “Towards real-time unsupervised monocular depth estimation on CPU”, IROS 2018

Page 78: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Monocular depth estimation: benefits

Monocular depth estimation is particularly appealing in real world scenarios, since it remove the binocular cameras constraint.

Monocular camera devices are widespread (e.g., smartphones and tablets are mostly monocular), and monocular depth estimation isn’t affected by classical stereo problems (e.g., occlusions).

We may want to use monocular depth in many situations, such as:

AR/VR Robotics Autonomous Driving

Page 79: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Monocular depth estimation: problems

Unfortunately, it is an ill-posed problem: it has, in theory, an infinite number of possible solutions!

https://www.moillusions.com/these-3-cars-are-same-in-size/

Page 80: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Monocular depth estimation: training

We can think about depth estimation as a regression problem, in which for each pixel the network has to predict a continuous value (e.g., the 3D point in the world that has generate such pixel in the image is far 15.02 meters from the camera).

How can we learn the depth?

We can train the network using different approaches:

● using ground truths● using monocular videos and geometrical constraints● using a stereo camera a training time● ...

Page 81: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

PyDNet trainings

The paper version of PyDNet exploited a stereo camera at training time, while in Mobile PyDNet we leveraged on “ground truth” data provided by an active sensor.

The network are equals, the differ only in the loss we used to minimise the network!

Page 82: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

PyDNet

PyDNet is a Encoder-Decoder (Autoencoder) fully convolutional network with a pyramidal structure:

● the Encoder is composed by 6 levels: each level extract useful information (features) starting from the output of the previous one, and maps them inside an higher dimensionality space.

● the Decoder phase has to map encoder features into depth values. It is actually made up by 6 tiny decoders.

We are going to see them in details

Page 83: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Encoder

Number of filters are respectively 16, 32, 64, 96, 128 and 196 for levels from 1 to 6.

We are gradually decreasing the spatial dimension of the inputs of each level (so, convolutions would be faster!) but at the same time we are increasing the number of features extracted.

For instance, last level returns B x H/64 x W/64 x 196 volumes, where B is batch size, while H and W the image shape.

Each level in the encoder first apply a 2D conv (in orange) with stride 2, followed by a 2D conv with stride 1 (in blue).

Page 84: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Decoder

Each decoder is able to predict a depth map at a certain resolution: starting from bottom (L6), depth are estimate at 1/64, 1/32, 1/16, ⅛, ¼ and ½ of the original input.

Each decoder applies 4 2D convolutions to obtain a feature volume. This volume is both convolved by a final conv2D to obtain a depth map, and upsampled through bilinear upsampling. The upsampled volume is given to the next decoder in the pyramid.

To restore original resolution we can apply upsampling operations (e.g., bilinear upsample) to depth map.

Page 85: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

Decoder

Why multiple predictions?

The key idea is that if we are using a powerful devices then we can process the full pyramid, otherwise we can stop early!

You have to find the right trade off between accuracy and computational time, depending on your application.

Rgb image Half Quarter Height

Page 86: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

References

Page 87: Introduction to TensorFlow - unibo.itvision.deis.unibo.it/~smatt/DIDATTICA/Sistemi... · What is TensorFlow TensorFlow offers API for Python, Java and C/C++ In this course we will

TensorFlow Lite

● Deep Learning, I. Goodfellow, Y. Bengio and A. Courville, MIT Press, website

● Convolutional Neural Network for Visual Recognition, Stanford, website

● Pattern Recognition and Machine Learning, C. Bishop, Springer Verlag

● Online TensorFlow examples, such as this repository

● TensorFlow Lite examples from the official website