Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)?...
Transcript of Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)?...
![Page 1: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/1.jpg)
images/logos
Deep multi-task learning with evolving weightsMachine learning - computer vision
published in European Symposium on Artificial Neural Networks (ESANN 2016)
Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam
LITIS lab., Apprentissage team - INSA de Rouen, France
JDD, Le Havre. 14 June, 2016
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights
![Page 2: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/2.jpg)
images/logos
Introduction
Machine learning
What is machine learning (ML)?
ML is programming computers (algorithms) to optimize aperformance criterion using example data or past experience.
Learning a task
Learn general models from data to perform a specific task f .
fw : x −→ y
x: inputy: output (target, label)w: parameters of ff (x;w) = y
From training to predicting the future: Learn to predict1 Train the model using data examples (x,y)2 Predict the ynew for the new coming xnew
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 1/29
![Page 3: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/3.jpg)
images/logos
Introduction
Machine learning
What is machine learning (ML)?
ML is programming computers (algorithms) to optimize aperformance criterion using example data or past experience.
Learning a task
Learn general models from data to perform a specific task f .
fw : x −→ y
x: inputy: output (target, label)w: parameters of ff (x;w) = y
From training to predicting the future: Learn to predict1 Train the model using data examples (x,y)2 Predict the ynew for the new coming xnew
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 1/29
![Page 4: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/4.jpg)
images/logos
Introduction
Machine learning
What is machine learning (ML)?
ML is programming computers (algorithms) to optimize aperformance criterion using example data or past experience.
Learning a task
Learn general models from data to perform a specific task f .
fw : x −→ y
x: inputy: output (target, label)w: parameters of ff (x;w) = y
From training to predicting the future: Learn to predict1 Train the model using data examples (x,y)2 Predict the ynew for the new coming xnew
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 1/29
![Page 5: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/5.jpg)
images/logos
Introduction
Machine learning applicationsFace detection/recognitionImage classificationHandwriting recognition(postal address recognition, signature verification,writer verification, historical document analysis (DocExplorehttp://www.docexplore.eu))Speech recognition, Voice synthesizingNatural language processing (sentiment/intent analysis, statistical machinetranslation, Question answering (Watson), Text understanding/summarizing,text generation)Anti-virus, anti-spamWeather forecastFraud detection at banksMail targeting/advertisingPricing insurance premiumsPredicting house prices in real estate companiesWin-tasting ratingsSelf-driving cars, Autonomous robotsFactory Maintenance diagnosticsDeveloping pharmaceutical drugs (combinatorial chemistry)Predicting tastes in music (Pandora)Predicting tastes in movies/shows (Netflix)Search engines (Google)Predicting interests (Facebook)Web exploring (sites like this one)Biometrics (finger prints, iris)Medical analysis (image segmentation, disease detection from symptoms)Advertisements/Recommendations engines, predicting other books/productsyou may like (Amazon)Computational neuroscience, bioinformatics/computational biology, geneticsContent (image, video, text) categorizationSuspicious activity detectionFrequent pattern mining (super-market)Satellite/astronomical image analysis
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 2/29
![Page 6: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/6.jpg)
images/logos
Introduction
ML in physics
Event detection at CERN (The European Organization for Nuclear Research)
⇒ Use ML models to determine the probability of the eventbeing of interest.⇒ Higgs Boson Machine Learning Challenge(https://www.kaggle.com/c/higgs-boson)
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 3/29
![Page 7: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/7.jpg)
images/logos
Introduction
ML in quantum chemistry
Computing the electronic density of a molecule⇒ Instead of using physics laws, use ML (FAST).
See Stéphane Mallat et al. work: https://matthewhirn.files.wordpress.com/2016/01/hirn_pasc15.pdf
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 4/29
![Page 8: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/8.jpg)
images/logos
Function estimation
How to estimate fw?
ModelsParametric (w) vs. non-parametricEstimate fw = train the model using dataTraining: supervised (use (x,y)) vs. unsupervised (use only x)Training = optimizing an objective cost
Different models to learn fwKernel models (support vector machine (SVM))Decision treeRandom forestLinear regressionK-nearest neighborGraphical models
Bayesian networksHidden Markov Models (HMM)Conditional Random Fields (CRF)
Neural networks (Deep learning): DNN, CNN, RBM, DBN, RNN.
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 5/29
![Page 9: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/9.jpg)
images/logos
Function estimation
How to estimate fw?
ModelsParametric (w) vs. non-parametricEstimate fw = train the model using dataTraining: supervised (use (x,y)) vs. unsupervised (use only x)Training = optimizing an objective cost
Different models to learn fwKernel models (support vector machine (SVM))Decision treeRandom forestLinear regressionK-nearest neighborGraphical models
Bayesian networksHidden Markov Models (HMM)Conditional Random Fields (CRF)
Neural networks (Deep learning): DNN, CNN, RBM, DBN, RNN.
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 5/29
![Page 10: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/10.jpg)
images/logos
Deep neural network
Deep neural networks (DNN)x1
x2
x3
x4
x5
y1
y2
State of the art in many task: computer vision, natual languageprocessing.Training requires large dataTo speed up the training: use GPUs cardsTraining deep neural networks is difficult⇒ Vanishing gradient⇒ More parameters⇒ Need more data
Some solutions:⇒ Pre-training technique [Y.Bengio et al. 06, G.E.Hinton et al. 06]⇒ Use unlabeled data
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 6/29
![Page 11: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/11.jpg)
images/logos
Deep neural network
Deep neural networks (DNN)x1
x2
x3
x4
x5
y1
y2
State of the art in many task: computer vision, natual languageprocessing.Training requires large dataTo speed up the training: use GPUs cardsTraining deep neural networks is difficult⇒ Vanishing gradient⇒ More parameters⇒ Need more data
Some solutions:⇒ Pre-training technique [Y.Bengio et al. 06, G.E.Hinton et al. 06]⇒ Use unlabeled data
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 6/29
![Page 12: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/12.jpg)
images/logos
Deep neural network
Deep neural networks (DNN)x1
x2
x3
x4
x5
y1
y2
State of the art in many task: computer vision, natual languageprocessing.Training requires large dataTo speed up the training: use GPUs cardsTraining deep neural networks is difficult⇒ Vanishing gradient⇒ More parameters⇒ Need more data
Some solutions:⇒ Pre-training technique [Y.Bengio et al. 06, G.E.Hinton et al. 06]⇒ Use unlabeled data
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 6/29
![Page 13: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/13.jpg)
images/logos
Context
Semi-supervised learning
General case:
Data = { labeled data︸ ︷︷ ︸expensive (money, time), few
,unlabeled data︸ ︷︷ ︸cheap, abundant
}
E.g: medical images
⇒ semi-supervised learning:
Exploit unlabeled data to improve the generalization
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 7/29
![Page 14: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/14.jpg)
images/logos
Context
Pre-training and semi-supervised learning
The pre-training technique can exploit the unlabeled data
A sequential transfer learning performed in 2 steps:1 Unsupervised task (x labeled and unlabeled data)2 Supervised task ( (x,y) labeled data)
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 8/29
![Page 15: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/15.jpg)
images/logos
Pre-training technique and semi-supervised learning
Layer-wise pre-training: auto-encoders
x1
x2
x3
x4
x5
y1
y2
A DNN to train
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 9/29
![Page 16: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/16.jpg)
images/logos
Pre-training technique and semi-supervised learning
Layer-wise pre-training: auto-encoders
x1
x2
x3
x4
x5
x1
x2
x3
x4
x5
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 10/29
1) Step 1: Unsupervised layer-wise training
Train layer by layer sequentially using only x (labeled or unlabeled)
![Page 17: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/17.jpg)
images/logos
Pre-training technique and semi-supervised learning
Layer-wise pre-training: auto-encoders
x1
x2
x3
x4
x5
h1,1
h1,2
h1,3
h1,4
h1,5
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 10/29
1) Step 1: Unsupervised layer-wise training
Train layer by layer sequentially using only x (labeled or unlabeled)
![Page 18: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/18.jpg)
images/logos
Pre-training technique and semi-supervised learning
Layer-wise pre-training: auto-encoders
x1
x2
x3
x4
x5
h1,1
h1,2
h1,3
h1,4
h1,5
h1,1
h1,2
h1,3
h1,4
h1,5
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 10/29
1) Step 1: Unsupervised layer-wise training
Train layer by layer sequentially using only x (labeled or unlabeled)
![Page 19: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/19.jpg)
images/logos
Pre-training technique and semi-supervised learning
Layer-wise pre-training: auto-encoders
x1
x2
x3
x4
x5
h2,1
h2,2
h2,3
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 10/29
1) Step 1: Unsupervised layer-wise training
Train layer by layer sequentially using only x (labeled or unlabeled)
![Page 20: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/20.jpg)
images/logos
Pre-training technique and semi-supervised learning
Layer-wise pre-training: auto-encoders
x1
x2
x3
x4
x5
h2,1
h2,2
h2,3
h2,1
h2,2
h2,3
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 10/29
1) Step 1: Unsupervised layer-wise training
Train layer by layer sequentially using only x (labeled or unlabeled)
![Page 21: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/21.jpg)
images/logos
Pre-training technique and semi-supervised learning
Layer-wise pre-training: auto-encoders
x1
x2
x3
x4
x5
h3,1
h3,2
h3,3
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 10/29
1) Step 1: Unsupervised layer-wise training
Train layer by layer sequentially using only x (labeled or unlabeled)
![Page 22: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/22.jpg)
images/logos
Pre-training technique and semi-supervised learning
Layer-wise pre-training: auto-encoders
x1
x2
x3
x4
x5
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 10/29
1) Step 1: Unsupervised layer-wise training
Train layer by layer sequentially using only x (labeled or unlabeled)
At each layer:⇒ When to stop training?
⇒ What hyper-parameters to use?
⇒ How to make sure that the training improves the supervised task?
![Page 23: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/23.jpg)
images/logos
Pre-training technique and semi-supervised learning
Layer-wise pre-training: auto-encoders
x1
x2
x3
x4
x5
y1
y2
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 11/29
2) Step 2: Supervised training
Train the whole network using (x, y)
Back-propagation
![Page 24: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/24.jpg)
images/logos
Pre-training technique and semi-supervised learning
Pre-training technique: Pros and cons
ProsImprove generalizationCan exploit unlabeled dataProvide better initialization than randomTrain deep networks⇒ Circumvent the vanishing gradient problem
ConsAdd more hyper-parametersNo good stopping criterion during pre-training phase
Good criterion for the unsupervised taskBut
May not be good for the supervised task
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 12/29
![Page 25: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/25.jpg)
images/logos
Pre-training technique and semi-supervised learning
Pre-training technique: Pros and cons
ProsImprove generalizationCan exploit unlabeled dataProvide better initialization than randomTrain deep networks⇒ Circumvent the vanishing gradient problem
ConsAdd more hyper-parametersNo good stopping criterion during pre-training phase
Good criterion for the unsupervised taskBut
May not be good for the supervised task
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 12/29
![Page 26: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/26.jpg)
images/logos
Pre-training technique and semi-supervised learning
Proposed solution
Why is it difficult in practice?
⇒ Sequential transfer learning
Possible solution:
⇒ Parallel transfer learning
Why in parallel?
Interaction between tasksReduce the number of hyper-parameters to tuneProvide one stopping criterion
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 13/29
![Page 27: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/27.jpg)
images/logos
Proposed approach
Parallel transfer learning: Tasks combination
Train cost = supervised task + unsupervised task︸ ︷︷ ︸reconstruction
l labeled samples, u unlabeled samples, wsh : shared parameters.
Reconstruction (auto-encoder) task:
Jr (D;w′ = {wsh,wr}) =l+u∑i=1
Cr (R(xi ;w′),xi) .
Supervised task:
Js(D;w = {wsh,ws}) =l∑
i=1
Cs(M(xi ;w),yi) .
Weighted tasks combination
J (D; {wsh,ws,wr}) = λs · Js(D; {wsh,ws}) + λr · Jr (D; {wsh,wr}) .
λs, λr ∈ [0, 1]: importance weight, λs + λr = 1.
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 14/29
![Page 28: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/28.jpg)
images/logos
Proposed approach
Tasks combination with evolving weightsWeighted tasks combination:
J (D; {wsh,ws,wr}) = λs · Js(D; {wsh,ws}) + λr · Jr (D; {wsh,wr}) .
λs, λr ∈ [0, 1]: importance weight, λs + λr = 1.
ProblemsHow to fix λs, λr ?At the end of the training, only Js should matters
Tasks combination with evolving weights (our contribution)
J (D; {wsh,ws,wr}) = λs(t) · Js(D; {wsh,ws}) + λr (t) · Jr (D; {wsh,wr}) .
t : learning epochs, λs(t), λr (t) ∈ [0, 1]: importance weight, λs(t) + λr (t) = 1.
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 15/29
![Page 29: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/29.jpg)
images/logos
Proposed approach
Tasks combination with evolving weightsWeighted tasks combination:
J (D; {wsh,ws,wr}) = λs · Js(D; {wsh,ws}) + λr · Jr (D; {wsh,wr}) .
λs, λr ∈ [0, 1]: importance weight, λs + λr = 1.
ProblemsHow to fix λs, λr ?At the end of the training, only Js should matters
Tasks combination with evolving weights (our contribution)
J (D; {wsh,ws,wr}) = λs(t) · Js(D; {wsh,ws}) + λr (t) · Jr (D; {wsh,wr}) .
t : learning epochs, λs(t), λr (t) ∈ [0, 1]: importance weight, λs(t) + λr (t) = 1.
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 15/29
![Page 30: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/30.jpg)
images/logos
Proposed approach
Tasks combination with evolving weightsWeighted tasks combination:
J (D; {wsh,ws,wr}) = λs · Js(D; {wsh,ws}) + λr · Jr (D; {wsh,wr}) .
λs, λr ∈ [0, 1]: importance weight, λs + λr = 1.
ProblemsHow to fix λs, λr ?At the end of the training, only Js should matters
Tasks combination with evolving weights (our contribution)
J (D; {wsh,ws,wr}) = λs(t) · Js(D; {wsh,ws}) + λr (t) · Jr (D; {wsh,wr}) .
t : learning epochs, λs(t), λr (t) ∈ [0, 1]: importance weight, λs(t) + λr (t) = 1.
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 15/29
![Page 31: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/31.jpg)
images/logos
Proposed approach
Tasks combination with evolving weights
J (D; {wsh,ws,wr}) = λs(t)·Js(D; {wsh,ws})+λr (t)·Jr (D; {wsh,wr}) .
start
0
0.2
0.4
0.6
0.8
1
Exponential schedule
Impo
rtan
cew
eigh
ts
t : Train epochs
λr (t)λs(t)
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 16/29
{λr (t) = exp(−t
σ) , σ : slope
λs(t) = 1− λr (t)
![Page 32: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/32.jpg)
images/logos
Proposed approach
Optimization using gradient descent (GD)
wt ← wt−1 − ∂J (D;w)∂w
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 17/29
![Page 33: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/33.jpg)
images/logos
Proposed approach
Optimization using gradient descent (GD)
wt ← wt−1 − ∂J (D;w)∂w
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 18/29
![Page 34: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/34.jpg)
images/logos
Proposed approach
Tasks combination with evolving weights: Optimization
Tasks combination with evolving weights (our contribution)
J (D; {wsh,ws,wr}) = λs(t)·Js(D; {wsh,ws})+λr (t)·Jr (D; {wsh,wr}) .
t : learning epochs, λs(t), λr (t) ∈ [0, 1]: importance weight, λs(t) + λr (t) = 1.
Algorithm 1 Training our model for one epoch
1: D is the shuffled training set. B a mini-batch.2: for B in D do3: Make a gradient step toward Jr using B (update w′)4: Bs ⇐ labeled examples of B,5: Make a gradient step toward Js using Bs (update w)6: end for
[R.Caruana 97, J.Weston 08, R.Collobert 08, Z.Zhang 15]
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 19/29
![Page 35: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/35.jpg)
images/logos
Proposed approach
Overview of the model
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 20/29
![Page 36: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/36.jpg)
images/logos
Results
Experimental protocol
Objective: Compare Training DNN using different approaches:
No pre-training (base-line)With pre-training (Stairs schedule)Parallel transfer learning (proposed approach)
Studied evolving weights schedules:
0
1Stairs (Pre-training) Linear
start t10
1Linear until t1
start
Exponential
Impo
rtan
cew
eigh
ts
t : Train epochs
λrλs
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 21/29
![Page 37: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/37.jpg)
images/logos
Results
MNIST dataset: digits dataset
Train set: 60 000 samples.Test set: 10 000 samples.
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 22/29
![Page 38: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/38.jpg)
images/logos
Results
Experimental protocol
Task: Classification (MNIST)Number of hidden layers K : 1, 2, 3, 4.Optimization:
Epochs: 5000Batch size: 600Options: No regularization, No adaptive learning rate
Hyper-parameters of the evolving schedules:t1: 100 σ: 40
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 23/29
![Page 39: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/39.jpg)
images/logos
Results
Shallow networks: (K = 1, l = 1E2)
0 1E+03 2E+03 5E+03 1E+04 2E+04 4E+04 49900
Size of unlabeled data (u)
28.0
28.5
29.0
29.5
30.0
30.5
31.0
31.5
32.0
32.5
Calssification error MNIST test (%)
Evaluation of the eloving weight schedules (size of labeled data l=100), K=1
baseline
stairs100
lin100
lin
exp40
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 24/29
![Page 40: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/40.jpg)
images/logos
Results
Shallow networks: (K = 1, l = 1E3)
0 1E+03 2E+03 5E+03 1E+04 2E+04 4E+04 49900
Size of unlabeled data (u)
11.0
11.5
12.0
12.5
13.0
13.5
14.0
14.5
Calssification error MNIST test (%)
Evaluation of the eloving weight schedules (size of labeled data l=1000), K=1
baseline
stairs100
lin100
lin
exp40
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 25/29
![Page 41: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/41.jpg)
images/logos
Results
Deep networks: exponential schedule (l = 1E3)
0 1E+03 2E+03 5E+03 1E+04 2E+04 4E+04 49900
Size of unlabeled data (u)
10.0
10.5
11.0
11.5
12.0
12.5
13.0
Calssifica
tion e
rror M
NIS
T test
(%
)
Evaluation of the exp40 eloving weight schedule (size of labeled data l=1000)
K=2
K=3
K=4
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 26/29
![Page 42: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/42.jpg)
images/logos
Conclusion and perspectives
Conclusion
An alternative method to the pre-training.Parallel transfer learning with evolving weights
Improve generalization easily.Reduce the number of hyper-parameters (t1, σ)
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 27/29
![Page 43: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/43.jpg)
images/logos
Conclusion and perspectives
Perspectives
Evolve the importance weight according to thetrain/validation error.Explore other evolving schedules (toward automaticschedule)Adjust the learning rate: Adadelta, Adagrad, RMSPropExtension to structured output problems
Train cost = supervised task+ Input unsupervised task+ Output unsupervised task
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 28/29
![Page 44: Deep multi-task learning with evolving weights · Machine learning What is machine learning (ML)? ML is programming computers (algorithms) to optimize a performance criterion using](https://reader034.fdocuments.net/reader034/viewer/2022042404/5f1b8ea7ebd9e038c656755c/html5/thumbnails/44.jpg)
images/logos
Conclusion and perspectives
Questions
Thank you for your attention,
Questions?
LITIS lab., Apprentissage team - INSA de Rouen, France Deep multi-task learning with evolving weights 29/29