Hubert Curien Laboratory UMR CNRS 5516 University of · PDF fileSynopsis of the LaHC Two...

Hubert Curien LaboratoryUMR CNRS 5516
University of Saint-Etienne
Marc Sebban and Alain Tremeau
Hubert Curien Laboratory, UMR CNRS 5516University Jean Monnet - Saint-Etienne (France)
Induction week - September 7-11, 2015
Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 1 / 22

Hubert Curien Laboratory (LAHC)
Synopsis of the LaHC
The LAHC (Head: Florent Pigeon) is a joint research unit, created in2006, within the Jean Monnet University, Saint-Etienne and theCNRS.
100 researchers and research lecturers.25 engineers and administrative staff.
100 PhD and postdoc students.This makes the LAHC with a total of +230 staff the most importantof all Saint-Etiennes university poles.

Synopsis of the LaHC
Two scientific departments
Optics, Photonics & Hyper-frequencies (Head: A. Boukenter)
Computer Science, Telecom & Image (Head: M. Sebban)

Project-Team Model
Project-Team Model
The Hubert Curien laboratory makes use of the notion of project-team:
made up of 6-10 permanent staff,
well defined scientific objectives,
well defined lifecycles,
own budget.

Project-Team Model
5 Project-Teams in Computer Science
Machine Learning
Data Mining & Information Retrieval.
Knowledge Representation
Multi-agents systems
Virtual Communities and Social Networks
3 Project-Teams in Image Science and Computer Vision
Optical Design and Image Reconstruction
Macroscopic Modelisation of Images
Image analysis and understanding
with strong interdisciplinary collaborations in computer vision betweenComputer Science and Image Processing

Research activities inMachine Learning and Data Mining

Research activities in Machine Learning and Data Mining
13 permanent members - 12 PhD students
Permanent staff Phd StudentsLeonor Becerra Irina NicolaeMarc Bernard Valentina ZantedeschiCatherine Combes Michael Perrot
Elisa Fromont Maria BatistaMathias Gery Romain DevilleAmaury Habrard Adrien DulacFrancois Jacquenet Damien FourureBaptiste Jeudy O. BenyahiaChristine Largeron Jordan FreryEmilie Morvant Guillaume MetzlerMarc Sebban Riadhy BenamarRemi Emonet Anil GoyalMichel Beigbeder

What is Machine Learning?
Field of study that gives computers the ability to learn without beingexplicitly programmed.
Machine learning explores the construction and study of algorithmsthat can learn from training data and make predictions on test data

Some Machine Learning applications

Main research topics in Machine Learning in Saint-Etienne
Main research topics
Domain Adaptation and Transfer Learning
Metric Learning
Ensemble methods - Theory of Boosting - PAC Bayesian Theory

Assumption in Machine Learning
Training and test data must be in the same feature space and have thesame distribution. Otherwise, transfer learning is required.

Domain Adaptation
Definition
Domain adaptation is a transfer learning subfield which makes use ofsome labeled source (training) data and many unlabeled target(test) data.Typically, it requires to move the two source and target distributions closerto each other.

Matching between the main strategies in TL and our skills
Sample bias
Covariate shift
Instance weighting
Feature Representation
Domain invariant features
Latent features
Iterative Models
Selftraining
EMbased methods
Statistical Learning Theory
PACBayesian Theory
Boostingbased models
Metric Learning
Subspace Alignement
Latent Pattern Mining

Metric Learning

How to discriminate between humans and dogs?
Predicted label?

Limitations of standard metrics (e.g. Euclidean distance)
Its not what it looks Like...

Limitations of standard metrics
Why standard metrics are not able to deal with such situations?
d2(x, x) =
di=1
(xi x i )2.
These distances cannot take into account ground truth. Therefore, theyoften fail to capture the idiosyncrasies of the data of interest.
An important part of the research activity of the Machine Learning teamin Saint-Etienne is dedicated in the design of new metric learningalgorithms.

Metric Learning in a Nutshell
Metric learning aims at optimizing parameterized distances like theMahalanobis distance.
Metric Learning
It typically induces a change of representation space to satisfyconstraints.
Metric Learning

Very popular approach
Find the matrix M Rdd of Mahalanobis distance
dM(x, x) =
(x x)TM(x x),
such that d2M satisfies best the constraints and where M is PSD.
Mahalanobis distance learning = Learning a linear projection
Using Cholesky decomposition, one can rewrite M as LTL.
dM(x, x) =
(x x)TLTL(x x)
=
(Lx Lx)T (Lx Lx)
A Mahalanobis distance implicitly corresponds to computing theEuclidean distance after the linear projection of the data by L.

Illustration
animation_lunes.movMedia File (video/quicktime)

Main scientific topics in MLDM
Metric Learning: optimization of metrics to improve classification tasks.
Domain Adaptation: transfer learning from a source domain to a targetdomain.
Ensemble methods: Theory of Boosting - PAC Bayesian Theory
Pattern Mining: temporal motif mining, social network mining.
Internship proposals on these topics will be soon available!

Hubert Curien Laboratory UMR CNRS 5516 University of · PDF fileSynopsis of the LaHC Two...

Documents

Transcript of Hubert Curien Laboratory UMR CNRS 5516 University of · PDF fileSynopsis of the LaHC Two...