Hubert Curien Laboratory UMR CNRS 5516 University of · PDF fileSynopsis of the LaHC Two...

22
Hubert Curien Laboratory UMR CNRS 5516 University of Saint-Etienne Marc Sebban and Alain Tremeau Hubert Curien Laboratory, UMR CNRS 5516 University Jean Monnet - Saint- ´ Etienne (France) Induction week - September 7-11, 2015 Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 1 / 22

Transcript of Hubert Curien Laboratory UMR CNRS 5516 University of · PDF fileSynopsis of the LaHC Two...

  • Hubert Curien LaboratoryUMR CNRS 5516

    University of Saint-Etienne

    Marc Sebban and Alain Tremeau

    Hubert Curien Laboratory, UMR CNRS 5516University Jean Monnet - Saint-Etienne (France)

    Induction week - September 7-11, 2015

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 1 / 22

  • Hubert Curien Laboratory (LAHC)

    Synopsis of the LaHC

    The LAHC (Head: Florent Pigeon) is a joint research unit, created in2006, within the Jean Monnet University, Saint-Etienne and theCNRS.

    100 researchers and research lecturers.25 engineers and administrative staff.

    100 PhD and postdoc students.This makes the LAHC with a total of +230 staff the most importantof all Saint-Etiennes university poles.

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 2 / 22

  • Synopsis of the LaHC

    Two scientific departments

    Optics, Photonics & Hyper-frequencies (Head: A. Boukenter)

    Computer Science, Telecom & Image (Head: M. Sebban)

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 3 / 22

  • Project-Team Model

    Project-Team Model

    The Hubert Curien laboratory makes use of the notion of project-team:

    made up of 6-10 permanent staff,

    well defined scientific objectives,

    well defined lifecycles,

    own budget.

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 4 / 22

  • Project-Team Model

    5 Project-Teams in Computer Science

    Machine Learning

    Data Mining & Information Retrieval.

    Knowledge Representation

    Multi-agents systems

    Virtual Communities and Social Networks

    3 Project-Teams in Image Science and Computer Vision

    Optical Design and Image Reconstruction

    Macroscopic Modelisation of Images

    Image analysis and understanding

    with strong interdisciplinary collaborations in computer vision betweenComputer Science and Image Processing

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 5 / 22

  • Research activities inMachine Learning and Data Mining

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 6 / 22

  • Research activities in Machine Learning and Data Mining

    13 permanent members - 12 PhD students

    Permanent staff Phd StudentsLeonor Becerra Irina NicolaeMarc Bernard Valentina ZantedeschiCatherine Combes Michael Perrot

    Elisa Fromont Maria BatistaMathias Gery Romain DevilleAmaury Habrard Adrien DulacFrancois Jacquenet Damien FourureBaptiste Jeudy O. BenyahiaChristine Largeron Jordan FreryEmilie Morvant Guillaume MetzlerMarc Sebban Riadhy BenamarRemi Emonet Anil GoyalMichel Beigbeder

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 7 / 22

  • What is Machine Learning?

    Field of study that gives computers the ability to learn without beingexplicitly programmed.

    Machine learning explores the construction and study of algorithmsthat can learn from training data and make predictions on test data

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 8 / 22

  • Some Machine Learning applications

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 9 / 22

  • Main research topics in Machine Learning in Saint-Etienne

    Main research topics

    Domain Adaptation and Transfer Learning

    Metric Learning

    Ensemble methods - Theory of Boosting - PAC Bayesian Theory

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 10 / 22

  • Domain Adaptation and Transfer Learning

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 11 / 22

  • Domain Adaptation and Transfer Learning

    Assumption in Machine Learning

    Training and test data must be in the same feature space and have thesame distribution. Otherwise, transfer learning is required.

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 12 / 22

  • Domain Adaptation

    Definition

    Domain adaptation is a transfer learning subfield which makes use ofsome labeled source (training) data and many unlabeled target(test) data.Typically, it requires to move the two source and target distributions closerto each other.

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 13 / 22

  • Matching between the main strategies in TL and our skills

    Sample bias

    Covariate shift

    Instance weighting

    Feature Representation

    Domain invariant features

    Latent features

    Iterative Models

    Selftraining

    EMbased methods

    Statistical Learning Theory

    PACBayesian Theory

    Boostingbased models

    Metric Learning

    Subspace Alignement

    Latent Pattern Mining

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 14 / 22

  • Metric Learning

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 15 / 22

  • How to discriminate between humans and dogs?

    Predicted label?

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 16 / 22

  • Limitations of standard metrics (e.g. Euclidean distance)

    Its not what it looks Like...

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 17 / 22

  • Limitations of standard metrics

    Why standard metrics are not able to deal with such situations?

    d2(x, x) =

    di=1

    (xi x i )2.

    These distances cannot take into account ground truth. Therefore, theyoften fail to capture the idiosyncrasies of the data of interest.

    An important part of the research activity of the Machine Learning teamin Saint-Etienne is dedicated in the design of new metric learningalgorithms.

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 18 / 22

  • Metric Learning in a Nutshell

    Metric learning aims at optimizing parameterized distances like theMahalanobis distance.

    Metric Learning

    It typically induces a change of representation space to satisfyconstraints.

    Metric Learning

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 19 / 22

  • Very popular approach

    Find the matrix M Rdd of Mahalanobis distance

    dM(x, x) =

    (x x)TM(x x),

    such that d2M satisfies best the constraints and where M is PSD.

    Mahalanobis distance learning = Learning a linear projection

    Using Cholesky decomposition, one can rewrite M as LTL.

    dM(x, x) =

    (x x)TLTL(x x)

    =

    (Lx Lx)T (Lx Lx)

    A Mahalanobis distance implicitly corresponds to computing theEuclidean distance after the linear projection of the data by L.

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 20 / 22

  • Illustration

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 21 / 22

    animation_lunes.movMedia File (video/quicktime)

  • Main scientific topics in MLDM

    Metric Learning: optimization of metrics to improve classification tasks.

    Domain Adaptation: transfer learning from a source domain to a targetdomain.

    Ensemble methods: Theory of Boosting - PAC Bayesian Theory

    Pattern Mining: temporal motif mining, social network mining.

    Internship proposals on these topics will be soon available!

    Marc Sebban and Alain Tremeau (LaHC) LAHC September, 7-11 2015 22 / 22