Hybrids of generative and discriminative methods for machine learning
description
Transcript of Hybrids of generative and discriminative methods for machine learning
![Page 1: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/1.jpg)
MSRC Summer School - 30/06/2009
Cambridge – UK
Hybrids of generative anddiscriminative methods for
machine learning
![Page 2: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/2.jpg)
Motivation
Generative models• prior knowledge• handle missing data such as labels
Discriminative models• perform well at classification
However• no straightforward way to combine them
![Page 3: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/3.jpg)
Content
Generative and discriminative methods
A principled hybrid framework• Study of the properties on a toy example• Influence of the amount of labelled data
![Page 4: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/4.jpg)
Content
Generative and discriminative methods
A principled hybrid framework• Study of the properties on a toy example• Influence of the amount of labelled data
![Page 5: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/5.jpg)
Generative methods
Answer: “what does a cat look like? and a dog?” => data and labels joint distribution
x : data
c : label
: parameters
![Page 6: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/6.jpg)
Generative methods
Objective function:G() = p() p(X, C|)
G() = p() n p(xn, cn|)
1 reusable model per class, can deal with incomplete data
Example: GMMs
![Page 7: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/7.jpg)
Example of generative model
![Page 8: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/8.jpg)
Discriminative methods
Answer: “is it a cat or a dog?” => labels posterior distribution
x : data
c : label
: parameters
![Page 9: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/9.jpg)
Discriminative methods
The objective function isD() = p() p(C|X, )
D() = p() n p(cn|xn, )
Focus on regions of ambiguity, make faster predictions
Example: neural networks, SVMs
![Page 10: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/10.jpg)
Example of discriminative model
SVMs / NNs
![Page 11: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/11.jpg)
Generative versus discriminative
No effect of the double mode on the decision boundary
![Page 12: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/12.jpg)
Content
Generative and discriminative methods
A principled hybrid framework• Study of the properties on a toy example• Influence of the amount of labelled data
![Page 13: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/13.jpg)
Semi-supervised learning
Few labelled data / lots of unlabelled data
Discriminative methods overfit, generative models only help classify if they are “good”
Need to have the modelling power of generative models while performing at discriminating => hybrid models
![Page 14: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/14.jpg)
Discriminative trainingBach et al, ICASSP 05
Discriminative objective function:D() = p() n p(cn|xn, )
Using a generative model:D() = p() n [ p(xn, cn|) / p(xn|) ]
D() = p() n c p(xn, c|)
p(xn, cn|)
![Page 15: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/15.jpg)
Convex combinationBouchard et al, COMPSTAT 04
Generative objective function:G() = p() n p(xn, cn|)
Discriminative objective function:D() = p() n p(cn|xn, )
Convex combination:log L() = log D() + (1- ) log G()
[0,1]
![Page 16: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/16.jpg)
A principled hybrid model
![Page 17: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/17.jpg)
A principled hybrid model
![Page 18: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/18.jpg)
A principled hybrid model
![Page 19: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/19.jpg)
A principled hybrid model
![Page 20: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/20.jpg)
A principled hybrid model
- posterior distribution of the labels
’- marginal distribution of the data
and ’ communicate through a prior
Hybrid objective function:
L(,’) = p(,’) n p(cn|xn, ) n p(xn|’)
![Page 21: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/21.jpg)
A principled hybrid model
= ’ => p(, ’) = p() (-’)
L(,’) = p() (-’) n p(cn|xn, ) n p(xn|’)
L() = G() generative case
’ => p(, ’) = p() p(’) L(,’) = [ p() n p(cn|xn, ) ] [ p(’) n p(xn|’) ] L(,’) = D() f(’) discriminative case
![Page 22: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/22.jpg)
A principled hybrid model
Anything in between – hybrid case
Choice of prior:p(, ’) = p() N(’|, ())
0 => = ’
1 => => ’
![Page 23: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/23.jpg)
Why principled?
Consistent with the likelihood of graphical models
=> one way to train a system
Everything can now be modelled => potential to be Bayesian
Potential to learn
![Page 24: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/24.jpg)
Learning
EM / Laplace approximation / MCMC either intractable or too slow
Conjugate gradients flexible, easy to check BUT sensitive to
initialisation, slow
Variational inference
![Page 25: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/25.jpg)
Content
Generative and discriminative methods
A principled hybrid framework• Study of the properties on a toy example• Influence of the amount of labelled data
![Page 26: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/26.jpg)
Toy example
![Page 27: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/27.jpg)
Toy example
2 elongated distributions
Only spherical gaussians allowed => wrong model
2 labelled points per class => strong risk of overfitting
![Page 28: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/28.jpg)
Toy example
![Page 29: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/29.jpg)
Decision boundaries
![Page 30: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/30.jpg)
Content
Generative and discriminative methods
A principled hybrid framework• Study of the properties on a toy example• Influence of the amount of labelled data
![Page 31: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/31.jpg)
A real example
Images are a special case, as they contain several features each
2 levels of supervision: at the image level, and at the feature level• Image label only => weakly labelled• Image label + segmentation => fully labelled
![Page 32: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/32.jpg)
The underlying generative model
gaussian
multinomial
multinomial
![Page 33: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/33.jpg)
The underlying generative model
weakly – fully labelled
![Page 34: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/34.jpg)
Experimental set-up
3 classes: bikes, cows, sheep
: 1 Gaussian per class => poor generative model
75 training images for each category
![Page 35: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/35.jpg)
HF framework
![Page 36: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/36.jpg)
HF versus CC
![Page 37: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/37.jpg)
Results
When increasing the proportion of fully labelled data, the trend is:
generative hybrid discriminative
Weakly labelled data has little influence on the trend
With sufficient fully labelled data, HF tends to perform better than CC
![Page 38: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/38.jpg)
Experimental set-up
3 classes: lions, tigers and cheetahs
: 1 Gaussian per class => poor generative model
75 training images for each category
![Page 39: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/39.jpg)
HF framework
![Page 40: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/40.jpg)
HF versus CC
![Page 41: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/41.jpg)
Results
Hybrid models consistently perform better
However, generative and discriminative models haven’t reached saturation
No clear difference between HF and CC
![Page 42: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/42.jpg)
Conclusion
Principled hybrid framework
Possibility to learn the best trade-off
Helps for ambiguous datasets when labelled data is scarce
Problem of optimisation
![Page 43: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/43.jpg)
Future avenues
Bayesian version (posterior distribution of ) under study
Replace by a diagonal matrix to allow flexibility => need for the Bayesian version
Choice of priors
![Page 44: Hybrids of generative and discriminative methods for machine learning](https://reader036.fdocuments.net/reader036/viewer/2022062322/56814fd3550346895dbd96ed/html5/thumbnails/44.jpg)
Thank you!