Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur...
Transcript of Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur...
![Page 1: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/1.jpg)
Reconnaissance d’objetsReconnaissance d’objetsjjet vision et vision artificielleartificielle
http://www.di.ens.fr/willow/teaching/recvis09http://www.di.ens.fr/willow/teaching/recvis09
Lecture 7L ur 7• A bit more on neural nets
h d• Optimization methods• Part-based object models• Part-based object models
![Page 2: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/2.jpg)
Convolutional NetsConvolutional Nets
Yann LeCunThe Courant Institute of Mathematical Sciences
New York Universityhtt // lhttp://yann.lecun.com
Yann LeCun
![Page 3: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/3.jpg)
An Old Idea for Local Shift Invariance
[Hubel & Wiesel 1962]: simple cells detect local featurescomplex cells “pool” the outputs of simple cells within a retinotopic neighborhood.
“Si l ll ”“Simple cells”“Complex cells”
poolingpooling subsamplingConvolutions
R ti t i F t M
Yann LeCun
Retinotopic Feature Maps
![Page 4: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/4.jpg)
The Multistage Hubel-Wiesel Architecture
Building a complete artificial vision system:Stack multiple stages of simple cells / complex cells layers
h l b l fHigher stages compute more global, more invariant featuresStick a classification layer on top[Fukushima 1971-1982]neocognitron
[LeCun 1988-2007] convolutional net
[Poggio 2002-2006]HMAX
[Ullman 2002-2006][Ullman 2002 2006]fragment hierarchy
[Lowe 2006]HMAX HMAX
QUESTION: How do we find (or learn) the
Yann LeCun
( )filters?
![Page 5: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/5.jpg)
Convolutional Net Architecture
input Layer 16@28 28
Layer 26@14 14
Layer 312@10x10 Layer 4
12@5x5
Layer 5100@1x1p
1@32x32 6@28x28 6@14x14@ 12@5x5
10
Layer 6: 10
5x55x5con ol tion
5x5convolution2x2 2x25x5
convolutionconvolutionpooling/
subsamplingpooling/subsampling
Convolutional net for handwriting recognition (400,000 synapses)Convolutional layers (simple cells): all units in a feature plane share the same weightsPooling/subsampling layers (complex cells): for invariance to small distortions.S i d di t d t l i i b k tiSupervised gradient-descent learning using back-propagationThe entire network is trained end-to-end. All the layers are trained simultaneously.
Yann LeCun
![Page 6: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/6.jpg)
MNIST Handwritten Digit Dataset
Handwritten Digit Dataset MNIST: 60,000 training samples, 10,000 test samples
Yann LeCun
![Page 7: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/7.jpg)
Results on MNIST Handwritten DigitsCLASSIFIER DEFORMATION PREPROCESSING ERROR (%) Referencelinear classifier (1-layer NN) none 12.00 LeCun et al. 1998linear classifier (1-layer NN) deskewing 8.40 LeCun et al. 1998pairwise linear classifier deskewing 7.60 LeCun et al. 1998K nearest neighbors (L2) none 3 09 Kenneth Wilder U ChicagoK-nearest -neighbors, (L2) none 3.09 Kenneth Wilder, U. ChicagoK-nearest -neighbors, (L2) deskewing 2.40 LeCun et al. 1998K-nearest -neighbors, (L2) deskew, clean, blur 1.80 Kenneth Wilder, U. ChicagoK-NN L3, 2 pixel jit ter deskew, clean, blur 1.22 Kenneth Wilder, U. ChicagoK-NN, shape context m atching shape context feature 0.63 Belongie et al. IEEE PAMI 200240 PCA + quadrat ic classifier none 3.30 LeCun et al. 1998q1000 RBF + linear classifier none 3.60 LeCun et al. 1998K-NN, Tangent Distance subsam p 16x16 pixels 1.10 LeCun et al. 1998SVM, Gaussian Kernel none 1.40SVM deg 4 polynom ial deskewing 1.10 LeCun et al. 1998Reduced Set SVM deg 5 poly deskewing 1.00 LeCun et al. 1998Virtual SVM deg 9 poly Affine none 0 80 LeCun et al 1998Virtual SVM deg-9 poly Affine none 0.80 LeCun et al. 1998V-SVM, 2-pixel jit tered none 0.68 DeCoste and Scholkopf, MLJ 2002V-SVM, 2-pixel jit tered deskewing 0.56 DeCoste and Scholkopf, MLJ 20022-layer NN, 300 HU, MSE none 4.70 LeCun et al. 19982-layer NN, 300 HU, MSE, Affine none 3.60 LeCun et al. 19982-layer NN, 300 HU deskewing 1.60 LeCun et al. 19983-layer NN, 500+ 150 HU none 2.95 LeCun et al. 19983-layer NN, 500+ 150 HU Affine none 2.45 LeCun et al. 19983-layer NN, 500+ 300 HU, CE, reg none 1.53 Hinton, unpublished, 20052-layer NN, 800 HU, CE none 1.60 Sim ard et al., ICDAR 20032-layer NN, 800 HU, CE Affine none 1.10 Sim ard et al., ICDAR 20032 layer NN 800 HU MSE Elast ic none 0 90 Sim ard et al ICDAR 20032-layer NN, 800 HU, MSE Elast ic none 0.90 Sim ard et al., ICDAR 20032-layer NN, 800 HU, CE Elast ic none 0.70 Sim ard et al., ICDAR 2003Convolut ional net LeNet-1 subsam p 16x16 pixels 1.70 LeCun et al. 1998Convolut ional net LeNet-4 none 1.10 LeCun et al. 1998Convolut ional net LeNet-5, none 0.95 LeCun et al. 1998Conv. net LeNet-5, Affine none 0.80 LeCun et al. 1998
Yann LeCun
Boosted LeNet-4 Affine none 0.70 LeCun et al. 1998Conv. net , CE Affine none 0.60 Sim ard et al., ICDAR 2003Com v net , CE Elast ic none 0.40 Sim ard et al., ICDAR 2003
![Page 8: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/8.jpg)
Some Results on MNIST (from raw images: no preprocessing)
CLASSIFIER DEFORMATION ERROR Reference
2 l NN 800 HU CE 1 60Know led g e-f ree m ethod s (a fixed perm utat ion of the pixels would m ake no difference)
Si d l ICDAR 20032-layer NN, 800 HU, CE 1.603-layer NN, 500+ 300 HU, CE, reg 1.53 Hinton, in press, 2005SVM, Gaussian Kernel 1.40 Cortes 92 + Many others
Sim ard et al., ICDAR 2003
0.800.70
Convolut iona l ne tsConvolut ional net LeNet-5, Ranzato et al. NIPS 2006Convolut ional net LeNet-6, Ranzato et al. NIPS 2006
Tra ining se t aug m ented w ith Af f ine D istort ions2-layer NN, 800 HU, CE Affine 1.10
Affine 0.80Affi 0 60
Sim ard et al., ICDAR 2003Virtual SVM deg-9 poly ScholkopfC l t i l t CE Si d t l ICDAR 2003Affine 0.60
Tra ining e t aug m ented w ith Elast ic D istort ions2-layer NN, 800 HU, CE Elast ic 0.70
Elast ic 0.40
Convolut ional net , CE Sim ard et al., ICDAR 2003
Sim ard et al., ICDAR 2003Convolut ional net , CE Sim ard et al., ICDAR 2003
Note: some groups have obtained good results with various amounts of preprocessingsuch as deskewing (e.g. 0.56% using an SVM with smart kernels [deCoste and Schoelkopf])
Yann LeCun
g ( g g [ p ])hand-designed feature representations (e.g. 0.63% with “shape context” and nearest neighbor [Belo
![Page 9: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/9.jpg)
Face Detection and Pose Estimation: Results
Yann LeCun
![Page 10: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/10.jpg)
Face Detection with a Convolutional Net
Yann LeCun
![Page 11: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/11.jpg)
Applying a ConvNet on Sliding Windows is Very Cheap!
output: 3x3
96x96
input:120x120 pTraditional Detectors/Classifiers must be applied to every location on a large input image, at multiple scales.Convolutional nets can replicated over large images very p g g ycheaply.The network is applied to multiple scales spaced by 1.5.
Yann LeCun
![Page 12: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/12.jpg)
Building a Detector/Recognizer: Replicated Convolutional NetsReplicated Convolutional Nets
Computational cost for replicated convolutional net:96x96 -> 4.6 million multiply-accumulate operations120x120 -> 8.3 million multiply-accumulate operations240x240 -> 47.5 million multiply-accumulate operations 480x480 -> 232 million multiply-accumulate operations
Computational cost for a non-convolutional detector ofComputational cost for a non convolutional detector of the same size, applied every 12 pixels:
96x96 -> 4.6 million multiply-accumulate operations120x120 -> 42.0 million multiply-accumulate operations240x240 -> 788.0 million multiply-accumulate operations 480x480 -> 5 083 million multiply-accumulate480x480 > 5,083 million multiply accumulate operations 96x96 window
12 pixel shift
84x84 overlap
![Page 13: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/13.jpg)
Generic Object Detection and Recognition with Invariance to Pose and Illuminationwith Invariance to Pose and Illumination
50 toys belonging to 5 categories: animal, human figure, airplane, truck, car10 instance per category: 5 instances used for training 5 instances for testing10 instance per category: 5 instances used for training, 5 instances for testingRaw dataset: 972 stereo pair of each object instance. 48,600 image pairs total.
For each instance:18 azimuths
0 to 350 degrees every 20 degrees20 degrees
9 elevations30 to 70 degrees from horizontal every 5 degreesdegrees
6 illuminationson/off combinations of 4 lights2 ( t )2 cameras (stereo)
7.5 cm apart40 cm from the object
Training instances Test instances
Yann LeCun
a g sta ces Test instances
![Page 14: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/14.jpg)
Data Collection, Sample Generation
Image capture setup Objects are painted green so that:all features other than shape are- all features other than shape are
removed- objects can be segmented, transformed,
and composited onto various pbackgroundsOriginal image Object mask
Yann LeCun
Composite imageShadow factor
![Page 15: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/15.jpg)
Textured and Cluttered Datasets
Yann LeCun
![Page 16: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/16.jpg)
Convolutional Network
Layer 324@18x18 Layer 4
Layer 6F llStereo
input2@96x96
Layer 18@92x92 Layer 2
8@23x23
@ Layer 424@6x6 Layer 5
100
Fully connected(500 weights)
5
5x5convolution
6x6convolution(96 kernels)
6x6convolution
4x4subsampling 3x3
b li(16 kernels)(96 kernels)
(2400 kernels)subsampling
90,857 free parameters, 3,901,162 connections.The architecture alternates convolutional layers (feature detectors) and subsamplinglayers (local feature pooling for invariance to small distortions).The entire network is trained end-to-end (all the layers are trained simultaneously)
Yann LeCun
simultaneously).A gradient-based algorithm is used to minimize a supervised loss function.
![Page 17: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/17.jpg)
lingAlternated Convolutions and Subsampling
“Simple cells”“Complex cells”p
Averaging subsamplinMultiple
Local features are extracted everywhere.
i / b li l
pgconvolution
s
averaging/subsampling layer builds robustness to variations in feature locations.Hubel/Wiesel'62,Hubel/Wiesel 62, Fukushima'71, LeCun'89, Riesenhuber & Poggio'02, Ullman'02,....
Yann LeCun
![Page 18: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/18.jpg)
Normalized-Uniform Set: Error Rates
Linear Classifier on raw stereo images: 30.2% error. K-Nearest-Neighbors on raw stereo images: 18 4%K Nearest Neighbors on raw stereo images: 18.4% error.K-Nearest-Neighbors on PCA-95: 16.6% error.Pairwise SVM on 96x96 stereo images: 11 6% errorPairwise SVM on 96x96 stereo images: 11.6% errorPairwise SVM on 95 Principal Components: 13.3% error.C l ti l N t 96 96 t i 5 8%Convolutional Net on 96x96 stereo images: 5.8% error.
Yann LeCunTraining instancesTest instances
![Page 19: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/19.jpg)
Jittered-Cluttered Dataset
Jittered-Cluttered Dataset: 291 600 tereo pairs for training 58 320 for testing291,600 tereo pairs for training, 58,320 for testingObjects are jittered: position, scale, in-plane rotation, contrast, brightness, backgrounds, distractor objects,...Input dimension: 98x98x2 (approx 18,000)Input dimension: 98x98x2 (approx 18,000)
Yann LeCun
![Page 20: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/20.jpg)
Experiment 2: Jittered-Cluttered Dataset
291,600 training samples, 58,320 test samplesSVM with Gaussian kernel 43 3%SVM with Gaussian kernel 43.3% errorConvolutional Net with binocular input: 7.8% errorConvolutional Net + SVM on top: 5 9% errorConvolutional Net + SVM on top: 5.9% errorConvolutional Net with monocular input: 20.8% errorS ll t (DEMO) 26 0%
Yann LeCun
Smaller mono net (DEMO): 26.0% errorDataset available from http://www.cs.nyu.edu/~yann
![Page 21: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/21.jpg)
What's wrong with K-NN and SVMs?g
K-NN and SVM with Gaussian kernels are based on matching globalK-NN and SVM with Gaussian kernels are based on matching global templatesBoth are “shallow” architecturesThere is now way to learn invariant recognition tasks with such naïve y garchitectures (unless we use an impractically large number of templates).
OutputThe number of necessary templates
LinearCombinations
grows exponentially with the number of dimensions of variations.Global templates are in trouble when the variations include: category instance
Global Template Matchers
Features (similarities)variations include: category, instance shape, configuration (for articulated object), position, azimuth, elevation, scale, illumination, texture, albedo, in- p
(each training sample is a template
Input
plane rotation, background luminance, background texture, background clutter, .....
Input
![Page 22: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/22.jpg)
Examples (Monocular Mode)
Yann LeCun
![Page 23: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/23.jpg)
Examples (Monocular Mode)
Yann LeCun
![Page 24: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/24.jpg)
Examples (Monocular Mode)
Yann LeCun
![Page 25: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/25.jpg)
Examples (Monocular Mode)
Yann LeCun
![Page 26: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/26.jpg)
Examples (Monocular Mode)
Yann LeCun
![Page 27: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/27.jpg)
Examples (Monocular Mode)
Yann LeCun
![Page 28: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/28.jpg)
Natural Images (Monocular Mode)
Yann LeCun
![Page 29: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/29.jpg)
Supervised Convolutional Nets: Pros and Cons
Convolutional nets can be trained to perform a wide variety of visual tasks.Global supervised gradient descent can produce parsimonious Global supervised gradient descent can produce parsimonious architectures
BUT: they require lots of labeled training samplesy q g p60,000 samples for handwriting120,000 samples for face detection25 000 to 350 000 for object recognition25,000 to 350,000 for object recognition
Since low-level features tend to be non task specific, we should be able to learn them unsupervised.
Hinton has shown that layer-by-layer unsupervised “pre-training” can be used to initialize “deep” architectures [Hinton & Shalakhutdinov Science 2006] [Hinton & Shalakhutdinov, Science 2006]
Can we use this idea to reduce the number of necessary labeled examples.
Yann LeCun
![Page 30: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/30.jpg)
Yann LeCun
![Page 31: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/31.jpg)
or How can bad optimization be good p g
in large-scale settingsSee http://leon bottou org/slides/largescale/lstut pdfSee http://leon.bottou.org/slides/largescale/lstut.pdf
![Page 32: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/32.jpg)
![Page 33: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/33.jpg)
![Page 34: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/34.jpg)
![Page 35: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/35.jpg)
![Page 36: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/36.jpg)
![Page 37: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/37.jpg)
![Page 38: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/38.jpg)
![Page 39: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/39.jpg)
![Page 40: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/40.jpg)
![Page 41: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/41.jpg)
![Page 42: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/42.jpg)
![Page 43: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/43.jpg)
![Page 44: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/44.jpg)
statistical estimation ratestatistical estimation rate
![Page 45: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/45.jpg)
![Page 46: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/46.jpg)
![Page 47: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/47.jpg)
![Page 48: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/48.jpg)
![Page 49: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/49.jpg)
![Page 50: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/50.jpg)
![Page 51: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/51.jpg)
Summary
![Page 52: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/52.jpg)
![Page 53: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/53.jpg)
![Page 54: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/54.jpg)
![Page 55: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/55.jpg)
![Page 56: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/56.jpg)
Generative part-based pmodels
Fischler & Elschlager’73
Many slides adapted from Svetlana Lazebnik, Fei-Fei Li, Rob Fergus, and Antonio Torralba
Fischler & Elschlager 73
![Page 57: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/57.jpg)
Bayesian approach
M d l P ( f | )• Model: Pθ ( f | c)• Learn the model by maximizing the likelihood of
the training datamaxθ ∑κ=1
n log Pθ ( fk | c)θ κ 1 θ k
• Recognize using Bayes ruleP ( c | f ) = P ( f | c) P(c) / P(f)Pθ ( c | f ) = Pθ ( f | c) P(c) / P(f)
![Page 58: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/58.jpg)
R. Fergus, P. Perona and A. Zisserman, Object Class Recognition by Unsupervised Scale-Invariant Learning, CVPR 2003
![Page 59: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/59.jpg)
Probabilistic model
)|()|()|(ma)|,()|(
bj thbj thhbj thPobjectshapeappearancePobjectimageP =
)|(),|(),|(max objecthpobjecthshapepobjecthappearancePh=
Part PartPartdescriptors
Partlocations
Candidate parts
![Page 60: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/60.jpg)
Probabilistic model
)|()|()|(ma)|,()|(
bj thbj thhbj thPobjectshapeappearancePobjectimageP =
h: assignment of features to parts
)|(),|(),|(max objecthpobjecthshapepobjecthappearancePh=
h: assignment of features to parts
Part 1Part 1
Part 3
Part 2
![Page 61: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/61.jpg)
Probabilistic model
)|()|()|(ma)|,()|(
bj thbj thhbj thPobjectshapeappearancePobjectimageP =
h: assignment of features to parts
)|(),|(),|(max objecthpobjecthshapepobjecthappearancePh=
h: assignment of features to parts
Part 1Part 1
Part 3
Part 2
![Page 62: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/62.jpg)
Probabilistic model
)|()|()|(ma)|,()|(
bj thbj thhbj thPobjectshapeappearancePobjectimageP =
)|(),|(),|(max objecthpobjecthshapepobjecthappearancePh=
Distribution over patchdescriptors
High-dimensional appearance space
![Page 63: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/63.jpg)
Probabilistic model
)|()|()|(ma)|,()|(
bj thbj thhbj thPobjectshapeappearancePobjectimageP =
)|(),|(),|(max objecthpobjecthshapepobjecthappearancePh=
Distribution over jointover jointpart positions
2D image space
![Page 64: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/64.jpg)
Results: FacesFaceshape
Patchshapemodel
appearancemodel
Recognitionresults
![Page 65: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/65.jpg)
Results: Motorbikes and i lairplanes
![Page 66: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/66.jpg)
Note: The Fergus part-based model is very rigid
(Schmid & Mohr, 1996)(Lowe, 1999) (Fergus, Perona & Zisserman, 2003)
![Page 67: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/67.jpg)
Model Learning as Multi-Image Segmentationg g g(Lazebnik, Scmid, Ponce, BMVC’04)
P ti l h t i t hi f ll d b lid tiPractical approach: two-image matching followed by validation
validation setinitial pairp
candidate part
![Page 68: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/68.jpg)
Model ≡ loose assembly of parts
(Lazebnik, Ponce, Schmid, ICCV’O5)Part ≡ rigid assembly of features
(Fergus et al., 2003)
![Page 69: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/69.jpg)
![Page 70: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/70.jpg)
![Page 71: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/71.jpg)
![Page 72: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/72.jpg)
A
![Page 73: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/73.jpg)
A
(Gaston, Grimson, & Lozano-Perez, 1982; Ayache & Fauger(Faugeras & Hebert, 1983; Huttenlocher, 1987)
![Page 74: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/74.jpg)
![Page 75: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/75.jpg)
Discriminative approach
M d l P ( | f)• Model: Pθ ( c | f)• Learn the model by maximizing the likelihood of
the training datamaxθ ∑κ=1
n log Pθ ( ck | fk)θ κ 1 θ k k
• Recognize by maximizing posterior probability of classc ass
maxc Pθ ( c | f )
![Page 76: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/76.jpg)
Complete Object Recognition System (ICCV’05)Training pairs
Candidate parts
Matching
Candidate parts
… …
… …Validation images Response scoresPart dictionary
Learning
n im
ages
ClassifierValidation
valid
atio
n
Test image
parts
Part detection Testingresponse vector Decision
![Page 77: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/77.jpg)
UIUC Bird Database
• 50 training images per class:20 i iti l i (50 l t did t t t i d)–20 initial images (50 largest candidate parts retained);
–30 validation (20 highest-scoring parts retained).50 l• 50 test images per class.
• 100 total.
Overall classification rate: 92.33%Bag of features (Zhang et al., 2005): 83%
![Page 78: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/78.jpg)
Model ≡ locally rigid assembly of partsP t l ll i id bl f f tPart ≡ locally rigid assembly of features
A first attempt at handling: • changes in viewpointi id h(Kushal, Schmid, Ponce, 2006) • nonrigid shape
• noncharacteristic texture
![Page 79: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/79.jpg)
Model ≡ locally rigid assembly of partsP t l ll i id bl f f tPart ≡ locally rigid assembly of features
base images validation images
A first attempt at handling: • changes in viewpointi id h
base images validation images
• nonrigid shape• noncharacteristic texture
(Kushal, Schmid, Ponce, 2006)
![Page 80: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/80.jpg)
Model ≡ locally rigid assembly of partsP t l ll i id bl f f tPart ≡ locally rigid assembly of features
A first attempt at handling: • changes in viewpointi id h• nonrigid shape
• noncharacteristic texture(Kushal, Schmid, Ponce, 2006)
![Page 81: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/81.jpg)
Model ≡ locally rigid assembly of partsP t l ll i id bl f f tPart ≡ locally rigid assembly of features
Qualitative experiments on Pascal VOC’07 (Kushal, Schmid, Ponce, 2008)
![Page 82: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/82.jpg)
Model ≡ locally rigid assembly of partsP t l ll i id bl f f tPart ≡ locally rigid assembly of features
Quantitative experiments on Pascal VOC’07 (Kushal, Schmid, Ponce, 2008)
![Page 83: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/83.jpg)
Color histograms (S&B’91)Local jets (Florack’93)Spin images (J&H’99)Spin images (J&H 99)Sift (Lowe’99)Shape contexts (B&M’95)p n ( &M 9 )
h ( & ’ )Texton histograms (L&M’97)Gist (O&T’05)Spatial pyramids (LSP’06)Spatial pyramids (LSP 06)Hog (D&T’06)Phog (B&Z’07)g ( )Convolutional nets (LC’70)
![Page 84: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/84.jpg)
Locally orderless structure of images (K&vD’99)
![Page 85: Reconnaissance d’objets et vision et vision ... · K-NN L3, 2 pixel jitter deskew, clean, blur 1.22 Kenneth Wilder, U. Chicago K-NN, shape context matching shape context feature](https://reader033.fdocuments.net/reader033/viewer/2022042320/5f09e1bf7e708231d428f31f/html5/thumbnails/85.jpg)
Felzwenszalb, McAllester, Ramanan (2007)[Wins on 6 of the Pascal’07 classes see Chum[Wins on 6 of the Pascal 07 classes, see Chum& Zisserman (2007) for the other big winner.]