Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford
description
Transcript of Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford
![Page 1: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/1.jpg)
1
Action Classification: An Integration of Randomization and Discrimination in A
Dense Feature Representation
Computer Science Department, Stanford University
{bangpeng,aditya86,feifeili}@cs.stanford.edu
Bangpeng Yao, Aditya Khosla, and Li Fei-Fei
![Page 2: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/2.jpg)
2
• Action Classification & Intuition
• Our Method
• Our Results
• Conclusion
Outline
![Page 3: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/3.jpg)
• Action Classification & Intuition
• Our Method
• Our Results
• Conclusion
Outline
3
![Page 4: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/4.jpg)
4
Action Classification
Object classification:
Presence of parts and their spatial configurations.[Lazebnik et al, 2006][Fergus et al, 2003]…
Phoning RidingBike Running
![Page 5: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/5.jpg)
5
Action Classification
• All images contain humans;
Object classification:
Presence of parts and their spatial configurations.[Lazebnik et al, 2006][Fergus et al, 2003]…
![Page 6: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/6.jpg)
6
Action Classification
• All images contain humans;• Objects small or absence;
Object classification:
Presence of parts and their spatial configurations.[Lazebnik et al, 2006][Fergus et al, 2003]…
![Page 7: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/7.jpg)
7
Action Classification
• All images contain humans;• Objects small or absence;• Large pose variation & occlusion;• Background clutter;
Challenging…
Object classification:
Presence of parts and their spatial configurations.[Lazebnik et al, 2006][Fergus et al, 2003]…
![Page 8: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/8.jpg)
8
Our Intuition
Focus on image regions that contain the most discriminative information.
![Page 9: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/9.jpg)
9
Our Intuition
Focus on image regions that contain the most discriminative information.
How to represent the features? Dense feature space
Randomization & DiscriminationHow to explore this feature space?
![Page 10: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/10.jpg)
Outline
10
• Action Classification & Intuition
• Our Method
• Our Results
• Conclusion
![Page 11: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/11.jpg)
11
... ... ...
......
......
Region Height
Region Width
Dense Feature Space
Normalized Image
Size of image region
Center of image region
![Page 12: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/12.jpg)
12
... ... ...
......
......
Region Height
Region Width
Normalized Image
Size of image region
Center of image region
Dense Feature Space
![Page 13: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/13.jpg)
13
... ... ...
......
......
Region Height
Region Width
Normalized Image
Size of image region
Center of image region
Dense Feature Space
![Page 14: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/14.jpg)
14
... ... ...
......
......
Region Height
Region Width
How can we identify the discriminative regions efficiently and effectively?
Normalized Image
Size of image region
Center of image region
Dense Feature Space
Image size: N×NImage regions: O(N6)
![Page 15: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/15.jpg)
15
... ... ...
......
......
Region Height
Region Width
Normalized Image
Size of image region
Center of image region
Apply randomization to sample a subset of image patches
Dense Feature Space
Random Forests (RF)
![Page 16: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/16.jpg)
16
... ... ...
......
......
Region Height
Region Width
This class Other classes
Random Forests (RF)
Normalized Image
Size of image region
Center of image region
Dense Feature Space
![Page 17: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/17.jpg)
17
... ... ...
......
......
Region Height
Region Width
RF with discriminative classifiers
Normalized Image
Size of image region
Center of image region
This class Other classes
Dense Feature Space
![Page 18: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/18.jpg)
18
Generalization Ability of RF
• Generalization error of a RF:
: correlation between decision trees: strength of the decision trees
• Discriminative classifiers
Better generalization
• Dense feature space decreases
increases
![Page 19: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/19.jpg)
19
… … … …
RF with Discriminative Classifiers
![Page 20: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/20.jpg)
20
… … … …
Train a binary SVM
RF with Discriminative Classifiers
1
2
3
4
5
0
1
1
1
0
BoW or SPM of SIFT-LLC features
![Page 21: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/21.jpg)
21
… … … …
Train a binary SVM
RF with Discriminative Classifiers
1
2
3
4
5
0
1
1
1
0Biggest information gain
![Page 22: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/22.jpg)
22
… … … …
Train a binary SVM
RF with Discriminative Classifiers
1
2
3
4
5
0
1
1
1
0
![Page 23: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/23.jpg)
23
… … … …
RF with Discriminative Classifiers
• We stop growing the tree if:- The maximum depth is reached;- There is only one class at the node;- The entropy of the training data at the node is low
![Page 24: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/24.jpg)
24
Classification With RF
… … … …
Number of treesClass Label
![Page 25: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/25.jpg)
• Action Classification & Intuition
• Our Method
• Our Results
• Conclusion
Outline
25
![Page 26: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/26.jpg)
26
Results on VOC 2011 Actions
ActionOthers’
BestOur
Method
Jumping 71.6 66.0
Phoning 50.7 41.0
Playing instrument 77.5 60.0
Reading 37.8 41.5
Riding bike 88.8 90.0
Riding horse 90.2 92.1
Running 87.9 86.6
Taking photo 25.7 28.8
Using computer 58.9 62.0
Walking 59.5 65.9
Our method ranks the first in six out of ten classes.
![Page 27: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/27.jpg)
27
Results on VOC 2011 Actions
ActionOthers’
BestOur
Method
Jumping 71.6 66.0
Phoning 50.7 41.0
Playing instrument 77.5 60.0
Reading 37.8 41.5
Riding bike 88.8 90.0
Riding horse 90.2 92.1
Running 87.9 86.6
Taking photo 25.7 28.8
Using computer 58.9 62.0
Walking 59.5 65.9
![Page 28: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/28.jpg)
28
Results on VOC 2011 Actions
ActionOthers’
BestOur
Method
Jumping 71.6 66.0
Phoning 50.7 41.0
Playing instrument 77.5 60.0
Reading 37.8 41.5
Riding bike 88.8 90.0
Riding horse 90.2 92.1
Running 87.9 86.6
Taking photo 25.7 28.8
Using computer 58.9 62.0
Walking 59.5 65.9
![Page 29: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/29.jpg)
29
Generalization Ability of RF
• Discriminative classifiers
Better generalization
• Dense feature space Tree correlation decreasesTree strength increases
0 100 200 300 4000.2
0.3
0.4
0.5
0.6
0.7
Number of trees
Mea
n A
vera
ge-P
reci
sion
dense feature, weak classifierSPM feature, strong classifierdense feature, strong classifier
dense feature (spatial pyramid)SPM feature
Vs.
strong classifier weak classifierVs.
Train discriminative SVM classifiers
Generate feature weights randomly
(Results on PASCAL VOC 2010)
![Page 30: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/30.jpg)
• Action Classification & Intuition
• Our Method
• Our Results
• Conclusion
Outline
30
![Page 31: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/31.jpg)
31
Conclusion
• Exploring dense image features can benefit action classification;
• Combining randomization and discrimination is an effective way to explore the dense image representation;
• Achieves very good performance based on only one type of image descriptor;
• Code will be available soon.
![Page 32: Computer Science Department, Stanford University {bangpeng,aditya86,feifeili}@cs.stanford](https://reader036.fdocuments.net/reader036/viewer/2022070404/56813c1a550346895da5902a/html5/thumbnails/32.jpg)
32
… … … …
Train a binary SVM
1
2
3
4
5
0
1
1
1
0
Acknowledgement
Bangpeng Yao, Aditya Khosla, and Li Fei-Fei. “Combining Randomization and Discrimination for Fine-Grained Image Categorization.” CVPR 2011.
Thanks to Su Hao, Olga Russakovsky, and Carsten Rother.
Reference: