Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept....
Transcript of Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept....
![Page 1: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/1.jpg)
Recognizing Human Actions by Attributes
CVPR2011
Jingen Liu, Benjamin Kuipers, Silvio Savarese
Dept. of Electrical Engineering and Computer Science
University of Michigan
![Page 2: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/2.jpg)
OutlineIntroductionOur ContributionsAttribute-Based Action
RepresentationLearning Data-Driven AttributesKnowledge Transfer Across
ClassesExperiments and Discussion
![Page 3: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/3.jpg)
Introduction
the traditional approaches for human action recognition
the action golf-swinging
human actions are better described by action attributes
![Page 4: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/4.jpg)
manually specified attributes◦Subjective◦2 – problem
Complete◦Data – driven
Intra-class variably◦Latent – variable , SVM
![Page 5: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/5.jpg)
![Page 6: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/6.jpg)
Our Contributionsaction attributes can be used to
improve human action recognition
manually-specified attributeslatent variablesintegrates manually-specified
and data-driven attributes
![Page 7: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/7.jpg)
useful for recognizing novel action classes without training examples
significantly boost traditional action classification
![Page 8: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/8.jpg)
Attribute-Based Action Representation
previous works represent actions with low-level features
define an action attribute space
![Page 9: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/9.jpg)
Example◦five attributes
“translation of torso”, “updown torso motion”, “arm motion”, “arm over shoulder motion”, “leg motion”
◦action class “walking” represented by a binary vector {1, 0, 1,
0, 1}
![Page 10: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/10.jpg)
By introducing the attribute layer between the low-level features and action class labels , classifier f which maps x to a class label
![Page 11: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/11.jpg)
Attributes as Latent Variableswant to learn a classification model for
recognizing an unknown action x
Treating attributes as latent variables
consider each attribute in the space as latent variables
ai ∈ [0, 1]
![Page 12: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/12.jpg)
Goal : learn a classifier fw to predict a new video x
![Page 13: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/13.jpg)
Raw feature : xClass label : y Attributes : aWeight for each feature : w
![Page 14: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/14.jpg)
provides the score measuring how well the raw feature matches the action class
![Page 15: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/15.jpg)
provides the score of an individual attribute, and is used to indicate the presence of an attribute in the video x
![Page 16: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/16.jpg)
captures the co-occurrence of pair of attributes aj and ak
![Page 17: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/17.jpg)
parameter vector w is learned from a training dataset
![Page 18: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/18.jpg)
Learning Data-Driven Attributes
manual specification of attributes is subjective
data-driven attributes
![Page 19: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/19.jpg)
The Mutual Information (MI) ◦a good measurement to evaluate the
quality of grouping
Given two random variables ◦X ∈ X = {x1, x2, ..., xn}◦Y ∈ Y = {y1, y2, ..., ym}◦where X represents a set of visual-
words, and Y is a set of action videos
MI(X; Y )
![Page 20: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/20.jpg)
Given a set of features
Wish to obtain a set of clusters
The quality of clustering is measured by the loss of MI
![Page 21: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/21.jpg)
integrate the discovery of data-driven attributes into the framework of latent SVM
h ∈ HH is the data-driven attribute
space
![Page 22: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/22.jpg)
Knowledge Transfer Across Classestransferring knowledge from
known classes (with training examples) to a novel class (without training examples)
using this knowledge to recognize instances of the novel class
![Page 23: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/23.jpg)
![Page 24: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/24.jpg)
Experiments and Discussion
Datasets and Action Attributes
Experimental Results
Experiments on Olympic Sports Dataset
![Page 25: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/25.jpg)
Datasets and Action AttributesUIUC Dataset
◦532 videos of 14 actions such as walk, hand-clap, jump-forward …
Combining existing datasets into a larger one◦KTH dataset
six classes and about 2,300 videos
◦Weizmann dataset 10 classes and about 100 videos
◦UIUCOlympic Sports dataset
◦ it is collected from YouTube , it contains realistic human actions
![Page 26: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/26.jpg)
Experimental Results
Recognizing novel action classes
Attributes boosting traditional action recognition
![Page 27: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/27.jpg)
Recognizing novel action classes
use the leave-two-classes-out-cross-validation strategy in experiments on the UIUC dataset
each run leave two classes out as novel classes (|Z| = 2)
![Page 28: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/28.jpg)
The average accuracy of leave-two-classes-out-cross-validation on the UIUC dataset for recognizing novel action classes.
![Page 29: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/29.jpg)
Divide the UIUC dataset into two disjoint sets◦Y : training set
contains 10 action classes
◦Z : testing set contains four classes
the testing and training classes share some common attributes
![Page 30: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/30.jpg)
![Page 31: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/31.jpg)
Example (a)
![Page 32: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/32.jpg)
Attributes boosting traditional action recognition
using our proposed framework to prove that action attributes do improve performance of traditional action recognition
Our results demonstrate that a significant improvement occurs with the use of manually-specified attributes.
![Page 33: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/33.jpg)
To further demonstrate the correlation between manually-specified attributes and data-driven attributes
This map is constructed from the training data
![Page 34: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/34.jpg)
Dissimilarity between 100 data-driven attributes (rows) and 34 manually-specified attributes (columns)
Colder color has lower value
![Page 35: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/35.jpg)
The effect of removing a set of human-specified attributessome specified attributes (e.g.,
the human-specified attribute set a = {1, 8, 9, 10, 11}, columns ) are more correlated with data-driven attributes.
![Page 36: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/36.jpg)
◦“Specified attributes” means only using this type of attributes for recognition
◦“B” indicates the performance before attributes removal
◦“A” indicates the performance after removing the attributes.
◦“Mixed Attributes” means using both manually-specified and data-driven attributes for recognition
![Page 37: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/37.jpg)
Using manually-specified attributes only
Remove human-specified attribute set a = {1, 8, 9, 10, 11}
the performance from 72% to 64%
![Page 38: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/38.jpg)
Using both manually-specifiedand data-driven attributesRemove human-specified
attribute set a = {1, 8, 9, 10, 11}doesn’t cause an obvious
performance decrease
![Page 39: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/39.jpg)
Experiments on Olympic Sports Dataset
using the Olympic Sports dataset, which contains 16 action classes and about 781 videos, for recognizing novel action classes and traditional training based recognition
![Page 40: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/40.jpg)
The performance of recognizing novel testing classes
Five cases 4 classes are used for testing 12 classes used for training
![Page 41: Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.](https://reader036.fdocuments.net/reader036/viewer/2022062712/56649c765503460f94929df6/html5/thumbnails/41.jpg)
THANK YOU !