Single Person Pose Recognition and Tracking
Defender: Javier Barbadillo AmorInformation and Communication Theory (ICT) GroupDelft University of Technology
Committee: Dr. Alan HanjalicDr. Emile. A. HendriksPhD. Feifei HuoDr. Pavel Paclik
25-06-2010
2
3
Outline:
Introduction Single Person Pose Recognition and Tracking System
Theory
The goal of this researchImprove Body Parts Detection and Pose recognition
Experiments and Results• Improved Hand Detection• Detection of a new class: Non-Pose Classification
Conclusions
Future Work
4
Introduction
5
Single Person Pose Recognition and Tracking System
• Real time • One single camera• Game control with detected poses
Theory: Background Subtraction by Mixture of Gaussians
• Compare the current frame with a model of the background.• Obtain a binary image with the foreground pixels
6
7
Theory: Background Subtraction by Mixture of Gaussians
• History of pixel intensity values:
• An intensity value belongs to a Gaussian Distribution if it is within [-2.5σ, 2.5σ]
• Each pixel is modeled by K Gaussians.
Theory: Particle Filter for tracking the torso and head
8
• Torso and head region detection
• Hand Detection
Theory: Particle filter for tracking torso and head
9
•N particles are generated
• Weights π assigned according to measured probability.
• Father particles spread into G sons
10
• Sample sets of particles are generated for 3 states: x, y and Scale
• The probability of the state of the torso is given by
Theory: Particle filter for tracking torso and head
Primitive for torso and head
11
• Hand Detection with general skin color model
Theory: General skin color detection
• Relative distances between hands and torso center.• Angles of the hands with the torso center.• r, l and t stand for right, left and torso.
12
Theory: Feature extraction
• Incoming observations = 6-feature-set• Classifier decides one Pose class.• Each Pose number is a different action in the game
13
Theory: Pose Classification
The goal of this research• Improve the system performance
– Hand detection: fails for short sleeves and “skin color clothes”– Pose recognition: detect Non-Poses
14
Hands detected in the forearm The 9 Predefined Poses
15
Experiments and Results
Skin color detection combined with human blob silhouette for hand detection
16
• Preliminary hand position is obtained from the center of gravity of the biggest skin blobs.
•First, general skin color detection is applied using this mask:
17
Skin color detection combined with human blob silhouette for hand detection
We check if the blob is:
- Below the heep
- Between the heep and the shoulder
- Over the shoulder
18
Skin color detection combined with human blob silhouette for hand detection
•Al the cases where people are wearing short sleeves or “skin color clothes” are correct now.
• Non-Pose classification
19
DEFINITION: Everything different from a predefined Pose
More features Needed!
• Clear Non-Pose: poses where one or both hand positions are in between positions corresponding to Poses
20
Non-Pose classification
21
Non-Pose classification• 17 videos from 17 different people were recorded.• Features extracted from each frame by processing the videos. • Multiple labeling with PRSD Studio, Matlab.
22
Non-Pose classification
Initial Dataset Improved Dataset
23
Non-Pose classification
• Experiments with Initial Dataset
First approach:
Second approach:
• “Leave one person out” method for realistic results.• Tested on Parzen, K-Nearest-Neighbor and Gaussian classifiers.
24
Non-Pose classification: Initial Dataset
ROC curve
• Mean error from LOPO: errors from all people summed and divided by the number of people.
25
•Best results detecting Poses are for K-Nearest-Neighbor, in general.•Parzen and Gaussian are considaribily worse.
Non-Pose classification: Initial Dataset
K-Nearest-Neighbor
26
• Parzen classifier shows more interesting results for a particular person (Hasan).
27
Results for Hasan´s samples as Test, from a single experiment.
•Pose 9 has 404 samples in total.
-120 from Hasan (Test)-171 from Saleem (Training)
Is there any relation?
Non-Pose classification: Initial Dataset
• Two correct ways of performing the same Pose result in quite different features.• Errors in Parzen give us an idea on how to improve even more K-NN performance.
28
Non-Pose classification: Initial Dataset
29
10-NN 5-NN
•All the samples from Carmen are missclassified as Non-Poses.
Non-Pose classification: Initial Dataset
Cascade of detector and classifier
• Results are much better with this approach than with the cascade. Single Pose classes seem to be better modelled than the whole Poses class with K-Nearest-Neighbor.
30
Second approach:
Non-Pose classification: Initial Dataset
31
• Experiments with Improved Dataset
– All classes have more than 100 samples
•For 10-NN the error on Poses decreased 1.5% and the error on Non-Poses decreased 3%.•Having more samples from singles Poses makes the whole Poses class more robust.
Mean errors for the detector trained on Poses.
Non-Pose classification: Improved Dataset
Mean errors for detector trained on Non-Poses.
32
•Training on Non-Poses doesn´t improve detection.
•Non-Poses are more difficult to model than Poses.
Non-Pose classification: Improved Dataset
33
•Now, Carmen´s samples of Pose 3 are correctly detected as Poses.•Pose 3 class is more compact.
Initial Dataset Improved Dataset
Non-Pose classification
34
Non-Pose classification
Decreased from 2% to 0%
Increased from 0% to 1%!!!
•More samples of poses 3 and 4 improved Detection on Poses and Non-Poses detection, but didn´t improve classification of the Pose classes.
Conclusions• The Improved hand detection is a simple method but robust, and solves the problem of
wrong detection for short sleeves.
• Non-Pose class is difficult to model because it overlaps with Poses and it is not a compact class. Anyway, almost 80% of Non-Poses can be detected.
• Having a good dataset might improve results drastically.– Samples must represent different people and ways of performing poses– Samples of wrong hand detections increase the error rate
• The K-Nearest-Neighbor is the best method for modelleing this Pose classes.
• The more restrictive the system is, the better results will be: Comprimise Solution
35
Future Work
• Make a new Dataset with the improved hand detection.
• Add a new feature for detecting more Non-Poses, e.g., face detection.
• Elbow detection.
36
37
I appreciate your attention
Questions?
Initial Dataset
38
Improved Dataset
39
Spatial Game
40
Top Related