RECOGNIZING FACIAL EXPRESSIONS THROUGH TRACKING
description
Transcript of RECOGNIZING FACIAL EXPRESSIONS THROUGH TRACKING
RECOGNIZING FACIAL EXPRESSIONS
THROUGH TRACKING
Salih Burak Gokturk
OVERVIEW• PROBLEM DESCRIPTION
• TRAINING STAGE
• TESTING STAGE
• EXPERIMENTS
• CONCLUSION
Components of the recognition system
Analysis -Face Tracking
Intelligence-Support Vector Machine
Classifier
Shape Parameters
Training with stereoData Classifier
Testing with mono
New Data
Output
PROBLEM DESCRIPTION(Tracking )
?
PROBLEM DESCRIPTION (Recognition)
X(t)[ Rigid, Open Mouth, Smile]
?[ Rigid, Open Mouth, Smile]
TrainingData Classifier TestingNew Data Output
OVERVIEW• PROBLEM DESCRIPTION
• TRAINING STAGE
• TESTING STAGE
• EXPERIMENTS
• CONCLUSION
p - degrees of freedom
Stereo TrackingData Monocular TrackingAnd Classification
Learn Shape
)1(uuo XX
uo
uu XnXnX )()(
p
iii
u XXX1
0
uoTp XXXXX 21
Support Vector Machines (SVM)
- Best discriminating hypersurface between two class of objects
- Map the data to high dimension using a map function - The hypersurface in the feature space corresponds to a hyperplane in the mapped space
TrainingData ClassifierTesting
(Classifier)New Data Output
OVERVIEW• PROBLEM DESCRIPTION
• TRAINING STAGE
• TESTING STAGE
• EXPERIMENTS
• CONCLUSION
LUKAS TOMASI KANADE OPTICAL FLOW TRACKER EXTENDED TO 3D
X(t)
I(x(t)) I(t+1)
TIME t+1
?X(t+1) tyx I
v
uII
P
iiiXXX
10 ),,( TRX
d
dT
dR
d
v
dT
v
dR
vd
u
dT
u
dR
u
v
u
J
tyx I
d
dT
dR
JII
One to Many Application of Support Vector Machines (SVM)
- One hypersurface per class is calculated
- A new data is tested for each hypersurface
k
z
z
k
i
e
eiP )(
- A different probability is assigned to ith class
OVERVIEW• PROBLEM DESCRIPTION
• TRAINING STAGE
• TESTING STAGE
• EXPERIMENTS
• CONCLUSION
-Training (Stereo) with 2 people, totally 240 frames - Testing with 3 people - 5 expressions: neutral, open mouth, close mouth, smile, raise eyebrow- velocity term is added to the shape vector:
3nn
nnewn
- Two other classifiers were tested: 1 - Clustering 2 – N-Nearest Neighbor
MOVIE (1)
MOVIE (2)
Decision of the system
Input
Neutral Open mouth
Close mouth
Smile Raise eyebrow
Neutral (44) 32 6 3 0 3
Open mouth (80) 0 76 4 0 0
Close Mouth (50) 0 1 49 0 0
Smile (87) 2 0 0 81 4
Raise Eyebrow (21) 3 0 0 0 18
Performance of the system for different expressions
Table 1
Comparison Between Different Methods
SVM with kernel erbf
SVM with kernel rbf
Clustering N-Nearest with N=9
N-Nearest with N=5
Same person
176/182 170/182 161/182 173/182 173/182
Total 256/282 253/282 242/283 255/282 253/282
Table 2
-Training (Stereo) with 1 person, totally 130 frames
- Testing with 3 people
- 5 expressions: neutral, open mouth, close mouth,
smile, raise eyebrow
Comparison Between Different Methods with only one person training set
SVM with kernel erbf
SVM with kernel rbf
Clustering N-Nearest with N=9
N-Nearest with N=5
Same person 98/110 99/110 109/110 109/110 110/110
Total 216/282 207/282 233/282 231/282 229/282
Table 3
-Training (Stereo) with 2 people, totally 240 frames
- Testing with 3 people
- 3 emotional expressions: neutral, happy, surprise
- Transition between expressions are separated
Comparison Between Different Methods with three emotional expressions
SVM with kernel erbf
SVM with kernel rbf
Clustering
N-Nearest with N=9
N-Nearest with N=5
N-Nearest with N=3
N-Nearest with N=1
Same person
164/165 165/165 152/165 163/165 164/165 164/165 164/165
Total 222/228 223/228 213/228 225/228 224/228 223/228 223/228
Table 4
Performance Comparison Between Previous Expression Recognition Work
Recognition Rate
Pose Change
Number of Expressions
Test/Train Subject
Number of Data
Comments
Chen et.al, ICME 2000
%89 Direct camera view
7 Different subject
470 images
Problem with different people
Wang et.al, AFGR 1998
%96 Direct camera view
3 Different subject
29 image sequence
Sequence classification
(easier) Lien et.al,
AFGR 1998 %85-%93 ~10
degrees rotation
4 Different subject
~130 images
Only upper part of the face is
classified Hiroshi et.al, ICPR 1996
%70 ~45-60 degrees rotation
5 Same subject
900 images
Permits for rotations, but
rates are not as good Chang et.al,
IJCNN 1999 %92 Direct
camera view
3 Different subject
38 images Small test and training set
Matsuno et.al, ICCV 1995
%80 Direct camera view
4 Different subject
45 images Small test and training set
Hong et.al, AFGR 1998
%65-%85 Direct camera view
7 Same and different subject
~250 images
%85 with known person % 65 with unknown person
Hong et.al, AFGR 1998
%81-%97 Direct camera view
3 Same and different subject
~250 images
%97 with known person % 81 with unknown person
Sakaguchi et.al, ICPR
1996
%84 Direct camera view
6 Same subject
- The test and training set not
mentioned Our Work %91 ~70-80
degrees rotation
5 Different subject
282 images
Table 2
Our Work %98 ~70-80 degrees rotation
3 Different subject
228 images
Table 4 - Emotional
Expressions
OVERVIEW• PROBLEM DESCRIPTION
• TRAINING STAGE
• TESTING STAGE
• EXPERIMENTS
• CONCLUSION
Future Work
Conclusions
- Breakthrough facial expression recognition rates .
- 3-D is the right way to go…
- Test with more subjects and expressions.
- further application to face recognition (?)