Real-time Emotion Recognition

1
Traditional Approaches EigenFace FisherFace Technique Convolution Neural Network with Inception V3 Simon Fraser University PROBLEM Facial expression classification: Capture live stream from a video camera attached to a laptop for our experiments. Apply and Benchmark different machine learning models for facial expression recognition Classify three different facial expressions: Neutral, Happy and Surprise. Final predicted facial expression is displayed via a live feed using the laptop camera. Applications of facial expression classification: Customer Engagement Virtual Reality Avatar Countenance Classifier : How are you feeling today? Liam Bui, Alexandre Lopes Simple Convolution Neural Network EXPERIMENTS Summary of Results Validation Accuracy by Epoch DATASET SOURCES Initial models trained on the Kaggle Dataset with 35,000 facial expression images Limited computation resources makes it infeasible to experiment with Kaggle Dataset CK Dataset supplemented with our self- created images is used in our final experiments Class such as Neutral and Sad appeared to be very similar to each other. LIVE DEMO Happy Expression Neutral Expression Surprise Expression Method Accuracy without data augmentation Accuracy with Data Augmentation EigenFace 66% 68% FisherFace 87% 88.2% Simple CNN 94% 94% Inception V3 (Inception training disabled) 91% 91% Inception V3 (Inception training enabled) 99% 99% CK Dataset Supplemented with self created images CONCLUSIONS Out of the two traditional approaches, FisherFace technique outperforms EigenFace technique. Convolutional Neural Network outperforms traditional techniques. Enabling Inception V3 training improves model performance significantly. Input Layer Convolutio n Layer 1 Convoluti on Layer 2 Pooling Layer Dense Layer Output Layer Dropout Layer 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 10 20 30 40 50 60 70 80 90 100 Inception V3 vs Simple CNN(with Data Augmentation) Validation Accuracy - CNN Validation Accuracy - Inception V3 Validation % LIMITATIONS Due to computing resource constraints, training the Inception V3 model on CPU’S is time consuming. Hence, small data set is used in our experiments Live demo is more challenging because of different conditions (lightning, blur, angles, glasses) Input Layer Inceptio n V3 Dense Layer Output Layer Dropout Layer

Transcript of Real-time Emotion Recognition

Page 1: Real-time Emotion Recognition

Traditional ApproachesEigenFace FisherFace Technique

Convolution Neural Network with Inception V3

Simon Fraser University

PROBLEMFacial expression classification: • Capture live stream from a video camera attached to a

laptop for our experiments.• Apply and Benchmark different machine learning models

for facial expression recognition• Classify three different facial expressions: Neutral, Happy

and Surprise.• Final predicted facial expression is displayed via a live

feed using the laptop camera.

Applications of facial expression classification: • Customer Engagement• Virtual Reality Avatar

Countenance Classifier : How are you feeling today?Liam Bui, Alexandre Lopes

Simple Convolution Neural Network

EXPERIMENTS Summary of Results

Validation Accuracy by Epoch

DATASET SOURCES• Initial models trained on the Kaggle Dataset with 35,000

facial expression images• Limited computation resources makes it infeasible to

experiment with Kaggle Dataset• CK Dataset supplemented with our self-created images is

used in our final experiments• Class such as Neutral and Sad appeared to be very similar

to each other.

LIVE DEMO• Happy Expression

• Neutral Expression

• Surprise Expression

Method Accuracy without data augmentation

Accuracy with Data Augmentation

EigenFace 66% 68%

FisherFace 87% 88.2%

Simple CNN 94% 94%

Inception V3(Inception training

disabled)91% 91%

Inception V3(Inception training

enabled)99% 99%

CK DatasetSupplemented with self created images

CONCLUSIONS• Out of the two traditional approaches, FisherFace

technique outperforms EigenFace technique.• Convolutional Neural Network outperforms traditional

techniques. • Enabling Inception V3 training improves model

performance significantly.

Input Layer Convolution Layer 1

Convolution Layer 2

Pooling LayerDense LayerOutput

LayerDropout

Layer

1 2 3 4 5 6 7 8 9 10 11 12 13 14 150

10

20

30

40

50

60

70

80

90

100

Inception V3 vs Simple CNN(with Data Augmentation)

Validation Accuracy - CNN Validation Accuracy - Inception V3

Valid

ation

%

LIMITATIONS• Due to computing resource constraints, training the

Inception V3 model on CPU’S is time consuming.• Hence, small data set is used in our experiments• Live demo is more challenging because of different

conditions (lightning, blur, angles, glasses)

Input Layer Inception V3 Dense Layer

Output Layer

Dropout Layer