Facial keypoint recognition
-
Upload
akrita-agarwal -
Category
Education
-
view
684 -
download
1
description
Transcript of Facial keypoint recognition
![Page 1: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/1.jpg)
Facial Keypoint Recognition
By
Akrita Agarwal
&
Srivathsava Sista
![Page 2: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/2.jpg)
Introduction
The goal of this project is to be able to properly label the key points on a greyscale photograph of a human face.
We are given labelled training data consisting of 7049 images.
We used a variety of methods to use this data to perform predictions on a test data which had 1783 images
Test data was also labelled which allowed us to measure the accuracy of each method used
Implemented in R
![Page 3: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/3.jpg)
Format of the Data
Each image was a 96 x 96 size image in greyscale
This means that each pixel was is described by a value which indicates the intensity of grey with 0 being purely white and 255 being purely black
Each training image is labelled with the (x,y) coordinates for 15 facial keypoints; which include the centre and corners of the eyes, eyebrows, lips, tip of the nose etc.
Each of these labels is followed by the 9216 integers which is essentially the greyscale image itself.
The entire data is given in a CSV file.
![Page 4: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/4.jpg)
Features of an Image:
![Page 5: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/5.jpg)
Evaluation of a Predictor
We compare the generated results with the labelled test data and calculate the root mean square error of the results.
The root mean square error will punish large errors and give us a good reflection of the accuracy of the predictor used.
![Page 6: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/6.jpg)
Simple Means
Calculated the mean of every feature in the training data.
Applied the mean as the required answer for every test data picture.
No real analysis of the data was done, a very simplistic method
Resulted in an RMSE of 3.96244
Obviously not a very refined approach
![Page 7: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/7.jpg)
![Page 8: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/8.jpg)
Image Patches
This method is similar to the simplistic means method, but instead of taking the mean only of the point, it considers a patch of the image centered around the keypoint.
We can consider a patch size of about 10 or 15 pixels as reasonable.
Using this method, we are able to better aggregate and generalize the results for every image as it now looks for an entire area around the keypoint to roughly match the average.
![Page 9: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/9.jpg)
Mean Right Eye Patch Mean Nose Tip Patch
![Page 10: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/10.jpg)
Evaluation of Mean Patches
Depending on the size of the patch, we got different results for the RMSE
Testing for patch sizes between 10 and 15, we found the optimal size to be 14 as we ended up with an RMSE of 3.75538
![Page 11: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/11.jpg)
Artificial Neural Networks
We then used neural network based classification for the data.
However, the data was just too massive to perform the entire calculation as 9216 labels had to be assigned for over 7000 images. This led to unfeasible execution times in trying to train the neural network.
Using a decimation filter, we reduced the 96x96 size images down to a 24x24 size, and considered only half of the original training set.
However as the training data is still sizable, the plots of the neural networks remained unreadable, but execution time was cut down on.
![Page 12: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/12.jpg)
Neural Network Plot with 2 Hidden Layers
![Page 13: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/13.jpg)
Evaluation of Neural Networks
![Page 14: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/14.jpg)
Conclusion
The improvement of the RMSE over simplistic methods when using neural networks indicates that the features in the data are not independent of each other.
Our earlier methods did not consider the inter-dependency of the features.
It makes intuitive sense that the data proved to be interdependent as the features of the face generally follow a certain pattern.
Facial Keypoint Recognition is a very important field as it forms the initial step towards more advanced application such as facial recognition and facial expression identification
![Page 15: Facial keypoint recognition](https://reader033.fdocuments.net/reader033/viewer/2022061218/54825b8eb4af9faa0d8b4769/html5/thumbnails/15.jpg)
References
All the project data was obtained from kaggle.com
The dataset was in turn obtained from Dr. Yoshua Bengio, University of Montreal.
R packages and tutorials from the official site : http://www.r-project.org