Research Article Automated Facial Expression...

Hindawi Publishing CorporationChinese Journal of EngineeringVolume 2013 Article ID 831747 8 pageshttpdxdoiorg1011552013831747

Research ArticleAutomated Facial Expression Recognition UsingGradient-Based Ternary Texture Patterns

Faisal Ahmed1 and Emam Hossain2

1 Department of Computer Science and Engineering Islamic University of Technology Board Bazar Gazipur 1704 Bangladesh2Department of Computer Science and Engineering Ahsanullah University of Science and Technology Dhaka 1208 Bangladesh

Correspondence should be addressed to Faisal Ahmed fahmediut-dhakaedu

Received 17 October 2013 Accepted 24 November 2013

Academic Editors K Ariyur and H Lu

Copyright copy 2013 F Ahmed and E Hossain This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Recognition of human expression from facial image is an interesting research area which has received increasing attentionin the recent years A robust and effective facial feature descriptor is the key to designing a successful expression recognitionsystem Although much progress has been made deriving a face feature descriptor that can perform consistently under changingenvironment is still a difficult and challenging task In this paper we present the gradient local ternary pattern (GLTP)mdashadiscriminative local texture feature for representing facial expression The proposed GLTP operator encodes the local texture ofan image by computing the gradient magnitudes of the local neighborhood and quantizing those values in three discriminationlevels The location and occurrence information of the resulting micropatterns is then used as the face feature descriptor Theperformance of the proposedmethod has been evaluated for the person-independent face expression recognition task Experimentswith prototypic expression images from the Cohn-Kanade (CK) face expression database validate that the GLTP feature descriptorcan effectively encode the facial texture and thus achieves improved recognition performance than some well-known appearance-based facial features

1 Introduction

Over the last two decades automated recognition of humanfacial expression has been an active research area with awide variety of potential applications in human-computerinteraction data-driven animation surveillance and cus-tomized consumer products [1 2] Since the classification rateis heavily dependent on the information contained in thefeature representation an effective and discriminative featureset is the most important constituent of a successful facialexpression recognition system [3] Even the best classifierwill fail to attain satisfactory performance if supplied withinconsistent or inadequate features However in real-worldapplications facial images can easily be affected by differentfactors such as variations in lighting condition pose agingalignment and occlusion [4] Hence designing a robustfeature extraction method that can perform consistently inchanging environment is still a challenging task

Based on the types of features used facial feature extrac-tion methods can be roughly divided into two different cat-egories geometric feature-based methods and appearance-based methods [1 2] In geometric feature-based methodsthe feature vector is formed based on the geometric rela-tionships such as positions angles or distances betweendifferent facial components [2] Among the different tech-niques introduced so far one of the most popular geometricmethods is the facial action coding system (FACS) [5] thatrecognizes facial expression with the help of a set of actionunits (AU) Each of these action units corresponds to thephysical aspect of a particular facial muscle Later fiducialpoint-based representations [6ndash8] were also investigated byseveral researchers However the effectiveness of geometricmethods is heavily dependent on the accurate detectionof facial components which is a difficult task in changingenvironment Hence geometric feature-based methods aredifficult to accommodate in many real-world scenarios [2]

2 Chinese Journal of Engineering

90 32 27

58 50 12

61 38 46

1 0 0

1 0

1 0 0

ThresholdC

(a)

LBP operator

(b)

Figure 1 (a) Illustration of the basic LBP81

encoding scheme Here the LBP81

code 01110000 is used to label the center pixel 119862 (b) SampleLBP encoded image

Appearance-based methods extract the facial appearanceby convoluting the whole face image or some specific facialregionswith image filter or filter bank [1 2] Somewidely usedappearance-based methods include principal componentanalysis (PCA) [9] independent component analysis (ICA)[10 11] and Gabor wavelets [12 13] Although PCA and ICAfeature descriptors can effectively capture the variability of thetraining images their performances deteriorate in changingenvironment [14 15] On the other hand extraction of Gaborfeatures by convoluting face images with multiple Gaborfilters of various scales and orientations is computationallyexpensive Recently local appearance descriptors based onlocal binary pattern (LBP) [16] and its variants [17] haveattained much attention due to their robust performance inuncontrolled environment The LBP operator encodes thelocal texture of an image by quantizing the neighbor graylevels of a local neighborhoodwith respect to center value andthus forms a binary pattern that acts as a template for microlevel information such as edges spots or corners Howeverthe LBP method performs weakly under the presence oflarge illumination change and random noise [4] since alittle variation in the gray level can easily change the LBPcode Later local ternary pattern (LTP) [4] was introduced toincrease the robustness of LBP in uniform and near-uniformregions by adding an extra intensity discrimination level andextending the binary LBP value to a ternary code Morerecently Sobel-LBP [15] has been proposed to improve theperformance of LBP by applying Sobel operator to enhancethe edge information prior to applying LBP for featureextraction However in uniform and near-uniform regionsSobel-LBP generates inconsistent patterns as it uses onlytwo discrimination levels just like LBP Local directionalpattern (LDP) [2 14] employed a different texture encodingapproach where directional edge response values arounda position are used instead of gray levels Although thisapproach achieves better recognition performance than localbinary pattern LDP tends to produce inconsistent patternsin uniform and near-uniform facial regions and is heavilydependent on the selection of the number of prominent edgedirections parameter [3]

Considering the limitations of the existing local tex-ture descriptors this paper presents a new texture patternnamely the gradient local ternary pattern (GLTP) for person-independent facial expression recognition The proposedGLTP operator encodes the local texture information byquantizing the gradient magnitude values of a local neigh-borhood using three different discrimination levels Theproposed encoding scheme is able to differentiate between

smooth and high-textured facial parts which ensure the for-mation of texture micropatterns that are consistent with thelocal image characteristics (smooth or high-textured) Theperformance of the GLTP feature descriptor is empiricallyevaluated using a support vector machine (SVM) classifierExperiments with seven prototypic expression images fromthe Cohn-Kanade (CK) face expression database [18] validatethat the GLTP feature descriptor can effectively encodethe facial texture and thus achieves improved recognitionperformance than some widely used appearance-based facialfeature representation

2 LBP and LTP A Review

Local binary pattern (LBP) is a simple yet effective localtexture description technique LBP was originally introducedby Ojala et al [19] for grayscale and rotation-invarianttexture analysis Later many researchers have successfullyadopted LBP in different face-related problems such as facerecognition [20] and facial expression analysis [16]The basicLBP method operates on a local neighborhood around eachpixel of an image and thresholds the neighbor gray levelswith respect to the center The result is then concatenatedbinomially and the center pixel is labeled with the resultantvalue Formally the LBP operator can be represented a

LBP119875119877(119909119888 119910119888) =

119875minus1

sum

119901=0

119904 (119894119901minus 119894119888) 2119901 (1)

119904 (119909) = 1 119909 ge 0

0 119909 lt 0(2)

Here 119894119888is the gray value of the center pixel (119909

119888 119910119888)

119894119901is the gray value of the surrounding neighbors 119875 is

the total number of neighbors and 119877 is the radius of theneighborhood Bilinear interpolation is used to estimate thegray level of a neighbor if it does not fall exactly on a pixelposition The histogram of the LBP encoded image or imageblock is then used as the feature descriptor The basic LBPencoding process is illustrated in Figure 1

One limitation of the LBP encoding is that the LBP codesare susceptible to noise since a little change in the intensitiesof the neighbors can entirely alter the resulting binary codeTo address this issue Tan and Triggs [4] proposed the localternary pattern (LTP) which extends the binary LBP code toa 3-valued ternary code in order to provide more consistencyin uniform and near-uniform regions In the LTP encoding

Chinese Journal of Engineering 3

minus1 minus2 minus1

0 0 0

1 2 1

(a)

minus1 0 1

minus2 0 2

minus1 0 1

(b)

Figure 2 Sobel masks (a) horizontal mask and (b) vertical mask

process gray values in a zone of width plusmn119905 about the centerpixel are quantized to 0 and those above +119905 and below minus119905 arequantized to+1 andminus1 respectively Hence the indicator 119904(119909)in (2) is substituted by a 3-valued function

1199041015840(119894119901 119894119888) =

1 119894119901ge 119894119888+ 119905

010038161003816100381610038161003816119894119901minus 119894119888

10038161003816100381610038161003816lt 119905

minus1 119894119901le 119894119888minus 119905

(3)

Here 119905 is a user-specified threshold The combination ofthese three discrimination levels in a local neighbourhoodyields the final LTP value

3 Proposed Method Gradient LocalTernary Pattern (GLTP)

In practice the LBP operator encodes the local textureprimitives such as edges or spots by thresholding the localneighborhood at the value of the center pixel into a binarypattern Zhao et al [15] argued that applying Sobel operatorprior to LBP feature extraction further enhances the texturedetails and thus facilitates more accurate texture encodingHence they proposed the Sobel-LBP method [15] where theSobel operator is first applied on the image to compute thegradient magnitude values and then the basic LBP methodis used to encode the gradient values However both LBPand Sobel-LBP employ two discrimination levels (0 and 1)for texture encoding and thus fail to generate consistentpatterns in uniform and near-uniform regions where thedifference between the center and the neighbor gray levels isnegligible To address this limitation we propose the gradientlocal ternary pattern (GLTP) a new texture descriptor thatcombines the advantages of Sobel-LBP and LTP operatorsOur proposed method utilizes the more robust gradientmagnitude values instead of gray levels with a three-levelencoding scheme to discriminate between smooth and high-textured facial regions Thus the proposed method ensuresgeneration of robust texture patterns which are consistentwith the local image property (smooth or high-texturedregion) even under the presence of illumination variations

31 GLTP Encoding The proposed GLTP operator first cal-culates the gradient magnitudes of each pixel position of

an image which enhances the local texture features suchas edges spots or corners The gradient magnitude 119866

119909119910at

(119909 119910) position of an image 119891(119909 119910) can be computed usingthe following equation

119866119909119910

=1003816100381610038161003816119866119909

1003816100381610038161003816 +10038161003816100381610038161003816119866119910

10038161003816100381610038161003816 (4)

Here 119866119909and 119866

119910are the two elements of the gradient

vector and can be obtained by applying Sobel operator on theimage119891(119909 119910)The Sobel operator convolves an image119891(119909 119910)with a horizontalmask and a verticalmask to obtain the valueof 119866119909and 119866

119910 The two Sobel masks are shown in Figure 2

In a uniform or near-uniform local region the gradientmagnitude of all the pixels will be the same or almost similarHowever in high-textured regions pixels located on anedge or spot will have relatively higher gradient magnitudesthan the other pixels in the local neighborhood Hencethe GLTP operator employs a threshold region plusmn119905 aroundthe center gradient value of a 3 times 3 local neighborhoodin order to differentiate between smooth and high-texturedfacial regions Here neighbor gradient values falling in theplusmn119905 threshold region around the center gradient value 119866

119888are

quantized to 0 those below 119866119888minus 119905 and those above 119866

119888+ 119905 are

quantized tominus1 and +1 respectively as shown in the followingequation

119878GLTP (119866119888 119866119894) =

minus1 119866119894lt 119866119888minus 119905

0 119866119888minus 119905 le 119866

119894le 119866119888+ 119905

+1 119866119894gt 119866119888+ 119905

(5)

Here 119866119888is the gradient magnitude of the center (119909

119888 119910119888)

of a 3 times 3 neighborhood 119866119894is the gradient magnitude of

the surrounding neighbors and 119905 is a threshold Finally theGLTP code is obtained by concatenating the resultsThe basicgradient LTP encoding scheme is illustrated in Figure 3

32 Positive and Negative GLTP Codes One consequenceof using three-level encoding is that the number of possi-ble GLTP patterns (38) is much higher than the numberof possible LBP patterns (28) which results in a high-dimensional feature vector Different approaches [4 21] havebeen proposed to reduce the number of possible ternarypatterns Here we have adopted the approach proposed byTan and Triggs [4] where each ternary code is split into its


Expression image

Gradient operator

GLTP code

90 95 98

75 80 82

61 65 60

Gradient image

0 1 1

0 0


Gradient magnitude values

GLTP encodingC

of a 3 times 3 neighborhood

Figure 3 Illustration of the proposedGLTP encoding schemeHerethe GLTP code is 100(minus1)(minus1)(minus1)01 for threshold 119905 = 10

0 1 1

0 0


0 1 1

0 0

0 0 0

0 0 0

0 0

1 1 1

PGLTP code

Original GLTP code

NGLTP code

C

C

C

Figure 4 Generation of 119875GLTP and 119873GLTP codes from the originalGLTP code

corresponding positive (119875GLTP) and negative (119873GLTP) partsand treated as individual binary patterns as shown in

119875GLTP =7

sum

119894=0

119878119875(119878GLTP (119894)) times 2

119894

119878119875(V) =

1 if V gt 00 otherwise

119873GLTP =7

sum

119894=0

119878119873(119878GLTP (119894)) times 2

119894

119878119873(V) =

1 if V lt 00 otherwise

(6)

Here 119875GLTP and 119873GLTP are the corresponding positiveand negative parts of the GLTP code 119878GLTP The process isillustrated in Figure 4

4 Facial Feature Description Based onGLTP Codes

Applying the GLTP operator on a facial image will producetwo encoded image representations one for the 119875GLTP andthe other for the119873GLTP First histograms are computed fromthese two encoded images using

119867119875GLTP

(120591) =

119872

sum

119903=1

119873

sum

119888=1

119891 (119875GLTP (119903 119888) 120591) (7)

119867119873GLT119875

(120591) =

119872

sum

119903=1

119873

sum

119888=1

119891 (119873GLTP (119903 119888) 120591) (8)

119891 (119886 120591) = 1 119886 = 120591

0 otherwise(9)

Here 120591 is the positive or negative GLTP code valueHistograms computed from the 119875GLTP and 119873GLTP encodedimages are then concatenated spatially to produce the GLTPhistogram which represents the occurrence information ofthe 119875GLTP and 119873GLTP binary patterns The flowchart forcomputing the GLTP histogram is shown in Figure 5

Spatial histograms computed from the whole encodedimage do not reflect the location information of themicropat-terns only their occurrence frequencies are represented [1]However it is understandable that a histogram representationthat combines the location information of the GLTP micro-patterns with their occurrence frequencies is able to describethe local texture more accurately and effectively [22 23]Therefore in order to incorporate some degree of locationinformation with the GLTP histogram each facial imageis divided into a number of regions and individual GLTPhistograms (representing the occurrence information of themicro-patterns from the corresponding local region) com-puted from each of the regions are concatenated to obtain aspatially combined GLTP histogram In the facial expressionrecognition system this combined GLTP histogram is usedas the facial feature vector The process of generating thecombined GLTP histogram is illustrated in Figure 6

5 Expression Recognition UsingSupport Vector Machine (SVM)

Shan et al [16] presented a comparative analysis of fourdifferent machine learning techniques for the facial expres-sion recognition task namely template matching linear dis-criminant analysis linear programming and support vectormachine Among these methods support vector machine(SVM) achieved the best recognition performance Hence inour study we use SVM to classify facial expressions based onthe GLTP features

Support vector machine (SVM) is a well-establishedmachine learning approach which has been successfullyadopted in different data classification problemsThe conceptof SVM is based on the modern statistical learning theoryFor data classification SVM first implicitly maps the datainto a higher dimensional feature space and then constructs


Expression image Gradient image

0 100 2000

500

1000

1500

0 100 2000

500

1000

0 200 4000

500

1000

1500

GLTP histogram

GLTPoperator

PGLTP image NGLTP image

PGLTP histogram NGLTP histogram

Figure 5 Pictorial illustration of the GLTP histogram generationprocess

a hyperplane in such a way that the separating marginbetween the samples of two classes is optimalThis separatinghyperplane then functions as the decision surface

Given a set of labeled training samples 119879 = (x119894 119897119894) 119894 =

1 2 119871 where x119894isin RP and 119897

119894isin minus1 1 a new test data x

is classified by

119891 (x) = sign(119871

sum

119894=1

120572119894119897119894119870(x119894 x) + 119887) (10)

Here 120572119894are Lagrange multipliers of dual optimization

problem 119887 is a threshold parameter and 119870 is a kernelfunction SVM constructs a hyperplane which lies on themaximum separating margin with respect to the trainingsamples with 120572

119894gt 0 These samples are called the support

vectorsSVM takes binary decisions by constructing the separat-

ing hyperplane between the positive and negative examplesTo achieve multiclass classification we can adopt either theone-against-rest or several two-class decision problems Inthis study the one-against-rest approach was employed We

Expression image Image partitioned

GLTPhistograms

Extended GLTP histogram (GLTP feature vector)

Partition

middot middot middot


into 3 times 3 subregions

Figure 6 Each expression image is partitioned into a number ofregions and individual GLTPhistograms generated from each of theregions are concatenated to form the feature vector

used radial basis function (RBF) kernel for the classificationproblem The radial basis function119870 can be defined as

119870(x119894 x) = exp (minus1205741003817100381710038171003817x119894 minus x1003817100381710038171003817

2) 120574 gt 0

1003817100381710038171003817x119894 minus x10038171003817100381710038172= (x119894minus x)119905 (x

119894minus x)

(11)

Here 120574 is a kernel parameterWe carried out a grid searchfor selecting appropriate parameter value as suggested in[24]

6 Experiments and Results

61 Experimental Setup and Dataset Description To evaluatethe effectiveness of the proposed face feature descriptorexperiments were conducted on images collected from awell-known image database namely theCohn-Kanade (CK) facialexpression database [18] In the CK database a sample set of100 students aging from 18 to 30 during image acquisitionwere included A majority of the subjects (65) were female15 of the samples were African-American and 3 wereAsian or of Latin descent Each of the students displayedfacial expressions starting from nonexpressiveness to one ofthe aforementioned six prototypic emotional expressions inthe image acquisition process These image sequences werethen digitized into 640 times 480 or 640 times 690 pixel resolutionsIn our setup a set of 1224 facial image sequences wereselected from 96 subjects and each of the images was givena label describing the subjectrsquos facial expression The datasetcontaining the 6 classes of expressions was then extendedby 408 images of neutral facial images to obtain the 7-class expression dataset Figure 7 shows sample prototypicexpression images from the CK database


Joy Disgust Anger

Fear Sad Surprise

Figure 7 Sample 6-class expression images from the CK database[18]

Figure 8 Cropping of a sample face image from the original one

We cropped the selected images from the original onesbased on the ground truth of the positions of two eyes whichwere then normalized to 150 times 110 pixels Figure 8 showsa sample cropped facial image from CK database A tenfoldcross-validation was carried out to compute the classificationrate of the proposed method In tenfold cross-validation tensubsets comprising equal number of instances are formed bypartitioning the whole dataset randomlyThe classifier is firsttrained on the nine subsets and then the remaining set isused for testingThis process is repeated for 10 times and theaverage classification rate is computed The threshold value 119905was set to 10 empirically

62 Experimental Results The classification rate of the pro-posed method can be influenced by adjusting the numberof regions into which the expression images are to be split[2] We have considered three cases in our experiments asopted in [2] where images were divided into 3 times 3 5 times 5and 7 times 6 regions We have compared our proposed methodwith 3 widely used local texture descriptors namely localbinary pattern (LBP) [16] local ternary pattern (LTP) [4]and local directional pattern (LDP) [2] Tables 1 and 2 showthe classification rates of these local texture descriptors forthe 6-class and the 7-class expression recognition problemrespectively It can be observed that dividing an image withhigher number of regions will produce higher classificationrate since the feature descriptor then contains more locationand spatial information of the local patterns However thefeature vector length will also be higher in such cases whichaffects the computational efficiency Hence selection of thenumber of regions is a trade-off between computationalefficiency and classification rate

Table 1 Recognition rate () for the CK 6-class expression datasetusing different local texture descriptors

OperatorClassification rate () for different

number of regions3 times 3 5 times 5 7 times 6

LBP 793 897 901LTP 873 923 936LDP 802 919 937GLTP 905 964 972




LBP 738 809 833LTP 813 885 889LDP 757 863 884GLTP 840 906 917

Table 3 Confusion matrix of CK 6-class recognition using GLTPfor images partitioned into 7 times 6 regions

Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()

Anger 984 0 0 08 0 08Disgust 05 944 0 0 51 0Fear 0 07 971 0 04 18Joy 11 11 0 978 0 0Sad 0 45 0 0 955 0Surprise 0 0 0 0 0 100

For both the 6-class and the 7-class expression recog-nition problems the proposed GLTP feature descriptorachieves the highest recognition rate for images partitionedinto different number of regions For the 6-class datasetGLTP achieves an excellent recognition rate of 972 Onthe other hand for the 7-class dataset the recognition rateis 917 Here inclusion of neutral expression images resultsin a decrease in the accuracy For both the 6-class and the7-class recognition problems the highest classification rateis obtained for images partitioned into 7 times 6 regions Theconfusion matrix of recognition using the GLTP descriptorfor the 6-class and the 7-class datasets is shown in Tables3 and 4 respectively which provides a better picture of therecognition accuracy of individual expression types It can beobserved that for the 6-class recognition all the expressionscan be recognized with high accuracy For the 7-class datasetwhile anger disgust fear joy and surprise can be recognizedwith high accuracy the recognition rates of sadness andneutral expressions are lower than the average Evidentlyinclusion of neutral expression images results in a decrease inthe accuracy since many sad expression images are confusedwith the neutral expression images and vice versa



Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()

Neutral()

Anger 956 18 0 0 0 26 0Disgust 0 937 21 0 15 27 0Fear 0 25 940 0 0 13 22Joy 06 06 0 947 0 35 06Sad 0 0 0 0 851 0 149Surprise 18 18 0 52 05 907 0Neutral 0 0 17 0 98 0 885

The reason behind the superiority of GLTP face descrip-tor is the utilization of robust gradient magnitude values witha three-level encoding approach which facilitates the dis-crimination between smooth and high-textured face regionsand thus ensures generation of consistent texture micropat-terns even under the presence of illumination variation andrandom noise

7 Conclusion

This paper presents a new local texture pattern the gradientlocal ternary pattern (GLTP) for robust facial expressionrecognition Since gradientmagnitude values aremore robustthan gray levels under the presence of illumination variationsthe proposed method encodes the gradient values of a localneighborhood with respect to a threshold region aroundthe center gradient which facilitates robust description oflocal texture primitives such as edges spots or cornersunder different lighting conditions In addition with thehelp of the threshold region defined around the centerthe proposed method can effectively differentiate betweensmooth and high-textured facial areas which enable theformation of GLTP codes consistent with the local textureproperty Experiments with prototypic expression imagesfrom the Cohn-Kanade expression database demonstratethat the proposed GLTP operator can effectively representfacial texture and thus achieves superior performance thansome widely used local texture patterns In the future weplan to apply the GLTP feature descriptor in other face-related recognition problems such as face recognition andgender classification for intelligent consumer products andapplications development

References

[1] F Ahmed H Bari and E Hossain ldquoPerson-independent facialexpression recognition based on Compound Local BinaryPattern (CLBP)rdquo International Arab Journal of InformationTechnology vol 11 no 2 2013

[2] T Jabid M H Kabir and O Chae ldquoRobust facial expressionrecognition based on local directional patternrdquo ETRI Journalvol 32 no 5 pp 784ndash794 2010

[3] F Ahmed andM H Kabir ldquoDirectional ternary pattern ( DTP)for facial expression recognitionrdquo in Proceedings of the IEEE

International Conference on Consumer Electronics (ICCE rsquo12)pp 265ndash266 Las Vegas Nev USA January 2012

[4] X Tan and B Triggs ldquoEnhanced local texture feature sets forface recognition under difficult lighting conditionsrdquo in IEEEInternational Workshop on Analysis and Modeling of Faces andGestures vol 4778 of Lecture Notes in Computer Science pp168ndash182 2007

[5] P Ekman and W Friesen Facial Action Coding System ATechnique for Measurement of Facial Movement ConsultingPsychologists Press Palo Alto Calif USA 1978

[6] Z Zhang ldquoFeature-based facial expression recognition sensi-tivity analysis and experiments with a multilayer perceptronrdquoInternational Journal of Pattern Recognition and Artificial Intel-ligence vol 13 no 6 pp 893ndash911 1999

[7] G D Guo and C R Dyer ldquoSimultaneous feature selection andclassifier training via linear programming a case study for faceexpression recognitionrdquo in Proceedings of the IEEE ComputerSociety Conference on Computer Vision and Pattern Recognitionpp 346ndash352 June 2003

[8] M Valstar I Patras andM Pantic ldquoFacial action unit detectionusing probabilistic actively learned support vector machines ontracked facial point datardquo in IEEE CVPR Workshop vol 3 pp76ndash84 2005

[9] C Padgett and G Cottrell ldquoRepresentation face images foremotion classificationrdquoAdvances in Neural Information Process-ing Systems vol 9 pp 894ndash900 1997

[10] M S Bartlett J R Movellan and T J Sejnowski ldquoFace recog-nition by independent component analysisrdquo IEEE Transactionson Neural Networks vol 13 no 6 pp 1450ndash1464 2002

[11] C C Fa and F Y Shih ldquoRecognizing facial action units usingindependent component analysis and support vector machinerdquoPattern Recognition vol 39 no 9 pp 1795ndash1798 2006

[12] M J Lyons ldquoAutomatic classification of single facial imagesrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 21 no 12 pp 1357ndash1362 1999

[13] Y Tian ldquoEvaluation of face resolution for expression analysisrdquoin IEEE Workshop on Face Processing in Video 2004

[14] T Jabid H Kabir and O Chaei ldquoLocal Directional Pattern(LDP) for face recognitionrdquo in Proceedings of the InternationalConference onConsumer Electronics (ICCE rsquo10) pp 329ndash330 LasVegas Nev USA January 2010

[15] S Zhao Y Gao and B Zhang ldquoSobel-LBPrdquo in IEEE Interna-tional Conference on Image Processing pp 2144ndash2147 2008

[16] C Shan S Gong and P W McOwan ldquoFacial expression recog-nition based on Local Binary Patterns a comprehensive studyrdquoImage and Vision Computing vol 27 no 6 pp 803ndash816 2009

[17] G Zhao and M Pietikainen ldquoBoosted multi-resolution spa-tiotemporal descriptors for facial expression recognitionrdquo Pat-tern Recognition Letters vol 30 no 12 pp 1117ndash1127 2009

[18] T Kanade J Cohn and Y Tian ldquoComprehensive database forfacial expression analysisrdquo in IEEE International Conference onAutomated Face and Gesture Recognition pp 46ndash53 2000

[19] T Ojala M Pietikainen and T Maenpaa ldquoMultiresolutiongray-scale and rotation invariant texture classificationwith localbinary patternsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 7 pp 971ndash987 2002

[20] T Ahonen A Hadid and M Pietikainen ldquoFace descriptionwith local binary patterns application to face recognitionrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol28 no 12 pp 2037ndash2041 2006


[21] D He and N Cercone ldquoLocal triplet pattern for content-basedimage retrievalrdquo in International Conference on Image Analysisand Recognition pp 229ndash238 2009

[22] S Gundimada and V K Asari ldquoFacial recognition usingmultisensor images based on localized kernel eigen spacesrdquoIEEE Transactions on Image Processing vol 18 no 6 pp 1314ndash1325 2009

[23] F Ahmed ldquoGradient directional pattern a robust featuredescriptor for facial expression recognitionrdquo IET ElectronicsLetters vol 48 no 19 pp 1203ndash1204 2012

[24] C-W Hsu and C-J Lin ldquoA comparison of methods for mul-ticlass support vector machinesrdquo IEEE Transactions on NeuralNetworks vol 13 no 2 pp 415ndash425 2002

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014


Active and Passive Electronic Components

Control Scienceand Engineering

Journal of



RotatingMachinery


Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design



Shock and Vibration


Civil EngineeringAdvances in

Acoustics and VibrationAdvances in



Electrical and Computer Engineering

Journal of

Advances inOptoElectronics


Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of


Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014


Chemical EngineeringInternational Journal of Antennas and

Propagation




Navigation and Observation



DistributedSensor Networks



90 32 27

58 50 12

61 38 46

1 0 0

1 0

1 0 0

ThresholdC

(a)

LBP operator

(b)

Figure 1 (a) Illustration of the basic LBP81

encoding scheme Here the LBP81

code 01110000 is used to label the center pixel 119862 (b) SampleLBP encoded image

Appearance-based methods extract the facial appearanceby convoluting the whole face image or some specific facialregionswith image filter or filter bank [1 2] Somewidely usedappearance-based methods include principal componentanalysis (PCA) [9] independent component analysis (ICA)[10 11] and Gabor wavelets [12 13] Although PCA and ICAfeature descriptors can effectively capture the variability of thetraining images their performances deteriorate in changingenvironment [14 15] On the other hand extraction of Gaborfeatures by convoluting face images with multiple Gaborfilters of various scales and orientations is computationallyexpensive Recently local appearance descriptors based onlocal binary pattern (LBP) [16] and its variants [17] haveattained much attention due to their robust performance inuncontrolled environment The LBP operator encodes thelocal texture of an image by quantizing the neighbor graylevels of a local neighborhoodwith respect to center value andthus forms a binary pattern that acts as a template for microlevel information such as edges spots or corners Howeverthe LBP method performs weakly under the presence oflarge illumination change and random noise [4] since alittle variation in the gray level can easily change the LBPcode Later local ternary pattern (LTP) [4] was introduced toincrease the robustness of LBP in uniform and near-uniformregions by adding an extra intensity discrimination level andextending the binary LBP value to a ternary code Morerecently Sobel-LBP [15] has been proposed to improve theperformance of LBP by applying Sobel operator to enhancethe edge information prior to applying LBP for featureextraction However in uniform and near-uniform regionsSobel-LBP generates inconsistent patterns as it uses onlytwo discrimination levels just like LBP Local directionalpattern (LDP) [2 14] employed a different texture encodingapproach where directional edge response values arounda position are used instead of gray levels Although thisapproach achieves better recognition performance than localbinary pattern LDP tends to produce inconsistent patternsin uniform and near-uniform facial regions and is heavilydependent on the selection of the number of prominent edgedirections parameter [3]

Considering the limitations of the existing local tex-ture descriptors this paper presents a new texture patternnamely the gradient local ternary pattern (GLTP) for person-independent facial expression recognition The proposedGLTP operator encodes the local texture information byquantizing the gradient magnitude values of a local neigh-borhood using three different discrimination levels Theproposed encoding scheme is able to differentiate between

smooth and high-textured facial parts which ensure the for-mation of texture micropatterns that are consistent with thelocal image characteristics (smooth or high-textured) Theperformance of the GLTP feature descriptor is empiricallyevaluated using a support vector machine (SVM) classifierExperiments with seven prototypic expression images fromthe Cohn-Kanade (CK) face expression database [18] validatethat the GLTP feature descriptor can effectively encodethe facial texture and thus achieves improved recognitionperformance than some widely used appearance-based facialfeature representation

2 LBP and LTP A Review

Local binary pattern (LBP) is a simple yet effective localtexture description technique LBP was originally introducedby Ojala et al [19] for grayscale and rotation-invarianttexture analysis Later many researchers have successfullyadopted LBP in different face-related problems such as facerecognition [20] and facial expression analysis [16]The basicLBP method operates on a local neighborhood around eachpixel of an image and thresholds the neighbor gray levelswith respect to the center The result is then concatenatedbinomially and the center pixel is labeled with the resultantvalue Formally the LBP operator can be represented a

LBP119875119877(119909119888 119910119888) =

119875minus1

sum

119901=0

119904 (119894119901minus 119894119888) 2119901 (1)

119904 (119909) = 1 119909 ge 0

0 119909 lt 0(2)

Here 119894119888is the gray value of the center pixel (119909

119888 119910119888)

119894119901is the gray value of the surrounding neighbors 119875 is

the total number of neighbors and 119877 is the radius of theneighborhood Bilinear interpolation is used to estimate thegray level of a neighbor if it does not fall exactly on a pixelposition The histogram of the LBP encoded image or imageblock is then used as the feature descriptor The basic LBPencoding process is illustrated in Figure 1

One limitation of the LBP encoding is that the LBP codesare susceptible to noise since a little change in the intensitiesof the neighbors can entirely alter the resulting binary codeTo address this issue Tan and Triggs [4] proposed the localternary pattern (LTP) which extends the binary LBP code toa 3-valued ternary code in order to provide more consistencyin uniform and near-uniform regions In the LTP encoding



0 0 0

1 2 1

(a)

minus1 0 1

minus2 0 2

minus1 0 1

(b)



1199041015840(119894119901 119894119888) =

1 119894119901ge 119894119888+ 119905

010038161003816100381610038161003816119894119901minus 119894119888

10038161003816100381610038161003816lt 119905

minus1 119894119901le 119894119888minus 119905

(3)






119909119910at


119866119909119910

=1003816100381610038161003816119866119909

1003816100381610038161003816 +10038161003816100381610038161003816119866119910

10038161003816100381610038161003816 (4)

Here 119866119909and 119866





119888are


119888+ 119905 are


119878GLTP (119866119888 119866119894) =

minus1 119866119894lt 119866119888minus 119905

0 119866119888minus 119905 le 119866

119894le 119866119888+ 119905

+1 119866119894gt 119866119888+ 119905

(5)


119888 119910119888)





Expression image

Gradient operator

GLTP code

90 95 98

75 80 82

61 65 60

Gradient image

0 1 1

0 0



GLTP encodingC



0 1 1

0 0


0 1 1

0 0

0 0 0

0 0 0

0 0

1 1 1

PGLTP code

Original GLTP code

NGLTP code

C

C

C



119875GLTP =7

sum

119894=0

119878119875(119878GLTP (119894)) times 2

119894

119878119875(V) =


119873GLTP =7

sum

119894=0

119878119873(119878GLTP (119894)) times 2

119894

119878119873(V) =


(6)




119867119875GLTP

(120591) =

119872

sum

119903=1

119873

sum

119888=1

119891 (119875GLTP (119903 119888) 120591) (7)

119867119873GLT119875

(120591) =

119872

sum

119903=1

119873

sum

119888=1

119891 (119873GLTP (119903 119888) 120591) (8)

119891 (119886 120591) = 1 119886 = 120591

0 otherwise(9)








0 100 2000

500

1000

1500

0 100 2000

500

1000

0 200 4000

500

1000

1500

GLTP histogram

GLTPoperator








is classified by

119891 (x) = sign(119871

sum

119894=1

120572119894119897119894119870(x119894 x) + 119887) (10)







GLTPhistograms


Partition







2) 120574 gt 0


119894minus x)

(11)





Joy Disgust Anger

Fear Sad Surprise








LBP 793 897 901LTP 873 923 936LDP 802 919 937GLTP 905 964 972




LBP 738 809 833LTP 813 885 889LDP 757 863 884GLTP 840 906 917


Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()





Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()

Neutral()



7 Conclusion


References





























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation











0 0 0

1 2 1

(a)

minus1 0 1

minus2 0 2

minus1 0 1

(b)



1199041015840(119894119901 119894119888) =

1 119894119901ge 119894119888+ 119905

010038161003816100381610038161003816119894119901minus 119894119888

10038161003816100381610038161003816lt 119905

minus1 119894119901le 119894119888minus 119905

(3)






119909119910at


119866119909119910

=1003816100381610038161003816119866119909

1003816100381610038161003816 +10038161003816100381610038161003816119866119910

10038161003816100381610038161003816 (4)

Here 119866119909and 119866





119888are


119888+ 119905 are


119878GLTP (119866119888 119866119894) =

minus1 119866119894lt 119866119888minus 119905

0 119866119888minus 119905 le 119866

119894le 119866119888+ 119905

+1 119866119894gt 119866119888+ 119905

(5)


119888 119910119888)





Expression image

Gradient operator

GLTP code

90 95 98

75 80 82

61 65 60

Gradient image

0 1 1

0 0



GLTP encodingC



0 1 1

0 0


0 1 1

0 0

0 0 0

0 0 0

0 0

1 1 1

PGLTP code

Original GLTP code

NGLTP code

C

C

C



119875GLTP =7

sum

119894=0

119878119875(119878GLTP (119894)) times 2

119894

119878119875(V) =


119873GLTP =7

sum

119894=0

119878119873(119878GLTP (119894)) times 2

119894

119878119873(V) =


(6)




119867119875GLTP

(120591) =

119872

sum

119903=1

119873

sum

119888=1

119891 (119875GLTP (119903 119888) 120591) (7)

119867119873GLT119875

(120591) =

119872

sum

119903=1

119873

sum

119888=1

119891 (119873GLTP (119903 119888) 120591) (8)

119891 (119886 120591) = 1 119886 = 120591

0 otherwise(9)








0 100 2000

500

1000

1500

0 100 2000

500

1000

0 200 4000

500

1000

1500

GLTP histogram

GLTPoperator








is classified by

119891 (x) = sign(119871

sum

119894=1

120572119894119897119894119870(x119894 x) + 119887) (10)







GLTPhistograms


Partition







2) 120574 gt 0


119894minus x)

(11)





Joy Disgust Anger

Fear Sad Surprise








LBP 793 897 901LTP 873 923 936LDP 802 919 937GLTP 905 964 972




LBP 738 809 833LTP 813 885 889LDP 757 863 884GLTP 840 906 917


Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()





Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()

Neutral()



7 Conclusion


References





























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation










Expression image

Gradient operator

GLTP code

90 95 98

75 80 82

61 65 60

Gradient image

0 1 1

0 0



GLTP encodingC



0 1 1

0 0


0 1 1

0 0

0 0 0

0 0 0

0 0

1 1 1

PGLTP code

Original GLTP code

NGLTP code

C

C

C



119875GLTP =7

sum

119894=0

119878119875(119878GLTP (119894)) times 2

119894

119878119875(V) =


119873GLTP =7

sum

119894=0

119878119873(119878GLTP (119894)) times 2

119894

119878119873(V) =


(6)




119867119875GLTP

(120591) =

119872

sum

119903=1

119873

sum

119888=1

119891 (119875GLTP (119903 119888) 120591) (7)

119867119873GLT119875

(120591) =

119872

sum

119903=1

119873

sum

119888=1

119891 (119873GLTP (119903 119888) 120591) (8)

119891 (119886 120591) = 1 119886 = 120591

0 otherwise(9)








0 100 2000

500

1000

1500

0 100 2000

500

1000

0 200 4000

500

1000

1500

GLTP histogram

GLTPoperator








is classified by

119891 (x) = sign(119871

sum

119894=1

120572119894119897119894119870(x119894 x) + 119887) (10)







GLTPhistograms


Partition







2) 120574 gt 0


119894minus x)

(11)





Joy Disgust Anger

Fear Sad Surprise








LBP 793 897 901LTP 873 923 936LDP 802 919 937GLTP 905 964 972




LBP 738 809 833LTP 813 885 889LDP 757 863 884GLTP 840 906 917


Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()





Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()

Neutral()



7 Conclusion


References





























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation











0 100 2000

500

1000

1500

0 100 2000

500

1000

0 200 4000

500

1000

1500

GLTP histogram

GLTPoperator








is classified by

119891 (x) = sign(119871

sum

119894=1

120572119894119897119894119870(x119894 x) + 119887) (10)







GLTPhistograms


Partition







2) 120574 gt 0


119894minus x)

(11)





Joy Disgust Anger

Fear Sad Surprise








LBP 793 897 901LTP 873 923 936LDP 802 919 937GLTP 905 964 972




LBP 738 809 833LTP 813 885 889LDP 757 863 884GLTP 840 906 917


Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()





Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()

Neutral()



7 Conclusion


References





























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation










Joy Disgust Anger

Fear Sad Surprise








LBP 793 897 901LTP 873 923 936LDP 802 919 937GLTP 905 964 972




LBP 738 809 833LTP 813 885 889LDP 757 863 884GLTP 840 906 917


Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()





Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()

Neutral()



7 Conclusion


References





























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation











Anger()

Disgust()

Fear()

Joy()

Sad()

Surprise()

Neutral()



7 Conclusion


References





























RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation
















RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation









Research Article Automated Facial Expression...

Documents

Transcript of Research Article Automated Facial Expression...