Robust regression for face recognition

15

Click here to load reader

Transcript of Robust regression for face recognition

Page 1: Robust regression for face recognition

Pattern Recognition 45 (2012) 104–118

Contents lists available at ScienceDirect

Pattern Recognition

0031-32

doi:10.1

� Corr

E-m

imrann1 Th

of West

journal homepage: www.elsevier.com/locate/pr

Robust regression for face recognition

Imran Naseem a,�,1, Roberto Togneri b, Mohammed Bennamoun c

a College of Engineering, Karachi Institute of Economics and Technology (KIET), Karachi 75190, Pakistanb School of EE&C Engineering, University of Western Australia, WA 6009, Australiac School of CSS Engineering, University of Western Australia, WA 6009, Australia

a r t i c l e i n f o

Article history:

Received 11 October 2010

Received in revised form

18 April 2011

Accepted 5 July 2011Available online 14 July 2011

Keywords:

Robust face recognition

Random pixel contamination

Robust estimation

03/$ - see front matter & 2011 Elsevier Ltd. A

016/j.patcog.2011.07.003

esponding author.

ail addresses: [email protected],

[email protected] (I. Naseem).

e presented research was conducted when the a

ern Australia.

a b s t r a c t

In this paper we address the problem of robust face recognition by formulating the pattern recognition

task as a problem of robust estimation. Using a fundamental concept that in general, patterns from a

single object class lie on a linear subspace (Barsi and Jacobs, 2003 [1]), we develop a linear model

representing a probe image as a linear combination of class specific galleries. In the presence of noise,

the well-conditioned inverse problem is solved using the robust Huber estimation and the decision is

ruled in favor of the class with the minimum reconstruction error. The proposed Robust Linear

Regression Classification (RLRC) algorithm is extensively evaluated for two important cases of

robustness i.e. illumination variations and random pixel corruption. Illumination invariant face

recognition is demonstrated on three standard databases under exemplary evaluation protocols

reported in the literature. Comprehensive comparative analysis with the state-of-art illumination

tolerant approaches indicates a comparable performance index for the proposed RLRC algorithm. The

efficiency of the proposed approach in the presence of severe random noise is validated under several

exemplary noise models such as dead-pixel problem, salt and pepper noise, speckle noise and Additive

White Gaussian Noise (AWGN). The RLRC algorithm is found to be favorable compared with the

benchmark generative approaches.

& 2011 Elsevier Ltd. All rights reserved.

1. Introduction

Increasing security threats have highlighted the need ofefficient and foolproof authentication systems for sensitive facil-ities. In this regard, biometrics have demonstrated good perfor-mance. Among the other available biometrics, such as speech, iris,fingerprints, hand geometry and gait, face seems to be the mostnatural choice [2]. It is nonintrusive, requires a minimum of usercooperation and is cheap to implement. The importance of facerecognition is highlighted for widely used video surveillancesystems where we typically have facial images of suspects [3].With the recent development in multimedia signal transmissionand processing, we witness emerging applications in the para-digm of face recognition. Face recognition through Internet, forinstance, being one of the latest implementations. However, theserelatively new applications tend to equally signify the problem ofrobustness. Although training is conducted offline in a controlledlaboratory environment, probe images are always vulnerableto distortion due to ambient luminance, sensor malfunctioning,

ll rights reserved.

uthor was with the University

channel noise, compression noise over the Internet medium,etc. [4,5].

In general, face recognition systems critically depend onmanifold learning methods. A gray-scale face image of ordera� b can be represented as an ab dimensional vector in theoriginal image space. Typically, in pattern recognition problems,it is believed that high-dimensional data vectors are redundantmeasurements of an underlying source. The objective of manifoldlearning is therefore to uncover this ‘‘underlying source’’ by asuitable transformation of high-dimensional measurements tolow-dimensional data vectors. Therefore, at the feature extractionstage, images are transformed to low-dimensional vectors in aface space. The main objective is to find a basis function for thistransformation, which could distinguishably represent faces inthe face space. In the presence of noise, however, it is supposed tobe an extremely challenging task [3,6]. It follows from codingtheory that iterative measurements are more likely to safelyrecover information in the presence of noise [7], therefore work-ing in a low-dimensional feature space maintaining the aspect ofrobustness is in fact an ardent problem in object recognition.A number of approaches have been reported in the literature fordimensionality reduction. In the context of robustness, theseapproaches have been broadly classified in two categories namelygenerative/reconstructive and discriminative methods [8]. Recon-structive approaches (such as PCA [9], ICA [10] and NMF [11,12])

Page 2: Robust regression for face recognition

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118 105

are reported to be robust for the problem related to missing andcontaminated pixels, these methods essentially exploit the redun-dancy in the visual data to produce representations with suffi-cient reconstruction property. Formally, given an input x and labely, the generative classifiers learn a model of the joint probabilitypðx,yÞ and classify using pðyjxÞ, which is determined using theBayes’ rule. The discriminative approaches (such as LDA [13]),on the other hand, are known to yield better results in ‘‘clean’’conditions [14] owing to the flexible decision boundaries. Theoptimal decision boundaries are determined using the posteriorpðyjxÞ directly from the data [8] and are consequently moresensitive to outliers. Apart from these traditional approaches, ithas been shown recently that unorthodox features such as down-sampled images and random projections can serve equally well.In fact the choice of the feature space may no longer be so critical[15–17]. What really matters is the dimensionality of the featurespace and the design of the classifier.

In the paradigm of face recognition, illumination variation issupposed to be a major robustness issue [3]. Several approacheshave been proposed in the literature to tackle the problem.A sophisticated approach in [18] models images of a subject witha fixed pose but different illumination conditions as a convexcone in the space of images. Consequently a small number oftraining images of each face taken under various lighting condi-tions are used for the reconstruction of shape and albedo of theface. Although the approach has demonstrated some good results,in practice the computation of an exact illumination cone for agiven subject is quite expensive and tedious due to a largenumber of extreme rays. Studies have shown that the facialimages under varying luminance conditions can be modeled aslow-dimensional linear subspaces [19]. Basis images for thispurpose may be obtained using a 3D model under diffuse lightingbased on spherical harmonics. Arrangement of physical lightingcan be harnessed so as to obtain images which can directly beused as basis vectors for low-dimensional linear space. Anotherline of action is to normalize/compensate the illumination effectby some kind of preprocessing such as histogram equalization,gamma correction and logarithm transform. However, theseelementary global processing techniques are not of much helpin the presence of nonuniform illumination variations [20]. More-over some latest approaches such as the Line Edge Map (LEM)[21] and Face-ARG matching [22] have shown good toleranceunder adverse luminance alterations. The use of geometrical/structural information of the face region justifies the implicitrobustness of these approaches.

Apart from the illumination problem, it has also been shown inthe literature that traditional face recognition approaches do notcope well in the presence of severe random noise [17,23–26].Most of the approaches in the literature, robust to random pixelnoise, are variants of neural network classification. An importantwork is presented in [24] incorporating a robust kernel approachin the presence of severe noise. Encouraging results have beenshown for two important problems of the additive noise (salt andpepper) and the multiplicative noise (speckle) compared to thetraditional SVM approaches. Similarly in [26] it has been shownthat neural network classifier outperforms traditional PCA [9],2DPCA [27], LDA [13] and Laplacianfaces [28] approaches for thecase of severe additive Gaussian noise. Apart from these neuralnetwork approaches, recently sparse representation classification(SRC) has been presented [17,25]. In the presence of noise(modeled as uniform random variable) the proposed approachhas shown to outperform the traditional approaches of PCA, ICA I,LNMF and L2

þNS, however, other important noise models such asspeckle and salt and pepper noise are not addressed.

In this research we propose a robust classification algorithmfor the problem of face recognition in the presence of random

pixel distortion. Samples from a specific object class are known tolie on a linear subspace [1,13]. In our previous work [15,16] weproposed to develop class specific models of the registered usersthereby defining the task of face recognition as a problem oflinear regression. In the work presented here, we extend ourinvestigations to the problem of noise contaminated probes,where the inverse problem is solved using a novel applicationof the robust linear Huber estimation [29,30] and the class label isdecided based on the subspace with the most precise estimation.The proposed approach, although being simple in architecture,has shown demonstrating results for two critical robustnessissues of severe illumination variations and random pixel noise.

Although, in principle, compensation for contaminated/miss-ing pixels can be addressed at learning or/and classificationstages, robust regression for classification has rarely beenaddressed in the literature [31]. Efforts in this direction mainlyfocus on development of robust learning methods which aim todetect outliers in the learning phase and work on potentiallyuncorrupted learning data thereby resulting in a high breakdownpoint. The majority of these robust learning approaches are basedon adapting a discriminative model by replacing the classicallocation and scatter matrix estimators by their robust counter-parts such as MVE estimators [32], MCD estimators [33–35] andthe projection pursuit approach [36], etc. Some robust variants ofgenerative methods have been proposed which make use ofrobust metrics rather than a standard least-square computation[31,37]. An important implementation to the problem of robustobject recognition is presented in [38] which improves thelearning process of the traditional eigenspace method by introdu-cing a hypothesize-and-test paradigm. Fundamentally a subsam-

pling approach is introduced which works on subsets of imagepoints, generating each hypothesis by the robust solution of a setof linear equations. Based on the Minimum Description Length(MDL) principle, competing hypotheses are further selected forthe determination of the eigenspace coefficients. Recently a moresophisticated approach has been proposed in [31] which essen-tially combines the discriminative and reconstructive models toconstruct such a subspace which characterizes the discriminationpower and reconstruction property of the two methodologiessimultaneously. An important relevant work is presented in [39],where generative probabilistic image formation models aredeveloped for the in-lier and out-lier processes. The task of facerecognition is thereby formulated as a Maximum-a-Posteriori(MAP) estimation problem, the approach is applied to theproblem of contiguous occlusion. To the best of our knowledge,however, it is for the first time that the problem of randomlymissing/corrupted pixels has simply been formulated as a linearrobust regression task.

The rest of the paper is organized as follows: The fundamentalproblem of robust estimation is discussed in Section 2 followed bythe face recognition problem formulation in Section 3. Section 4demonstrates the efficacy of the proposed approach for theproblem of severely varying illumination followed by the experi-ments for random pixel corruption in Section 5. The paper finallyconcludes in Section 6.

2. The problem of robust estimation

Consider a linear model

y¼Xbþe ð1Þ

where the dependent or response variable yARq�1, the regressoror predictor variable XARq�p, the vector of parameters bARp�1

and error term eARq�1 . The problem of robust estimation is to

Page 3: Robust regression for face recognition

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118106

estimate the vector of parameters b so as to minimize the residual

r¼ y�y ; y ¼Xb ð2Þ

y being the predicted response variable. In classical statistics theerror term e is conventionally taken as a zero-mean Gaussiannoise [40]. A traditional method to optimize the regression is tominimize the least squares (LS) problem

arg min|fflfflfflfflffl{zfflfflfflfflffl}b

Xq

j ¼ 1

r2j ðbÞ ð3Þ

where rjðbÞ is the jth component of the residual vector r.However, in the presence of outliers, least squares estimation isinefficient and can be biased. Although it has been claimed thatclassical statistical methods are robust, they are only robust in thesense of type I error. Type I error corresponds to the rejection ofnull hypothesis when it is in fact true. It is straightforward to notethat type I error rate for classical approaches in the presence ofoutliers tend to be lower than the nominal value. This is oftenreferred to as conservatism of classical statistics. However, due tocontaminated data, type II error increases drastically. Type II erroris the error when the null hypothesis is not rejected when it is infact false. This drawback is often referred to as inadmissibility ofthe classical approaches. Additionally, classical statistical meth-ods are known to perform well with the homoskedastic datamodel. In many real scenarios, however, this assumption is nottrue and heteroskedasticity is indispensable, thereby emphasizingthe need of robust estimation.

Several approaches to robust estimation have been proposedsuch as R-estimators and L-estimators. However, M-estimatorshave shown superiority due to their generality and high break-down point [29,40]. Primarily M-estimators are based on mini-mizing a function of residuals

b ¼ arg min|fflfflfflfflffl{zfflfflfflfflffl}b ARp

FðbÞ �Xq

j ¼ 1

rðrjðbÞÞ

8<:

9=; ð4Þ

where rðrÞ is a symmetric function with a unique minimum atzero [29,30]

rðrÞ ¼

1

2gr2 for jrjrg

jrj�1

2g for jrj4g

8>><>>:

ð5Þ

g being a tuning constant called the Huber threshold. Manyalgorithms have been developed for calculating the HuberM-estimate in Eq. (4), some of the most efficient are based onNewton’s method [41]. M-estimators have been found to berobust and statistically efficient compared to classical methods[42–44]. Although robust methods, in general, are superior totheir classical counterparts, they have rarely been addressed inapplied fields [31,40]. Several reasons have been discussed in [40]for this paradox, computational expense related to the robustmethods has been a major hindrance [42]. However, with recentdevelopments in computational power, this reason has becomeinsignificant. The reluctance in the use of robust regressionmethods may also be credited to the belief of many statisticiansthat classical methods are robust.

3. Robust Linear Regression Classification (RLRC) for robustface recognition

Consider N number of distinguished classes with pi number oftraining images from the ith class such that i¼ 1,2, . . . ,N. Eachgrayscale training image is of an order a� b and is represented

as ui

ðmÞARa�b, i¼ 1,2, . . . ,N and m¼ 1,2, . . . ,pi. Each gallery image is

downsampled to an order c� d and transformed to a vector

through column concatenation such that ui

ðmÞARa�b-wi

ðmÞARq�1,

where q¼cd, cd5ab. Each image vector is normalized so thatthe maximum pixel value is 1. Using the concept that patterns fromthe same class lie on a linear subspace [1], we develop a classspecific model Xi by stacking the q-dimensional image vectors,

Xi ¼ ½wi

ð1Þwi

ð2Þ. . . . . .wi

ðpiÞ

�ARq�pi , i¼ 1,2, . . . ,N ð6Þ

Each vector wi

ðmÞ, m¼ 1,2, . . . ,pi, spans a subspace of Rq also

called the column space of Xi. Therefore at the training level eachclass i is represented by a vector subspace, Xi, which is also calledthe regressor or predictor for class i. Let z be an unlabeled testimage and our problem is to classify z as one of the classesi¼ 1,2, . . . ,N. We transform and normalize the grayscale image z

to an image vector yARq�1 as discussed for the gallery. If ybelongs to the ith class it should be represented as a linearcombination of the training images from the same class (lying inthe same subspace) i.e.

y¼Xibiþe, i¼ 1,2, . . . ,N ð7Þ

where biARpi�1. From the perspective of face recognition thetraining of the system corresponds to the development of theexplanatory variable ðXiÞ which is normally done in a controlledenvironment, therefore the explanatory variable can safely beregarded as noise free. The issue of robustness comes into playwhen a given test pattern is contaminated with noise which mayarise due to luminance, malfunctioning of the sensor, channelnoise, etc. Given that qZpi, the system of equations in Eq. (7) iswell-conditioned and bi is estimated using robust Huber estima-tion as discussed in Section 2 [30]

bi ¼ arg min|fflfflfflfflffl{zfflfflfflfflffl}b i ARpi

Fðb iÞ �Xq

j ¼ 1

rðrjðbiÞÞ

8<:

9=;, i¼ 1,2, . . . ,N ð8Þ

where rjðb iÞ is the jth component of the residual

rðbiÞ ¼ y�Xib i, i¼ 1,2, . . . ,N ð9Þ

The estimated vector of parameters, bi , along with the pre-dictors Xi is used to predict the response vector for each class i:

y i ¼Xibi , i¼ 1,2, . . . ,N ð10Þ

We now calculate the distance measure between the predictedresponse vector y i, i¼ 1,2, . . . ,N and the original responsevector y,

diðyÞ ¼ Jy�y iJ2, i¼ 1,2, . . . ,N ð11Þ

and rule in favor of the class with minimum distance i.e.

min|ffl{zffl}i

diðyÞ, i¼ 1,2, . . . ,N ð12Þ

4. Case study: face recognition in presence of severeillumination variations

The proposed RLRC algorithm is extensively evaluated onvarious databases incorporating several modes of luminancevariations. In particular we address three standard databasesnamely Yale Face Database B [18], CMU-PIE database [45] andAR database [46]. For all experiments images are histogramequalized and transformed to logarithm domain.

4.1. Yale Face Database B

Yale Face Database B consists of 10 individuals with 9poses incorporating 64 different illumination alterations for each

Page 4: Robust regression for face recognition

Fig. 1. Yale Face Database B: starting from top, each row represents typical images from subsets 3, 4 and 5, respectively. Note that subset 5 (third row) characterizes the

worst illumination variations.

Table 1Details of the subsets for Yale Face Database B with respect to light source

directions.

Subsets 1 2 3 4 5

Lighting angle (degrees) 0–12 13–25 26–50 51–77 477

Number of images 70 120 120 140 190

Table 2Recognition results for Yale Face Database B.

Methods Subset 3 (%) Subset 4 (%) Subset 5

No normalization [20] 89.20 48.60 22.60%

Histogram equalization [20] 90.80 45.80 58.90%

Linear subspace [18] 100.00 85.00 N/A

Cones-attached [18] 100.00 91.40 N/A

Cones-cast [18] 100.00 100.00 N/A

Gradient Angle [47] 100.00 98.60 N/A

Harmonic images [48] 99.70 96.90 N/A

Illumination Ratio Images [49] 96.70 81.40 N/A

Quotient Illumination Relighting [50] 100.00 90.60 82.50%

9PL [51] 100.00 97.20 N/A

Method in [20] 100.00 99.82 98.29%

RLRC 100.00 100.00 100.00%

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118 107

pose [18]. The database has been used by researchers as a test-bed for the evaluation of robust face recognition algorithms. Sincewe are concerned with the illumination tolerant face recognitionproblem, only the frontal images of the subjects are considered.The images are divided into five subsets with respect to the anglebetween the light source direction and the camera axis (seeFig. 1), refer to Table 1. Interested readers may also refer to [18]for further details of the database, all images are downsampled toan order of 50�50.

We follow the evaluation protocol as reported in [18,20,47–51].Training is conducted using subset 1 and the system is validated onthe remaining subsets. A detail comparison of the results with somelatest approaches is shown in Table 2, all results are as reported in[20]. Note that the error rates have been converted to the recogni-tion success rates. Since subset 3 incorporates moderate luminancevariations, most of the state-of-art algorithms report error-freerecognition as shown in Table 2. For subset 4 with more adverseillumination variations, the proposed algorithm achieves 100%recognition which is either better than or comparable to allthe results reported in the literature. In particular the proposedapproach outperforms the Cones-attached, Illumination RatioImages and Quotient Illumination Relighting methods by 8.60%,18.60% and 9.40%, respectively. It is also found to be fairly compar-able to the latest Cone-cast and Gradient Angle approaches. Subset5 represents the worst case scenario with angle between the lightsource direction and camera axis being greater than 771. Theproposed RLRC algorithm consistently achieves 100% recognitionfor the severe alterations comparing favorably with all the reportedresults in the literature beating the Quotient Illumination Relightingmethod by more than 17%. Noteworthy is the fact that resultsfor this subset are not available in the literature for most of thecontemporary approaches.

4.2. CMU-PIE face database

4.2.1. Evaluation Protocol 1 (EP 1)

Extensive experiments were conducted on CMU-PIE database[45]. We follow the evaluation protocol as proposed in [52]randomly selecting a subset of database consisting of 65 subjectswith 21 illumination variations per subject, all images are resizedto an order 50�50. Fig. 2 represents 21 different alterations for atypical subject each image being accordingly labeled. We followtwo experimental setups as proposed in [52], in the first set ofexperiments the system is trained using images with near frontallighting and validation is conducted across the whole database.A detailed comparison of the performance with the state-of-artapproaches is depicted in Table 3. The proposed RLRC algorithm isfound to be pretty much comparable with the latest approaches ofMACE filters and Corefaces, it also comprehensively outperformsthe IPCA, 3D linear subspace and Fisherfaces methods for variouscase-studies of training sessions. For instance with train-ing images labeled 7, 10 and 19, the proposed RLRC algorithmachieves 99.93% recognition which is 63.83%, 49.03% and 26.63%better than IPCA, 3D linear subspace and Fisherfaces methods,respectively.

For the second set of experiments training is conducted onimages captured under extreme lighting conditions, the system isagain validated across the whole database. The proposed RLRCalgorithm is found to be comparable with the latest approaches as

Page 5: Robust regression for face recognition

Fig. 2. The 21 different illumination variations for a typical subject from the CMU PIE database. These images were captured without any ambient lighting thereby

demonstrating more severe luminance alterations.

Table 3Performance comparison with state-of-the-art algorithms characterizing training images captured from near frontal lighting. All results are as reported in [52].

Training images IPCA (%) 3D linear subspace (%) Fisherfaces (%) MACE filters (%) Corefaces (%) RLRC (%)

5, 6, 7, 8, 9, 10, 11, 18, 19, 20 97.60 97.30 97.30 100.00 100.00 100.005, 6, 7, 8, 9, 10 91.40 97.10 89.30 99.90 100.00 99.415, 7, 9, 10 72.40 93.20 71.40 99.90 99.90 99.857, 10, 19 36.10 50.90 73.30 99.10 99.90 99.938, 9, 10 78.00 97.80 82.10 99.90 99.90 99.4118, 19, 20 91.00 98.40 94.20 99.90 100.00 100.00

Table 4Performance comparison with state-of-the-art algorithms characterizing training

images with severe lighting conditions. All results are as reported in [52].

Trainingimages

IPCA(%)

3D linearsubspace

Fisherfaces(%)

MACEfilters (%)

Corefaces(%)

RLRC(%)

3, 7, 16 95.90 99.90% 99.90 100.00 100.00 100.001, 10, 16 90.70 99.90% 99.90 100.00 100.00 100.002, 7, 16 88.57 99.85% 100.00 100.00 100.00 100.004, 7, 13 91.40 98.90% 99.10 100.00 100.00 100.003, 10, 16 91.70 100.00% 99.90 100.00 100.00 100.003, 16 44.30 N/A 49.90 99.90 99.90 99.93

0 2 4 6 8 10 12 14 16 18 2050

55

60

65

70

75

80

85

90

95

100

Leave−one−out Experiments

Rec

ogni

tion

Acc

urac

y

S−QIRLRC

Fig. 3. Performance curves for the CMU-PIE database under EP 2.

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118108

shown in Table 4. The only erroneous recognition trial was for thecase with the training images labeled 3 and 16. The error may beattributed to the fact that the system was trained using only twoimages which accounts for only a couple of regressor or predictor

observations for each class in the context of the RLRC algorithm.Apart from insufficient information, it has to be noted that images3 and 16 (Fig. 2) have adverse luminance conditions.

4.2.2. Evaluation Protocol 2 (EP 2)

Under Evaluation Protocol 2 (EP 2) we follow the leave-one-out strategy on the 68 subjects of the CMU-PIE database asproposed in a recent work of generalized quotient image [53].A detailed comparison with the best results in [53] is shown in

Fig. 3. The proposed RLRC approach consistently attained highrecognition accuracy for all leave-one-out experiments. In parti-cular, apart from one recognition trail we attained an error-freeperformance index with 100% recognition accuracy. Only oneerror was reported for the seventh leave-one-out experimentwhere we achieved a recognition rate of 98.53%. It is appropriateto point out that performance curve for S-QI method in Fig. 3 is anapproximation to the curve shown in [53].

Page 6: Robust regression for face recognition

Fig. 4. Various luminance variations for a typical subject of the AR database, the two rows represent two different sessions.

Table 5Results for the AR database under EP 1.

Method Recognition accuracy (%)

LPP 65.25

DLPP 96.89

RLRC 95.76

Table 6Results for the AR database under EP 2.

Method Recognition accuracy(%)

PCA 25.90

PCAþHE 37.70

PCAþBHE 71.30

PCAþ2D face

model

81.80

RLRC 94.49

Table 7Results for the AR database under EP 3.

Method Left-light (%) Right-light (%) Both-lights (%)

1-NN [22] 22.20 17.80 3.70

PCA [22] 7.40 7.40 2.20

LEM [21] 92.90 91.10 74.10

Face-ARG [22] 98.50 96.30 91.10

RLRC 96.30 94.07 94.07

10−3 10−2 10−1 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Accept Rate

Verif

icat

ion

Rat

e

RLRCusc mar 97umd mar 97ef hist dev m12

Fig. 5. ROC curves for the FERET database.

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118 109

4.3. AR database

The AR face database contains over 4000 color images taken intwo sessions separated by two weeks [46]. The database char-acterizes various deviations from the ideal conditions includingfacial expressions, luminance conditions and occlusion modes. Inparticular, there are three lighting modes with left light on, rightlight on and both lights on. Fig. 4 represents these variations forthe two sessions.

4.3.1. Evaluation Protocol 1 (EP 1)

We follow the evaluation protocol as proposed in [54], a subsetof the database consisting of 118 randomly selected individuals isselected. Training is performed on images with nominal lightingconditions (Fig. 4(a) and (e)) while validation is conducted onimages with adverse ambient lighting (Fig. 4(b), (c), (d), (f),

(g) and (h)). Therefore altogether we have 236 (118�2) galleryimages and 708 (118�6) probes.

All images are downsampled to an order of 180�180. Theresults are dilated in Table 5, the proposed RLRC algorithmoutperforms the Locality Preserving Projections (LPP) method bya margin of 30.51% and is quite comparable to the DiscriminantLocality Preserving Projections (DLPP) method. All results are asreported in [54].

Page 7: Robust regression for face recognition

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118110

4.3.2. Evaluation Protocol 2 (EP 2)

Under EP 2 we follow the experimental setup as proposed in[55]. We now have a subset of 121 subjects, training is done onFig. 4(a) and the system is validated for adverse luminancevariations of the same session i.e Fig. 4(b), (c) and (d). Thereforewe have 121 gallery images and 363 (121�3) probes. The resultsare tabulated in Table 6, all the results are as reported in [55].

Fig. 6. First row illustrates some gallery images from subsets 1 and 2 while second

row shows some probes from subset 3.

Fig. 7. Probe images corrupted with (a) 20%, (b) 40%, (c) 60% and (d) 80% dead

pixels.

0 10 20 30 40 50 60 70 80 900

10

20

30

40

50

60

70

80

90

100

Noise Percentage

Rec

ogni

tion

Acc

urac

y

PCA + NNICA I + NNRLRCM−PCAPCA+L1M−RLRC

Fig. 8. Recognition accuracy of various approaches for a range of dead pixel noise

density.

Table 8Verification results for dead-pixel noise.

20% noise density 40% noise density

Approach EER (%) Verif. (%) Approach EER (%) Verif. (%)

PCA 0.70 99.78 PCA 1.00 97.81

ICA I 0.60 100 ICA I 1.50 96.93

RLRC 0.20 100 RLRC 0.20 100

The proposed RLRC algorithm achieves a high recognition accu-racy of 94.49% outperforming the latest 2D face model approachby a margin of 12.69%.

4.3.3. Evaluation Protocol 3 (EP 3)

Under Evaluation Protocol 3 (EP 3) we follow the experimentalsetup as proposed in recent works of Face-ARG matching [22] andLine Edge Map (LEM) [21]. These recent approaches use thegeometric quantities and structural information of a humanface and have therefore shown to be robust to severe illumina-tion variations. We select a subset of AR database consisting

60% noise density 80% noise density

Approach EER (%) Verif. (%) Approach EER (%) Verif. (%)

PCA 1.80 94.30 PCA 5.80 75.44

ICA I 3.70 92.11 ICA I 11.02 51.54

RLRC 0.20 100 RLRC 0.40 99.78

Fig. 9. Probes with (a) 20%, (b) 40%, (c) 70% and (d) 90% salt and pepper noise

density.

0 10 20 30 40 50 60 70 80 900

10

20

30

40

50

60

70

80

90

100

Pecentage Noise Density

Rec

ogni

tion

Acc

urac

y

PCA + NNICA I + NNRLRCM−PCAPCA+L1M−RLRC

Fig. 10. Recognition accuracy curves in the presence of varying density of salt and

pepper noise.

Fig. 11. Probe images corrupted with (a) 4, (b) 6, (c) 8 and (d) 10 variance

speckle noise.

Page 8: Robust regression for face recognition

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118 111

of 135 subjects. The system is trained using Fig. 4(a) whileFig. 4(b)–(d) serve as probes, altogether we have 135 galleryimages and 405 (135 �3) probes. The results are tabulated inTable 7, noteworthy is the fact that results in [21] are shown for112 subjects.

1 2 3 4 5 6 7 8 9 100.94

0.95

0.96

0.97

0.98

0.99

1

Rank

Rec

ogni

tion

Rat

e

PCA+NNICA I + NNRLRC

1 2 3 4 5 6 7 8 9 100.85

0.9

0.95

1

Rank

Rec

ogni

tion

Rat

e

PCA+NNICA I+NNRLRC

0.00

2

0.00

4

0.00

6

0.00

8

0.01

0.01

2

0.01

4

0.01

6

0.01

8

0.02

0.80.820.840.860.88

0.90.920.940.960.98

1

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I+NNRLRC

0.00

4

0.00

6

0.00

8

0.01

0.01

2

0.01

4

0.01

6

0.01

8

0.02

0.80.820.840.860.88

0.90.920.940.960.98

1

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I+NNRLRC

Fig. 12. Dead-pixel noise: (a)–(d) elaborate rank-recognition profiles while (e)–(h) sh

densities, respectively.

The proposed RLRC approach shows a consistent performanceacross all illumination modes of the AR database. For the cases of‘‘left light on’’ and ‘‘right light on’’, recognition accuracies of96.30% and 94.07% are achieved which are fairly comparable tothe latest LEM and Face-ARG approaches as shown in Table 7. For

1 2 3 4 5 6 7 8 9 100.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

Rank

Rec

ogni

tion

Rat

e

PCA+NNICA 1+NNRLRC

1 2 3 4 5 6 7 8 9 100.5

0.550.6

0.650.7

0.750.8

0.850.9

0.951

Rank

Rec

ogni

tion

Rat

ePCA+NNICA I+NNRLRC

0.00

4

0.00

6

0.00

8

0.01

0.01

2

0.01

4

0.01

6

0.01

8

0.02

0.80.820.840.860.88

0.90.920.940.960.98

1

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I+NNRLRC

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09 0.1

0.11

0.12

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

False AcceptRate

Verif

icat

ion

Rat

e

PCA+NN+ICA I+NNRLRC

ow the Receiver Operating Characteristics (ROC) for 20%, 40%, 60% and 80% noise

Page 9: Robust regression for face recognition

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118112

the most challenging problem of illumination with ‘‘both lights on’’the proposed RLRC approach attains 94.07% recognition which isfavorably comparable with the Face-ARG approach and outperforms

1 2 3 4 5 6 7 8 9 100.86

0.88

0.9

0.92

0.94

0.96

0.98

1

Rank

Rec

ogni

tion

Rat

e

PCA+NNICA I+NNRLRC

1 2 3 4 5 6 7 8 9 10

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Rank

Rec

ogni

tion

Rat

e

PCA+NNICA I+NNRLRC

0.01

0.01

1

0.01

2

0.01

3

0.01

4

0.01

5

0.01

6

0.01

7

0.01

8

0.80.820.840.860.88

0.90.920.940.960.98

1

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I +NNRLRC

0.02

0.04

0.06

0.08 0.1

0.12

0.14

0.16

0.18

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

False Accept Rate

Verif

icat

ion

Rat

e PCA+NNICA I+NNRLRC

Fig. 13. Salt and pepper noise: (a)–(d) represent rank-recognition curves while (e)–(h

densities, respectively.

the LEM approach by a margin of approximately 20%. The conven-tional methods of PCA and 1-NN reported in [22] are out ofdiscussion as they lag far behind these latest approaches.

1 2 3 4 5 6 7 8 9 100.75

0.8

0.85

0.9

0.95

1

Rank

Rec

ogni

tion

Rat

e PCA+NNICA I+NNRLRC

1 2 3 4 5 6 7 8 9 10

0.4

0.5

0.6

0.7

0.8

0.9

1

Rank

Rec

ogni

tion

Rat

e

PCA+NNICA I+NNRLRC

0.01

0.01

1

0.01

2

0.01

3

0.01

4

0.01

5

0.01

6

0.01

7

0.01

8

0.80.820.840.860.88

0.90.920.940.960.98

1

False Accept Rate

Verif

icat

ion

Rat

e PCA+NNICA 1+NNRLRC

0.02

0.04

0.06

0.08 0.1

0.12

0.14

0.16

0.18

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I +NNRLRC

) show Receiver Operating Characteristics (ROC) for 50%, 60% 70% and 80% noise

Page 10: Robust regression for face recognition

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118 113

4.4. FERET database

The FERET database is arguably one of the largest publiclyavailable database with two versions [56], gray FERET databaseand color FERET database. The database addresses severalchallenging issues such as expression variations, pose alterationsand aging factor, etc. For the case of varying illumination there isonly one evaluation protocol recognized as ‘‘fafc’’ within theframework of gray FERET database. The methodology utilizesonly one gallery image for each of the 1196 subjects, the gallerysize is therefore 1196. The probe set consists of 194 images, referto [56] for further details on the FERET evaluation methodology. Itis worthy to note that recognizing a person from a single galleryimage is itself an independent ardent issue within the paradigmof face recognition [57] and as such not the focus of the presentedresearch. However, to evaluate the efficacy of the proposedalgorithm with a single gallery image per subject, we conductedextensive experiments as shown in Fig. 5. The proposed RLRCalgorithm outperformed 13 of the reported 14 algorithms andlagged only one algorithm tagged as USC MAR 97 in [56]. Fig. 5illustrates receiver operating characteristics for three bestreported algorithms (in the sense of recognition accuracy). Theproposed RLRC algorithm achieves a verification accuracy of70.10% at 0.001 FAR which lags 9.28% as compared to the bestresult reported for USC MAR 97, the proposed RLRC algorithm,however, comprehensively outperforms the other 13 algorithmsbeating UMD MAR 97 and EF HIST DEV M12 by a margin ofapproximately 30% and 50% (at 0.001 FAR), respectively. It hasto be noted that for higher values of FAR the proposed RLRCalgorithm is better than the best reported method. In particularthe RLRC algorithm achieves better performance from 0.017 FARonwards with a good equal error rate of 0.03 (approximately) whichis better than 0.05 as reported for USC MAR 97 method [56].

1 2 3 4 5 6 7 8 9 1020

30

40

50

60

70

80

90

100

Variance

Rec

ogni

tion

Acc

urac

y

PCA + NNICA I + NNRLRCM−PCAPCA+L1M−RLRC

Fig. 14. Recognition accuracy of various approaches in the presence of speckle

noise for different variances. (For interpretation of the references to color in this

figure legend, the reader is referred to the web version of this article.)

5. Case study: face recognition in presence of random pixelnoise

Extensive experiments were carried out using the ExtendedYale B database [18,19]. The database consists of 2414 frontal-face images of 38 subjects under various lighting conditions.Subsets 1 and 2 consisting of 719 images under normal-to-moderate lighting conditions were used as gallery. Subset 3consisting of 456 images under severe luminance alterationswas designated as probes. Sample gallery and probe images areshown in Fig. 6. The choice of training and testing images isspecifically to isolate the effect of noise.

The proposed approach was validated for a range of exemplarynoise models specific to image data. For all experiments thelocation of noisy pixels is unknown to the algorithm. Fig. 7reflects the probe images corrupted with various degrees ofdead-pixel noise.

The proposed Robust Linear Regression Classification approachcomprehensively outperforms the benchmark reconstructivealgorithms of PCA [9] and ICA I [10] showing a high breakdownpoint as shown in Fig. 8. Even for the worst case scenario of 90%

Table 9Verification results for salt and pepper noise.

50% noise density 60% noise density

Approach EER (%) Verif. (%) Approach EER (%) Verif. (%)

PCA 2.12 95.61 PCA 4.10 87.28

ICA I 2.16 93.86 ICA I 4.25 83.99

RLRC 0.21 99.78 RLRC 2.18 97.37

corrupted pixels, the proposed approach achieves 94.52% recogni-tion accuracy outperforming PCA and ICA I by 49.13% and 75.44%,respectively. In the presence of outliers the L1-norm computationis reported to be more efficient compared to the usual euclideandistance measure [42,43]. Also in the literature median filteringhas shown to improve image understanding in the presence ofnoise [5]. Therefore for an appropriate indication of the perfor-mance index for the proposed RLRC algorithm we also conductedexperiments using median filtering and L1-norm calculations forthe PCA. Note that the median preprocessing is indicated by letter‘‘M’’ before the corresponding approach. The performance curvesare shown in Fig. 8, the proposed RLRC algorithm consistentlyoutperforms these robust variants of the benchmark approaches.

In particular, for the case of 90% corruption we note a majorperformance difference with the RLRC algorithm outperformingthe L1-norm and M-PCA methods by 89.04% and 91.67%, respec-tively. Noteworthy is the fact that standard preprocessing androbust calculations are of no use in such severe noise conditions.

For a comprehensive comparison with the benchmark generativeapproaches, extensive verification experiments were conducted, thesimilarity scores were normalized on the scale ½0 1�. Results aredilated in Fig. 12 and Table 8. Rank recognition profiles for variousdegree of dead-pixel contamination show an excellent performanceindex for the proposed RLRC approach. With the increasing noiseintensity the proposed approach shows good tolerance as comparedto PCA and ICA I.

Specifically with 80% noise density, the proposed RLRCapproach achieves an excellent rank-1 recognition accuracy of99.78% comprehensively beating PCA and ICA I rank-1 recognitionresults by 28.29% and 48.68%, respectively. The RLRC approachachieves an excellent equal error rate (EER) of 0.40% as comparedto 5.80% and 11.02% for PCA and ICA I, respectively. Verificationrate of 99.78% at a typical 0.01 FAR, as indicated in Table 8, alsocomprehensively outperforms the benchmark approaches.

70% noise density 80% noise density

Approach EER (%) Verif. (%) Approach EER (%) Verif. (%)

PCA 6.36 75.22 PCA 15.13 53.07

ICA I 8.83 62.28 ICA I 17.24 42.54

RLRC 5.48 85.75 RLRC 19.30 39.91

Page 11: Robust regression for face recognition

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118114

In the next set of experiments we contaminate the probe imageswith data drop-out and snow in the image noise simultaneously,commonly referred to as salt and pepper noise [58]. This fat-tail

1 2 3 4 5 6 7 8 9 100.75

0.8

0.85

0.9

0.95

1

Rank

Rec

ogni

tion

Rat

e

PCA+NNICA I+NNRLRC

1 2 3 4 5 6 7 8 9 100.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Rank

Rec

ogni

tion

Rat

e

PCA+NNICA I+NNRLRC

0.02

0.04

0.06

0.08 0.

1

0.12

0.14

0.16

0.18

0.80.820.840.860.88

0.90.920.940.960.98

1

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I+NNRLRC

0.02

0.04

0.06

0.08 0.

1

0.12

0.14

0.16

0.18

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I+NNRLRC

Fig. 15. Speckle noise: (a)–(d) represent rank-recognition curves and (e)–(h) show re

respectively.

distributed noise, also called impulsive noise or spike noise [58], canbe caused by analog-to-digital converter errors and bit errors intransmission [59,60]. Fig. 9 reflects probes distorted with various

1 2 3 4 5 6 7 8 9 100.7

0.75

0.8

0.85

0.9

0.95

1

Rank

Rec

ogni

tion

Rat

e

PCA+NNICA I+NNRLRC

1 2 3 4 5 6 7 8 9 100.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Rank

Rec

ogni

tion

Rat

ePCA+NNICA I+NNRLRC

0.02

0.04

0.06

0.08 0.1

0.12

0.7

0.75

0.8

0.85

0.9

0.95

1

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I+NNRLRC

0.02

0.04

0.06

0.08 0.

1

0.12

0.14

0.16

0.18

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I+NNRLRC

ceiver operating characteristics for noise densities with variances 2, 4, 6, and 8,

Page 12: Robust regression for face recognition

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118 115

degrees of salt and pepper noise. In the overall sense, the proposedRLRC approach is favorably comparable with the benchmark recon-structive approaches as depicted in Fig. 10.

At a noise density of 70% for instance, the RLRC algorithm gives9.21% and 18.64% better recognition accuracy compared to PCAand ICA I, respectively. However, under high noise densities of80% and 90%, PCA seems to be better than either ICA I or RLRC. Itshould be noted that under such severe conditions of salt andpepper noise, although PCA gives a better comparative perfor-mance index, the recognition accuracy achieved (for e.g. 48.25% at80% noise density) is itself far from satisfactory. For salt andpepper noise we note that the median preprocessing results in asignificant improvement both for PCA and RLRC approaches. Forinstance at 70% noise density M-PCA achieves 93.86% recognitionwhich is 22.15% better than simple PCA. Similarly the M-RLRCshows an improvement of almost 20% compared to simple RLRCachieving a maximum recognition of 100%. The L1-norm calcula-tion is of little benefit as M-RLRC consistently performs betterthan all competing approaches.

Verification experiments were also conducted for salt andpepper noise, results are shown in Fig. 13 and Table 9. For noisedensity up to 70%, the proposed RLRC algorithm shows betterperformance index, both in recognition and verification, com-pared to PCA and ICA I. For instance at 60% contamination levelRLRC achieves a high rank-1 recognition accuracy of 94.52%,outperforming PCA and ICA I by a difference of 12.50% and16.45%, respectively (refer to Fig. 13(b)). A low EER of 2.18% forthe proposed RLRC approach is also favorable compared to 4.10%and 4.25% of PCA and ICA I, respectively. Note the majorperformance difference of the receiver operating characteristicsin Fig. 13(f). An excellent verification rate of 97.37% at standard0.01 FAR outstandingly outperforms the benchmark approaches(refer to Table 9). However, for a noise density greater than 70%the verification results for PCA are better than either RLRC or ICA Iapproaches. For instance at 80% noise contamination, PCAachieves a verification rate of 53.07% at a typical 0.01 FAR whichis better than ICA I and RLRC by 12.53% and 13.16%, respectively,also the EER performance for PCA is superior compared to bothapproaches as shown in Table 9. The superior performance of PCAfor severe noise density is, however, undone by the fact that it isunable to reach satisfactory performance in the absolute sense as53.07% success rate is not reliable by any standard. For low tomoderate salt and pepper noise, the proposed RLRC remains thebest choice.

Speckle noise is regarded as a major interference in digitalimaging and therefore forms another important robustness issue.The proposed approach was extensively evaluated by addingvarying multiplicative speckle noise to probes as shown inFig. 11. Speckle noise is efficiently modeled as a zero-meanuniform random variable.

The proposed RLRC approach showed a good performance indexas shown in Fig. 14, consistently achieving a high recognitionaccuracy for a wide range of error variance. The effect of specklenoise with a variance of 6 is shown in Fig. 11(b), the image is badlydistorted and traditional reconstructive approaches fail to producecompetitive results. The proposed RLRC approach attains a high

Table 10Verification results for speckle noise.

Noise variance¼2 Noise variance¼4

Approach EER (%) Verif. (%) Approach EER (%) Verif. (%)

PCA 3.20 87.94 PCA 4.09 81.80

ICA I 4.60 86.84 ICA I 5.57 77.63

RLRC 0.52 99.34 RLRC 1.00 98.90

recognition accuracy of 91.67% outperforming PCA and ICA Iapproaches by 16.67% and 23.69%, respectively. Noteworthy is thetolerance and consistency of the proposed approach for highlycorrupted data. The best recognition results are reported for theRLRC approach with median filtering (M-RLRC indicated by reddashed line in Fig. 14). For instance for the worst case of specklenoise with variance 10 the M-RLRC approach achieves a highrecognition success of 97.37% comprehensively outperforming allcompeting approaches, the best competitor being the RLRC withoutany preprocessing achieving 86.40% (solid blue line in Fig. 14).Median filtering variants again show significant improvement ofapproximately 12% compared to the unprocessed computations.Note that the L1-norm robust calculations are of not much help insuch adverse conditions.

Results for verification experiments are shown in Fig. 15 andTable 10. In particular, in the presence of noise density withvariance 8, the RLRC approach achieves a high rank-1 recognitionaccuracy of 89.04% outperforming PCA and ICA I by margins of19.52% and 22.15%, respectively. At a typical 0.01 FAR theproposed RLRC reaches a verification rate of 94.74% with an EERof only 3.07% substantially outperforming both contestingapproaches (refer to Fig. 15(d) and (h)).

Due to the fact that all imaging systems acquire images bycounting photons, the detector noise (modeled as Additive WhiteGaussian Noise) is always an important case-study in the contextof robustness [61]. The probes were distorted by adding zero-mean Gaussian noise with a wide range of error variance asshown in Fig. 18. Since classical statistical methods are known tobe efficient in the presence of Gaussian noise, we also conductedexperiments by solving Eq. (7) using the least squares (LS)approach. To harness redundant measurements in the presenceof noise, all experiments for LS and RLRC were conducted in theoriginal image space.

Results shown in Fig. 17 reflect the superiority of the proposedmethod. The RLRC approach consistently outperformed all otherapproaches for a wide range of error variance. In particular, withan error variance of 0.8 the RLRC approach beats PCA, ICA I and LSmethods by margins of 7.89%, 16.88% and 48.48%, respectively.Even with a severe additive noise of 0.9 variance a reasonable93.64% recognition accuracy was achieved. The LS approachshowed an interesting behavior for low variance noise theperformance of LS is pretty much comparable to the RLRCapproach. However, with low SNR the LS method substantiallylags the Robust Linear Regression Classification. The best recogni-tion results are obtained for the RLRC with median filtering (M-RLRC), for instance with 0.8 error variance a recognition accuracyof 99.78% is reported which is 3.73% better than plain RLRCapproach. Median filtering also substantially improved the per-formance of PCA, however, the two top performance curves areobtained for the M-RLRC and RLRC methods.

Verification results for various SNR case-studies of AWGN areshown in Fig. 16 and Table 11. In particular, for the worst casescenario of 0.9 variance Gaussian noise, the proposed RLRCapproach achieves high verification rate of 98.68% at 0.01 FARcomprehensively outperforming PCA, ICA I and the LS methods by12.50%, 12.28% and 51.75%, respectively (see Fig. 16(h)). The huge

Noise variance¼6 Noise variance¼8

Approach EER (%) Verif. (%) Approach EER (%) Verif. (%)

PCA 5.71 79.82 PCA 6.62 73.90

ICA I 6.36 70.61 ICA I 8.27 67.76

RLRC 2.61 96.27 RLRC 3.07 94.74

Page 13: Robust regression for face recognition

1 2 3 4 5 6 7 8 9 100.7

0.75

0.8

0.85

0.9

0.95

1

Rank

Rec

ogni

tion

Rat

ePCA+NNICA I+NNLSRLRC

1 2 3 4 5 6 7 8 9 100.5

0.550.6

0.650.7

0.750.8

0.850.9

0.951

Rank

Rec

ogni

tion

Rat

e

PCA+NNICA+NNLSRLRC

1 2 3 4 5 6 7 8 9 100.4

0.5

0.6

0.7

0.8

0.9

1

Rec

ogni

tion

Rat

e

PCA+NNICA I+NNLSRLRC

1 2 3 4 5 6 7 8 9 100.4

0.5

0.6

0.7

0.8

0.9

1

RankRank

Rec

ogni

tion

Rat

e

PCA+NNICA I+NNLSRLRC

0.02

0.04

0.06

0.08 0.1

0.12

0.14

0.16

0.18

0.02

0.04

0.06

0.08 0.1

0.12

0.14

0.16

0.18

0.7

0.75

0.8

0.85

0.9

0.95

1

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I+NNLSRLRC

0.02

0.04

0.06

0.08 0.1

0.12

0.14

0.16

0.18

0.02

0.04

0.06

0.08 0.

1

0.12

0.14

0.16

0.18

0.50.550.6

0.650.7

0.750.8

0.850.9

0.951

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I+NNLSRLRC

0.50.55

0.60.65

0.70.75

0.80.85

0.90.95

1

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I+NNLSRLRC

0.4

0.5

0.6

0.7

0.8

0.9

1

False Accept Rate

Verif

icat

ion

Rat

e

PCA+NNICA I+NNLSRLRC

e

Fig. 16. Gaussian noise: (a)–(d) represent rank-recognition curves while (e)–(h) show Receiver Operating Characteristics (ROC) for noise densities with variances 0.5, 0.7,

0.8, and 0.9, respectively.

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118116

performance difference of more than 50% compared to the LSapproach signifies the importance of robust regression for theparticular case-study of face recognition. Also in terms of EER theproposed approach attains excellent performance index of 1.63%

while other approaches substantially lag behind (refer toTable 11).

A careful analysis of the above reported results highlight aninteresting behavior in presence of high intensity noise. For

Page 14: Robust regression for face recognition

Table 11Verification results for Gaussian noise.

Noise variance¼0.5 Noise variance¼0.7 Noise variance¼0.8 Noise variance¼0.9

Approach EER (%) Verif. (%) Approach EER (%) Verif. (%) Approach EER (%) Verif. (%) Approach EER (%) Verif. (%)

PCA 2.19 95.61 PCA 3.28 92.11 PCA 2.23 93.20 PCA 4.05 86.18

ICA I 2.06 94.30 ICA I 3.50 89.47 ICA I 4.16 86.62 ICA I 4.17 86.40

LS 7.80 78.73 ICA I 13.49 56.36 ICA I 15.88 50.00 ICA I 18.20 46.93

RLRC 0.21 99.78 RLRC 0.43 99.78 RLRC 1.30 98.68 RLRC 1.63 98.68

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.920

30

40

50

60

70

80

90

100

Variance

Rec

ogni

tion

Acc

urac

y

PCA + NNICA I + NNLSRLRCM−PCAPCA+L1M−RLRC

Fig. 17. Recognition accuracy of various approaches in the presence of Gaussian

noise for different variances.

Fig. 18. Probe images corrupted with (a) 0.2, (b) 0.4, (c) 0.6 and (d) 0.8 variance

zero-mean Gaussian noise.

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118 117

instance in Fig. 17 (Gaussian noise with 0.9 variance) the proposedRLRC approach gives almost equivalent performance compared tobenchmark subspace techniques. This relates to the issue of lowbreakdown point of robust estimation in presence of severe noise.Classically speaking, for severely corrupted data adequate prepro-cessing increases the breakdown point [5]. Therefore the medianprocessed RLRC (referred to as M-RLRC) shows excellent perfor-mance achieving approximately 100% recognition accuracy for eventhe worst case scenario of 0.9 variance in Fig. 17.

6. Conclusion

In this paper we present a novel robust face recognition algorithmbased on the robust Huber estimation approach. It is for the first timethat the problem of robust face recognition has been formulated as arobust Huber estimation task. The proposed Robust Linear Regres-sion Classification (RLRC) algorithm has been evaluated for two case-studies i.e severe illumination variation and random pixel corruption.For the case of illumination invariant face recognition, we havedemonstrated results on three standard databases incorporatingadverse luminance alterations. A comprehensive comparison withthe state-of-art robust approaches indicates a comparable perfor-mance index for the proposed RLRC approach. We demonstrate, forthe first time, an error-free recognition for the most challenging

subset 5 of the Yale Face Database B. In addition we report acomparable evaluation for the RLRC algorithm on the CMU-PIE, ARand FERET databases under standard evaluation protocols asreported in the literature.

In addition, the problem of random pixel corruption is alsoaddressed. The proposed RLRC approach has shown good resultsfor various noise models comprehensively outperforming thereconstructive benchmark approaches. In particular the proposedapproach attains a high verification rate of 99.78% at 0.01 FAR forthe important case-study of probes contaminated with 80% dead-pixel noise. This performance is appreciable considering that thebenchmark approaches are unable to provide satisfactory resultsunder such severe noisy conditions. Similarly in the presence ofsevere AWGN the proposed RLRC approach beats the traditionalgenerative approaches by a margin of an order of 12%. It has alsobeen experimentally shown that the classical LS approach isextremely inefficient in the presence of severe AWGN and lagsthe proposed RLRC algorithm by more than 50%. For a faircomparison the robust variants of the base systems are alsoevaluated and a comprehensive comparison demonstrates theefficacy of the proposed approach.

It is adequate to point out that the proposed RLRC algorithmhas shown demonstrating results for severe illumination andrandom pixel corruption. Accordingly it should be considered asan important module of a hierarchical, unconstrained face recog-nition system where other challenges such as severe pose varia-tions, contiguous occlusion, etc. should be considered separately.

Apart from the good performance index of the proposedapproach, there are several interesting outcomes of the presentedresearch. In the paradigm of view-based face recognition, the choiceof features for a given case-study has been a debatable topic. Recentresearch has, however, shown competency of the unorthodoxfeatures such as downsampled images and random projections,indicating a divergence from the conventional ideology [15–17]. Theproposed RLRC approach in fact conforms to this emerging belief.Good results for randomly distributed noisy pixels are encouragingenough to extend the proposed algorithm for the problem ofcontiguous occlusion where contaminated pixels are known to havea connected neighborhood. Moreover due to good noise tolerance ofreconstructive approaches, an important future direction is toexplore efficient fusion of RLRC with PCA.

Acknowledgment

The research was partially funded by the Australian ResearchCouncil (ARC) Grants DP0771294 and DP110103336.

References

[1] R. Barsi, D. Jacobs, Lambertian reflection and linear subspaces, IEEE Transac-tions on PAMI 25 (3) (2003) 218–233.

[2] A. Jain, A. Ross, S. Prabhakar, An introduction to biometric recognition, IEEETransactions on Circuits and Systems for Video Technology 14 (1) (2004)4–20.

Page 15: Robust regression for face recognition

I. Naseem et al. / Pattern Recognition 45 (2012) 104–118118

[3] A.F. Abate, M. Nappi, D. Riccio, G. Sabatino, 2D and 3D face recognition: asurvey, Pattern Recognition Letters 28 (2007) 1885–1906.

[4] H. Daum, Audio-and video-based biometric person authentication, in: LectureNotes in Computer Sciences, vol. 3546, Springer-Verlag, 2005, pp. 900–908.

[5] W. Zhao, R. Chellappa (Eds.), Face Processing: Advanced Modelling andMethods, Academic Press, 2006.

[6] W. Zhao, R. Chellappa, P. Phillips, A. Rosenfeld, Face recognition: a literaturesurvey, ACM Computing Surveys (2000) 399–458.

[7] R. Macwilliams, N. Sloane, The Theory of Error-Correcting Codes, NorthHolland, 1981.

[8] M.P. Roath, M. Winter, Survey of appearance-based methods for objectrecognition, Technical Report, Inst. for Computer Graphics and Vision, GrazUniversity, Austria, 2008.

[9] M. Turk, A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neu-roscience 3 (1) (1991) 71–86.

[10] P.C. Yuen, J.H. Lai, Face representation using independent componentanalysis, Pattern Recognition 35 (6) (2002) 1247–1257.

[11] D.D. Lee, H.S. Seung, Learning the parts of objects by non-negative matrixfactorization, Nature 401 (1999) 788–791.

[12] D.D. Lee, H.S. Seung, Algorithms for non-negative matrix factorization,Advances in Neural Information Processing Systems (2001) 556–562.

[13] V. Belhumeur, J. Hespanha, D. Kriegman, Eigenfaces vs fisherfaces: recogni-tion using class specific linear projection, IEEE Transactions on PAMI 17 (7)(1997) 711–720.

[14] R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification, John Wiley & Sons, Inc.,2000.

[15] I. Naseem, R. Togneri, M. Bennamoun, Linear regression for face recognition,IEEE Transactions on PAMI 32 (11) (2010) 2106–2112.

[16] I. Naseem, R. Togneri, M. Bennamoun, Face Identification using LinearRegression, in: IEEE ICIP, 2009.

[17] J. Wright, A. Yang, A. Ganesh, S.S. Sastri, Y. Ma, Robust face recognition viasparse representation, IEEE Transactions on PAMI 31 (2) (2009) 210–227.

[18] A. Georghiades, P. Belhumeur, D. Kriegman, From few to many: illuminationcone models for face recognition under variable lighting and pose, IEEETransactions on PAMI 23 (6) (2001) 643–660.

[19] K.C. Lee, J. Ho, D. Kriegman, Acquiring linear subspaces for face recognitionunder variable lighting, IEEE Transactions on PAMI 27 (5) (2005) 684–698.

[20] C. Weilong, M.E. Joo, S. Wu, Illumination compensation and normalization forrobust face recognition using discrete cosine transform in logarithm domain,IEEE Transactions on Systems, Man and Cybernetics 36 (2) (2006) 458–464.

[21] Y. Gao, M.K.H. Leung, Face recognition using Line Edge Map, IEEE Transac-tions on PAMI 24 (6) (2002) 764–779.

[22] B.G. Park, K.M. Lee, S.-U. Lee, Face recognition using face-ARG matching, IEEETransactions on PAMI 27 (12) (2005) 1982–1988.

[23] L.M.A. Levada, D.C. Correa, D.H.P. Salvadeo, J.H. Saito, D.A. Nelson, Novelapproaches for face recognition: template-matching using dynamic time warp-ing and LSTM Neural Network Supervised Classification, in: InternationalConference on Systems, Signals and Image Processing, 2008, pp. 241–244.

[24] C.T. Liao, S.-H. Lai, A novel robust kernel for appearance-based learning, in:International Conference on Pattern Recognition, 2008, pp. 1–4.

[25] J. Wright, Y. Ma, J. Mairal, G. Sapiro, T. Huang, S. Yan, Sparse representationfor computer vision and pattern recognition, in: IEEE International Confer-ence of Computer Vision and Pattern Recognition, 2009.

[26] X. Xu, M. Ahmadi, A human face recognition system using neural classifiers,International Conference on Computer Graphics, Imaging and Visualization(2007).

[27] J. Yang, D. Zhang, A.F. Frangi, J. Yang, Two-dimensional PCA: a new approachto appearance-based face representation and recognition, IEEE Transactionson PAMI 26 (1) (2004) 131–137.

[28] X. He, S. Yan, Y. Hu, P. Niyogi, H.J. Zhang, Face recognition using Laplacian-faces, IEEE Transactions on PAMI 27 (3) (2005) 328–340.

[29] P.J. Huber, Robust Statistics, John Wiley, New York, 1981.[30] H.B. Nielsen, Computing a minimizer of a piecewise quadratic—implementation,

Technical Report, Informatics and Mathematical Modelling, Technical Universityof Denmark, DTU, 1998.

[31] S. Fidler, D. Skocaj, A. Leonardis, Combining reconstructive and discrimina-tive subspace methods for robust classification and regression by subsam-pling, IEEE Transactions on PAMI 28 (3) (2006) 337–350.

[32] C.V. Chork, P.J. Rousseeuw, Integrating a high-breakdown option intodiscriminant analysis in exploration geochemistry, Journal of GeochemicalExploration 43 (1992) 191–203.

[33] D.M. Hawkins, G.J. McLachlan, High-breakdown linear discriminant analysis,Journal of the American Statistical Association 92 (1997) 136–143.

[34] M. Hubert, K.V. Driessen, Fast and robust discriminant analysis, Computa-tional Statistics and Data Analysis 45 (2003) 301–320.

[35] P.J. Rousseeuw, Multivariate estimation with high breakdown point, Math-ematical Statistics and Applications B (1985).

[36] A.M. Pires, Robust linear discriminant analysis and the projection pursuitapproach practical aspects, in: Proceedings of the International Conferenceon Robust Statistics, 2001.

[37] F.D. la Torre, M.J. Black, A framework for robust subspace learning, Interna-tional Journal of Computer Vision 54 (1) (2003) 117–142.

[38] A. Leonardis, Robust recognition using eigenimages, Computer Vision andImage Understanding 78 (2000) 99–118.

[39] R. Fransens, C. Strecha, L. Van Gool, Robust estimation in presence of spatiallycoherent outliers, CVPRW (2006).

[40] F.R. Hampel, E.M. Ronchetti, P.J. Rousseeuw, W.A. Stahel, Robust Statistics:The Approach Based on Influence Functions, John Wiley & Sons, 1986, 2005.

[41] K. Madsen, H.B. Nielsen, Finite algorithms for robust linear regression, BITComputer Science and Numerical Mathematics 30 (4) (1990) 682–699.

[42] R.D. Maronna, D. Martin, V. Yohai, Robust Statistics: Theory and Methods,Wiley, 2006.

[43] P.J. Rousseeuw, A.M. Leroy, Robust Regression and Outlier Detection, Wiley,2003.

[44] R.G. Staudte, S.J. Sheather, Robust Estimation and Testing, Wiley, 1990.[45] T. Sim, S. Baker, M. Bsat, The CMU pose, illumination and expression (PIE)

database of human faces, Technical Report CMU-RT-TR-01-02, RoboticsInstitute, Carnegie Mellon University, 2001.

[46] A. Martinez, R. Benavente, The AR face database, Technical Report 24, CVC,1998.

[47] H.F. Chen, P.N. Belhumeur, D.J. Kriegman, In search of illumination invariants,IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, 2000,pp. 13–15.

[48] L. Zhang, D. Samaras, Face recognition under variable lighting using harmonicimage exemplars, IEEE Conference on Computer Vision and Pattern Recogni-tion, vol. 1, 2003, pp. 19–25.

[49] J. Zhao, Y. Su, D. Wang, S. Luo, Illumination Ratio Images: synthesizing andrecognition with varying illumination, Pattern Recognition Letters 24 (2003)2703–2710.

[50] S. Shan, W. Gao, B. Cao, D. Zhao, Illumination normalization for robust facerecognition against varying lighting conditions, in: IEEE Workshop on AMFG,2003, pp. 157–164.

[51] K.C. Lee, J. Ho, D.J. Kriegman, Acquiring linear subspaces for face recognitionunder variable lighting, IEEE Transactions on PAMI 27 (5) (2005) 684–698.

[52] M. Savvides, B.V.K. Vijay Kumar, P.K. Khosla, Corefaces—robust shift invar-iant PCA based correlation filter for illumination tolerant face recognition, in:IEEE Conference on Computer Vision and Pattern Recognition, 2004.

[53] H. Wang, S.Z. Li, Y. Wang, Generalized quotient image, in: IEEE Conference onComputer Vision and Pattern Recognition.

[54] W. Yu, X. Teng, C. Liu, Face recognition using discriminant locality preservingprojections, Image Vision Computing 24 (2006) 239–248.

[55] X. Xie, K.-M. Lam, Face recognition under varying illumination based on a 2Dface shape model, Pattern Recognition 38 (2005).

[56] P.J. Phillips, H. Moon, S. Rizvi, P. Rauss, The FERET evaluation methodology forface recognition algorithms, IEEE Transactions on PAMI 22 (10) (2000)1090–1104.

[57] A. Martin�ez, Recognizing imprecisely localized, partially occluded, andexpression variant faces from a single sample per class, IEEE Transactionson PAMI 24 (6) (2002) 748–763.

[58] R.C. Gonzalez, R.E. Woods, Digital Image Processing, Pearson Prentice Hall,2007.

[59] L.G. Shapiro, G.C. Stockman, Computer Vision, Prentice Hall, 2001.[60] C. Boncelet, Handbook of Image and Video Processing, 2005.[61] J. Nakamura, Image Sensors and Signal Processing for Digital Still Cameras,

CRC Press, 2005.