Pergamon REGION-BASED TEMPLATE …ntur.lib.ntu.edu.tw/bitstream/246246/142032/1/03.pdf ·...

Pergamon PartemRecognition, Vol. 30, No. 3, pp. 403-419, 1997

© 1997 Pa~ernRecognition Society. Published by Elsevier Science Ltd Pfin~d in Great Bfitmn. All fights reserved

0031-3203/97 $17.00+.00

PII: S0031-3203(96)00086-6

REGION-BASED TEMPLATE DEFORMATION AND MASKING FOR EYE-FEATURE EXTRACTION AND DESCRIPTION

JYH-YUAN DENG and FEIPEI LAI*

Department of Electrical Engineering and Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan, R.O.C.

(Received 8 August 1995; in revised form 20 May 1996); reeeived for publication 11 June 1996)

Abstract--We propose an improved method for eye-feature extraction, descriptions, and tracking using deformable templates. Some existing algorithms are exploited to locate the initial position of eye features and then deformable templates are used for extracting and describing the eye features. Rather than using original energy minimization for matching the templates, the region-based approach is proposed for template deformation. Based on the region properties, the new strategy avoids problems such as template shrinking, adjusting the weights of energy terms, failure of orientation adjustment due to some exceptional cases. Our strategies are also coupled with Canny edge operator to give a new back-end processing. By integrating the local edge information from the edge detection and the global collector from our region-based template deformation, this processing stage can generate accurate eye-feature descriptions. Finally, the template deformation process is applied to tracking eye features. @, 1997 Pattern Recognition Society. Published by Elsevier Science Ltd.

Deformable templates Energy minimization Facial features Feature extraction

1. INTRODUCTION

Face analysis is an important topic in computer vision research. There are many important applications based on face analysis. Face recognition is one of the most famous among these applications. It provides the infra- structure for applications such as security systems using face identification. Recently, there has been much research on face recognition systems. (1-1°~ Brunelli and Poggio (1~ investigate in depth two typical approaches for face recognition: feature-based matching and template matching. Samal and Iyengar (8~ provide a detailed survey of the work done on automatic recognition and analysis of human faces. Karhunen-Loeve's expansion approach was also applied to face recognition. (4~ Turk and Pentland (9) have developed a real system based on this approach, called "eigenfaces". While Beymer °1~ proposed a face recognition system that works under varying poses, Moghaddam and Pentland (7~ have also extended their face recognition system to be view based. Connectionist models (lm are also applied for face processing and recognition such as Kohonen's associative memory. (5~ Chen and Huang (2~ used deformable templates and an active contour model to locate facial features for face recognition. Lanitis et al. (la) developed an automatic face identification system based both on shape and on grey-level information, which can suppress the effects due to variations in expressions, viewing angles, and so on.

Recently, efforts have concentrated on other applications of face recognition such as facial expression understanding, (s'13,a4~ gender classification, age classifica-

* Author to whom correspondence should be addressed: E-mail: flai@ cc.ee.ntu.edu.tw.

tion, (15) and so on. Samal and Iyengar (8~ discuss some research work on facial expression analysis. Mazurski and Bond (13) provide a series of rated pictures on different facial expressions, e.g. sadness, happiness, surprise, and so on. Yacoob and Davis (14) propose an approach for the analysis and representation of facial dynamics for recognizing facial expression from image sequences. By the use of cranio-facial changes in features-position ratios and skin wrinkle analysis, Kwon and Lobo (15) present a system which can be used to classify pictures of people of different ages into three age groups. Ballard and Stockman (16~ develop a system for controlling the computer cursor for facial aspect. Lanitis e ta / . (17) use flexible appearance models for tracking, coding, and reconstructing human faces.

Most of these systems mentioned above require the work of extracting salient features from the images. These features include the mouth, nose, eyes, and so on. Eye features especially provide very significant and reliable information for recognizing and understanding faces. (1'7'8'11'18-20) Because most work on facial-features

detection and extraction play a role in providing information for face recognition, (1'3'6~ few scholars address the issue of fine description and characterization of these features. OsA9'21'22) Vincent eta/ . (2°) uses multi-layered

perceptrons for automatic location of visual features. Chow and Li 08~ reported an improved deformable template (21'22> method for automatic facial-feature detection. The deformable templates proposed by Yuille and Halli- nan (2a'22~ are composed of parameterized curves which can describe the eye features and the mouth geometri- cally.

Unlike other approaches, the descriptions of the ~ea- tures from the deformable templates provide both the

403

404 J.-Y. DENG and E LAI

information needed for face recognition and the characterization of the features. For example, the status of the templates provides the relationship among the subfea- tures, such as the iris being partially hidden by the boundary of the eye. Also, we can get details about the movement of the features from motion pictures, e.g. the eyelids close and open. This is our motive for discussing deformable templates for extraction and description of the eye features.

Our purpose is to find the eye features in nearly frontally-viewed face images and to get fine descriptions of the eye features. The descriptions include the size of the iris, the orientation of the eye, the boundary of the eye, and so on. Moreover, we hope that the descriptions we obtained can be used for tracking the motion of these features. Our assumptions and purposes are the same as those of Yuille et al. (2~) and Hallinan. (m)

Many factors will influence the result we expect, so we must put some constraints on the problem. First, the face image is almost completely viewed frontally, because when the head is turned to one side, the eye features can not be described by only planar curves. For a short interaction, the input of our system is a face image, e.g. see Fig. 1. The first stage is to separate the eye features from other features in the face image, like mouth and nose, for further processing. After this stage, we get a similar eye window image as in Fig. 2(a). We then can develop further processing such as the deformable template method proposed by Yuille e ta / . (21) The result that we expect is similar to Fig. 2(b). In this example, the eye is described by one circle for the iris and two parabolic sections for the boundary of the eye. Once we have the appropriate parameters of the eye template, we can use them to do the tracking.

Second, following Hallinan's (19) concept, we must assume the range of the input image to locate at the middle level, i.e. the appropriate level. Also the illumination must be in ordinary condition, i.e. the eyes are recognizable. Other extraneous factors that must be eliminated like glasses, eyebrows, hair and makeup.

In this paper, we develop a region-based scheme for template deformation which overcomes several problems using original energy minimization (21'22) such as template shrinking, (23) orientation adjustment, choosing weighting coefficients, (21-23) t ime-consuming fine tuning process, (23) and so on. Figure 3 compares the flow dia-

Fig. 1. The original face image.

grams of Yuille's approach and our own. We add a back- end processing stage to integrate the local information after matching the template because it is hard to get global information from the local edge information obtained from conventional edge detectors. By using our region-based strategy and the back-end processing, we can totally replace the whole energy minimization process and get more accurate descriptions. The detailed comparisons of Yuille's approach and the proposed approach are discussed in Sections 3 and 4.

This paper is organized as follows. First, we summar- ize the preprocessing stage that gives the initial values of the eye template in Section 2. In Section 3, we describe our whole template deformation process and compare our results with Yuille's results by using energy minimization processing in each substage. We then discuss the back-end processing stage and show accurate descriptions of eye features generated by this stage in Section 4. In Section 5, we provide examples of tracking the movement of eyes using deformable templates. Finally, we make a conclusion.

2. PREPROCESSING

The purpose of this stage is to provide the initial parameters of the eye template for deformation. Yuille et al. (21) assume that the parameters of the eye template have been set approximately to the position of the eyes.

l l i l (a) (b)

Fig. 2. The eye window image (a) and the eye description (b).

Region-based template deformation and masking 405

Yuille's approach Our approach

1 ...... ,og I Pleprocessing

Enelgy Minimization

, !t : , S !

Fig. 3. Comparison of YuiUe's approach with ours.

This can be achieved by hand initialization. Algorithms have also been used to find the parameters, although there are some constraints. For example, Craw et al. (3~ utilized the positions of eyes to locate an area below the eyebrows. Brunelli and Poggio (1~ also proposed an improved version of integral projection method for feature extraction, which utilized some prior constraints. For example, the face must be bilaterally symmetrical: there are two eyes, one nose, and one mouth in almost every face. These constraints can ease the task of feature extraction. Brunelli and Poggio (1~ suggest performing directional edge-projection analysis. The edge map of the face image is partitioned into two parts, the horizontal edge map and the vertical edge map. With prior knowledge, the horizontal edge projection provides information for detecting the boundaries of the face and nose and the vertical edge map is useful for locating the eyes and mouth. Xie et al. (a3~ also find the eye window using a search strategy based on prior knowledge about the geometry of the face.

Hallinan (19) proposes the robust template which can be used to find the appropriate candidates of the features while searching in the scale-space. Recently, Moghad- dam and Pentland (7) have extended the concept of eigenfaces to the eigentemplates. This method directly uses the gray level information from training to get the eigenfeatures such as "eigeneyes", "eigennoses", and "e igenmouths" . This approach, unlike the simple template matching, utilizes the principal component representation. From the measurement of the residual error, we can detect distinct facial features. In the shape-free appearance model by Lanitis et al.,(12~ the 3D orientation and lighting condition could be normalized for further processing.

3. TEMPLATE DEFORMATION PROCESS

In this section, we begin to discuss how to extract full eye features and get their fine descriptions. After obtaining the eye window, we use the deformable templates proposed by Yuille et al. <2a~ The original deformable

templates are active models which interact with variant image properties in order to reach the stable state. We can regard this approach as an energy minimization problem, and it is similar to the snake model proposed by Kass e t a / . (24) By using various energy terms, deformable templates can provide a more flexible method for dealing with extracting salient features in different environments. For example, if we want to extract and describe the eyes and mouth in a face, conventional edge detectors seem unable to provide a method for obtaining reliable global descriptions. The deformable templates are, however, flexible enough to adjust their size, position, orientation, and so on. So this method provides an evolution process for these parameters to be changed gradually into best-fit. The deformation process depends on energy functions, the purpose is to find the minimum, i.e. the best fit to salient features. Using energy minimization provides a flexible way for reaching our goal, because the energy function is composed of not only of the edge energy, but also of the other energy terms like valleys, peaks, and so on. Ynille et al. (2s) devise an automatic stage by stage scheme for minimizing the energy function, but there are still some problems during energy minimization. First, how can we determine the weights for each energy term? It is t ime-consuming to adjust the weights of energy terms in each stage by experiments, and the weights are not applicable to the other cases. Moreover, we cannot guarantee a good fit of the result because of local minima.

In order to avoid the problems described above and to improve the behaviour of the deformable templates, Xie e t a / . (26) proposed a normalization method for energy terms. Here we propose a template deformation process depending mainly on regional properties. Combining with the original energy minimization flexibility and our modification, we obtain a more reliable method to extract and to describe the eye features from the face images.

3.1. The eye template

We use the same eye template proposed by Yuille et al. (21) except for two points located at the center of the whites. The template is illustrated in Fig. 4 and consists of one circle and two parabolas.

The reason that we do not use the two points in the white area is similar to that of Xie et al. ~23~ In our experiments, while adjusting the orientation of the template, we found that it is more robust to use the whole white area than only the two points. Moreover, when the iris moves to one extreme side of the eye, then only one peak field remains and the original template will have some problems. So our eye template is composed of nine parameters; three parameters (cx, cy, r) for a circle, and six other parameters (tx, ty, a, b, c, O) for the two parabolic sections.

3.2. Potential f ie lds

We describe the deformable template as a parameterized active model; it interacts with the image in both the geometric aspect and the intensity aspect. The measure of

406 J.-Y. DENG and F. LAI

/ (tx,ty)

Y

(cx.cy)

The eye lelnplale i,~ t:Olllpllsed o1 a circle and Iwo Nmnded paranoiac. The circle is wilh cenler (cx,cy) and radius r. The whole lenlplale has center (Ix,ly) and ofienlatilm 0.

Fig. 4. Yuille's deformable template of eye.

fitness determines the actions of the template. The approach of the measurement is modeled as energy minimization. These energy terms can be classified as edge fields ~ , valley fields q~v, peak fields ~p, and intensity fields ~5i.

In Yuille et al. (21"2a'25) these fields are computed by using morphological filters. (a7) These fields need not be very precise, so this approach gives an easy and straightforward way for building various potential fields. Here we use the same potential fields adopted by Yuille et al.

3.3. Iris localization

When we have finished building potential fields and energy terms, we must localize the circle of the eye template to the iris of the eye image. In this stage, we use similar energy functions as in the Yuille's approach. We let the valley energy term dominate the whole energy function, i.e. E----Evalley. The valley energy term is described as equation (1). The meaning of the energy function is that the circle of the eye template should be located at the position where Ev~mey is low, i.e. the area of ordinary eyes.

kl / ~v (.~)dA (1) Evalley -- Acircl e

Acffcle

In equation (1), the term Acird~ means the area of the circle or the template. It can be calculated as the number of pixels in the circle. Its effect is to normalize the energy term.

We show two examples of this stage. The initial positions of the eye templates are in Fig. 5(a) and (c), and the results after iris localization are in Fig. 5(b) and (d).

3.4. Eye template and iris size adjustment

At the second stage, we want to adjust the size of the circle to the correct size of the iris. In Section 3.4.1, we discuss Yuille's original method for resizing the circle. After some experiments, we found that there are some problems in this approach. For example, the circle often shrank to a point due to bad coefficients of energy terms (see Fig. 6). In order to solve this problem, we propose a new approach called "regional forces approach". We first define the regional forces and then show how to apply this method to the original problem.

3.4.1. Yuille's approach. In Yuille's approach, the main energy terms for this stage are Ev~ley, Eedge, Eintensity. Eva l l ey has been described in equation (1). Eedg~ is defined as follows:

k2 f ~e (Y)ds - k3 / ffe(Y)ds. (2) Eedge - - Lcircl e Lpar- ~

Lcircle Lp~a

This edge energy term includes the edge energy of the circle and the two parabolas. Lcircle is the length of the boundary of the circle. Similarly, Lp~a is the length of the two parabolas. The two constants kb k2 correspond to the weights of the energy factors. Eintensi ty c a n be written as follows,

k4 k5 gintensity--Acircle J I ( 2 ) d A - A - ~ f l('2)dA, (3)

Aclrcle Awhite

w h e r e Awhi t e is the white area bounded by two parabolas and the circle, and it can be calculated similarly to Aoircle. The function I is the gray level intensity function of the original eye images. The sign of the coefficients k4 is positive because we use the image intensity as the energy. The gray level value of the iris area is low in the intensity images.

(a) (b) (c) (d)

Fig. 5. Localization of eye templates: (a) initial position (b) localization (c) initial position (d) localization.

(a)

Region-based template deformation and masking

(b) (c)

Fig. 6. The eye template will shrink because of bad coefficients.

407

(d)

Now we can write the energy function as

E = Evalley -}- Eedge @ Eintensity. (4)

When we finish the iris localization, then we turn on the terms Eedge and Eintensity of the energy function. During this stage, the valley energy term and intensity energy term will drag the template to the dark area and adjust the circle size, whereas the edge energy term will prevent the circle from shrinking into a point. If the coefficients of these energy terms are appropriately chosen, the circle will initially adjust its size, and then when the circle is near the iris the edge energy term will begin to play a maj or role in dragging the circle onto the boundary of the iris.

The system works well after the ad hoc values of the coefficients were found. However, even with an elaborate adjustment, we cannot guarantee the eye template will get into a good position and size (see Fig. 7). Although we have put some efforts into adjusting the coefficients, we do not get very good results.

Some problems arise in this approach. For example, when we overweight the valley energy term or the intensity energy term, the circle will easily shrink to a point at the darkest part of the iris. When the initial size of the circle is much smaller than the size of the iris in the images, there is no chance to recover it to the proper size. In Fig. 6, we show the shrinking process of one eye template.

The key problem is how to determine the coefficients of each energy term. The solution is spending much time finding these values by experiments. With the ad hoc coefficients, we may avoid these problems. Yet another problem arises: the weights of these energy terms cannot

(a)

be generalized. That is, we must adjust the weights case by case and perform experiments. To overcome these problems, we propose a strategy called "regional forces". We utilize the regional properties of the images to do resizing.

3.4.2. Definition of regional forces. In order to avoid the adjustment of these coefficients of the energy terms, we develop a scheme utilizing the regional properties of the images. We call this "regional forces for resizing and locating the circle". More specifically, there are two kinds of forces, the resizing force and the movement force. We call them regional forces because we calculate these forces from the regional properties of the images. For a quick interpretation, see Fig. 8. We utilize a small window along the boundary of the circle to probe the regional properties. From this information we generate these forces. The actions of these forces interact with the eye template to do deformation.

The motivation of this method came from the work by Berger, (28) the balloon model by Cohen e ta / . (29'3°) and the anticipating snake model by Ronfard. (3a) Unlike the balloon model, the template can expand a n d shrink automatically by the regional forces. We do not need to supply an artificial external force as in the balloon model. Like Ronfard, (31~ we utilize the regional properties to improve the behavior of the original model.

In implementation, the small window for probing the regional properties returns a value, which we call as the window force, denoted as fw- The region window is a rectangle with its length I and width w. The center of the window is on the circle of the template and its orientation is the same as the vector from the center of the circle to

(b) (c) (d)

Fig. 7. The eye template deformed to better position because of ad hoc coefficients.


(i)

(ii)

(iii)

S 1 ,X]2w

Resizing force:

Movement R~rces:

x-dir Iorce:

y-dir lorce:

Use a sinai[ wthdow tll pnthe the region property, alld t~} estimate region Ikwce.

Amalgamate total threes along the iris edge.

summation of (each peried lbrce*cos(theta))

sunllnlthon Of (t3ach pcried Iklrct2*sin(theta))

(a) shrink (b) expand

(c) shrink and move right (d) expand and move left

(a) (b) Fig. 8. Definitions and actions of regional forces: (a) definitions (b) actions.

the center of the window. So we have to write the window force with parameter 0, i.e. fw(0). Since we have the window forces, we can define the resizing force and movement force as follows. The definition of resizing force is

27r

fresize f Sw(O)dO (5) 0

The movement force can be divided into x direction and y direction components, which can be written as

2~r 27r

fm .. . . = / f w ( O ) c o s ( 0 ) d 0 , fmovey = ffw(O)sin(O)dO. 0 0

(6)

In our implementation, the integration is replaced by summation along the circle contour. The 0 is incremented by a small step.

The value of the forces determines the actions of the eye template. We have shown some examples in Fig. 8(b). The stop criterion of the force-based deformation is when the absolute values of these forces are smaller than a positive threshold c. The circle witl shrink or move depending on the situations.

The main work of our approach is how to get the region window force. There are several ways for doing this. For example, we can separate the window into the upper and the lower part and then calculate the contrast of the two subwindows, or we may probe the homogeneity of the region window using algorithms as Chakraborty eta/. (32)

Because we have the valley fields, and the interior of the iris of the eye should be a valley region through our prior knowledge, we utilize the valley field ~v to get the region window force for resizing the circle. First, we binarize the valley field ~5v. Then we take the mean value of each region window and normalize the value as its window force fw. Specifically

~ ( e a c h pixel value in the window) 255 fw = , (7)

number of pixels in the window 2

where we normalize thefw into the region ( -255 /2 , 255/ 2) because we binarize the valley field into two levels, zero and 255.

After getting the region window force fw, we can then easily establish the resizing force fresize and the movement force fmove. When these forces are above the threshold e, we change the parameters of the template. These parameters, including the position and radius of the circle, are increased or decreased by a small value as long as these forces are active.

3.4.3. Improving behavior of iris size adjustment. Here we show two deformation examples using the regional force approach. In Fig. 9, we demonstrate the eye template shrinking from its initial position to a reasonable size and position. Compared with Yuille's approach, our method provides an easier way to get reasonable results. It is hard to predict the results with Yuille's approach because variant coefficients will create variant energy functions which generate different local minima. In contrast, we can guarantee the template will deform to reasonable size and position via the binarized valley field. This helps the further processing go more

Region-based template deformation and masking

.i ~ .~H, ~, , , . . . . . . .

409

Fig. 9. Using region force to shrink to proper size.

Fig. 10. Using region force to expand to proper size.

directly and easily. For example, we can fix the position and the size of the eye template and then adjust the orientation only in the next step.

We found that, via our approach, the eye template still has a chance to expand and move to the proper size even if the initial size is much smaller than the size of the real eyes in the images (see Fig. 10). The eye template expands gradually to reach the final configuration. If we use Yuille's approach, there will be no chance to recover it. This is due to local minima because when the size of the template is much smaller than the expected size, the valley and intensity energy terms will dominate and make the template shrink to a point.

3.5. Orientation of the eye template

In this stage, we consider the orientation of the eye template. Note that in this stage the size and position of the iris are fixed, we only allow the orientation of the template to be changed. This requires that a reasonable position and size of the template have been established. Our previous regional-forces approach provides a reliable way to do this. In Yuille's approach, the template intracts with the peak field to rotate and translate the parabolic sections. Our previous stage for locating the circle also provides a better way for adjusting the orientation using original Yuille's approach.

While experimenting with Yuille's approach for adjusting the orientation, we also found some problems. First, using the two center points of the white areas for adjusting the orientation of the template will be easily disturbed by noise. Next, when the iris moves one extreme side, then only one peak field exists. In this case, the method by Yuille et al. will be in some exceptional situation not mentioned in the original paper. (zl? We also found it is more stable to use the whole white

area to adjust the orientation than only to use a small neighborhood of the two points proposed by Yuille et al. (21) Motivated by our regional forces, we propose a new strategy called the "regional torque" for rotating the eye template. We utilize the regional properties of the whole white region and the valley region to generate the regional torque. As the name indicates, the regional torque provides a natural and easy way to adjust the orientation.

3.5.1. Yuille's approach. In Yuille's approach, (21~ the energy function is dominated by the peak energy term Epe~k in the E = Ep~ak stage. Other energy terms, like Evalley or Eedg e, are turned off. The template interacts with the peak field in order to adjust its orientation. The advantage of this approach is that it can help the template locate its centers of the white region. However, it only uses the energy on these two points: this makes the approach work not very accurate in some cases.

In Fig. 11, we show the ordinary case in which there are two peaks. The two peaks attract the two pointspl and P2 to the peaks of the white area. This method works well under this situation. We found, however, when the peaks of the white region are not obvious, the template will be attracted by noise peaks caused by the illumination condition (see Fig. 12). The template is trapped into the false peaks.

When the iris moves to one extreme side, another problem occurs: there is only one peak field in the image, as illustrated in Fig. 12(c). The approach cannot deal with this situation very well, although we may omit the other peak point and let one peak point be active. In this case, the peak point will not locate at the center of the white region. So using only the peak energy cannot recover the proper orientation.


(a) (b)

Fig. 11. Orientation adjustment using Yuille's approach, case (I) ordinary situation, (a) initial orientation (b) final orientation.

(a) (b) (c) (d)

Fig. 12. Orientation adjustement using Yuille's approach, case (II) with weak peak field and case (III) one peak field, (a) and (c) initial orientation; (b) and (d) final orientation.

3.5.2. Regional torque. In order to solve the problems mentioned above, we first utilize the whole white region properties to do orientation adjustment. Because of our success in using regional-forces approach to deal with the ins size and position tuning, we extend our original regional forces approach to "the regional torque". Via the whole region properties, we get reasonable results using this new scheme in the orientation adjustment.

Similarly, we use a small window to probe the regional properties as in the previous stage. Yet, instead of using the regional forces directly, we generate a regional torque for rotating the template (see Fig. 13). We evaluate the regional forces along the whole contour of the two parabolic sections. We then use these forces to generate the regional torque. As the regional forces, the torque strobes the template to rotate until the template reaches the balance condition. Here we use the whole eye region, which includes the white region and the valley region, because the iris might sometimes be hidden by the parabolic boundary. At that time, the shape of the iris region also provides an important cue for the orientation of the template. Another reason for doing so is that the iris occupies a large area of the eye so that considering both the white region and the valley (the iris) region gives a more reasonable strategy.

We choose the center of the eye template Yt to be the pivot point for calculating the torque and rotating. We divide the eye template into two parts by the perpendi- cular line to the current orientation through the center 2,, one is the left and the other is the right. Then we calculate

' ,r I,,l~w,~

m~lUelUl~ ~

'd

( i ) L LTw Use a small w indow to p o b e he reg on properly

and to estimate region force.

(ii) torque fol rotation Select iris centeJ as pivot point, and use regional

force to calculate the torque.

Fig. 13. Regional torque for rotation.

the torque from the window forces of both sides of the eye template. The definition of the regional torque T is written as

= ~ each fw~wi in the left "Felockwise

i

- ~-~ each fw~wj in the right (8) J

The weighting factor w is equivalent to the effective distance to the pivot point for calculating the torque.


Summation of each torque generated by the small window along the parabolic boundary constitutes the total regional torque. When the magnitude of the torque is greater than a threshold, the eye template will continue rotating. The template will rotate clockwise if the value of the torque is positive. Otherwise, the template will rotate counterclockwise.

3.5.3. Experiments. We show three cases demonstrating the effectiveness of the regional torque approach. One is the ordinary case (see Fig. 14); the original Yuille's approach and our approach will both work well. In the other two cases, as shown in previous section, Yuille's approach fails to get the proper orientation, whereas, our approach still work well.

Now we discuss the second case, the peak fields of the eye images are weak. If we use the energy function defined on the two peak points, the template will be dragged by the other false strong peak points generated by illumination. However, if we use the whole eye region to help adjust the orientation, the template will not be trapped into such false local minima. In Fig. 15, we show the process of orientation adjustment with weak peak

fields. Similarly, when there is only one peak region, we can also use the same strategy to adjust the orientation. Figure 16 shows the evolution process.

3.6. Parabolic boundary adjustment

Now the parameters of the two parabolas will be updated during deformation so that the description of the eye image will become more refined than before. Obviously, it is necessary to correct the parameters of the parabolic sections when the iris locates at one extreme side of the eye. Indeed, even in ordinary cases, we also need to break the symmetrical condition of the eye template in order to get more accurate descriptions. All the five parameters of the parabolic boundaries, (xt, Yt, a, b, c), are simultaneously being changed. We extend the regional forces for resizing and locating the circle to solve the current problem.

Figure 17 shows the synopsis for the definitions and actions of the regional forces used in this stage. These forces can also be categorized into two classes. One is for movement and the other is for updating the three parameters (a, b, c). We also apply small windows on the parabolic boundaries to calculate the region window forces. The definition of the regional force fw for the

Fig. 14. Orientation adjustment using our approach, case (I) ordinary situation.

Fig. 15. Orientation adjustment using our approach, case (II) with weak peak field.

Fig. 16. Orientation adjustment using our approach, case (III) only one peak field.


•rvemen Dl'Ce Resizing Force

(i) k ~ ' w Use a small window m probe the region property, and t . estimaIe region force.

(ii) Movement & Resizing As filr the iris size and positilm adiustment, we can

use the similar c,)ncept for the wN~le template adjustment

Fig. 17. Parabolic area adjustment.

small window is the same as in previous sections. We show some examples of this stage in Fig. 18.

3.7. Individual component deformation

From previous stages, we get approximation descriptions of the eye images. If we want to get more refined descriptions, we have to fine tune individual components. For example, the iris might be partially hidden by the parabolic boundaries. Moreover, the binarized region fields are noisy so we cannot get very accurate results using only these forces. For dealing with the iris problem, the approach taken here is also energy minimization again. However, we let the edge energy term dominate the whole energy function. This is because we are con- cenaed that the iris is partially hidden and that the edge field information is the most important part. Hence we let

the edge energy term dominate and begin the energy minimization process.

Because sometime the iris will be partially hidden, we calculate the energy terms in only the region bounded by the parabolic contours. With this consideration, the circle will easily be moved and scaled by the proper energy local minimum. This is also due to the fact that our previous stages have provided a reasonable configuration of the eye template (see Fig. 19 for examples). We can get a more accurate description after the individual deformation.

Next we begin to consider the boundary adjustment of the eye template. We can also take the energy minimization for the fine tuning of the parabolic sections. How- ever, we found that the parabolic boundaries being fine tuned (see Fig. 20) may not give a reasonable description for the following reasons. First, the energy function contains not only the edge term, but also many other terms. The coefficients again become a problem for fine tuning. This makes it hard to determine what kind of the energy function will generate the best description. Second, the local minimum of the energy function also generates a serious problem for finding the best representation. Our parabolic boundary adjustment stage which has already provided a good and reasonable description of the eye template also makes it more difficult to get a more accurate representation. So we add a back-end stage for generating accurate representa- tions of eye images. This new stage utilizes the local reliable edge information for grouping and generating global accurate descriptions. We will discuss this method in the next section. The purpose of the new back-end processing stage is to replace the energy minimization process used here. We show several examples to illustrate

Fig. 18. Parabolic area adjustment process.

(a) (b) (c) (d) Fig. 19. Fine tuning for the iris (a) before fine tuning (b) after fine tuning.


(a) (b) (c) (d)

Fig. 20. Variant results of parabolic boundary fine tuning using energy minimization.

the simulation results of using energy minimization for parabolic boundary fine tuning. These results demonstrate the difficulties of fine tuning through energy minimization.

4 . B A C K - E N D P R O C E S S I N G

Here we combine the edge operator and the deformable template to provide a back-end processing. In general, it seems that the typical edge operator cannot provide global information• We hope that the deformable templates can help to select the edge pixels from the edge detection to form more meaningful and global data. From these post-processing edge data, we can then match the data to some parameterized curves, e.g• the conics.

4.1• Edge filtering

At the beginning, let us check the results from Canny edge operator. (33) From Fig. 21, we can find that the edge

detection results are still very noisy even in a clear eye image in the illustration. We show the Canny edge detection results under different noise margins. So we could not depend on only the edge map from edge detection to evaluate the global description. We have to take other approaches to do that.

Here we use the eye template after deformation to generate masks to select the edge pixels we want. After deformation, the eye template has been deformed into a good fit to the real eye image• It provides the appropriate size, position, and the orientation information, so we use the template to generate the masks for selecting the edge pixels. For example, we let the radius of the circle be increased or decreased a little so that we generate a region masks from the overlap of the circle template. We demonstrate the results in Fig. 22: (a) and (c) are the iris mask and the edge pixels selected; (b) and (d) are the results for the eyelids• Similarly, these masks are generated from changing the parameters of the two parabolas

':-'""----~:T~-":" Z .:" .-.- " " :i" "'"-"" '- " -- --- --'- "-" ."-"--.':"~"--:----'-"

::':.. I-7;-.; ~."?:!.!. ":'" i" :.l -i,.:.:-.* "7

• . " ' ' , . ' !Z _ - ~ _ ' 7 . - , : a : - . - ' ~ . . . , ' - : _ ' - " - • ' I . - - = . i . . • - ° I • ' . . , , . - :, I .

-:y~. :" .. _

I ." .-'- -.<-_2--_" "~-'..•, i" l., .--- _ ----." ,• ••

. " . , " ~ , : : " . - - m - " ~ . " : ' J • ,-'~; "."', + ¢" • ' y . % -, C -.-"

: " ," . I " . . ' " - ' : . ' ~ . - " ' - . " . " - . " ! ,' ~:L_-.-_--.---:~'- ." -%" • i_-.~ . .,_-. .- ,

I I" I " ~ . : ' : -

- ' - ' - ._.--_.- ... ...-"

- , .

- - - . . . ~ '~ : : "i

" - ' - - ' " : -2 - - -

" . . - . : - - : . - : .7 - ,. • "~. ••~, .:. "." • "~ J ,. •:..i . -

..,.'( '....'---......'...::_,.:'" •. .... ~' ~ :~ . - - - - - - . , . - ' - ' - "

(a) (b) (c) (d)

Fig. 21. Filtered iris edge map and para-edge map: (a) original image and (b) noise margin = 20 (c) 30 (d) 50.

.... "5 !" .-" " I"

• . , , . . .E• .

(a) (b) (c) (d)

Fig. 22. Edge mask and filtered results: (a) iris mask (b) parabola edge mask (c) iris edgels (d) parabola edgels.


of the template. We can separate the edge pixels for the lower eyelid or the upper eyelid easily because the template is composed of two parabolas.

4.2. Rule-based grouping

When we get the edge data filtered by the template masks, we can directly apply the fitting algorithm to these data. However, the fitting seems unreasonable, especially when the iris is hidden partially by the boundary of the eyelids. Due to this situation, the edge data from the iris mask may contain the edges of the eyelids. These data will decrease the effectiveness of fitting. So we have to select the correct edges.

For solving this problem, we use an extended mask filter for extracting the iris edges. From Fig. 23 we know the edge data between the iris and the white area should be the correct choice. By utilizing the parabolic masks, we can generate the extended iris mask as in Fig. 24. The edge data selected from the extended masks can be directly used for fitting.

As illustrated in Fig. 26, the edge data for the upper eyelid obviously contain data points from two distinct group of curves via our observation. Yet there is only one parabolic contour for the upper eyelid in our eye template, because there is an undefined region between the boundary of the white region and the skin. Accordingly, there are two groups of edges for representing the boundary of the upper eyelid. The lower eyefid also

has the same problem. This is one of the reasons that Hallinan (19) proposed the robust eye template. Here we use some simple heuristic rules for grouping. See Fig. 25 for a brief interpretation. The rules are very simple and are summarized as follows:

1. Separate the edge segments into nonoverlapped components with respect to y direction.

2. Collect the upper edge segments or lower segments into one group.

We show two examples in Fig. 26 using these simple rules for grouping. These two examples using the same two eye images as in the previous section. Though one is more clear, our method runs well on these two cases.

4.3. Conic section fitting

Once we get the reliable data points of the eyelid edge and the iris edge, in the next step, we fit conic curves to these data. These conic curves represent the descriptions of the eye images. The algorithm we take here is proposed by Bookstein. (34) See also Sampson's paper (35) for

a refined algorithm. We show the fitting results using the Bookstein's

algorithm. From these figures, we can apply the deformable template technique to help the edge operator to group their local edge information into more global descriptions. Using only the results of typical edge operators cannot provide global information. If the local

This segment cannot be grouped into edge data for the iris.

(a)

Only these two segments are correct choices of the edge data for the iris.

(b)

Fig. 23. Grouping for the edge data of the iris.

• \ .)" I'

(b) (c) (a) Fig. 24. Extended iris mask: (a) original iris mask (b) extended iris mask (c) original selected edgel

(d) correct edgels.


There ~¢ two edge dala gn~up.

Gn)uping R~IL~

~ g e data S~Ie~t the~ twt~ ~ g n ~ l t ~ one group.

~ =;2,,,&:2:L,: .... .. 1 •

Fig. 25. Grouping for the dge data of the eyelids.

edges information is available and reliable, then the fitting results are very accurate descriptions.

Figure 27 shows the result of fitting a circle to the edge data of the iris. From these experimental results, we find that the circles are located accurately on the iris edge. In contrast, the results from template deformation or energy minimization show good fit only. In Fig. 28 we show the fitting results on the eyelids. We restrict the shape of the eyelids to general conic sections rather than parabolic curves. These results also demonstrate accurate descriptions of the eyelids. Figure 28(b) and (d) exhibit strong similiraties with the results from robust templates proposed by Hallinan. (19) Instead of using a statistical estimator, the results shown here are produced by edge analysis and fitting from our back-end processing. These results show a feasible way to integrate the deformable templates and edge analysis to get global accurate descriptions.

f

.7 I

5. TRACKING

The deformable templates suggest a straightforward mechanism for tracking movement of the eyes. Much research has been done o n tracking. (21"22"24'25'36 38) The

active contour model, snakes, proposed by Kass eta/. (24)

has been applied to tracking applications. It was first applied for tracking a speaker's lip through a series of images. Terzopoulos and Szeliski (3s) discuss several examples of tracking being used in active contour models, including tracking facial features using snakes. Also the original snake model (24) has been extended to the Kalman snake (3s) for the purpose of tracking. A snake is an active contour used to minimize its objective energy function. The work by Yuille et a / . (21'22'25) also discussed the tracking applications using the deformable templates.

In this section, we will discuss tracking applications using deformable templates for movement of eyes. Instead of using the pure energy minimization process for tracking, we repeat our template deformation process. With a continuous series of images, the result from the previous frame can be used to set the initial values of the parameters of the template in the next frame. Then we reinitialize a new deformation process. Using such an approach, the task of tracking will be completed under some constraints.

For tracking eyes, there are some other constraints. First, because the template is a plane model, the movement of the head cannot be so big as produce the side view from the camera. In other words, we have to keep these face images at a nearly frontal-viewed. Next, the eyes have also to be clear. Further, as mentioned in the introduction, if the eyes are too small to be described by

.... :;, . . . . . . . . . . , ~ . ,m~"~m'-- . , . i -"

Fig. 26. Parabolic boundaries a/:el" rule-based grouping: (a) and (c) original selected edgels; (b) and (d) separated and selected by rules.

f

(a) (c) (d) (b)

Fig. 27. Fitting a circle to the iris: (a) and (c) selected edgels; (b) and (d) fitting results.


(a) (b) (c) (d)

Fig. 28. Fitting conic sections to the eyelids: (a) and (b) case I, (c) and (d) case II.

the template, the method will fail. Also, there cannot be an obstacle, like hair, in front of the eyes.

If these conditions are satisfied and we have a series of images, we then can begin to track. First, we have to find and describe the eyes in the first frame. We have discussed this problem in the previous sections. Now assume that the eye template has been fitted well in the first frame and the parameters are available. We then use these parameters as the initial values of the template in the next frame. Then we can begin the deformation process for the template as discussed previously.

Since the deformation is expected to be small from frame to frame, we can stipulate that the radius of the iris only be allowed to change a little compared with other parameters. Because our region based approach provides a good fit for the eyes and now the purpose is tracking, we need not fine tune these parameters by energy minimization. The whole deformation process, based on the regional forces and the regional torque for orientation, provide a quick way for tracking.

Figure 29 shows the results of tracking the eye of a baby. The sequence contains four frames. The results

Fig. 29. Tracking the baby eye sequence.


N

) : : i '

: ~: :i }

Fig. 30. Tracking sequence 1, iris movement.

I , i~!!

N

}~i ~:~

!iJ i g ~i: }? '

: '. ~:;} i : } [ .............. . . . .

, r

) . . . . .

}

Fig. 31. Tracking sequence 2, movement of the eyelids.

demonstrate the feasibility of using deformable templates for tracking.

If we have more constraints on the movement of eyes, for example, if the position and the size of the eye are almost fixed, the tracking task can be done more quickly. In such cases, we can only put efforts on tracking the movement of the iris and the eyelids. Here we show the results of tracking the movement of the iris in a series of images in Fig. 30.

In the beginning, we used the regional forces defined in Section 3.4.2 for adjusting the position of this iris. This provides a quick way for dragging the iris to an approximate position. When the iris is partially hidden by the boundary of the eye, we let the edge energy term dominate the energy function for energy minimization. Only the parts of the edge field located in the interior of the region bounded by the two parabolas are considered as the edge energy. The experimental results are shown below.

Here we consider the system applied on tracking the movement of the eyelids. As in the previous section, the position and the size of the eyes in this series of images are regarded as almost fixed. In this series, the move- ments of the eyelids are very small in the previous four frames, whereas the upper eyelid closed and opened again in the next four frames. The experimental results are shown in Fig. 31.

In this tracking process, we only use the regional forces for adjusting parameters of the parabolas. We do not use energy minimization in this example. The reason for doing so is because it is t ime consuming to do energy minimization for fine tuning the whole shape of the eye template. During energy minimization for fine tuning these parameters, the coefficients problem arises again. We have discussed this situation in previous sections.

For the purpose of tracking, it is sufficient to allow only some parameters to be changed. As illustrated in Fig. 31, the upper parabola will move up and down following the closing and opening of the upper eyelid.

6. C O N C L U S I O N

In this paper we have described an improved strategy for eye features extraction and description. We also demonstrate tracking of eye features from face images based on the template deformation process. The contri- butions of our work are summarized as follows.

1. We propose a regional force approach to overcome the template shrinking problem. It does not depend on energy minimization and so it avoids the problem of choosing the weighting coefficients of various


energy terms. Due to this, it provides a good chance for further process ing to get a good match.

2. Using original energy minimizat ion on two peak points for adjusting orientation of the eye template will succeed in the "a rche typa l" eye images, (21) but will fail in some except ional cases such as the eye images with weak peak fields or one peak field. We

extend the concept o f regional forces to the regional torque for orientat ion adjustment. This strategy suc- ceeds in these except ional cases. By using the whole region propert ies of the eye image, our method is more robust and less sensitive to noise due to lighting variations.

3. The regional forces can also be applied on adjusting the whole shape of the eye template, i.e. the parameters of the boundary of the template. Using this approach provides a reasonable descript ion o f the eye. The scheme is no longer a pure energy minimizat ion process and can be named the template deformat ion process based on region properties.

4. We add a back-end process ing stage after template deformation. The deformat ion templates will help to integrate the local informat ion f rom edge detection into a more reliable and global description.

5. The regional torque and forces are also beneficial for tracking.

Acknowledgements--This work was supported in part by NSC contract 84-2215-E002-022.

REFERENCES

1. R. Brunelli and T. Poggio, Face recognition features versus templates, IEEE Trans. Pattern Analysis Mach. Intell. 15(10), 1042-1052 (1993).

2. C.-W. Chen and C.-L. Huang, Human face recognition from a single front view, Int. J. Pattern Recognition and Artificial Intell. 6(4), 571-593 (1992).

3. I. Craw, H. Ellis and J. R. Lishman, Automatic extraction of face-features, Pattern Recognition Lett. 5(2), 183-187 (1987).

4. M. Kirby and L. Sirovich, Applications of the Karhunen- Loeve procedure for the characterization of human faces, IEEE Trans. Pattern Analysis Mach. Intell. 12(1), 103-108 (1990).

5. T. Kohonen, Self-Organization and Associative Memory, Springer, Berlin, (1989).

6. B. S. Majunath, A feature based approach to face recognition, Proc. IEEE Conf. Comput. Vision and Pattern Recognition, 373-378 (1992).

7. B. Moghaddam and A. Pentland, Face recognition using view-based and modular eigenspaces, Automatic Systems for Identification and Inspection of Humans SPIE, 2277 (1994), also MIT Media Lab. Tech. Report No. 301.

8. A. Samal and P. A. Iyengar, Automatic recognition and analysis of human faces and facial expressions: A survey, Pattern Recognition 25(1), 65-77 (1992).

9. M. Turk and A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neuroscience 3(1), 71-86 (1991).

10. D. Valentin, H. Abdi, A.J. O'Toole and G. W. Cottrell, Connectionist models of face processing: A survey, Pattern Recognition 27(9), 1209-1230 (1994).

11. D. J. Beymer, Face recognition under varying pose, A. L Memo No. 1461, Massachusetts Institute of Technology,

Dec. 1993 also Proc. IEEE Conf. Comput. Vision and Pattern Recognition (1994).

12. A. Lanitis, C. J. Taylor and T. E Cootes, Automatic identification of human faces using flexible appearance models, Proc. 5th British Machine Conference, 65-74 (1994).

13. E. J. Mazurski and N. W. Bond, A new series of slides depicting facial expressions of affect: A comparison with the pictures of facial affect series, Australian J. Psychol. 45(1), 41-47 (1993).

14. Y. Yacoob and L. S. Davis, Recognizing human facial expression, Tech. Report No. CS-TR-3265 University of Maryland May 1994 also Proc. IEEE Conf. Comput. Vision and Pattern Recognition (1994).

15. Y. H. Kwon and N. V. Lobo, Age classification from facial images, Proc. 1EEE Conf. Comput. Vision and Pattern Recognition, 762-767 (1994).

16. R Ballard and G. C. Stockman, Controlling a computer via facial aspect, IEEE Trans. Systems, Man and Cybernetics 25(4), 669-677 (1995).

17. A. Lantis, C. J. Taylor and T. E Cootes, Automatic tracking coding and reconstruction of human faces using flexible appearance models, lEE Electronic Lett. 30(19), 1578-1579 (1994).

18. G. Chow and X. Li, Towards a system for automatic facial feature detection, Pattern Recognition 26(12), 1739-1755 (1993).

19. R W. Hallinan, Recognizing human eyes, SPIE Proc. Geometric Methods in Computer Vision 1570, 214-226 (1991).

20. J. M. Vincent, J. B. WaRe and D. J. Myers, Automatic location of visual features by a system of multilayered perceptrons, lEE Proceedings-F 139(6), (1992).

21. A. Yuille, D. Cohen and E Hallinan, Fea~u'e extraction from faces using deformable templates, Proc. CVPR, 104- 109 (1989).

22. A. Yuille and P. Hallinan, Deformable templates, Active Vision, A. Blake and A. Yuille, eds., pp. 21-38. MIT Press, Cambridge, MA (1992).

23. X. Xie, R. Sudhaker and H. Zhuang, On improving eye feature extraction using deformable templates, Pattern Recognition 27(6), 791-799 (1994).

24. M. Kass, A. Witkin and D. Terzopoulos, Snakes: Active contour models, Int. J. Computer Vision 1(4), 321-331 (1988).

25. A. Yuille, P. Hallinan and D. Cohen, Detecting facial features using deformable templates, Int. J. Computer Vision, 104-109 (1992).

26. X. Xie, R. Sudhaker and H. Zhuang, Corner detection by a cost minimization approach, Pattern Recognition 26(8), 1235-1243 (1993).

27. P. Maragos, Tutorial on advances in morphological image processing and analysis, Optical Engineering 26(7), 623- 632 (1987).

28. M. O. Berger, Snake growing, Proe. European Conf. Comput. Vision, 570-572 (1990).

29. L. D. Cohen, On active contour models and balloons, CVGIP: Image Understanding 54(2), 211-218 (1991).

30. L. D. Cohen and I. Cohen, Finite-element methods for active contour models and balloons for 2D and 3D images, IEEE Trans. Pattern Analysis and Machine Intell. 15(11), 1131-1147 (1993).

31. R. Ronfard, Region-based strategies for active contour models, Int. J. Computer Vision 13(2), 229-251 (1994).

32. A. Chakraborty, L. H. Staib and J. S. Duncan, Deformable boundary finding influenced by region homogeneity, Proc. IEEE Conf. Comput. Vision. Pattern Recognition, 624-627 (1994).

33. J. Canny, A computational approach to edge detection, IEEE Trans. Pattern Analysis Mach. Intell. 8(6), 679-698 (1986).

34. E L. Bookstein, Fitting conic sections to scattered data, Computer Graphics and Image Processing 9, 56-71 (1979).


35. R D. Sampson, Fitting conic sections to very scattered data: An iterative refinement of the Bookstein algorithm, Computer Graphics and Image Processing 18, 97-108 (1982).

36. I. A. Essa, T. Darrell and A. Pentland, Tracking facial motion, Proc. 1EEE Workshop on Nonrigid and Articulate Motion, (November 1994) also, MIT Media Lab. Tech. Report No. 272 (19.82).

37. D. Terzopoulos, A. Witkin and M. Kass, Constraints on deformable models: Recovering 3D shape and nonrigid motion, Artificial Inwll. 36, 91-123 (1988).

38. D. Terzopoulos and R. Szeliski, Tracking with Kalman Snakes, Active Vision, A. Blake, and A. Yuille, eds., pp. 21-38, MIT Press, Cambridge, MA (1992).

About the Aut ho r - -F E IP E I LAI received a B.S.E.E. degree from National Waiwan University in 1980, and M.S. and Ph.D. degrees in computer science from the University of Illinois at Urbana-Champaign in 1984 and 1987, respectively. Since 1987, he has been involved in the architecture design and hardware implementation of the MARC and IAS-S multiprocessor systems. He is a professor in the Department of Electrical Engineering and in the Department of Computer Science and Information Engineering at National Taiwan University. He was also a visiting senior computer engineer in the Center for Supercomputing Research and Development at the University of Illinois at Urbana-Champaign. Dr Lai holds two Taiwan patents currently. He served as a consultant at ERSO, ITRI during 1988 and at Farady Technology Corp. from 8/94 to 7/95. His current research interests are high performance microprocessor chip design, computer architecture, optimizing compiler, VLSI design, Pattern Recognition, and artificial neural network applications. Dr Lai is a member of Phi Kappa Phi, Phi Tan Phi, SPEC, ACM, The Chinese Institute of Engineers, The Chinese Institute of Electrical Engineers, The American Association for the Advancernent of Science, and the New York Academy of Sciences. He received Acer awards four times in 1989, 1991, 1992 and 1993 and The Taiwan Fuji Xerox Research award in 1991. Dr Lai is a Senior member of IEEE and included in "Who's Who in the World".

About the A u t h o r - - JYH-YUAN DENG was born in Taiwan, R.O.C. on June 9, 1971. He received B.S. and M.S, degrees both in electrical engineering from National Taiwan University, in 1993 and 1995. His research interests include computer vision, image processing, pattern recognition and signal processing.

Pergamon REGION-BASED TEMPLATE …ntur.lib.ntu.edu.tw/bitstream/246246/142032/1/03.pdf ·...

Documents

Transcript of Pergamon REGION-BASED TEMPLATE …ntur.lib.ntu.edu.tw/bitstream/246246/142032/1/03.pdf ·...