Biometrics Technology: Image Processing & Pattern...

w04-P. R. Biometrics - Summer 2006 1

Biometrics Technology: Image Processing & Pattern Recognition

(by Dr. Dickson Tong)References:[1] http://homepages.inf.ed.ac.uk/rbf/HIPR2/index.htm[2]http://www.cs.wisc.edu/~dyer/cs540/notes/vision.html


Image Processing & Pattern RecognitionImage Enhancement Techniques

Convolution filterGaussian filterLaplacian/Laplacian of Gaussian filterUnsharp filterContrast StretchingHistogram Equalization

Feature Extraction TechniquesRoberts Cross Sobel Edge DetectorCanny Edge DetectorBinarization

Pattern Recognition TechniquesEigenspace Representation of ImagesPCA(Principal Component Analysis)Example Face Recognition Algorithm


Image Enhancement Techniques – Convolution Filter [1]

A Convolution Filter is• a simple and fundamental mathematical operation to

many common image processing operators• a way of `multiplying together' two arrays of numbers,

with different sizes but same dimensionality• used in image processing to implement operators whose

output pixel values are simple linear combinations of certain input pixel values

• performed by sliding the kernel over the image, generally starting at the top left corner

• to move the kernel through all the positions where the kernel fits entirely within the boundaries of the image


Image Enhancement Techniques – Convolution Filter [1]Convolution Filter is

calculated by multiplying the kernel value and the underlying image pixel value for each of the cells in the kernel, and then adding all these numbers together.

e.g.Mathematically


Image Enhancement Techniques - Gaussian filter [1]Gaussian filter -• is a 2-D smoothing convolution operator• is used to blur images and remove detail and noise• where sigma is the standard deviation of the distribution• assume distribution has a mean of zero,i.e. centered on the

line x=0

1-D form


Image Enhancement Techniques - Gaussian filter [1]Gaussian filter - idea of Gaussian smoothing is to

use this 2-D distribution as a `point-spread' function

2-D form

discrete approximation to Gaussian function with sigma = 1.0


Image Enhancement Techniques - Gaussian filter [1]

Example: original image, sigma=1.0, sigma=2.0


Image Enhancement Techniques –Laplacian/Laplacian of Gaussian filter [1]

The Laplacian• is a 2-D isotropic measure of the 2nd spatial

derivative of an image• an image highlights regions of rapid intensity change• often used for edge detection • takes a graylevel image as input and produces another

graylevel image as output• Laplacian L(x,y) of an image with pixel intensity

values I(x,y)



The Laplacian• can be calculated using a convolution filter• two commonly used discrete approximations

to the Laplacian filter



The convolution kernels approximating a second derivative measurement on the image is very sensitive to noise

The image is often Gaussian smoothed before applying the Laplacianfilter which reduces the high frequency noise components prior to the differentiation step

Laplacian of Gaussian (LoG) Since the convolution operation is associative

– we can convolve the Gaussian smoothing filter with the Laplacian filter first and

– convolve this hybrid filter with the image to achieve the required result

– advantage• requires far fewer arithmetic operations• The LoG kernel can be precalculated in advance so only one

convolution needs to be performed



the 2D LoG function centered on zero and with Gaussian standard deviation sigma has the form


Image Enhancement Techniques –Laplacian/Laplacian of Gaussian filter

response of the LoG to a step edge


Image Enhancement Techniques –Laplacian/Laplacian of Gaussian filter

Example: LoG filter with Gaussian sigma = 1.0; and a 7×7 kernelresult image contain negative and non-integer valuesoutput image normalized to the range 0 – 255 for display

purpose


Image Enhancement Techniques –Unsharp filter [1]

• simple sharpening operator for edge enhancement• subtracts an unsharp, or smoothed, version of an image

from the original image• produce edge image g(x,y) from original image f(x,y)• fsmooth(x,y) is a smoothing function for f(x,y)



examining its frequency response characteristicscalculate edge image for unsharp filter (a),(b),(c)add the edge image back to the original image



Complete unsharp sharping operatorK varies between 0.2 and 0.7, with the larger values

providing increasing amounts of sharpening



Example of Unsharp filter: original image, edge image, sharpened image


Image Enhancement Techniques –Contrast Stretching [1]

Contrast stretching - often called normalization• improve the contrast in an image by `stretching' the range of

intensity values it contains to span a desired range of values• a and b are the lower and the upper limits of the image type

e.g. (0,255) for 8-bit grayscale image• c and d are the lowest and highest value in the original image• Pin is pixel value of original image, Pout is output pixel value


Image Enhancement Techniques - Contrast Stretching [1]Example:


Image Enhancement Techniques –Histogram Equalization [1]

Histogram Equalization• provide a sophisticated method for modifying the dynamic range

and contrast of an image by altering that image such that its intensity histogram has a desired shape

• employs a monotonic, non-linear mapping which re-assigns the intensity values of pixels in the input image such that the output image contains a uniform distribution of intensities

• used in image comparison processes • usually introduced using continuous, rather than discrete, process

functions• suppose that the images of interest contain continuous intensity

levels (in the interval [0,1])• suppose the transformation function f which maps an input image

A(x,y) onto an output image B(x,y) is continuous within this interval


Image Enhancement Techniques - Histogram Equalization [1]

Histogram Equalization assumed that the transfer law which may also be written in terms of intensity density levels, e.g. is single-valued and monotonically increasing and its inverse exist as

Example of such a transfer function:

)( AB DfD =

)(1BA DfD −=


Image Enhancement Techniques - Histogram Equalization [1]

All pixels in the input image with densities in the region DA to DA + dDA will have pixel value replaced by an output pixel

density value in the range from in the region of DB to DB + dDB

surface areas hA(DA)dDA and hB(DB)dDB will therefore be equalyielding:

where


Image Enhancement Techniques –Histogram Equalization

Result can be written in the language of probability theory if the histogram h is regarded as a continuous probability density function p describing the distribution of the (assumed random) intensity levels:

For histogram equalization, the output probability densities should all be an equal fraction of the maximum number of intensity levels in the input image (where the minimum level considered is 0). The transfer function (or point operator) necessary to achieve this result is simply:


Image Enhancement Techniques - Histogram EqualizationTherefore

where is simply the cumulative probability distribution (i.e.cumulative histogram) of the original image. Thus, an image which is transformed using its cumulative histogram yields an output histogram which is flat!A digital implementation of histogram equalization is usually performed by defining a transfer function of the form:

where N is the number of image pixels and nk is the number of pixels at intensity level k or less.


Image Enhancement Techniques - Histogram EqualizationExample


Feature Extraction Techniques –Roberts Cross [1]

Roberts Cross operator• simple, quick to compute, 2-D spatial gradient measurement on an

image• highlights regions of high spatial frequency which often

correspond to edges• consists of a pair of 2x2 convolution kernels


Feature Extraction Techniques - Roberts CrossKernels designed• to respond maximally to edges running at 45° to the pixel

grid, one kernel for each of the two perpendicular orientations, the gradient magnitude is given by:

• an approximate magnitude is computed using:

• angle of orientation of the edge giving rise to the spatial gradient is given by:

• pseudo-convolution kernels and the approximate magnitude is given by:


Feature Extraction Techniques - Roberts Cross

Example


Feature Extraction Techniques - Sobel Edge Detector [1]

Sobel operator• performs a 2-D spatial gradient measurement on an image • emphasizes regions of high spatial frequency that correspond to

edges• approximate absolute gradient magnitude at each point in an input

grayscale image• consists of a pair of 3×3 convolution kernels


Feature Extraction Techniques - Sobel Edge DetectorKernels designed• respond maximally to vertical and horizontal edge• one kernel for each of the two perpendicular orientations• can be applied separately to the input image, to produce separate

measurements of the gradient component in each orientation (callthese Gx and Gy)

• gradient magnitude, approximation and angle of orientation of the edge are given by:

pseudo-convolution kernels and approximate magnitude


Feature Extraction Techniques - Sobel Edge Detector

Sobel Edge Detector: Example


Feature Extraction Techniques - Canny Edge Detector [1]Canny operator• designed to be an optimal edge detector (according to

particular criteria)• works in a multi-stage process

– smoothed by Gaussian convolution– simple 2-D first derivative operator (somewhat like

the Roberts Cross) is applied to the smoothed image to highlight regions of the image with high first spatial derivatives

– edges give rise to ridges in the gradient magnitude image

• the algorithm tracks along the top of ridges and sets to zero all pixels that are not on the ridge top to give a thin line in the output, a process known as non-maximal suppression.


Feature Extraction Techniques - Canny Edge Detector [1]

Canny operator• the tracking process exhibits hysteresis controlled by two

thresholds: T1 and T2, with T1 > T2• tracking can only begin at a point on a ridge higher than T1• tracking then continues in both directions out from that

point until the height of the ridge falls below T2• this hysteresis ensure that noisy edges are not broken up

into multiple edge fragments.


Feature Extraction Techniques - Canny Edge DetectorExample

Sigma=1.0Upper thres=255Lower thres=1





Feature Extraction Techniques – Binarization [1]

Binarization• used to extract feature from feature magnitude map• used label features• convert intensity map to binary values, such as 0 or 255• color of the object (usually white) is referred to as the

foreground color• the rest (usually black) is referred to as the

background color• produced by thresholding

– global threshold – apply threshold on the whole image– Local/adaptive threshold – apply threshold locally


Feature Extraction Techniques - BinarizationExample: original image, edge map computed by sobel,

binarization - threshold with value 150


Pattern Recognition Techniques –Eigenspace Representation of Images [2]

Image Representation in N2 dimension• an N x N image can be "represented" as a point in an N2

dimensional image space• each dimension is associated with one of the pixels in

the image and the possible values in each dimension are the possible gray levels of each pixel

• e.g. 512 x 512 image where each pixel is an integer in the range 0, ..., 255 (i.e., a pixel is stored in one byte), then image space is a 262,144-dimensional space and each dimension has 256 possible values.


Pattern Recognition Techniques - Eigenspace Representation of Images

Example: case of M training face imagesSuppose we represent our M training images as M points in

image spaceOne way of recognizing the person in a new test image would

be to find its nearest neighbor training image in image space,

However– this approach would be very slow since the size of image

space is so large– does not exploit the fact that since all of our images are of

palms, they will likely be clustered relatively near one another in image space

So– let's represent each image in a lower-dimensional feature

space, called face space or eigenspace.


Pattern Recognition Techniques –Eigenspace Representation of Images

Suppose, we have M' images, E1, E2, ..., EM', called eigenfaces or eigenvectors. Each images define a basis set, so that each face image will be defined in terms of how similar it is to each of these basis images, i.e. we can represent an arbitrary image I as a weighted (linear) combination of these eigenvectors as follows:

1. Compute the average image, A, from all of the training images I1, I2, ..., IM:

2. For k = 1, ..., M' compute a real-valued weight, wk, indicating the similarity between the input image, I, and the kth eigenvector, Ek:

wk = EkT * (I - A)

where I is a given image and is represented as a column vector of length N2, Ek is the kth eigenface image and is a column vector of length N2, A is a column vector of length N2, * is the dot product operation, and - is pixel by pixel subtraction. Thus wk is a real-valued scalar.

∑=

=M

iiI

MA

1

1



3. W = [w1, w2, ..., wM']T is a column vector of weights that indicates the contribution of each eigenface image in representing image I. • instead of representing image I in image space, • we'll represent it as a point W in the M'-dimensional weight

space • that we'll call face space or eigenspace.• each image is projected from a point in the high dimensional

image space down to a point in the much lower dimensional eigenspace.

• in terms of compression, each image is represented by M' real numbers, which means that for a typical value of M'=10 and 32 bits per weight, we need only 320 bits/image to encode it in face space. (Of course, we must also store the M' eigenfaceimages, which are each N2 pixels, but this cost is amortized over all of the training images, so it can be considered to be a small additional cost.)



Notice that image I can be approximately reconstructed from W as follows:

This reconstruction will be exact if M' = min(M, N2). Hence, representing an image in eigenspace won't be exact in that the image won't be reconstructible, but it will be a pretty good approximation that's sufficient for differentiating between faces.

Question: How to select a value for M' and then determine the M'"best" eigenvector images (i.e., eigenfaces).Answer: Use the statistics technique called Principal Components Analysis (also called the Karhunen-Loeve transform in communications theory). Intuitively, this technique selects the M'images that maximize the information content in the compressed (i.e., eigenspace) representation.

∑=

+≈'

1)*(

M

iii EwAI


Pattern Recognition Techniques –PCA (Principal Component Analysis) [2]

The best M' eigenface images can be computed as follows: 1. For each training image Ii, normalize it by subtracting the

mean (i.e., the "average image"): Yi = Ii – A2. Compute the N2 x N2 Covariance Matrix:

3. Find the eigenvectors of C that are associated with the M'largest eigenvalues. Call the eigenvectors E1, E2, ..., EM'. These are the eigenface images used by the algorithm given above.

∑=

=M

i

Tii yy

MC

1)(1


Pattern Recognition Techniques – Face Recognition Algorithm [2]

Example: face recognition1. Given a training set of face images, compute the M' largest

eigenvectors, E1, E2, ..., EM'. M' = 10 or 20 is a typical value used. Notice that this step is done once "offline.“

2. For each different person in the training set, compute the pointassociated with that person in eigenspace. That is to compute W = [w1, ..., wM']. Note that this step is also done once offline.

3. Given a test image, Itest, project it to the M'-dimensional eigenspaceby computing the point Wtest

4. Find the closest training face to the given test face: d= mink abs(Wtest –Wk)

where Wk is the point in eigenspace associated with the kth person in the training set, and || X || denotes the Euclidean norm defined as (x1

2 + x22 + ... + xn

2)1/2 where X is the vector [x1, x2, ..., xn].


Pattern Recognition Techniques - Face Recognition Algorithm

5. Find the distance of the test image from eigenspace, i.e. compute the projection distance so that we can estimate the likelihood that the image contains a face): dffs = || y - yf ||

where Y = Itest - A, and

6. If dffs < Threshold1; Test image is "close enough" to the eigenspace; associated with all of the training faces to ; believe that this test image is likely to be some ; face (and not a house or a tree or something ; other than a face)

then if d < Threshold2then classify Itest as containing the face of person k,

where k is the closest face in the eigenspace to Wtest, the projection of Itest to eigenspace

else classify Itest as an unknown person else classify Itest as not containing a face

∑=

='

1, )*(

M

iiitestf EWy

Biometrics Technology: Image Processing & Pattern...

Documents

Transcript of Biometrics Technology: Image Processing & Pattern...