Phd Synapsis- Format

8/10/2019 Phd Synapsis- Format

1/12

Novel Multi Algorithm based Speech and Face Recognition

(Multimodal) System Design and Implementation.

INTRODUCTION

Biometrics refers to the authentication techniques that rely on measurable physiological

and individual characteristics that can be automatically verified. Depending on the application

context, a biometric system may operate in verification mode or identification mode. As the

level of security breaches and transaction fraud increases, the need for highly secure

identification and personal verification techniques is becoming apparent. Biometric based

solutions are able to provide for confidential transactions and personal data privacy. Multimodalbiometric integrates different biometric systems for verification in making a personal

identification.

A biometric recognition system can be used in two different modes: identification (1:N

matching) or verification (1:1 matching). Identification is the process of trying to find out a

persons identity by comparing the person who is present against a biometric pattern/template

database. The system would have been pre-programmed with biometric pattern or template of

multiple individuals. During the enrolment stage, a biometric would have been processed, stored

and encrypted, for each individual.

A pattern / template that is going to be identified is going to be matched against every

known template, yielding either a score or distance describing the similarity between the pattern

and the template. The system assigns the pattern to the person with the most similar biometric

template. To prevent impostor patterns (in this case all patterns of persons not known by the

system) from being correctly identified, the similarity has to exceed a certain level. If this level is

not reached, the pattern is rejected.

With verification, a persons identity is known and therefore claimed a priority to search

against. The pattern that is being verified is compared with the persons individual template only.

Similar to identification, it is checked whether the similarity between pattern and template is

sufficient enough to provide access to the secured system or area.


2/12

Statement of the problem:

Most of the biometric systems deployed in real world applications are unimodal which

rely on the evidence of single source of information for authentication (e.g. fingerprint, face,

voice etc.). These systems are vulnerable to variety of problems such as noisy data, intra-class

variations, inter-class similarities, non-universality and spoofing. It leads to considerably high

false acceptance rate (FAR) and false rejection rate (FRR), limited discrimination capability,

upper bound in performance and lack of permanence.

Some of the limitations imposed by unimodal biometric systems can be overcome by

including multiple sources of information for establishing identity. These systems allow the

integration of two or more types of biometric systems known as multimodal biometric systems.

These systems are more reliable due to the presence of multiple, independent biometrics .These

systems are able to meet the stringent performance requirements imposed by various

applications. They address the problem of non-universality, since multiple traits ensure sufficient

population coverage. They also deter spoofing since it would be difficult for an impostor to spoof

multiple biometric traits of a genuine user simultaneously.

Objectives of the proposed study:

The main purpose of the proposed system is to reduce the error rate as low as possible

and improve the performance of the system by achieving good acceptable rate during

identification and authentication.

To replace the existing computationally intensive algorithms with multiple

computationally efficient algorithms and design these algorithms to be on par in the

performance with the highly complex algorithms and procedures.

This replacement of complex procedures of multimodal biometrics with optimized multialgorithm approach is to make use of parallel architecture based signal processing

hardware to meet real time challenges.


3/12

LITERATURE SURVEY

Biometrics refers to the physiological or behavioural characteristics of a person to

authenticate his/her identity [1]. The increasing demand of enhanced security systems has led to

an unprecedented interest in biometric based person authentication system. Biometric systems

based on single source of information are called unimodal systems. Although some unimodal

systems [2] have got considerable improvement in reliability and accuracy, they often suffer

from enrollment problems due to non-universal biometrics traits, susceptibility to biometric

spoofing or insufficient accuracy caused by noisy data [3].

Multi algorithm approach employs a single biometric sample acquired from single sensor.

Two or more different algorithms process this acquired sample. The individual results are

combined to obtain an overall recognition result. This approach is attractive, both from an

application and research point of view because of use of single sensor reducing data acquisition

cost. The 2002 Face Recognition Vendor Test has shown increased performance in 2D face

recognition by combining the results of different commercial recognition systems [4]. Gokberk

et al. [5] have combined multiple algorithms for 3D face recognition. Xu et al. [6] have also

combined different algorithmic approaches for 3D face recognition.

Many different ways of combining the face and voice modalities have been presented in

the literature [7]-[12] , [17-18].

For speech Many classifier approaches, such as vector quantization (VQ), Bayesiandiscriminant dynamic time warping (DTW), Gaussian mixture model (GMM), hidden Markov

model (HMM) and neural network (NN), have been studied for speaker recognition. Among

these approaches, GMM yield the best performance, especially for text-independent applications

[13]. GMM is a powerful approach to model a speakers characteristics for its flexibility to

approximate the underlying probability distribution in a high dimensional space.

PCA is used to calculate uncorrelated components from the covariance matrix of the

original data in the orthogonal matrix transform [15]. LDA searches for those vectors in the

underlying space that best discriminate among the classes and also reduce the dimensionality of

original data [16]. The majority of the biometric systems use Singular Value Decomposition

(SVD) method. TheSVD method plays a vital role in analyzing the biometric traits.


4/12

Types of Biometrics:The biometric system can be classified into two different types:

1. Uni modal Biometric System:

The unimodal biometric employs single biometric trait (either physical or behaviour trait)

to identify the user. Example: Biometric system based on Face or Iris or Palm print or

Voice or Gait etc.

2. Multimodal Biometric System:

A biometric system that consolidates the information from multiple sources is known as

multimodal biometric system. For example:

Speech and Signature

Face and Iris

Face Recognition, Fingerprint verification and speaker verification.

Fingerprint and Hand Geometry.

Limitations of unimodal biometric systems:

Noise in sensed data:Noise in the sensed data may result from defective or improperly

maintained sensor.ex. Finger print image with scar, voice sample altered by cold etc.

Intra-class variation: Caused by an individual who is incorrectly interacting with sensor

and this will increase False Reject Rate (FRR).

Intra-class similarities: Refers to overlapping of feature spaces corresponding t multiple

classes or individuals. This may increase the False Acceptance Rate of the system.

Non-universality: Biometric system may not able to acquire meaningful biometric data

from a subset of users.

Spoof attacks: Involves the deliberate manipulation of ones biometric traits in order to

avoid recognition. This type of attack is relevant when behaviour traits are used.

Multimodal Biometrics:

The term multimodal is used to combine two or more different biometric sources of a

person (like face and fingerprint) sensed by different sensors.


5/12

The Benefits of Multimodal Biometrics:

The multimodal biometric system exhibits number of advantages as compared to that of

unimodal biometric system:

Since multimodal biometric system acquires more than one type of information it offers a

substantial improvement in the matching accuracy as compared to that of unimodal

system.

Multi modal biometric systems are capable of addressing the non universality issue by

accommodating a large population of users.

Multimodal biometric systems are less sensitive to imposter attacks. It is very difficult to

spoof the legitimate user enrolled in multimodal biometric system

Multimodal biometric systems are insensitive to the noise on the sensed data i.e. when

information acquired from the single biometric trait is corrupted by noise we can use

another trait of the same user to perform the verification.

These systems also help in continuous monitoring or tracking the person in situation

when a single biometric trait is not enough. For example tracking a person using face and

gait simultaneously.

Challenges in designing multimodal biometric systems:

Since multimodal biometric relies on multiple information, combing the information plays

an important role in designing the multimodal biometric system. The following are the

challenges involved in designing the multimodal biometric system.

Selection of multimodal biometric source is very challenging as it depends upon the

application and cost involved in acquiring the same.

In multimodal biometric system the information acquired from different sources can be

processed either in sequence or parallel. Hence it is challenging to decide about the

processing architecture to be employed in designing the multimodal biometric system as

it depends upon the application and the choice of the source. Processing is generally

complex in terms of memory and or computations.

Since information obtained from different biometric sources can be combined at four

different levels such as: sensor, feature, match score and decision level. Choosing the


6/12

level of fusion will have direct impact on performance and cost involved in developing a

system. Thus, it is challenging to decide the level of fusion to be employed for the given

sources and application.

Given the biometric source and level of fusion, numbers of techniques are available for

fusing the multiple source of information. Hence, it is challenging to find the optimal

one for the given application.

Multi Algorithm Approach:

Multi algorithm approach employs a single biometric sample acquired from single

sensor. Two or more different algorithms process this acquired sample. The individual

results are combined to obtain an overall recognition result.

This approach is attractive, both from an application and research point of view because

of use of single sensor reducing data acquisition cost.

Multi Sample Approach:

Multi sample or multi instance algorithms use multiple samples of the same biometric.

The same algorithm processes each of the samples and the individual results are fused to

obtain an overall recognition result.

In comparison to the multi algorithm approach, multi sample has advantage that using

multiple samples may overcome poor performance due to one sample that hasunfortunate properties. Acquiring multiple samples requires either multiple copies of the

sensor or the user availability for a longer period of time.

Compared to multi algorithm, multi sample seems to require either higher expense for

sensors, greater cooperation from the user, or a combination of both.

Modes of Operation:

A multimodal system can operate in one of three different modes: Serial mode: In the serial mode of operation, the output of one modality is typically used

to narrow down the number of possible identities before the next modality is used.

Therefore, multiple sources of information (e.g., multiple traits) do not have to be

acquired simultaneously. Further, a decision could be made before acquiring all the traits.

This can reduce the overall recognition time.


7/12

Parallel mode: In the parallel mode of operation, the information from multiple

modalities are used simultaneously in order to perform recognition.

Hierarchical mode: In the hierarchical scheme, individual classifiers are combined in a

treelike structure. This mode is relevant when the number of classifiers is large

Multimodal biometrics in terms of FAR & FRR:

FAR (false acceptance rate): the probability of an imposter being accepted as a genuine

individual.

FRR (false rejection rate): the probability of a genuine individual being rejected as an

imposter.

Applications:

The applications of biometrics can be divided into the following three main groups.

Commercial applications such as computer network login, electronic data security, e-

commerce, Internet access, ATM, credit card etc.

Government applications such as national ID card, correctional facility, drivers license,

social security, welfare disbursement, border control, and passport control.

Forensic applications such as criminal investigation, terrorist identification, parenthood

determination, and missing children.

Traditionally, commercial applications have used knowledge- based systems (e.g., PINs and

passwords), government applications have used token-based systems (e.g., ID cards and badges),

and forensic applications have relied on human experts to match biometric features.


8/12

METHODOLOGY

Multi algorithm approach:

Multi algorithm approach employs a single biometric sample acquired from

single sensor. Two or more different algorithms process this acquired sample.

The individual results are combined to obtain an overall recognition result. This

approach is attractive.

Parallel architecture approach:

In the parallel mode of operation, the information from multiple modalities is

used simultaneously in order to perform recognition.


9/12

Recognition System:

Any recognition system involves various stages. The final output is the recognized person

or identity. Here the first task is the data collection that acquires the data in the system. In

the problem of fusion of face and speech, the camera is used to take the photograph of the

person. At the same time the microphone may be used to capture his voice. Here the

system would be very simple to use for the user where the image and speech can be

acquired simultaneously.

The next step comes is the image pre processing. This is needed for the noise removal as

well as to highlight the features. In case of the face the input is in the form of image that

requires the application of noise removal operators and binarization. In case of speech the

input is a signal that may be freed from noise by the application of noise removal filters.

The next task is segmentation. Here we segment the image and the features. In image the

task is concerned with application of gradient mask, dialization, filling up of holes, etc. In

speech we segment each and every word of the spoken sentence. Then feature extraction

is done. Here we extract the features for dimensionality reduction. The extracted features

must be such that they lead to large inter-class distances and small intra-class distances.

They must be relatively constant when the same face is clicked numerous times, or the

person speaks various times.

Levels of fusion:

The information of the multimodal system can be fused at any of the four modules.


10/12

Fusion at the sensor level:

In this the raw data from different sensors are fused. In it we can either use samples of same

biometric trait obtained from multiple compatible sensors or multiple instances of same

biometric trait obtained using a single sensor. In it the data is fused at very early stage so it has a

lot of information as compared to other fusion levels.

Fusion at the Feature Extraction Level:

The data or the feature set originating from multiple sensors or sources are fused together.

Features extracted from each sensor form a feature vector. These features vectors are then

concatenated to form a single new vector. In feature level fusion we can use same feature

extraction algorithm or different feature extraction algorithm on different modalities whose

features has to be fused.

Matcher Score Level:

Each system provides a matching score indicating the proximity of the feature vector with the

template vector. These scores can be combined to assert the veracity of the claimed identity. The

scores obtained from different matchers are not homogeneous, score normalization technique is

followed to map the scores obtained from different matchers on to a same range. These scores

contain the richest information about the input.Fusion at the Decision Level:

The final outputs of the multiple classifiers are combined. A majority vote scheme can be

used to make final decision. Decision level fusion includes very abstract level of information so

they are less preferred in designing multimodal biometric systems.

POSSIBLE OUTCOME

To achieve multimodal and multi algorithm approach for the recognition of face and

speech.

Computationally efficiency algorithms based on multi algorithm approach for multi

modal biometrics.

To achieve optimal procedures optimized for power efficiency and also enhanced

performance.

Improved FAR and FRR compare to those of existing methodology


11/12

REFERENCES

1.A. K. Jain, A. Ross and S. Prabhakar, An introduction to biometric recognition. IEEE

Transactions on Circuits and Systems for Video Technology, vol. 14, pp. 420, Jan 2004.

2.Chander Kant, Rajender Nath, ReducingProcess-Time for Fingerprint Identification System,

International Journals of Biometric and Bioinformatics, Vol. 3, Issue 1, pp.1-9, 2009.

3.A.K. Jain, A. Ross, Multibiometric systems. Communications of the ACM, vol. 47, pp. 34-

40, 2004.

4. Phillips, P.J., P. Grother R.J. Michaels, D.M. Blackburn and E. Tabassi and J.M. Bone,

FRVT 2002: overview and summary", March 2003.

5. Gokberk, B., A.A. Salah. and L. Akarun, Rank-Based Decision Fusion for 3D Shape- Based

Face Recognition, LNCS 3546: AVBPA, pp. 1019-1028, July 2005.

6. Xu, C., Y. Wang, T. Tan and L. Quan, Automatic 3D face recognition combining global

geometric features with local shape variation information, Aut. Face and Gesture Recog., pp.

308 -313, 2004.

7.Aleksic, P. S. and Katsaggelos, A. K., Audio-visual biometrics, Proc. IEEE, vol. 94, no. 11,pp. 2025-2044, Nov. 2006.

8. Sanderson, C., Automatic person verification usingspeech and face information, Ph.D.

Thesis, Griffith University, Queensland, Australia, 2003.

9. Chetty, G. and Wagner, M., Face-voice authenticationbased on 3D face models, Proc.

ACCV, pp. 559- 568, Jan. 2006.

10. Chetty, G. and Wagner, M., Speaking faces for facevoice speaker identity verification,

Proc. Inter speech, pp. 513-516, Sept. 2006.


12/12

11.Erzin, E., Yemez, Y., and Tekalp, A. M., Multimodalspeaker identification using an

adaptive classifier cascadebased on modality reliability, IEEE Trans. Multimedia,vol. 7, no. 5,

pp. 840-852, Oct. 2005.

12. Chetty, G. andWagner, M., Audio-visual speaker verification based on hybrid fusion of

cross modal features,Proc. PreMI, pp. 469-478, Dec. 2007.

13. Sanderson, C., Biometric person recognition: face, speech, and fusion. VDM Verlag, June

2008.

14. Reynolds, D. A., Quatieri, T., Dunn, R., "Speaker verification using adapted Gaussian

mixture models", Digital Signal Process, 10, 19-41, 2000.

15. M. Turk and A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive Neuroscience,

vol. 3, no. 1, 1991, pp.71-86.

16. P.N. Belhumeur, 1.P. Hespanha, and D. J. Kriegman, "Eigen faces vS.Fisher faces:

Recognition using class specific linear rojection", IEEE Trans. Pattern Anal. Machine Intel., vol.

19, PP. 711-720, May 1997.

17.Ibiyemi T. S. , Ogunsakin J. , Daramola S. A. 2012. "Bi-Modal Biometric Authentication by

Face Recognition and Signature Verification", International Journal of Computer Applications,

vol. 42, no. 20, pp 17-21.

18.Ibiyemi T. S. , Akintola A. G. 2012. "Speaker Authentication and Speech Recognition

Enabled Telephone Auto-Dial in Yorb", International Journal of Science and Advanced

Technology, vol. 12, no. 4, pp 88-187.

Phd Synapsis- Format

Documents

Transcript of Phd Synapsis- Format