Acoustic Beamforming for Hearing Aids Using Multi Microphone...

MEE-2010-2012

Acoustic Beamforming for Hearing Aids Using

Multi Microphone Array by Designing

Graphical User Interface

Master’s Thesis

S S V SUMANTH KOTTA

BULLI KOTESWARARAO KOMMINENI

This thesis is presented as a part of Degree of Master of Science in Electrical

Engineering with Emphasis on Signal Processing

Blekinge Institute of Technology

January-2012

Blekinge Institute of Technology

School of Engineering

Department of Electrical Engineering

Supervisor : Dr. Benny Sällberg

Examiner : Dr. Nedelko Grbic

Blekinge Tekniska Högskola

SE 371 41 Karlskrona.

ii

Contact Information:

Author 1:

S S V Sumanth Kotta (880617-4499)

Email: [email protected]

Author 2:

Bulli Koteswararao Kommineni (880812-4690)


Supervisor:

Dr. Benny Sällberg


School of Engineering, BTH

Blekinge Institute of Technology, Sweden


Examiner:

Dr. Nedelko Grbic


School of Engineering, BTH

Blekinge Institute of Technology, Sweden


mailto:[email protected]




iii

ABSTRACT

Hearing impaired persons lose their ability to distinguish speech signal in ambient

noise. Human hearing system is sensitive to interfering noise. Interfering noise decreases the

quality and intelligibility of the speech signal which in turn makes speech communication

default. To make the speech signal effective and useful for hearing impaired, they need to be

enhanced from noisy speech signal. Speech enhancement is one of the most emerging and

useful branch in signal processing, to reduce the noise and improves the perceptual quality

and intelligibility of the speech signal.

Several signal processing techniques has been widely used in hearing aids to enhance

the speech signal from the noisy environment. Microphone array is one of the signal

processing technique implemented in hearing aids to provide a better solution to the problem

encountered by the hearing impaired person when listening to speech in the presence of

background noise. Generalized Sidelobe Canceller (GSC) is a powerful technique to enhance

the signal of interest which suppressing the interference signal and noise at the output of the

array microphones. The main focus of the thesis is to implement a GSC using microphone

array, the blocking matrix in the GSC is replaced with Elko’s algorithm. Elko’s algorithm is

used to track and attenuate interference or background noise located in the back half plane of

the array of microphones.

The proposed system is implemented successfully and validated effectively. Clean

speech signal is corrupted by various background noises respectively multi-talker babble

noise, wind noise, car interior noise, destroyer engine room noise, tank noise, interference

male and female voices at five different Signal-to-Noise Ratio (SNR) levels 0db, 5db, 10db,

15db and 20db. Different types of objective tests, such as SNR, Signal-to-Noise Ratio

Improvement (SNRI), Perceptual Evaluation of Speech Quality (PESQ), Speech Distortion

(SD) and Noise Distortion (ND) are performed on the test set. The platform is made in

Matlab Graphical User Interface (GUI) and all the results have been shown by plots produced

from Matlab code.

iv

To Our Parents

v

Acknowledgment

We owe our deepest gratitude to our supervisor, Benny Sällberg, for his

encouragement and guidance. He provided us with all the advice and support for completing

the thesis. His deep knowledge in the field allowed us to learn many things which are helpful

to us during the thesis work. We would like to express our utmost gratitude to our Examiner

Dr. Nedelko Grbić for providing us this opportunity to pursue Master Thesis. We would like

to thank Dr. Gary W. Elko for giving his suggestions during thesis work.

We would like to thank our parents and family for their support and encouragement

for the completion of thesis. They helped us throughout our educational carrier and motivated

us. They helped us both morally and financially. We would like to thank all of our friends

who supported us during the thesis work.

Lastly, we offer our regards to all of those who supported us in any respect during the

completion of thesis.

S S V Sumanth Kotta,

Bulli Koteswararao Kommineni,

Karlskrona, January, 2012, Sweden.

vi

Table of Contents

ABSTRACT ........................................................................................................................... III

ACKNOWLEDGMENTS ...................................................................................................... V

LIST OF FIGURES ........................................................................................................... VIII

LIST OF TABLES ................................................................................................................ XI

LIST OF ABBREVIATIONS .............................................................................................. XI

CHAPTER 1 ............................................................................................................................. 1

INTRODUCTION.................................................................................................................... 1

1.1 Objective of the Thesis .................................................................................................... 2

1.2 Problem Statement ........................................................................................................... 2

1.3 Aim of the Thesis Work ................................................................................................... 3

1.4 Overview of the Proposed System ................................................................................... 3

1.5 Outline of the Thesis: ....................................................................................................... 4

CHAPTER 2 ............................................................................................................................. 5

BACKGROUND ...................................................................................................................... 5

2.1 Microphone Array in Hearing Aids ................................................................................. 5

2.2 Beamforming in Hearing Aids ......................................................................................... 6

2.3 Noise Reduction in Hearing Aids .................................................................................... 6

2.4 Feedback Cancellation in Hearing Aids .......................................................................... 7

CHAPTER 3 ............................................................................................................................. 8

MICROPHONE ARRAY ........................................................................................................ 8

3.1 Basics of Microphone ...................................................................................................... 8

3.2 Microphone Array ............................................................................................................ 8

3.3 Microphone array structure and connections ................................................................... 9

3.4 Physical Preliminaries .................................................................................................... 10

3.5 Trigonometric Solution .................................................................................................. 11

CHAPTER 4 ........................................................................................................................... 14

ELKO ALGORITHM ........................................................................................................... 14

4.1 Introduction .................................................................................................................... 14

4.2 Aim of Elko Algorithm .................................................................................................. 14

4.3 Derivation of the Adaptive First-Order Array ............................................................... 15

4.4 LMS Version of Back-to-Back Cardioid Microphone ................................................... 19

vii

CHAPTER 5 ........................................................................................................................... 21

BEAMFORMING AND GUI................................................................................................ 21

5.1 Basics structure of Beamforming................................................................................... 22

5.2 Types of Beamformers ................................................................................................... 24

5.2.1 Fixed Beamforming ................................................................................................ 24

5.2.2 Adaptive Beamforming ........................................................................................... 24

5.2.3 Acoustic Beamforming ........................................................................................... 26

5.3 Generalized Sidelobe Canceller ..................................................................................... 26

5.4 Elko Based Generalized Sidelobe Canceller.................................................................. 29

5.5 Graphical User Interface ................................................................................................ 33

CHAPTER 6 ........................................................................................................................... 35

EVALUATION ...................................................................................................................... 35

6.1 Test Data ........................................................................................................................ 35

6.1. 1 Clean Speech Data ................................................................................................. 35

6.1.2 Noise Data ............................................................................................................... 36

6.2 Objective Measures ........................................................................................................ 39

6.2.1 Signal to Noise ratio................................................................................................ 39

6.2.2 SNR Improvement .................................................................................................. 39

6.2.3 Perceptual Evaluation of Speech Quality ............................................................... 39

6.2.4 Measurement of Speech Distortion ......................................................................... 41

6.2.5 Measure of Noise Distortion ................................................................................... 41

6.3 Test Results with Various Noise Signals ....................................................................... 42

6.3.1 Evaluation of Babble Noise .................................................................................... 43

6.4.2 Evaluation of Car Interior Noise ............................................................................. 47

6.4.3 Evaluation of Tank Noise ....................................................................................... 50

6.4.4 Evaluation of wind Noise........................................................................................ 54

6.4.5 Evaluation of Man voice as interference Noise ...................................................... 57

6.4.6. Evaluation of Destroy Engine Noise...................................................................... 61

6.4.7 Evaluation of Female voice as interference Noise .................................................. 64

CHAPTER 7 ........................................................................................................................... 68

SUMMARY AND FUTURE WORK ................................................................................... 68

7.1 Conclusion ..................................................................................................................... 68

7.2 Future work .................................................................................................................... 69

BIBLIOGRAPHY .................................................................................................................. 70

viii

List of Figures

Fig. 1. 1. Basic overview of microphone array for recording the signals 01

Fig. 1. 2. Basic overview of the speech enhancement system 02

Fig. 1. 3. Block diagram of the proposed system 03

Fig. 2. 1. Head simulator with the three element microphone array 06

Fig. 3. 1. Electronic symbol of microphone 08

Fig. 3. 2. Microphone constellation in an array 09

Fig. 3. 3. Physical set up of the microphones 10

Fig. 3. 4. Diagram showing the possible situation of microphone and source 12

Fig. 4. 1. First-order sensor composed of two zero-orders and a delay 15

Fig. 4. 2. Directional response of the array for 16

Fig. 4. 3. Schematic implementation of an adaptive first-order differential microphone

using the combination of forward and backward facing cardioids 17

Fig. 4. 4. Directional response of the array for 18

Fig. 4. 5. Directional response of the back-to-back cardioid microphone 19

Fig. 4. 6. Directional response of the adaptive array for 20

Fig. 5. 1. Block diagram of beamforming 22

Fig. 5. 2. Signal model for microphone array and beamforming 22

Fig. 5. 3. An Adaptive beamforming system 25

Fig. 5. 4. Block diagram of generalized sidelobe canceller 27

Fig. 5. 5. Detailed structure of generalized sidelobe canceller 28

Fig. 5. 6. Structure of Proposed Elko Based Generalized Sidelobe Canceller Model 30

Fig. 5. 7. GUI layout used for the design of the proposed model 33

Fig. 6. 1. Power spectrum of babble noise 36

Fig. 6. 2. Power spectrum of car interior noise 37

Fig. 6. 3. Power spectrum of destroy engine noise 37

Fig. 6. 4. Power spectrum of tank noise 38

Fig. 6. 5. Power spectrum of wind noise 38

Fig. 6. 6. Structure of perceptual evaluation of speech quality 40

Fig. 6. 7. Graph represents the position of microphones, source and noise signal 42

Fig. 6. 8. Graph represents a clean speech signal 42

ix

Fig. 6. 9. Graph represents babble corrupted speech signal at 10dB, enhanced signal 43

Fig. 6. 10. Graph represents the SNR value for babble noise 44

Fig. 6. 11. Graph represents the PESQ value for babble noise 44

Fig. 6. 12. Graph represents the SD value for babble noise at 10dB 45

Fig. 6. 13. Graph represents the ND value for babble noise at 10dB 45

Fig. 6. 14. GUI Layout with babble as input noise at 10dB of input SNR 46

Fig. 6. 15. Graph represents car noise corrupted speech signal at 0dB, enhanced signal 47

Fig. 6. 16. Graph represents the SNR value for car noise 47

Fig. 6. 17. Graph shows the PESQ value for car noise 48

Fig. 6. 18. Graph represents the SD value for car noise at 0dB 48

Fig. 6. 19. Graph represents the ND value for car noise at 0dB 49

Fig. 6. 20. GUI Layout with car noise as input noise at 0dB of input SNR 50

Fig. 6. 21. Graph represents tank noise corrupted speech signal at 5dB, enhanced signal 50

Fig. 6. 22. Graph represents the SNR value for tank noise 51

Fig. 6. 23. Graphs represents the PESQ value for tank noise 51

Fig. 6. 24. Graph represents the SD value for tank noise at 5dB 52

Fig. 6. 25. Graph represents the ND value for tank noise at 5dB 52

Fig. 6. 26. GUI Layout with car noise as input noise at 5dB of input SNR 53

Fig. 6. 27. Graph represents wind noise corrupted speech signal at 0dB, enhanced signal 54

Fig. 6. 28. Graph represents the SNR value for wind noise 54

Fig. 6. 29. Graph shows the PESQ value for wind noise 55

Fig. 6. 30. Graph represents the SD value for wind noise at 0dB 55

Fig. 6. 31. Graph represents the ND value for wind noise at 0dB 56

Fig. 6. 32. GUI Layout with wind noise as input noise at 0dB of input SNR 57

Fig. 6. 33. Graph represents man noise corrupted speech signal at 5dB, enhanced signal 57

Fig. 6. 34. Graph shows the SNR value for man voice as interference noise 58

Fig. 6. 35. Graph shows the PESQ value for man voice as interference noise 58

Fig. 6. 36. Graph shows the SD value for man voice as interference noise at 5dB 59

Fig. 6. 37. Graph shows the ND value for man voice as interference noise at 5dB 59

Fig. 6. 38. GUI Layout with man noise as input noise at 5dB of input SNR 60

Fig. 6. 39. Graph represents engine noise corrupted speech signal at 0dB, enhanced signal 61

Fig. 6. 40. Graph shows the SNR value for destroy engine noise 61

Fig. 6. 41. Graph shows the PESQ value for destroy engine noise 62

x

Fig. 6. 42. Graph shows the SD value for destroy engine noise at 0dB 62

Fig. 6. 43. Graph shows the ND value for destroy engine noise at 0dB 63

Fig. 6. 44. GUI Layout with destroy engine noise as input noise at 0dB of input SNR 64

Fig. 6. 45. Graph represents female noise corrupted signal at 10dB, enhanced signal 64

Fig. 6. 46. Graph shows the SNR value for female voice as interference noise 65

Fig. 6. 47. Graph shows the PESQ value for female voice as interference noise 65

Fig. 6. 48. Graph shows the SD value for female voice as interference noise at 10dB 66

Fig. 6. 49. Graph shows the ND value for female voice as interference noise at 10dB 66

Fig. 6. 50. GUI Layout with female noise as input noise at 10dB of input SNR 67

xi

List of Tables

Table 5.1 Basic components used in GUI 34

Table 6.1 Type of male and female sentences used for evaluation 35

Table 6.2 Represents the SNRI speech and noise distortion for babble noise 46

Table 6.3 Represents the SNRI speech and noise distortion for car interior noise 49

Table 6.4 Represents the SNRI speech and noise distortion for tank noise 53

Table 6.5 Represents the SNRI speech and noise distortion for wind noise 56

Table 6.6 Represents the SNRI speech and noise distortion for man interference noise 60

Table 6.7 Represents the SNRI speech and noise distortion for destroy engine noise 63

Table 6.8 Represents the SNRI speech and noise distortion for female interference noise 67

List of Abbreviations

GSC Generalized Sidelobe Canceller

SNR Signal-to-Noise Ratio

SNRI Signal-to-Noise Ratio Improvement

PESQ Perceptual Evaluation of Speech Quality

ANC Adaptive Noise Canceller

ADMA Adaptive Differential Microphone Array

NLMS Normalized Least Mean Square

GUI Graphical User Interface

VoIP Voice over Internet Protocol

DAT Digital Audio Tape

SD Speech Distortion

ND Noise Distortion

dB Decibels

Chapter-1

1

Chapter 1 Introduction

Hearing impairments affect 10% of the world population. Surveys in Sweden, has

estimated that about 1.2 million people aged 18 years and older have mild hearing loss,

495,000 have moderate hearing loss, 120,000 have sever hearing loss [1]. About 367,000

Swedes, with hearing damage uses hearing aids. For the people suffering with hearing

impairment, hearing aids are used to amplify the acoustic signal that enables an individual

with hearing loss to understand the acoustic signal in an efficient manner. Most of the hearing

impaired people with hearing aids do not satisfy with the hearing aids because of background

noise.

The poor performance of the conventional hearing aids in background noise

motivated the use of microphone array to create directional sensitive hearing aids that

amplify the signal arriving in a particular direction. Microphone array is used to improve

desired speech signal when the interference arises from different directions. The microphone

array is considered as a preprocessor, followed by conventional hearing aid processing [2]. It

is used to improve the SNR value and speech intelligibility. Microphone array is used in

various applications such as audio, teleconference, voice recognition applications [3]. Fig. 1.

1 represents the basic overview of microphone array for recording signals.

Fig. 1. 1. Basic overview of microphone array for recording the signals

Chapter-1

2

The speech signal recorded by the microphone array is of poor quality, because of

various interfering noises recorded and the distance between the speaker and microphones.

Further the output of the microphone array should be processed to enhance the pure speech

signal. Speech enhancement is one of the key technology, used to enhance the speech signal

and to suppress the unwanted noise while maintaining the quality of the speech signal. Fig. 1.

2. represents the basic overview of speech enhancement system.

Fig. 1. 2. Basic overview of the speech enhancement system

1.1 Objective of the Thesis

The objective of the thesis is to improve the perceptual aspects such as quality and

intelligibility of the degraded signal in hearing aids by the use of microphone array. This

project will analyze the achievable performance of speech enhancement, for a microphone

with 1 cm aperture. The speech enhancement paradigm that will be used exclusively

throughout the project is GSC and Elko algorithm.

1.2 Problem Statement

Microphone array as a preprocessor to hearing aids, the problem is to design a system

to enhance desired speech signal form the interfering noise signals. The interfering noise

signal may be of random, wind, background sounds form offices, car or babble noise. The

noise signal may affect the original signal in an additive, multiplicative or convolution

manner. This thesis concern with

Firstly how to design and implement a microphone array that suits the system and to

determine the angle of arrival of the speech and noise signal to the microphones.

Secondly how to suppress the noise by implementing a new way of speech

enhancement method which uses GSC in which blocking matrix is replaced with the

Elko algorithm and to develop a GUI layout which suits the proposed method.

Chapter-1

3

1.3 Aim of the Thesis Work

The main aim of developing this thesis is to overcome the problem of interfering

noise signals in hearing aids, to improve the quality and intelligibility of the speech signal by

using microphone array as a preprocessor to the hearing aids. This work is divided into four

major parts

Design of the microphone array that suits the system and determining the angle of

arrival of the speech and noise signal to the microphones.

Implementing the Elko algorithm and GSC. The blocking matrix in the GSC is

replaced with Elko algorithm.

The final objective is to analyze the performance of the proposed system with

different interfering noises and to perform the objective tests on the system.

Implementing the proposed system in Matlab GUI.

1.4 Overview of the Proposed System

The block diagram of the proposed system used in our thesis is as shown in Fig. 1. 3.

Fig. 1. 3. Block diagram of the proposed system

The proposed system consists of a speech signal and an interfering noise signal. It

consists of an array of microphones placed in an arc position as shown in Fig. 1. 3.

Chapter-1

4

Speech and noise signals are individually passed through an array of microphones, the output

is the delayed version of the original signals. Elko algorithm is applied to the signals obtained

from the microphone array. Elko algorithm is a linear system which is used to improve the

SNR [4]. Both the speech and noise signal obtained from the Elko algorithm is added

together and is applied to the adaptive beamformer such as GSC [5]. GSC is most popular

adaptive beamformer which is used to enhance the speech signal. For the effective

performance of the system the blocking matrix in GSC is replaced with Elko algorithm. The

modified GSC is used to suppress the noise signal form the noise contaminated speech signal.

The output from the modified GSC is a speech signal that is presented to the listener. The

detailed description of the Elko algorithm and GSC is explained in the further chapters. This

system is implemented in the Matlab GUI. To validate the system it undergoes various

objective measures such as measure of SNR, SNRI, PESQ, SD and ND.

1.5 Outline of the Thesis:

This document is a report on thesis presented as a part of Degree of Master of Science in

Electrical Engineering with Emphasis on Signal Processing. It is made up of six chapters and

is organized as follows Chapter 1 introduces the subject that are handled in this thesis. It

comprises of sub-chapters that deals with objective of the thesis, problem statement, aim of

the thesis, overview of the proposed system. Chapter 2 provides the background information

of hearing aids and various enhancement technologies used in hearing aids. Chapter 3

provides the information about the microphones, arrangement of microphones and the angle

of arrival of the speech signal to the microphones. Chapter 4 provides the information of the

elko algorithm. It comprises of sub-chapters that deal with first order differential

microphones, derivation of the adaptive first order and second order arrays, LMS version of

the differential microphone. Chapter 5 provides the information about the beamforming. It

comprises of sub-chapters that deals with different types of beamforming, generalized

sidelobe canceller, the model proposed in the thesis and information about GUI. Chapter 6

provides the information about the testing and presentation of the results. It gives the

description about the SNR, SNRI, PESQ, SD and ND values. Chapter 7 gives the information

about final conclusion and recommendation for future work. Some useful references used in

thesis.

Chapter-2

5

Chapter 2 Background

It is difficult for the normal hearing person to understand the speech signal with

background noise. The problem is very severe for the person suffering from hearing

impairment. To enhance the speech signal form the background noise many algorithms have

developed. To increase the quality and intelligibility of the speech signal in the hearing aids

several signal processing techniques. Signal processing has wide range of applications in

hearing aids. Since more than 25 years onwards research is going on hearing aids as an

application in signal processing.

In the past decades, the development of hearing aids was increased with the

development of sophisticated signal processing algorithms such as beamforming, noise

reduction techniques and feedback cancellation. Other signal processing technologies such as

adaptive filtering, echo cancellation, array processing has been widely used in hearing aids.

2.1 Microphone Array in Hearing Aids

Microphone array consists of a multiple microphones arranged in spatial domain.

Microphone array hearing aids provides a better solution for the hearing impaired person

when listening to speech in the presence of background noise. The aim of the microphone

array hearing aids is to increase the speech to interference ratio when the interference is

arrived from different directions rather than the desired speech signal. Functionally, a

microphone array hearing aid consists of three components: the microphone array, processing

unit and receiver all these units are interconnected. Microphone array acts as a preprocessor

to the system followed by the speech enhancement system [2]. A microphone array is capable

of maintaining high signal to noise ratio in a noisy environment. The advantage of

microphone array is their ability to exploit, reduction of noise based on the knowledge of the

position of speech signal. Fig. 2. 1. represents an example of a head simulator using a hearing

aid with element microphone array [13].

Chapter-2

6

Fig. 2. 1. Head simulator with the three element microphone array

2.2 Beamforming in Hearing Aids

Beamforming is a signal processing technique used for signal transmission or

reception. Beamforming technology is used to create a constructive interference in a

particular direction and destructive interference in other directions. Hearing impaired person

facing problem with different directions of noise source. Beamforming is used to create a null

in the direction of the noise source and allows only signal coming from a particular direction.

Beamforming is performed in hearing aids to enhance the SNR and to increase the speech

intelligibility in hearing aids [6].

Several beamforming techniques have been developed for hearing aids to enhance the

desired speech signal from various types of noises. Fixed beamforming is used to obtain the

beam in a particular direction and don’t change its direction as that the incident source

direction changes. To steer the directional pattern to the location of the desired source and to

maximize the attenuation of noise source an adaptive beamforming is used in hearing aids.

2.3 Noise Reduction in Hearing Aids

Noise is an unwanted signal, plays a major role in many applications. Noise exists in

several forms and creates problems to various devices such as telecommunications, radar,

sonar, medical applications and so on. Hearing impaired people faces a problem to

understand the speech signal in presence of noise because SNR is an important factor for

hearing impairment. A person with normal hearing can understand the speech signal with

SNR as low as -5db. Hearing impaired person needs at least +5db SNR to understand the

speech signal. To enhance the speech signal in hearing aids several noise reduction

techniques have been developed. A reference signal is available and is used to reduce the

noise.

Chapter-2

7

Reliable and intelligent signal detection plays an important role for the success of

noise reduction. Hearing aids are sensitive to the presence of noise. Amplitude modulation is

the key technology used to separate speech from noise signal. Amplitude modulation works

on the principle that desired speech signal has a harmonic structure and the amplitude of this

harmonic component will change over the time and produces amplitude modulation. The

amplitude signal of the speech and noise may vary. Speech signal has higher amplitude signal

compared to that of stationary and pseudo stationary noises. Pseudo noise has very low

amplitude modulation. The amplitude modulation of the environmental noises such as babble,

traffic noise has higher amplitude than the stationary noise and lower amplitude compare to

the speech signal.

Amplitude modulation alone does not provide the reliable signal detection because the

signal with higher modulation need not be the desired signal. With reliable signal detection, it

is not possible to enhance the speech signal while attenuating the noise. Intelligent signal

detection and noise reduction have been improved by using temporal and timing information

about the signal and noise in combination with amplitude modulation [7].

2.4 Feedback Cancellation in Hearing Aids

Feed back cancellation is used to suppress the feedback signal for which the hearing

aid gain is larger than the feedback part which is the attenuation between the hearing aid

output and its microphone input. Feedback compensation approach consists of a linear

adaptive filter subtracts the feedback signal [6]. The adaption control of the adaptive filter is

the challenging for the feedback cancellation. The typical correlation between the input signal

and feedback signal causes the signal distortion at the hearing aid output.

Chapter-3

8

Chapter 3 Microphone Array

3.1 Basics of Microphone

Microphone is a device used to convert one form of energy to another form.

Microphone is a transducer, which converts a non electrical signal into electrical signal. The

input to the microphone is sound information exists as a pattern of air pressure. Sound

information is converted into patterns of electric current by the microphone. Microphones are

used in various applications such as hearing aids, telephones, tape recorders, radio, television

and non acoustic purpose such as ultrasonic checking. The electronic symbol of microphone

is as shown in Fig. 3. 1.

Fig. 3. 1. Electronic symbol for microphone

3.2 Microphone Array

Microphone array consists of multiple microphones arranged in space with a single

directional input device whose outputs are processed individually and added to produce the

desired output. Microphone array improves the performance of picking up distance sound

compared to that of directional microphone. In applications where a speech signal is

monitored by the microphone, a better performance can be achieved by using an array of

microphones. The microphone array processing technique can be effectively used for the

reduction of noise, it can be used to improve the signal-to-noise ratio of acquired sound pick

up the desired speech with a flat spectrum response at arbitrary speaker position, and detect

the speech period in noisy speech signal [8]. The outputs from the microphone array are

further processed in order to achieve the speech enhancement.

Microphone array has been used in wide different fields such as speech acquisition in

hand-free communication, audio, teleconference and hearing aid applications [9]. The

microphone array processing is well tested and well understood to enhance distance noisy

Pattern of Air

Pressure Pattern of

electric current

Chapter-3

9

target signal. The main aim of microphone array is to improve the quality of the input signal,

to reduce the effect of typical recording problem.

3.3 Microphone array structure and connections

Fig. 3. 2. Microphone constellation in an array

Fig. 3. 2. shows 8-element semi circle shaped microphone array with the sound source

in located in the far field. The microphone array consists of 8-elements and the microphones

are placed in a semi circle shape with a distance between the microphones. The distance

between the source and microphone is greater than that of the distance between the

microphones, indicates that the source is located in the far field. The sound signals coming

from the source are assumed to be parallel to each other. The sound signals from the source

arrives the microphone at different time instants because the distance traveled by the source

to the microphones may vary.

Each of the microphones will receive the input signal with

some delay due to the distance between the microphone and the source signal. Let us consider

that the distance between the microphones as . The distance travelled by the source signal

to the microphone array is considered as where is the angle of arrival of the

source signal to the microphone. The time delay to the microphone is considered as

and is given as

Chapter-3

10

where is the speed of the sound. The input to the microphone is given as

The phase shift of the incoming signal is given as

From equation 2.3 substituting the value of in equation 2.3 then

By considering

equation 3.4 can be written as

In the similar manner a noise signal is passed through the microphone array

[10]. The total signal received by the microphone is the combination of the source signal

and the noise signal given as,

The output from the microphone array is given to the Elko algorithm further to

beamforming to enhance the speech signal from the noisy speech signal.

3.4 Physical Preliminaries

To determine the angle of arrival of the source signal to the microphones let us

consider two microphones placed as shown in the Fig. 3. 3.

Fig. 3. 3. Physical set of the microphones

Chapter-3

11

Fig. 3. 3. represents the physical set of two microphones and . The source

signal is represented by S and is located in the front of the microphones. To determine the

angle of arrival of the source signal to microphone, it is needed to fix the origin for the

microphones. The midpoint between the microphones is considered as the origin.

Considering the orthogonal line to the microphone axis at the origin (OX). The angle is

defined as the separation between the line OX and OS. The angle determines the angle of

arrival of the source signal to the microphone array [11].

From the Fig. 3. 3. it is observed that the source signal is closer to the microphone

compared to that of the microphone . The sound travelling from the source signal reaches

to the microphone and then to . The time delay between the two microphones is

denoted as .

The source signal received by the microphone is represented as and the

source signal received by the microphone is represented as .

3.5 Trigonometric Solution

To determine the angle of arrival between the microphones and the source signal

consider a point S with coordinates x and y these are assumed to be the variables. The

coordinates of the microphones and are considered as and

respectively. The distance between the microphones is considered as cm. The midpoint

between the microphones and is taken as origin.

The target is to determine the angle of angle of arrival of sound signal from the source

signal to microphone. A signal coming from the source reaches the microphone in time

t. In the same moment, the signal travels from the source to the microphone [11]. Let

be the number of samples between the two signals and is expressed a

Chapter-3

12

Fig. 3. 4. Diagram showing the possible situation of microphone and source

Fig. 3. 4. shows the possible arrangement of the microphone and source signal. Let

be the midpoint between the microphones and is expressed as

The slope of the line joining the midpoint of the microphones and the source signal is

expressed as

The angle made by the line joining the midpoint of the microphones and the source

signal is obtained by taking arctangent of the slope of the line joining the point and is

expressed as

Chapter-3

13

The angle is the angle made by the line joining the line between the midpoint of the

microphone and the source signal to the X-axis. The angle is the angle made by the Y-axis

and the line joining the point and is given as follows

If then

and if then

In this way we can determine the angle of arrival of the source signal to the

microphone. This procedure is applied to all the microphones to determine the angle of the

microphone. Similar this procedure is applied to the noise signal as the input to the

microphones.

Chapter-4

14

Chapter 4 Elko Algorithm

4.1 Introduction

Communication devices are widely used in many environments, the acoustic pick up

of the electro acoustic transducer requires a combination of transducer and signal processing

unit. During communication, the transmitted signal is effected by the background noise due

to this the quality of the signal is degraded. The presence of background noise causes acoustic

signal transmission to ubiquitous problems. To overcome the problem of background noise,

convectional microphones are used to pick up the signal in a particular direction, such that the

background noise can be eliminated. Utilization of the conventional directional microphones

limits the solution to this problem because the noise doesn’t have particular direction of

arrival. A better solution can be obtained by taking the advantage of ANC capabilities of the

differential microphone array in combination of digital signal processing [12]. An adaptive

microphone system is to be designed such that it adjusts its directive pattern to maximize the

SNR. ADMA is used to suppress the background noise and to maximize the SNR value.

ADMAs are able to adaptively track and attenuate possibly moving noise sources that are

located in the back half plane of the differential array.

4.2 Aim of Elko Algorithm

Elko has proposed a solution for an adaptive directional microphone. Elko algorithm

covers the design and implementation of a novel adaptive first order differential microphone

that minimizes the microphone output power. By attenuating sound from one direction it can

improve the SNR in acoustic field. An adaptive differential microphone has been

implemented by combining two omni directional elements to from back-to-back cardioid

directional microphone. The microphone signals and a delayed version of the microphones

signals are combined such that a null is placed in one direction, any first order array can be

realized. The adaption process works under the constrain so that a single null is placed in the

rear half plane.

Chapter-4

15

4.3 Derivation of the Adaptive First-Order Array

When a plane-wave with spectrum and wave vector incident on a two-

element microphone array as shown in Fig. 4. 1. the sound waves reaches one microphone

before the other. The time difference depends on the distance between the microphone

and the angle of incident sound wave ,

where is the speed of the sound. The output can be obtained by taking the difference

between the delayed microphone signal and signal from the other microphone, by changing

the time delay it is possible to steer the null. The output signal can be written as

Fig. 4. 1. First-order sensor composed of two zero-orders and a delay

Transforming the equation into frequency domain we get,

The magnitude plot of the equation (4.13) is as shown in Fig. 4. 2. The plots

represent the directional response of the array for three different values of . The time delay

is changed between 0 to , so that the null is steered between and .

T

Chapter-4

16

(a) (b) (c)

Fig. 4. 2. Directional response of the array for

The magnitude of the frequency and angular dependent response of the first-order

differential microphone from a single point source located in the far field is given as

If we assume a small element spacing and inner element delay

the above equation can be written as

The first-order differential array has a monopole term and first order dipole

term . It is observer that the first-order array has first-order differentiator frequency

dependency which can be compensated by a first-order low pass filter [14]. The term in the

brackets of the above equation has a directional response.

The adaptive algorithm minimizes the array output with the appropriate combination

of omnidirectional and dipole sensors such that the mean square output would be minimized.

The dipole directivity pattern can be realized by subtracting two closely-spaced

omnidirectional microphones. A low-pass filter is implemented in the dipole path, the filter is

used for inter channel phase shift. Due to this the adaptive algorithm can steer a null in noise

source direction.

10

20

30

40

30

210

60

240

90

270

120

300

150

330

180 0

10

20

30

40

30

210

60

240

90

270

120

300

150

330

180 0

10

20

30

40

30

210

60

240

90

270

120

300

150

330

180 0

Chapter-4

17

By setting the sampling period equal to and use a fixed delay of one sample, we

get a cardioid directional pattern. A directional microphone array with two microphones

generates forward and backward cardioid signals. An adaption factor is applied to the

backward cardioid and signal obtained is subtracted from the forward cardioid signal to

generate output signal [4]. The output signal is applied to the low pass filter which is used to

compensate the differential response of the differential microphone.

Fig. 4. 3. Schematic implementation of an adaptive first-order differential microphone using

the combination of forward and backward facing cardioids

The output of back-to-back cardioid microphone is obtained by setting the sampling

period equal to . With sampling period , the expression for the forward facing

cardioid and backward facing cardioid is as given

The output can be given as

T

T

Chapter-4

18

Transforming the above equations into frequency domain we get,

Normalizing the output signal by the input spectrum results in

The time delay T is fixed instead the value of β is changed between 0 and 1. The

magnitude plot of the Equation (4.11) is as shown Fig. 4. 4. the direction pattern is obtained

for different values of β. By changing the value of β between 0 and 1 it is possible to steer

between 1800 and 90

0.

(a) (b) (c)

Fig. 4. 4. Directional response of the array for .

The direction response of the back-to-back cardioid microphone is as shown in Fig. 4.

5.

10

20

30

40

30

210

60

240

90

270

120

300

150

330

180 0

10

20

30

40

30

210

60

240

90

270

120

300

150

330

180 0

10

20

30

40

30

210

60

240

90

270

120

300

150

330

180 0

Chapter-4

19

Fig. 4. 5. Directional response of the back-to-back cardioid microphone

4.4 LMS Version of Back-to-Back Cardioid Microphone

Least mean square algorithm is an adaptive algorithm, which uses a gradient method

of steepest decent. LMS incorporates an iterative procedure that makes the successive

correction to the weight vector in the direction of negative gradient vector which leads to

minimum mean square error. LMS algorithm is commonly used algorithm for its simplicity

and does not require correction function calculation. LMS algorithm is implemented to back-

to-back cardioid adaptive first-order differential array [15]. The output of the back-to-back

cardioid microphone is given as

Squaring the above equation on both sides we get,

The minimum error is determined by using steepest descent algorithm by

stepping in the direction opposite to gradient of the surface with respect to the weight

parameter . The steepest descent update equation is given as,

10

20

30

40

30

210

60

240

90

270

120

300

150

330

180 0

Forward Cardioid

Backward Cardioid

Chapter-4

20

where, is the update step-size and the derivative gives the gradient of error surface

with respect to . LMS algorithm minimizes the mean of i.e. instantaneous

estimate of the gradient but not the expectation value that is [15]. Taking the

derivative of we get,

LMS update equation is given as

The LMS algorithm is modified by normalizing the update size. Therefore the LMS version

with normalized is given as

The bracket indicates the time average. The directional pattern for the adaptive array for

is shown in the Fig. 4. 6.

Fig. 4. 6. Directional response of the adaptive array for

10

20

30

40

30

210

60

240

90

270

120

300

150

330

180 0

Chapter-5

21

Chapter 5 Beamforming and GUI

In speech communication system such as hearing aids, wireless communication,

radar’s and sonar’s the recorded speech is corrupted by various background noises. The

reason behind this is that the recording microphone array is located at a certain distance

which causes the microphone to record the background noise. The background noise arises

from audio equipment and other speakers present. The impact of these background noises on

the speech quality depends on the acoustic environment. The intelligibility of the recorded

speech signal is degraded by the background noise. The signal recorded by the microphone

should be enhanced to improve the quality and the intelligibility of the speech signal.

Beamforming is one the simplest method used for distinguishing signals based on the

physical location, it is used with the combination of an array of sensors to provide versatile

form of spatial filtering. The sensors are used to collect the spatial samples of propagating

wave, which are further processed by the beamforming. The term beamforming is derived

from the spatial filters, which are used to design the beam in order to receive a signal from

the direction of interest and attenuate the signal from other directions. The objective of the

beamforming is to estimate the signal arriving from the desired direction in the presence of

noise and interference signal. The desired signal and interference signal are placed at

different spatial directions [27]. A beamformer performs spatial filtering to separate signals

that have overlapping frequency content but originate from different spatial locations.

Beamforming is applicable to either radiation or reception of energy.

Beamforming is used to extract the signal contaminated by interference signal based

on directivity. The signal extraction is performed by processing the signals obtained from

multiple sensors such as microphones, antenna and sonar located at different positions in

space. Beamforming can be used in both transmitter and receiver side. During the

transmission, the beamformer controls the phase and amplitude of the signal at each

transmitter, in order to obtain the pattern of the constructive and destructive interference [16].

In the receiving side, information from different sensors are combined together to obtain a

desired radiation pattern. Beamforming is used in microphone array for speech enhancement.

Chapter-5

22

Beamforming can be considered as multidimensional signal processing in space and

time. The general block diagram of beamforming in which sensors are placed at different

locations is as shown in Fig. 5. 1.

Fig. 5. 1. Block diagram of beamformer

The signals picked by the sensors at particular instant of time are considered as a

snapshot. The beamforming combines the signals arriving the sensors in a particular are

amplified, while signals from the signals from other direction are attenuated.

5.1 Basics structure of Beamforming

Let us assume microphone array and beamformer. Consider the desired signal is

received by the Omni-directional microphone at a time instant as shown in Fig. 5. 2.

Fig. 5. 2. Signal model for microphone array and beamforming

Microphone Array

Chapter-5

23

Let us consider the source signal as , noise signal as and time delay

between the source and microphones as . Let as assume that the

microphone output as is the attenuated and delayed

version of the source and noise given by

where the source signal and noise signal are considered as statistically independent. The

frequency domain representation of the microphone output is given as

The vector representation of arrayed microphone is give by

The data vector is given as:

and

The represents array steering vector and depends on microphone and source location

and is given as

where is the gain scaling of microphone and is given as

and time delay is given as

where represents the distance between the microphone and reference

microphone respectively and represents the speed of sound. The source signal is

Chapter-5

24

retrieved by processing with frequency domain filter weights . The weight vector

is given as:

The output of the beamformer us the sum of weighted microphone outputs and is given as

where (.)H

represents hermitian transpose and is represented in vector form as

In this a microphone array is used in combination with beamformer to enhance the

speech signal from a noise contaminated speech signal [17].

5.2 Types of Beamformers

Beamforming technique is further divided into two types

I. Fixed beamforming

II. Adaptive beamforming

5.2.1 Fixed Beamforming

Fixed beamforming uses a set of weights and time delays to combine the

signals from the sensors in an array. This type of beamforming optimizes the microphone in a

particular direction and does not change the direction as the incident source signal changes.

The beam is optimized for the direction of desired source while suppressing the sound from

other directions as much as possible. Thus the direction response of the array is fixed to

particular angle of elevation. If the target source is non-stationary, the signal enhancement

performance is reduced as the source moved away from the steering direction.

5.2.2 Adaptive Beamforming

A beamforming which adaptively forms its directive patterns is called an

adaptive beamforming. Adaptive beamforming is a powerful technique to enhance a signal of

interest while suppressing the interference signal and noise at the output of the array sensor.

Adaptive beamforming alters the direction pattern in according to the changes in the acoustic

environment, thus provides a better performance than fixed beamforming. Adaptive

Chapter-5

25

beamforming is more sensitive than fixed beamforming to errors such as sensors mismatch,

mis-steering and to correlated reflections [18].

Let us consider microphones the general adaptive beamforming is as shown in Fig.

5. 3.

Fig. 5. 3. An adaptive beamforming system

Adaptive beamforming is used to create multiple beams towards the signal of interest

and suppress the interfering signals from all the other directions. The input signal

received by the microphones is multiplied with a coefficient weight vector to adjust

the phase and amplitude of the incoming signal. The multiplied signals are summed up to

produce a resulting output array . An adaptive algorithm is applied to minimize the error

between the desired signal and the output array . The output of the

beamformer at an instant of time n is given by the equation

where and . The weights

are used to adjust the amplitude so that when added together produce a desired beam of

interest [19].

Adaptive beamformers has higher capability of unknown directional noise reduction

compared to that of fixed beamforming and potentially provides better performance that fixed

beamformers. Adaptive beamformers are sensitive to steering errors and might suffer from

Chapter-5

26

signal leakage and degradation of the desired signal. Due to this the conventional adaptive

beamforming has not gained a wide spread of acceptance for speech applications. Robust

modifications to avoid signal leakage and cancellation have been an important matter of

interest in microphone applications. GSC is an adaptive beamforming solution that has been

proposed for microphone array processing.

5.2.3 Acoustic Beamforming

Acoustic beamforming is a technique where the microphone array is placed in the

far field. As a rule of thumb, the far field is defined as being further away from the source

than the array dimensions or diameter. The area between near field and far field remains a

grey zone. In the near field, sound waves behave like circular or spherical waves whereas, in

the far field, they become planar waves. Acoustic beamforming modifies the propagation of

sound by introducing spatially dependent delay into a wave front. This focuses incoming

sound from a single source or direction into a small volume of space so that it can be detected

by a single transducer. Acoustic beamforming can efficiently enhance the speech of interest

while suppressing interference, background noise. It allows people to move freely around

without wearing or holding a microphone. Acoustic beamforming provides the option to

enhance the signal from the specific individual and allows background noise (other speech,

motors, movement. etc) to take place. Acoustic beamforming is sometimes called “sum and

delay” since it considers the relative delay of sound wave reaching different microphone

positions. Acoustic beamforming requires that all data is measured simultaneously [28].

The main advantage of Acoustic beamforming is good spatial resolution and main

disadvantage is it does not perform well in the low frequency range. To rectify this

disadvantage we choose high frequency range that is higher than 8000 Hz [28].

5.3 Generalized Sidelobe Canceller

Generalized sidelobe canceller is a most common and successful approach used

widely in microphone array applications. GSC is used to reduce the interference noise from

non target location in array beamforming [5]. It can be used as adaptive noise canceller in

array processing. The structure is used with arrays which have been time delay steered such

that the desired signal of interest appears in phase at the steered output. GSC is very

susceptible to the burst of interference noise. The Block diagram of GSC is as shown in Fig.

5. 4.

Chapter-5

27

Fig. 5. 4. Block diagram of Generalized Sidelobe Canceller

The structure of GSC consists of an adaptive filter and a non adaptive filter. The non

adaptive filter is steered in the direction of the input signal . The non adaptive part of the

GSC consists of a fixed beamformer such as delay-and-sum beamformer. The adaptive part

of the GSC is the cascade combination of the blocking matrix and an adaptive filter. The

adaptive part is used to estimate the non-desired components through the blocking matrix that

blocks the input signal and allows all the other signals to pass through it. The adaptive filter is

used to match the interference in the adaptive branch to as close as possible to interference in

the non adaptive branch. The reduction of the noise is performed by a simple unconstrained

NLMS algorithm [20]. Fig. 5. 5. depicts one simple realization of the GSC

A signal flow diagram of the GSC is as shown in Figure. 5. 4. The input signal

is applied to as array of microphones that are used to steer towards the desired focal point

with some time delay. The upper part of the GSC is a delay-and-sum beamformer. A delay-

sum-beamformer is used to delay the signal received at each microphone and sum them

together. The lower part of the GSC consists of a blocking matrix used for processing the

signals from the microphone array in order to estimate the noise reference signal from the

array of the microphones. A delay of samples is applied to the delay-and-sum beamformer

to make the signal processing delay encountered by the adaptive filtering in the lower part of

the GSC.

Let us assume that the system consists of microphones. The output of the delay-

and-sum-beamformer is given as

Non Adaptive

Filter

Blocking

Matrix Adaptive

Filter

Chapter-5

28

Fig. 5. 5. Detailed structure of Generalized Sidelobe Canceller

In this case the blocking stage is achieved by simple subtracting pair of sensors. Then

the output of the blocking matrix is

where is a blocking matrix. The output of the adaptive path can be written in terms of

and adaptive filter as

The total output of the GSC beamforming is given as

Blocking

Matrix

NLMS

NLMS

NLMS

NLMS

Chapter-5

29

where the vector of the adaptive filter is weights for each blocking matrix and

is the blocking matrix output. The filter weights of the NLMS algorithm are

updated using

where is given by

The value of is given as for the stability of the system the value of the should

be very small.

The adaptive path of the GSC is used to reduce the coherent noise and it has a poor

performance in terms of non-coherent noise. For this reason GSC is used for the rejection of

unknown directional interference [21]. In real world applications maladjustment in the

microphone position, assumed source position and characteristics of different microphones

causes signal leakage in the blocking matrix output which results in target signal cancellation

and further reduces the SNR of the system. To decrease the signal leakage in the blocking

matrix the GSC blocking matrix is replaced with an Elko algorithm. The structured of the

modified GSC is as shown in Fig. 5. 6

5.4 Elko Based Generalized Sidelobe Canceller

Fig. 5. 6. represents the proposed Elko based GSC. The Proposed system is a

combination of a microphone array, Elko algorithm and an adaptive part of the GSC. The

microphones are placed in an arc shape. In our proposed model we are using 8 microphones.

A sound wave is passed through the microphone array. The input sound signal reaches the

microphones with some time delay because the source signal is placed at a distance from the

microphones which indicates that the microphones are located in the far field. The angle of

arrival of the speech signal to the microphones is calculated as explained in chapter-3. The

output signals from the microphone are given to the elko algorithm. The elko algorithm is

applied by considering the output signals from the pair of microphones. The eight

microphone used are considered as five pairs of microphones as shown in Fig. 5. 6. and elko

algorithm is applied on the pairs of microphones. The description of the elko algorithm is

explained in chapter-4, the elko algorithm used here is as shown in Fig. 4. 3. A noise signal

is applied in the same procedure as that of that of the speech signal. The output speech and

Chapter-5

30

noise signals from the pairs of microphones are added together. The output signals after

adding both the noise and speech signal are named as , , , and .

The microphones output which are straight forward to the source signal are and ,

the output from these pair of microphone is and is considered as the output of fixed

path in the GSC or as a main lobe. All the other elko outputs i.e., and

are considered as the output from the blocking matrix of the GSC or as the sidelobes

and are applied to the adaptive path of the GSC. The adaptive part of the GSC consists of an

unconstrained NLMS algorithm.

Fig. 5. 6. Structure of Proposed Elko Based Generalized Sidelobe Canceller Model

Let us assume that the signal arriving the and as and and the angle of

arrival of the speech signal is considered as and . The speech signal from the is

multiplied with forward cardioid and the signal from the is multiplied with backward

cardioid is given as

8

7

3

4

5

6

Chapter-5

31

where is the distance between the microphones and is given as which is

equivalent to 1cm distance, ,

and is the speed of sound.

Elko algorithm is applied on the signals which are obtained by multiplying with

forward and backward cardioid. In the elko algorithm, initially a unit delay is applied to the

signals the delayed signals is considered as and . From these delayed signals the

forward and backward cardioids are obtained as

The output from the elko algorithm for speech as the input signal is given as

where is a constant.

Similarly noise signal is passed through the microphones and the output of the elko

algorithm with noise as input is given as . The output of the elko algorithm for and

with speech and noise as input is given as

In the similar manner elko algorithm is applied to all the other microphone pairs and

the outputs are named as and . To enhance the speech signal the

output of the elko algorithm is applied to the adaptive part of the GSC which consist of an

NLMS algorithm. For the NLMS algorithm is considered as the reference signal. Let

us consider that all the other elko are kept in a vector form as

Chapter-5

32

The output vector of the elko algorithm is further applied to adaptive filter. The output

of the adaptive algorithm is given as

where is the number of microphones and represents the vector of the output elko

vector. The total output of the proposed system is given as

where is the weight vector of the adaptive filter for the elko vector . These

filter weights are updated by the NLMS algorithm. Weight update equation for the NLMS

algorithm is given as

where is given by

The value of is given as for the stability of the system the value of the should

be very small. In this way an elko based GSC is implemented to enhance the speech signal

from the noisy speech signal.

Chapter-5

33

5.5 Graphical User Interface

MATLAB code is performed by command-line-operation which is a bit difficult to

understand the program during the execution. Most of the people interested to perform the

task simply by hiding the unnecessary clutters and technicality that lies in the program. A

user friendly interface is need for simplification of entry point of the program and

encapsulation of its functional behavior. A graphical front-end such as GUI is used in

MATLAB to perform the task simply by hiding unnecessary clutters. GUI is used for the

pictorial representation of the program. GUI uses graphics and text input to make a familiar

environment to the user for the execution of the program. GUI based programs must be

prepared for mouse clicks. Each control in the GUI has user-written routine know as call

back, used to call back MATLAB to ask it to do things. The execution of the call back is

triggered by the user action such as clicking a mouse button, selecting a menu item or

pressing the screen button etc. GUI then responds to these events and this type of

programming is called as event-driven programming. In event-driven programming call back

execution is asynchronous, because it is triggered by the event external to the software [22].

GUI enables the user, to analyze the performance of the system using SNR values,

graphical representation of input signal, output signal, speech and noise distortion. The layout

of the GUI designed for evaluation of the proposed system is as shown in the Fig. 5. 7.

Fig. 5. 7. GUI layout used for the design of the proposed model

Chapter-5

34

Fig. 5. 7. represents the layout of the GUI designed for our system. Various

components used for the design of the GUI are push button, edit box, popup menu and axes.

The brief description of the components used in the GUI is explained in Table. 5.1.

TABLE 5.1

BASIC COMPONENTS USED IN THE GUI

Elements Description

Push button It is created by uicontrol call back. It triggers a call back when with

clicked with mouse.

Edit box It created by uicontrol call back. It is used to display a string and

allows the user to modify the information. It triggers a call back

when the user press the enter key.

Popup menu It is created by uicontrol. It is used to display a series of text strings

in response to a mouse click.

Axes Creates a new set of axes which is used to display the data on.

Nerves triggers a call back

Fig. 5. 7. represents the GUI layout designed for our proposed model. In the layout

the buttons start, clear and close buttons are push buttons. Input noise signal, output signal

and input SNR are made up of popup menu which are used to select one value from a list of

values. Input signal, output signal, SD and ND are made of axes component which displays

the information. Elko SNR, output SNR, SD and ND are created by edit box in which the

information is displayed.

To run the GUI designed for your model, select the type of the input noise given to

the system, output signal from the system, input SNR value, enter the value of order and step

size. By triggering the start button the call back will execute the corresponding call back

program and corresponding results are displayed. To clear the previous execution results,

clear button should be triggered. To close the GUI layout, clear button should be triggered.

Chapter-6

35

Chapter 6 Evaluation

This chapter deals with the performance evaluation of the speech enhancement system

in hearing aid which is proposed in previous chapters. Enhancement of speech depends on the

quality of the processed speech determines whether the effort is worthwhile. An evaluation of

speech enhancement requires a series of objective measures to be conducted on the proposed

system, these measures determines the quality of the output signal. The objective methods

include the measure of SNR, SNRI, the measure of PESQ value under the ITU-TP.862 is

used to measure the quality of the speech signal, SD and ND [23]. Objective measures are

widely used in speech enhancement. The advantages of the objective measure are that the

results can be easily viewed for verification and a large number of test data can be evaluated

using a computer. Though there maybe overall noise reduction in the signal, there may be

very little amount of noise remains in the processed signal. This chapter describes the test

employed and the test data used.

6.1 Test Data

6.1. 1 Clean Speech Data

The speech signals used for the test are sampled at 16 kHz. The signal is of

short speech sample of 3 seconds. Two male voices and one female voice are used as the test

date. The speech file used throughout the test is as Table 6. 1. Sentence.wav is used as a main

speech signal and the other two voice signals are used as interference signals.

TABLE 6. 1.

TYPE OF MALE AND FEMALE SENTENCES USED FOR EVALUATION

File Name Type of Voice Sentence

Setntence.wav Male Voice “She sells seashells by the seashore”

Man.wav Male Voice “Someone walking on the side walk with the

rainbow”

Woman.wav Female Voice “A good birthday has canoes with cap cakes

cargoes in rainbow color”

Chapter-6

36

6.1.2 Noise Data

Various noise signals are used for the evaluation of the proposed method. All

the noise signals are taken from Noisex-92 database [24]. All the noise signals are recorded

at these signals are resampled to as that of the sampling frequency of

the speech signal. These noise signals and interference male and female voice are added to

speech signal at different SNR values. The input SNR value are scaled to different levels such

as using the formula

where is the variance of speech signal and

is the variance of the noise

signal. The value of in the equation may be . The brief

description of various noise signals used for the evaluation are given as follows

6.1.2.1 Babble Noise

The most challenging interference noise for the speech system is babble

noise. This type of noise is highly non-stationary and is obtained by recording the voice of

people speaking in a canteen. This is obtained by recording samples from B&K

condenser microphone onto DAT. The room radius is over two meters therefore, individual

voices are slightly audible. The sound level during the recording process was . It is

the most difficult noise for speech enhancement. The power spectrum of the babble noise is

as show in the Fig. 6. 1.

Fig. 6. 1. Power Spectrum of Babble Noise

0 2000 4000 6000 8000 10000 12000 14000 16000-30

-20

-10

0

10

20

30

40

50

Frequency in [Hz]

Pow

er in

[db]

Power spectrum of Babble Noise

Chapter-6

37

6.1.2.2 Car Interior Noise

This recording was made in Volvo car at , in the gear, on an

asphalt road, in rainy conditions. Many speech enhancement systems perform well with car

interior noise due to low pass nature of this noise filter. The power spectrum of the car

interior noise is as shown in Fig. 6. 2.

Fig. 6. 2. Power Spectrum of Car Interior Noise

6.1.2.3 Destroy Engine Noise

This type of noise is obtained by recording samples from microphone on

DAT. Sound level during the recording process is . The power spectrum of this type

of noise is as shown in Fig. 6. 3.

Fig. 6. 3. Power Spectrum of Destroy Engine Noise

0 2000 4000 6000 8000 10000 12000 14000 16000-20

-10

0

10

20

30

40

50

60

Frequency in [Hz]

Pow

er in

[db]

Power spectrum of Car Interior Noise

0 2000 4000 6000 8000 10000 12000 14000 160000

5

10

15

20

25

30

35

40

45

50

Frequency in [Hz]

Pow

er in

[db]

Power spectrum of Destroy Engine Noise

Chapter-6

38

6.1.2.4 Tank Noise This type of noise is recorded from tank by using B&K condenser

microphone onto DAT. The tank is moving at a speed of . The sound level

during the recording process was . The power spectrum of the tank noise is as shown

in Fig. 6. 4.

Fig. 6. 4. Power Spectrum of Tank Noise

6.1.2.5 Wind Noise

Wind noise is the noise caused by the turbulent airflow over and around an

object. The wind is an invisible force. When wind strikes the surface of the microphone it

produces an effect called as wind noise. The power spectrum of the wind noise is as shown in

Fig. 6. 5.

Fig. 6. 5. Power Spectrum of Wind Noise

0 2000 4000 6000 8000 10000 12000 14000 16000-10

0

10

20

30

40

50

60

Frequency in [Hz]

Pow

er in

[db]

Power spectrum of Tank Noise

0 2000 4000 6000 8000 10000 12000 14000 16000-40

-30

-20

-10

0

10

20

30

40

50

60

Frequency in [Hz]

Pow

er in

[db]

Power spectrum of Tank Noise

Chapter-6

39

6.2 Objective Measures

Various objective tests used to measure the performance of the proposed system are

described below.

6.2.1 Signal to Noise ratio

Signal to noise ratio (SNR) is used to measure to compare the level of the

desired signal to level of the background noise. The conventional method to measure the

SNR is to compute the amount of speech energy over the noise energy after the enhancement

and is given as

where is the variance of the speech signal and

is the variance of the noise

signal.

6.2.2 SNR Improvement

SNR improvement is measured by subtracting the input SNR value from that of

the output SNR value and is expressed as follows

where

is the variance of the output speech signal,

is the variance of the

output noise signal,

is the variance of the input speech signal,

is the

variance of the input noise signal.

6.2.3 Perceptual Evaluation of Speech Quality

Perceptual Evaluation of Speech Quality (PESQ) is the international standard

for objective speech quality measurement and is well known as intrusive objective speech

quality assessment method. The PESQ is an Objective measure but it based on cognitive

models of the human hearing organ to form pseudo subjective scores and it has high

correlation with real subjective tests. It is standardized as ITU-T P.862 PESQ. PESQ operates

on a transmitted (input) signal and received (output) speech signal to compute the perceptual

quality of the received signal. PESQ is used in Voice over Internet Protocol (VoIP), mobile

Chapter-6

40

transmission, in fixed networks in order to measure the quality of the speech signal. The

evaluation of system using PESQ measure is as shown in Fig. 6. 6.

Fig. 6. 6. Structure of Perceptual Evaluation of Speech Quality model

A number of objective measures examined in previous study for predicting the

intelligibility of speech in noisy conditions. The mostly used one is PESQ. Among all

objective measures considered, the PESQ measure is the most complex to compute and is one

recommended by a standardized agency i.e. International Telecommunication union (ITU-T

2000) for speech quality assessment of 3.2 KHz (narrow band) handset telephony and

narrow-band speech codec [25, 29].

The PESQ measure is computed as follows:

The original (clean) and degraded signals are first level equalized to a standard

listening level and filtered by a filter with response similar to that of standard telephone

headset. The signals are time aligned to correct for time delays, and the processed through an

auditory transform to obtain the loudness spectra. The difference in loudness between the

original and degraded signals is computed and averaged over time and frequency to produce

the prediction of subjective quality rating.

Finally the output of the system determines the PESQ value of the signal. The PESQ

delivers an output value which lies in the range between -0.5 to 4.5. PESQ values in the range

-0.5 indicates the poor quality of the voice signal. The PESQ value in the range 4.5 indicates

excellent quality of the voice signal [26].

System

under

Test

y(n) x(n)

PESQ

P.862 s(n)

v(n)

PESQ

Score

Chapter-6

41

6.2.4 Measurement of Speech Distortion

SD is defined as the spectral deviation in the power of the input clean speech

signal and the power of the processed speech signal at the output. A reference power level of

the enhanced output signal is obtained by normalizing the target speech signal. The

normalizing factor given as

SD is given by

where is the power of input speech is signal and

is the power of output speech signal.

6.2.5 Measure of Noise Distortion

ND is defined as the spectral deviation in the power of the input noise signal

and the power of the processed noise signal at the output. A reference power level of the

enhanced output signal is obtained by normalizing the target noise signal. The normalizing

factor given as

ND is given by

where is the power of input noise is signal and

is the power of output noise signal.

Chapter-6

42

6.3 Test Results with Various Noise Signals

In this thesis, we considered a source signal, noise signal and eight microphones

placed in an arc position, the distance between the microphones is considered to be 1cm and

is as shown in Fig. 6. 7., red dot indicates the source signal position, black dots indicates the

position of microphones and blue dot indicates the noise signal position.

Fig. 6. 7. Graph represents the position of source, noise signal and microphones position

we use a clean male speech signal as a test signal sampled at 16 kHz frequency which

is used for the effective validation of the system. This speech signal is corrupted with various

noise signals such as babble noise, car interior noise, tank noise, wind noise, male voice as

interference noise, destroy engine noise and female voice as interference noise signal at 0 dB,

5 dB, 10 dB, 15 dB and 20 dB for testing the system. The graphical representation of clean

speech signal “she sells seashells by the seashore” is as shown in Fig. 6. 8.

Fig. 6. 8. Graphs represent a clean speech signal

0 0.01 0.02 0.03 0.04 0.05 0.06 0.070

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1Position of Source Signal, Microphones and Noise Signal

Chapter-6

43

Fig. 6. 8. (a), 6. 15. (a), 6. 21. (a), 6. 28. (a), 6. 34. (a), 6. 40. (a), 6. 46. (a), represents

the corrupted speech signal with babble noise, car interior noise, tank noise, wind noise, man

voice as interference noise, destroy engine noise, woman noise respectively. Fig. 6. 8. (b), 6.

15. (b), 6. 21. (b), 6. 28. (b), 6. 34. (b), 6. 40. (b), 6. 46. (b) represents the enhanced speech

signal from various noise signals. Fig. 6. 10., 6. 16., 6. 23., 6. 29., 6. 35., 6. 41., 6. 47.,

represents the graph of SNR values measured. Fig. 6. 11., 6. 17., 6. 24., 6. 30., 6. 36., 6. 42.,

6. 48., represents the graphs of input and output signal PESQ score. Fig. 6. 12., 6. 18., 6. 25.,

6. 31., 6. 37., 6. 43., 6. 49., represents the graph of SD between pure clean speech signal and

enhanced speech signal from various noises. Fig. 6. 13., 6. 19., 6. 26., 6. 32., 6. 38., 6. 44., 6.

50., represents the graph of ND between the input noise signal and output noise signal. Fig. 6.

15., 6. 20., 6. 27., 6. 33., 6. 39., 6. 45., 6. 51., represents the GUI layout designed for the

proposed system with various noise signal and different input SNR values. Table 6.2, 6.3,

6.4, 6.5, 6.6, 6.7, 6.8 represents the SNRI, SD and ND values for different input SNR values.

6.3.1 Evaluation of Babble Noise

(a)

(b)

Fig. 6. 9. Graphs represent (a) Corrupted speech with babble noise at 10 dB (b) Enhanced speech signal.

The enhanced speech signal produced by the proposed method is clean with enhanced

quality without audible noise signal.

Chapter-6

44

Fig. 6. 10. Graph represents the SNR value for babble noise

Fig. 6. 11. Graph shows the PESQ value for babble noise

0 5 10 15 200

5

10

15

20

25

30

35

40

45

50Graph Representing the SNR Value of Babble Noise

Input SNR in db

Elk

o,

Outp

ut

SN

R

Input SNR

Elko SNR

Output SNR

0 5 10 15 201

1.5

2

2.5

3

3.5

4

4.5

5

SNR in db

PE

SQ

Valu

e

PESQ of Babble Noise

input

output

Input SNR

Elko SNR

Output SNR

Chapter-6

45

Fig. 6. 12. Graph shows the SD of babble noise at 10 dB

Fig. 6. 13. Graph shows the ND of babble noise at 10 dB

0 1000 2000 3000 4000 5000 6000 7000 8000-60

-50

-40

-30

-20

-10

0Speech Distortion Graph of Babble Noise

Input Speech

Output Speech

0 1000 2000 3000 4000 5000 6000 7000 8000-90

-80

-70

-60

-50

-40

-30

-20Noise Distortion Graph of Babble Noise

Input Noise

Output Noise

Chapter-6

46

TABLE 6.2

REPRESENTS THE SNRI, SPEECH AND NOISE DISTORTION FOR BABBLE NOISE

Input SNR SNRI Speech Distortion Noise Distortion

0 db 17.0935 -21.6603 -30.9882

5 db 16.9475 -28.3759 -36.8680

10 db 17.2390 -27.7635 -41.1443

15 db 17.3293 -27.3468 -45.7790

20 db 17.2475 -27.1291 -50.7423

Fig. 6. 14. GUI Layout with babble as input noise at 10dB of input SNR

The proposed system produces a good performance of results with babble noise. The

proposed system produces approximately 17 dB of SNRI with babble noise. The PESQ value

represents that the quality of the output speech signal is good compared to that of the input

signal.

Chapter-6

47

6.4.2 Evaluation of Car Interior Noise

(a)

(b)

Fig. 6. 15. Graphs represent (a) Corrupted speech with car noise at 0db (b) Enhanced speech signal.

Fig. 6. 16. Graph represents the SNR value for car noise

0 5 10 15 200

5

10

15

20

25

30

35

40

45

50Graph Representing the SNR Value of Car Noise

Input SNR in db

Elk

o, O

utpu

t S

NR

Input SNR

Elko SNR

Output SNR

Chapter-6

48

Fig. 6. 17. Graph shows the PESQ value for car noise

Fig. 6. 18. Graph shows the SD of car interior noise at 0 dB

0 5 10 15 201

1.5

2

2.5

3

3.5

4

4.5

5

5.5

6

SNR in db

PE

SQ

Valu

e

PESQ of Car Noise

input

output

0 1000 2000 3000 4000 5000 6000 7000 8000-60

-50

-40

-30

-20

-10

0Speech Distortion Graph of Car Interior Noise

Input Speech

Output Speech

Chapter-6

49

Fig. 6. 19. Graph shows the ND of car interior noise at 0 dB

TABLE 6.3

REPRESENTS THE SNRI, SPEECH AND NOISE DISTORTION FOR CAR INTERIOR NOISE


0 db 21.6896 -29.0254 -43.9777

5 db 22.2870 -28.4140 -47.3695

10 db 23.1764 -27.8790 -51.1349

15 db 23.3517 -27.5479 -56.8825

20 db 22.7619 -27.3463 -64.8594

The proposed method has a high performance with car interior noise as input noise.

The proposed system produces approximately 22 dB of SNRI. By listening the output speech

signal it is free from noise and a clean speech is audible at the output. The GUI layout of the

babble noise at 10db is as shown in Fig. 6. 20.

0 1000 2000 3000 4000 5000 6000 7000 8000-90

-80

-70

-60

-50

-40

-30

-20

-10

0Noise Distortion Graph of Car Interior Noise

Input Noise

Output Noise

Chapter-6

50

Fig. 6. 20. GUI Layout with car interior noise as input noise at 0dB of input SNR

6.4.3 Evaluation of Tank Noise

(a)

(b) Fig. 6. 21. Graphs represent (a) Corrupted speech with tank noise at 5db (b) Enhanced speech signal.

Chapter-6

51

Fig. 6. 22. Graph represents the SNR value for tank noise

Fig. 6. 23. Graph shows the PESQ value for tank noise

0 5 10 15 200

5

10

15

20

25

30

35

40

45

50Graph Representing the SNR Value of Tank Noise

Input SNR in db

Elk

o,

Outp

ut

SN

R

Input SNR

Elko SNR

Output SNR

0 5 10 15 201

1.5

2

2.5

3

3.5

4

4.5

5

SNR in db

PE

SQ

Valu

e

PESQ of Tank Noise

input

output

Chapter-6

52

Fig. 6. 24. Graph shows the SD of tank noise at 5 dB

Fig. 6. 25. Graph shows the ND of tank noise at 5 dB

0 1000 2000 3000 4000 5000 6000 7000 8000-60

-50

-40

-30

-20

-10

0Speech Distortion Graph of Tank Noise

Input Speech

Output Speech

0 1000 2000 3000 4000 5000 6000 7000 8000-60

-55

-50

-45

-40

-35

-30

-25

-20

-15

-10Noise Distortion Graph of Tank Noise

Input Noise

Output Noise

Chapter-6

53

TABLE 6.4

REPRESENTS THE SNRI, SPEECH AND NOISE DISTORTION FOR TANK NOISE


0 db 18.0880 -29.0542 -31.7077

5 db 18.4749 -28.4263 -36.0378

10 db 18.8690 -27.9134 -40.3683

15 db 18.8887 -27.5661 -45.2082

20 db 18.5283 -27.3500 -50.5688

Fig. 6. 26. GUI Layout with tank noise as input noise at 5dB of input SNR

The proposed system produces a good performance of results with tank noise. The

proposed system produces approximately 18 dB of SNRI with tank noise. The PESQ value


signal. By listening the output speech signal it is free from noise and a clean speech is audible

at the output.

Chapter-6

54

6.4.4 Evaluation of wind Noise

(a)

(b)

Fig. 6. 27. Graphs represent (a) Corrupted speech with wind noise at 0db (b) Enhanced speech signal.

Fig. 6. 28. Graph represents the SNR value for wind noise

0 5 10 15 200

5

10

15

20

25

30

35

40

45

50Graph Representing the SNR Value of Wind Noise

Input SNR in db

Elk

o,

Outp

ut

SN

R

Input SNR

Elko SNR

Output SNR

Chapter-6

55

Fig. 6. 29. Graph shows the PESQ value for wind noise

Fig. 6. 30. Graph shows the SD of wind noise at 0 dB

0 5 10 15 201

1.5

2

2.5

3

3.5

4

4.5

5

SNR in db

PE

SQ

Valu

e

PESQ of Wind Noise

input

output

0 1000 2000 3000 4000 5000 6000 7000 8000-60

-50

-40

-30

-20

-10

0Speech Distortion Graph of Wind Noise

Input Speech

Output Speech

Chapter-6

56

Fig. 6. 31. Graph shows the ND of wind noise at 0 dB

TABLE 6.5 REPRESENTS THE SNRI, SPEECH AND NOISE DISTORTION FOR WIND


0 db 18.3066 -29.0286 -34.3433

5 db 18.8293 -28.3215 -38.2911

10 db 19.3762 -27.7482 -42.4971

15 db 19.6856 -27.3794 -47.3313

20 db 19.6901 -27.1757 -52.5111

The proposed system produces a good performance of results with tank noise. The

proposed system produces approximately 19 dB of SNRI with wind noise. The PESQ value


signal. By listening the output speech signal it is free from noise and a clean speech is audible

at the output.

0 1000 2000 3000 4000 5000 6000 7000 8000-90

-80

-70

-60

-50

-40

-30

-20

-10Noise Distortion Graph of Wind Noise

Input Noise

Output Noise

Chapter-6

57

Fig. 6. 32. GUI Layout with wind noise as input noise at 0db of input SNR

6.4.5 Evaluation of Man voice as interference Noise

(a)

(b)

Fig. 6. 33. Graphs represent (a) Corrupted with man interference noise at 5db (b) Enhanced speech signal.

Chapter-6

58

Fig. 6. 34. Graph represents the SNR value for man voice as interference noise

Fig. 6. 35. Graph shows the PESQ value for man voice as interference noise

0 5 10 15 200

5

10

15

20

25

30

35

40

45

50Graph Representing the SNR Value of Male Voice as Interference Noise

Input SNR in db

Elk

o,

Outp

ut

SN

R

Input SNR

Elko SNR

Output SNR

0 5 10 15 201

1.5

2

2.5

3

3.5

4

4.5

5

SNR in db

PE

SQ

Valu

e

PESQ of Man Voice as Interference Noise

input

output

Chapter-6

59

Fig. 6. 36. Graph shows the SD of man voice as interference noise at 5 dB

Fig. 6. 37. Graph shows the ND of man voice as interference noise at 5 dB

0 1000 2000 3000 4000 5000 6000 7000 8000-60

-50

-40

-30

-20

-10

0Speech Distortion Graph of Man Voice as Interference Noise

Input Speech

Output Speech

0 1000 2000 3000 4000 5000 6000 7000 8000-65

-60

-55

-50

-45

-40

-35

-30

-25

-20

-15Noise Distortion Graph of Man Voice as Interference Noise

Input Noise

Output Noise

Chapter-6

60

TABLE 6.6 REPRESENTS THE SNRI, SPEECH AND NOISE DISTORTION FOR MAN VOICE AS INTERFERENCE NOISE


0 db 14.5147 -21.6962 -30.1394

5 db 15.3550 -27.9082 -35.6224

10 db 15.5771 -27.5811 -40.0245

15 db 15.7168 -27.3656 -44.7639

20 db 15.7653 -27.2448 -49.7774

Fig. 6. 38. GUI Layout with man voice as interference noise at 5db of input SNR

The proposed system produces a good performance of results with man voice as

interference noise. The proposed system produces approximately 15 dB of SNRI with man

voice as interference noise. The PESQ value represents that the quality of the output speech

signal is good compared to that of the input signal. By listening the output speech signal it is

free from noise and a clean speech is audible at the output.

Chapter-6

61

6.4.6. Evaluation of Destroy Engine Noise

(a)

(b)

Fig. 6. 39. Graphs represent (a) Corrupted speech with destroy engine noise at 0dB (c) Enhanced speech signal

Fig. 6. 40. Graph represents the SNR value for Destroy engine noise

0 5 10 15 200

5

10

15

20

25

30

35

40

45

50Graph Representing the SNR Value of Destroy Engine Noise

Input SNR in db

Elk

o,

Outp

ut

SN

R

Input SNR

Elko SNR

Output SNR

Chapter-6

62

Fig. 6. 41. Graph shows the PESQ value for destroy engine noise

Fig. 6. 42. Graph shows the SD of destroy engine noise at 0 dB

0 5 10 15 201

1.5

2

2.5

3

3.5

4

4.5

5

SNR in db

PE

SQ

Valu

e

PESQ of Destroy Engine Noise

input

output

0 1000 2000 3000 4000 5000 6000 7000 8000-60

-50

-40

-30

-20

-10

0Speech Distortion Graph of Destroy Engine Noise

Input Speech

Output Speech

Chapter-6

63

Fig. 6. 43. Graph shows the ND of destroy engine noise at 0 dB

TABLE 6.7

REPRESENTS THE SNRI, SPEECH AND NOISE DISTORTION FOR DESTROY ENGINE NOISE


0 db 16.4261 -29.7541 -33.4305

5 db 12.3759 -28.8133 -37.7039

10 db 11.3883 -28.0778 -42.1343

15 db 12.5644 -27.6350 -46.9517

20 db 13.4878 -27.3962 -51.9464

The proposed system produces a good performance of results with destroys

engine noise. The proposed system produces approximately 13 dB of SNRI with wind noise.

The PESQ value represents that the quality of the output speech signal is good compared to

that of the input signal. By listening the output speech signal it is free from noise and a clean

speech is audible at the output.

0 1000 2000 3000 4000 5000 6000 7000 8000-45

-40

-35

-30

-25

-20

-15

-10

-5Noise Distortion Graph of Destroy Engine Noise

Input Noise

Output Noise

Chapter-6

64

Fig. 6. 44. GUI Layout with destroy engine noise at 0db of input SNR

6.4.7 Evaluation of Female voice as interference Noise

(a)

(b)

Fig. 6. 45. Graphs represent (a) Corrupted with female interference noise at 10dB (b) Enhanced speech signal.

Chapter-6

65

Fig. 6. 46. Graph represents the SNR value for female voice as interference noise

Fig. 6. 47. Graph shows the PESQ value for female voice as interference noise

0 5 10 15 200

5

10

15

20

25

30

35

40

45

50Graph Representing the SNR Value of Female Voice as Interference Noise

Input SNR in db

Elk

o,

Outp

ut

SN

R

Input SNR

Elko SNR

Output SNR

0 5 10 15 201

1.5

2

2.5

3

3.5

4

4.5

5

SNR in db

PE

SQ

Valu

e

PESQ of Female Voice as Interference Noise

input

output

Chapter-6

66

Fig. 6. 48. Graph shows the SD of female voice as interference noise at 10 dB

Fig. 6. 49. Graph shows the ND of female voice as interference noise at 10 dB

0 1000 2000 3000 4000 5000 6000 7000 8000-60

-50

-40

-30

-20

-10

0Speech Distortion Graph of Woman Voice as Interference Noise

Input Speech

Output Speech

0 1000 2000 3000 4000 5000 6000 7000 8000-70

-65

-60

-55

-50

-45

-40

-35

-30

-25

-20Noise Distortion Graph of Woman Voice as Interference Noise

Input Noise

Output Noise

Chapter-6

67

TABLE 6.8 REPRESENTS THE SNRI, SPEECH AND NOISE DISTORTION FOR FEMALE VOICE AS INTERFERENCE NOISE


0 db 13.5490 -21.7572 -30.3441

5 db 15.3606 -27.9074 -35.4025

10 db 15.4650 -27.6047 -40.0446

15 db 15.4885 -27.4057 -44.8835

20 db 15.4453 -27.2826 -49.8412

Fig. 6. 50. GUI Layout with female voice as interference noise at 10db of input SNR

The proposed system produces a good performance of results with female voice as

interference noise. The proposed system produces approximately 15 dB of SNRI with man

voice as interference noise. The PESQ value represents that the quality of the output speech

signal is good compared to that of the input signal. By listening the output speech signal it is

free from noise and a clean speech is audible at the output.

Chapter7

68

Chapter 7 Summary and Future work

7.1 Conclusion

This thesis is focused on the design, implementation and testing of an elko based GSC

system for the enhancement of the speech signal from the noisy speech signal. The proposed

elko based GSC has been implemented successfully and the performance of the system is

measured with five different noisy environments and with a male and female voice as an

interference signal to the main speech signal. The performance or the quality of the output

signal can be measured by objective measures performed on the system. Objective tests such

as SNR, SNRI, SD, ND and measure of the PESQ values have been implemented for the

evaluation of the proposed system. From the results obtained it can be concluded that the

proposed system provides a satisfactory results with the objective tests conducted on the

proposed system.

The system provides acceptable improvement of SNR for the car noise as the input

test noise signal even at low input SNR values. The proposed system provides a maximum of

approximately 22db for car noise as input test noise. By listening to the output speech signals

it is clean and free from noise. The speech quality of the output signal is measured with the

help of PESQ and it provides a value lies in the range 4.1 to 4.4 for car noise for different

input SNR values which indicates at the quality of the output speech signal is good. Car noise

provides a low speech distortion and high noise distortion values. The proposed system

performs well with babble noise provides an improvement of 17db and PESQ value lies in

the range 3.3 to 3.9 with acceptable speech and noise distortion. In case of tank and wind

noise the system provides an improvement of 18db and the PESQ value lies in the range 3.5

to 4.1. For destroy engine noise the system provides an improvement of 13db and PESQ

value lies in the range 1.9 to 3.8. For the male and female voice as interference signal the

system provides an improvement of 15db and the PESQ value lies in the range 2.7 to 4.1. By

listening to the output a small amount of interference signal is present but overall interference

signal has been reduced.

Chapter7

69

The proposed elko based GSC is less complex and computationally efficient. The

proposed system is successfully implemented and validated. For the better view the results

are presented the form of tables and graphs.

7.2 Future work

Considering the results of the experiment, the proposed elko based GSC is efficient in

reduction of noise from the noisy speech signal as the input to the system. Implementation of

the proposed elko based GSC was done in offline mode and in future the proposed method

has to be implemented in real time. Further the performance of the system should be tested

under the reverberation environment.

Reference

70

Bibliography

[1] S. Arlinger, A. Leijon, “Hearing Aids for Adults Benefits and Costs,” in The Swedish

council on Technology Assessment in Health Care, May 2003.

[2] M. Brandstein, D. ward, “Microphone Array Signal Processing Techniques and

Applications,” Ed. New York: Springer, 2010.

[3] A. Wang, K. Yao, R. E. Hudson, D. Korompis, F. Lorenzellii, S. Soli and S. Gao, ”

Microphone Array for Hearing Aid and Speech Enhancement Applications,” in Proc.

of Int. Conf. on Architectures and Processors, Los Angeles, CA, 19-21 Aug. 1996,

pp.231-239.

[4] G. W. Elko, “A Simple Adaptive First-Order Differential Microphone,” in Acoust. And

speech Research Dept. Bell Labs, Lucent Technologies, Murray Hill, NJ, Aug. 1999.

[5] J. P. Townsend, K. D. Donohue, “Stability Analysis for the Generalized Sidelobe

Canceller,” in IEEE Signal Process. Lett. Lexington, KY, USA, June 2010, pp. 603-

606.

[6] H. Puder, “Hearing Aids: An Overview of the State-of-the-Art, Challenges, and Future

Trends of an Interesting Audio Signal Processing Applications,” in proc. of 6th

Int.

Symp. on Image and Signal process. and Anal., Erlangen, Germany, Sept. 2009, pp. 1-

6.

[7] H. Luo, H. Arndt, “Digital Signal Processing Technology and Applications in Hearing

Aids,” in 6th

Int. Conf. on Signal Process., Canada, Aug. 2002, pp. 1727-1730 vol. 2.

[8] K. Kiyohara, Y. Kaneda, S. Takahashi, H. Nomura and J. Kijima, ”A Microphone

Array System for Speech Recognition,” in IEEE Int. Conf. on Acoust., Speech and

Signal Process., Musashino, Apr. 1997, pp. 215-218 vol. 1.

[9] A. Wang, K. Yao, R. E. Hudson, D. Korompis, F. Lorenzelli, S. F. Soli and S. Gao,

”A High Performance Microphone Array System for Hearing Aid Applications,” in

IEEE Int. Conf. on Acoust., Speech and Signal Process., Los Angeles, CA, May 1996,

pp. 3197-3200 vol. 6.

[10] David K. Campbell, “Adaptive Beamforming Using a Microphone Array for Hand-

Free Telephony,” M. S. Thesis, Dept. Elect. Eng., Virginia Polytechnic Institute and

State Univ., Blacksburg, Virginia, 1999.

Reference

71

[11] C. F. Scola, M. D. B. Ortega, “Direction of Arrival Estimation-A Two Microphone

approach,” M. S. Thesis, Dept. Elect. Eng., Blekinge Institute of Technology,

Karlskrona, Sweden, Sept. 2010.

[12] H. Teutsch, G. W. Elko, “First-and Second-Order Differential Microphone Array,” in

Acoust. And speech Research Dept. Bell Labs, Lucent Technologies, Murray Hill, NJ,

Aug. 1999.

[13] A. Acero, J. Droppo, M. Seltzer and I. Tashev. Audio Processing. [Online]. Available:

http://research.microsoft.com/en-us/projects/audioprocessing/default.aspx

[14] G. W. Elko, A. T. N. Pong, “A Simple Adaptive First-Order Differential

Microphone,” in IEEE ASSP Workshop on Applicat. of Signal Process. to Audio and

Acoust., Oct. 1995, pp. 169-172.

[15] G. W. Elko, “Noise-Reducing Directional Microphone Array,” U.S. Patent

US2009/0175466 A1, Jul. 2009.

[16] Basics of Beamforming [Online]. Available:

http://en.wikipedia.org/wiki/Beamforming

[17] I. Himawan, “Speech Recognition Using AD-HOC Microphone Arrays,” Ph.D.

dissertation, Dept. Elect. Eng., Queensland Univ., of Technology, Queensland, 2010.

[18] M. Zhang, M. H. Er, “Adaptive Beamforming by Microphone Array,” in IEEE Global

Telecomm. Conf., Nov. 1995, pp. 163-167 vol.1.

[19] A. Bouacha, F. Debbat and F. T. Bendimerad. (2008, Jan.). Modified Blind

Beamforming Algorithm For Smart Antenna System. [Online]. Available:

http://jre.cplire.ru/jre/jan08/3/text.html

[20] D. N. Johnson, D. E. Dudgeon, “Array Signal Processing,” Ed. New Jersey, Prentice-

Hall, 1993, Ch. 7, pp. 349-413.

[21] A. A. Gareta, “A Multi-Microphone Approach to Speech Processing in a Smart-room

Environment,” Ph.D. dissertation, Dept. Signal Theory and Commun., Universitat

Polit’ecnica de Catalunya, Barcelona, 2007.

[22] “Matlab Creating Graphical User Interface.” Ed. Natick, The Math Works, 2011, Ch.

2, pp. 2.2-2.36.

http://research.microsoft.com/en-us/projects/audioprocessing/default.aspx

http://en.wikipedia.org/wiki/Beamforming

http://jre.cplire.ru/jre/jan08/3/text.html

Reference

72

[23] Y. Hu, P. C. Loizou, “Evaluation of Objective Quality Measures for Speech

Enhancement,” in IEEE Trans. On Audio, Speech and Language Process., Dallas, TX,

Jan.2008, pp. 229-238.

[24] Noisex-92 database, taken from Signal Process. Inform. Base. [Online]. Available:

http://spib.rice.edu/spib/select_noise.html

[25] A. W. Rix, J. G. Beerends, M. P. Hollier and A. P. Hekstra, “Perceptual Evaluation of

Speech Quality-A New Method for Speech Quality Assessment of Telephone

Networks and Codecs,” in IEEE Int. Conf. on Acoust., Speech and Signal Process.,

Ipswich, 2001, pp. 749-752 vol.2.

[26] P. Stefan, T. Uhl, “Quantifying the Suitability of Reference Signals for the PESQ

Algorithm,” in Third Int. Conf., on Commun. Theory, Rel. and Quality of Service, June

2010, pp. 110-115.

[27] B. D. V. Veen, K. M. Buckley, “Beamforming: A Versatile Approach to Spatial

Filtering,” in IEEE, ASSP Mag., USA, April 1998, pp. 4-24.

[28] Description of Acoustic beamforming [Online]. Available:

http://www.lmsintl.com/acoustic-beamforming

[29] Y. Hu, P. C. Loizou, “Evaluation of Objective Quality Measures for Speech

Enhancement,” in IEEE Trans. On Audio, Speech and Language Process., Dallas, TX,

Jan.2008, pp. 229-238.

http://spib.rice.edu/spib/select_noise.html

http://www.lmsintl.com/acoustic-beamforming

Acoustic Beamforming for Hearing Aids Using Multi Microphone...

Documents

Transcript of Acoustic Beamforming for Hearing Aids Using Multi Microphone...