Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech...

51
Speech Processing Laboratory, Temple Speech Processing Laboratory, Temple University University May 5, May 5, 2004 2004 1 Structure-Based Speech Structure-Based Speech Classification Using Nonlinear Classification Using Nonlinear Embedding Techniques Embedding Techniques Uchechukwu Ofoegbu Advisor Dr. Robert E. Yantorno Committee Dr. Saroj K. Biswas Dr. Henry M. Sendaula

description

Speech Processing Laboratory, Temple University May 5, Overview  Voiced and Unvoiced Speech  Usable and Unusable Speech  Nonlinearities in Speech  Non-Linear Embedding  Research Goal  Proposed Research

Transcript of Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech...

Page 1: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

1

Structure-Based Speech Classification Structure-Based Speech Classification Using Nonlinear Embedding Using Nonlinear Embedding

TechniquesTechniques

Uchechukwu Ofoegbu

AdvisorDr. Robert E. Yantorno

CommitteeDr. Saroj K. Biswas

Dr. Henry M. Sendaula

Page 2: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

2

AcknowledgmentAcknowledgment Dr. Robert YantornoDr. Robert Yantorno Dr. Saroj BiswasDr. Saroj Biswas Dr. Henry SendaulaDr. Henry Sendaula Speech Lab MembersSpeech Lab Members

Air Force Research Laboratory,Air Force Research Laboratory,Rome, NYRome, NY

Page 3: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

3

OverviewOverview Voiced and Unvoiced Speech

Usable and Unusable Speech

Nonlinearities in Speech

Non-Linear Embedding

Research Goal

Proposed Research

Page 4: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

4

Voiced and Unvoiced SpeechVoiced and Unvoiced Speech

Page 5: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

5

Voiced/Unvoiced CharacteristicsVoiced/Unvoiced Characteristics

Voiced

Quasi-periodic excitation

Modulation by vocal tract

Production of vowels, voiced fricatives & plosives

Unvoiced

No periodic vibration of vocal chords

Noise-like nature

Production of unvoiced fricatives and plosives

Page 6: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

6

Usable SpeechUsable Speech

Portions of co-channel speech still usable for applications such as Speaker ID and Speech Recognition.

Low-energy (unvoiced/silence) segments overlap with high-energy (voiced) segments

Target-to-interferer Ratio (TIR) > 20dB

Page 7: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

7

Nonlinearities in SpeechNonlinearities in SpeechGlottal waveform changes

Shape varies with amplitude

Physical observations Flow in vocal tract is non-laminar

Coupling between vocal tract and folds When glottis is open, prominent changes are observed

in formant characteristics

Page 8: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

8

Nonlinear EmbeddingNonlinear Embedding

Nonlinear Systems

Point moving along some trajectory in an abstract state space

Coordinates of the point are independent degrees of freedom of the system

State space could be reconstructed from a scalar signal

Page 9: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

9

Nonlinear Embedding (cont’d)Nonlinear Embedding (cont’d)

Takens’ Method of Delays

A state space representation topologically equivalent to the original state space of a system can be reconstructed from a single observable dimension

Vectors in m-dimensional state space are formed from time-delayed values of a signal

Page 10: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

10

Nonlinear Embedding (cont’d)Nonlinear Embedding (cont’d)

dmisdisdisisix 1,,2,,

m = embedding dimension

d = delay value

Page 11: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

11

Nonlinear Embedding (Cont’d)Nonlinear Embedding (Cont’d)Delay value, d:

Dependent on sampling rate and signal properties

Large enough such that nonlinearities are taken into account by the reconstructed trajectory

Small enough to retain reasonable time resolution

Page 12: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

12

Nonlinear Embedding (Cont’d)Nonlinear Embedding (Cont’d)Dimension, m:

Generation of voiced speech constitutes a low-dimensional system

Generation of unvoiced speech constitutes a relatively high-dimensional system

Using a low dimension (such as m = 3) sufficiently reconstructs voiced but not unvoiced speech

Page 13: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

13

Page 14: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

14

Embedded Voiced and Embedded Voiced and Unvoiced SpeechUnvoiced Speech

-50000

5000

10000

-5000

0

5000

10000-5000

0

5000

10000

Embedded Voiced Speech

-2000

0

2000

-2000-10000

10002000-2000

-1000

0

1000

2000

Embedded Unvoiced Speech

Page 15: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

15

Embedded Usable and Embedded Usable and Unusable SpeechUnusable Speech

-4000-2000

02000

40006000

-5000

0

5000-4000

-2000

0

2000

4000

6000

Embedded Co-channel Speech of 30dB TIR

-10000-5000

05000

-10000-5000

05000

-10000

-5000

0

5000

Embedded Co-channel Speech of 10dB TIR

Page 16: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

16

Research GoalResearch GoalFeature Extraction

Difference-Mean Comparison (DMC) Measure

– Voiced/unvoiced classification

Nodal Density Measure– Voiced/unvoiced classification– Usable/unusable classification

Page 17: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

Difference-Mean Difference-Mean Comparison (DMC) MeasureComparison (DMC) Measure

Voiced/Unvoiced ClassificationVoiced/Unvoiced Classification

Page 18: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

18

IntroductionIntroduction 3rd order difference computation along first

non-singleton dimension

Ist order difference of NxN matrix given by

Length(3rd order diff. > mean) observed

(2,1) (1,1) (2, 2) (1, 2) . . . (2, ) (1, )(3,1) (2,1) (3, 2) (2,2) . . . (3, ) (2, )

. . .

. . .

. . .( ,1) (( 1),1) ( , 2) (( 1),2) . . . ( , ) (( 1), )

X X X X X N X NX X X X X N X N

X N X N X N X N X N N X N N

Page 19: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

19

Embedded Voiced and Embedded Voiced and Unvoiced SpeechUnvoiced Speech

-50000

5000

10000

-5000

0

5000

10000-5000

0

5000

10000

Embedded Voiced Speech

-2000

0

2000

-2000-10000

10002000-2000

-1000

0

1000

2000

Embedded Unvoiced Speech

Page 20: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

20

Difference-Mean Comparison Difference-Mean Comparison Distribution Distribution

0 20 40 60 80 100 120 140 1600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Prob

abili

ty

Difference-Mean Comparison

Clean Speech

VoicedUnvoiced

Page 21: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

21

Difference-Mean Comparison Difference-Mean Comparison DistributionDistribution

0 20 40 60 80 100 120 140 1600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Prob

abili

ty

Difference-Mean Comparison

Speech + 15dB Pink Noise

VoicedUnvoiced

Page 22: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

22

Difference-Mean Comparison Difference-Mean Comparison DistributionDistribution

0 50 100 1500

0.05

0.1

0.15

0.2

Prob

abili

ty

Difference-Mean Comparison

Speech + 15dB White NoiseVoicedUnvoiced

Page 23: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

23

DMC-Based Decisions

200 400 600 800 1000 1200 1400-1

0

1

Clean Speech => 1:V; 0:Dont Care; -1:UV

Ampl

itude

200 400 600 800 1000 1200 1400-1

0

1

Deci

sion

200 400 600 800 1000 1200 1400-1

0

1

Sample Number

Deci

sion

Page 24: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

24

DMC-Based Decisions

200 400 600 800 1000 1200 1400-1

0

1

Speech + 15dB Pink Noise => 1:V; 0:Dont Care; -1:UV

Ampl

itude

200 400 600 800 1000 1200 1400-1

0

1

Deci

sion

200 400 600 800 1000 1200 1400-1

0

1

Sample Number

Deci

sion

Page 25: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

25

DMC-Based Decisions

200 400 600 800 1000 1200 1400-1

0

1

Speech + 15dB White Noise => 1:V; 0:Dont Care; -1:UV

Ampl

itude

200 400 600 800 1000 1200 1400-1

0

1

Deci

sion

200 400 600 800 1000 1200 1400-1

0

1

Sample Number

Deci

sion

Page 26: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

26

DMC-Based Decisions

200 400 600 800 1000 1200 1400-1

0

1

Clean Speech => 1:V; 0:Dont Care; -1:UV

Ampl

itude

200 400 600 800 1000 1200 1400-1

0

1

Deci

sion

200 400 600 800 1000 1200 1400-1

0

1

Sample Number

Deci

sion

Page 27: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

27

DMC-Based Decisions

200 400 600 800 1000 1200 1400-1

0

1

Speech + 15dB Pink Noise => 1:V; 0:Dont Care; -1:UVAm

plitu

de

200 400 600 800 1000 1200 1400-1

0

1

Deci

sion

200 400 600 800 1000 1200 1400-1

0

1

Sample Number

Deci

sion

Page 28: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

28

DMC-Based Decisions

200 400 600 800 1000 1200 1400-1

0

1

Speech + 15dB White Noise => 1:V; 0:Dont Care; -1:UV

Ampl

itude

200 400 600 800 1000 1200 1400-1

0

1

Deci

sion

200 400 600 800 1000 1200 1400-1

0

1

Sample Number

Deci

sion

Page 29: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

29

ResultsResultsHits Minus False Alarms for Voiced Speech

0

20

40

60

80

100

Clean 15dB P ink 15dB White

FR/RE E/ZC DMC

Page 30: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

30

Results (Cont’d)Results (Cont’d)Hits Minus False Alarms for Unvoiced Speech

0

20

40

60

80

100

Clean 15dB Pink 15dB White

Perc

ent

FR/RE E/ZC DMC

Page 31: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

Nodal Density MeasureNodal Density Measure Voiced/Unvoiced ClassificationUsable/Unusable Classification

Page 32: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

32

IntroductionIntroduction Smallest cube which encloses the signal is

determined

This cube is divided into N smaller cubes

Edges of the smaller cubes are defined as nodes

Number of nodes spanned by the signal is determined

Ratio of number of nodes spanned to total number of nodes is defined as nodal density

Page 33: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

Voiced/Unvoiced ClassificationVoiced/Unvoiced Classification

Page 34: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

34

Embedded Voiced and Unvoiced Embedded Voiced and Unvoiced Speech Frames with GridsSpeech Frames with Grids

-0.1-0.05

00.05

0.10.15

-0.1-0.05

00.05

0.10.15-0.1

-0.05

0

0.05

0.1

0.15

Voiced

-0.01-0.005

00.005

0.01

-0.01

-0.0050

0.005

0.01-0.01

-0.005

0

0.005

0.01

Unvoiced

Page 35: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

35

Nodes Spanned by Embedded Voiced and Nodes Spanned by Embedded Voiced and Unvoiced Speech FramesUnvoiced Speech Frames

-0.1-0.05

00.05

0.10.15

-0.1-0.05

00.05

0.10.15-0.1

-0.05

0

0.05

0.1

0.15

Voiced

-0.01-0.005

00.005

0.01

-0.01

-0.005

0

0.005

0.01-0.01

-0.005

0

0.005

0.01

Unvoiced

Page 36: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

36

Nodal-Density Distribution Nodal-Density Distribution

0.03 0.04 0.05 0.06 0.070

0.05

0.1

0.15

0.2

0.25

Prob

abili

ty

Nodal-Density

Clean Speech VoicedUnvoiced

Page 37: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

37

Nodal-Density Distribution Nodal-Density Distribution

0.03 0.04 0.05 0.06 0.070

0.05

0.1

0.15

0.2

0.25

Prob

abili

ty

Nodal-Density

Speech + 15dB Pink Noise

VoicedUnvoiced

Page 38: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

38

Nodal-Density Distribution Nodal-Density Distribution

0.04 0.045 0.05 0.055 0.06 0.065 0.07 0.0750

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

Prob

abili

ty

Nodal-Density

Speech + 15dB White NoiseVoicedUnvoiced

Page 39: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

39

FilteringFiltering

Moving Average Filter

Order, M = 10

Page 40: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

40

Nodal-Density Distributions after Nodal-Density Distributions after FilteringFiltering

0.03 0.04 0.05 0.06 0.070

0.05

0.1

0.15

0.2

Prob

abili

ty

Nodal Density

Clean Speech

VoicedUnvoiced

0.03 0.04 0.05 0.06 0.070

0.05

0.1

0.15

0.2

0.25

Prob

abili

ty

Nodal-Density

Clean Speech VoicedUnvoiced

Page 41: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

41

Nodal-Density Distributions after Nodal-Density Distributions after FilteringFiltering

0.03 0.04 0.05 0.06 0.070

0.05

0.1

0.15

0.2

0.25

Prob

abili

ty

Nodal Density

Speech + 15dB Pink Noise

VoicedUnvoiced

0.03 0.04 0.05 0.06 0.070

0.05

0.1

0.15

0.2

0.25

Prob

abili

ty

Nodal-Density

Speech + 15dB Pink NoiseVoicedUnvoiced

Page 42: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

42

Nodal-Density Distributions After Nodal-Density Distributions After FilteringFiltering

0.04 0.05 0.06 0.070

0.05

0.1

0.15

0.2

Prob

abili

ty

Nodal Density

Speech + 15dB White Noise

VoicedUnvoiced

0.04 0.045 0.05 0.055 0.06 0.065 0.07 0.0750

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

Prob

abili

ty

Nodal-Density

Speech + 15dB White NoiseVoicedUnvoiced

Page 43: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

43

ResultsResultsHits Minus False Alarms for Voiced Speech

010203040506070

Clean 15dB P ink 15dB White

ND ND_Filt

Page 44: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

44

Results (Cont’d)Results (Cont’d)

Hits Minus False Alarms for Unvoiced Speech

010203040506070

Clean 15dB Pink 15dB White

Perc

ent

ND ND_Filt

Page 45: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

Proposed ResearchProposed Research

Usable/Unusable ClassificationUsable/Unusable Classification

Page 46: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

46

Embedded Usable and Unusable Embedded Usable and Unusable Speech Frames with GridsSpeech Frames with Grids

-10000-5000

05000

-10000-5000

05000

-10000

-5000

0

5000

Embedded Co-channel Speech of 10dB TIR with Grids

-5000

0

5000

-5000

0

5000-5000

0

5000

Embedded Co-channel Speech of 30dB TIR with Grids

Page 47: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

47

Nodes Spanned by Embedded Usable Nodes Spanned by Embedded Usable and Unusable Speech Framesand Unusable Speech Frames

-4000-2000

02000

40006000

-5000

0

5000-4000

-2000

0

2000

4000

6000

Nodes Spanned by Embedded Co-channel Speech of 30dB TIR

-10000

-5000

0

5000

-10000

-5000

0

5000-6000

-4000

-2000

0

2000

4000

6000

Nodes Spanned by Embedded Co-channel Speech of 30dB TIR

-10000

-5000

0

5000

-10000

-5000

0

5000-6000

-4000

-2000

0

2000

4000

6000

Nodes Spanned by Embedded Co-channel Speech of 30dB TIR

Page 48: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

48

Preliminary ResultsPreliminary Results

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1ROC Curve for Usable Speech Detection Using the Nodal Density Measure

False Alarms

Hits

Page 49: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

49

SummarySummary

SpeechSpeech Nonlinear Embedding

Difference-Mean

Comparison

Nodal Density Usable/Unusable Usable/Unusable

ClassificationClassification

V/UV ClassificationV/UV Classification

V/UV ClassificationV/UV Classification

Page 50: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

50

Future Proposed ResearchFuture Proposed Research Determine optimum filter for nodal density-based

voiced/unvoiced classification

Develop nodal density measure for usable/unusable classification

Investigate the presence of complimentary information in between both features (DMC and nodal density) for voiced/unvoiced classification

Perform decision-level fusion of both features

Page 51: Speech Processing Laboratory, Temple University May 5, 2004 1 Structure-Based Speech Classification…

Speech Processing Laboratory, Temple University Speech Processing Laboratory, Temple University

May 5, 2004May 5, 2004

51

If you understood this If you understood this presentation presentation

……

please askplease ask QUESTIONS !!!QUESTIONS !!!