The Utilization of Subjective Evaluation in the Development of Vocoders

32
ARCON Corporation J.D. Tardelli - [email protected] The Utilization of Subjective Evaluation in the Development of Vocoders

description

The Utilization of Subjective Evaluation in the Development of Vocoders. Evaluation Basics. Purpose Research Vocoder Development Vocoder Characterization Selection Validation Types of Conditions of Interest Baseline Acoustic Background Noise Transmission Channel Impairments - PowerPoint PPT Presentation

Transcript of The Utilization of Subjective Evaluation in the Development of Vocoders

Page 1: The Utilization of Subjective Evaluation in the Development of Vocoders

ARCON CorporationJ.D. Tardelli - [email protected]

The Utilization of Subjective Evaluation in the Development of Vocoders

Page 2: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 2 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Evaluation Basics• Purpose– Research

– Vocoder Development

– Vocoder Characterization

– Selection

– Validation

• Types of Conditions of Interest– Baseline

– Acoustic Background Noise

– Transmission Channel Impairments

– Talker Variability

– Signal Levels

– System Tandems

– Digital Circuit Multiplication Systems

Page 3: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 3 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Subjective Testing - Control of Variables• Laboratory Factors

– Listening Environment; Audio & Electronics

• Source/Processed Recording Factors– Speech Material Factors

• Linguistic and Phonetic• Talker Factors• Transducer Selection

– Audio and Sampled Bandwidth Factors

– Acoustic Noise Material and Speech + Noise Method

• Listener Factors• Presentation Factors

– Blocking, Order and Balance

– Audio Level and Sidetone

Page 4: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 4 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Associated Issues

• User Population and Face Validity

• Context– Range of Candidate Systems

– Reference and Calibration Systems

• Listen Only vs. Two-Way Methods– Delay

– Asymmetric Transmission Channels

– VoIP

• Speech material– Speech Sample length re impairment distribution

– Uniqueness, Amount Available

– Type

Page 5: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 5 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Associated Issues (cont.)• Speech material (by increasing contextual content)

– Types• Scripted

– Sounds– Words

» rhyming, CVC, etc

– Sentences» meaningful, nonsense, semantically anomalous, etc

– Connected sentences– Scripts

• Scenario based– Representative of application?– Informational or Familiar– Information flow (balanced?, directional?)

• Task Based• Open

Page 6: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 6 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Performance Characteristics & Test Methodology• Quality

– Diagnostic Acceptability Measure - DAM• Voiers ICASSP77

– Category Rating Tests - ACR (MOS); DCR (DMOS) CCR (CMOS)• ITU-T P.800: P.830• ITU HANDBOOK ON TELEPHONOMETRY

• IEEE Recommended Practices for Speech Quality Measures 1969

– Paired Comparison A/B Tests• David, H.A, “The Method of Paired Comparison,” Oxford

– Multi Stimulus Test with Hidden Reference and Anchor - MUSHRA• ITU-R BS.1534-1

– Speech Communication Systems with Noise Suppression Algorithms• ITU-T P.NSA

Page 7: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 7 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Performance Characteristics & Test Methodology

• Speaker Recognizability– NRL Speaker Recognition Test (speakers unknown)

• Schmidt-Nielsen SCW95, ICASSP96, JASA 1985

– TNO Speaker Recognition Test (speakers known)• Steeneken & Leeuwen 1997

• Language Dependency– SRT-LD

• Wijngaarden SCW02, EuroSpeech01, Ph.D. Dissertation 2003

• Conservation of Stress State Characteristics

Page 8: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 8 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Performance Characteristics & Test Methodology

• Communicability– Conversation Opinion Tests

• ITU-T P.800

– Conversational & Third Party Listen Only Tests• ITU-T P.832, P-581 (HATS)

– Continuous Quality Evaluation Method - ECQ• ITU-T P.PAC

– Arcon Communicability Exercise - ACE• Tardelli ICASSP96, NAS-NRC CHABA Symposium 1995

– TNO Communicability Test• Wijngaarden EuroSpeech01

Page 9: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 9 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Performance Characteristics & Test Methodology• Intelligibility

– Modified Rhyme Test - MRT• ANSI S3.2-1989; House 1965; Kruel 1968

– Diagnostic Rhyme Test - DRT• ANSI S3.2-1989; Voiers 1973, 1987

– Consonant-Vowel-Consonant Test - CVC (AI Basis)• Fletcher ATT 1920s, JASA 1950; Allen 1994, ICASSP02; Steeneken 1992

– Speech Reception Threshold - SRT• Plomp & Mimpen 1979; Wijngaarden & Steeneken EuroSpeech99

– International Civil Aviation Org. Spelling Alphabet - ICAO • Moser & Dreher 1955; Schmidt-Nielson NRL R9035 1987, R9174 1988

– INTELTRANS -(CVC, HATS)• CELAR France MOD; J.C. Lafon 1958, 1964, 1968

Page 10: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 10 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Intelligibility Measures vs. Information

Webster, 1979

ANSI S3.5-1969

Page 11: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 11 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Evaluation Decisions• Purpose

• Types of Conditions

• Performance Characteristics of Importance

• Choice of Test Methodologies

• Development of Test Plan

• Selection Criteria if Selection Test

Page 12: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 12 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Vocoder Development Issues• Application

– Commercial

– Strategic

– Tactical

• Diagnostic Information– Intelligibility

– Quality

– Communicability

Page 13: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 13 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Low-Rate Vocoder for Tactical Use

• Harsh Acoustic Noise Environments

• Physical and Jamming Channel Issues

• LPI / LPD

• Intelligibility

• Talker Recognizability

• Conserve Stress State of Talker

• Audio Bandwidth

• Delay

• Size - Weight -Power

Page 14: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 14 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Narrowband Low-Rate Vocoder IntelligibilityVocoder Intelligibility - Benign Environments

76.0

78.0

80.0

82.0

84.0

86.0

88.0

90.0

92.0

94.0

LPC10e CELP CVSD-16 MELP MELPe2.4 MELPe1.2

DR

T

Quiet

H250

Office

MCE

Vocoder Intelligibility - Mild Noise Environments

64.0

66.0

68.0

70.0

72.0

74.0

76.0

78.0

80.0

82.0

84.0

86.0

88.0

90.0

LPC10e CELP CVSD-16 MELP MELPe2.4 MELPe1.2

DR

T

E3A

SC55

P3C

F15

Vocoder Intelligibility - Severe Environments

28.0

33.0

38.0

43.0

48.0

53.0

58.0

63.0

68.0

73.0

78.0

LPC10e CELP CVSD-16 MELP MELPe2.4 MELPe1.2

DR

THMMWV

M2

CH47

Intelligibility results for current low-rate military

vocoders in acoustic background noise

Page 15: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 15 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Effects of Current Noise PreprocessorsProcessing Intelligibility w/ 95% C.I.

65.0

70.0

75.0

80.0

85.0

90.0

95.0

100.0

Source NPP MELP+NPP MELP

Process

DR

T

Quiet

HMMWV

Processing Quality w/ 95% C.I.

35.0

45.0

55.0

65.0

75.0

85.0

Source NPP MELP+NPP MELP

Process

DA

M Quiet

HMMWV

Intelligibility - DRT

Quality - DAM

Page 16: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 16 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Road Map to Improved DRT Intelligibility

• Inherent Distinctive Features– Jacobson, Fant, and Halle 1952; Miller & Nicely, 1955

• DRT Attributes– Voiers 1973, 1987

• DRT Attributes : Distinctive Features :Acoustic Correlates– Voiers, Benchmark Papers in Acoustics, V11 1977

• Diagnostic Capabilities of the DRT

• Cook Book

Page 17: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 17 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Inherent Distinctive Features (Jacobson, Fant, and Halle 1952)• Fundamental Source Features

• Vocalic Non-Vocalic

• Consonantal Non-Consonantal

• Secondary Consonant Features

– Envelope Features

• Continuant Interrupted

• Checked Unchecked

• Strident Mellow

– Supplementary Source

• Voiced Voiceless

• Resonant Features

• Compact Diffuse

– Tonality Features

• Grave Acute

• Flat Plain

• Sharp Plain

• Tense Lax

– Supplementary Resonator

• Nasal Oral

Page 18: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 18 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

DRT AttributesSTATE PRESENT ABSENT

Page Num PRESENT ABSENT Feature SubFeature V/UV Place Manner V/UV Place Manner1 1 GOB BOB EXP Voiced Velar Plosive Voiced Bilabial Plosive1 2 DAUNT TAUNT VOICING NON-FRICTIONAL Voiced Alveolar Plosive Voicless Alveolar Plosive1 3 MOOT BOOT NASALITY GRAVE Voiced Bilabial Affricate Voiced Bilabial Plosive1 4 SHEET CHEAT SUSTENTION UNVOICED Voicless Palato-Alveolar Fricative Voicless Palato-Alveolar Affricate1 5 JAB GAB SIBILATION VOICED Voiced Palato-Alveolar Affricate Voiced Velar Plosive1 6 POT TOT GRAVENESS UNVOICED Voicless Bilabial Plosive Voicless Alveolar Plosive1 7 GHOST BOAST COMPACTNESS VOICED Voiced Velar Plosive Voiced Bilabial Plosive1 8 RILL NILL EXP Voiced Palato-Alveolar Approximant Voiced Alveolar Affricate1 9 ZED SAID VOICING FRICTIONAL Voiced Alveolar Fricative Voicless Alveolar Fricative1 10 GNAW DAW NASALITY ACUTE Voiced Alveolar Affricate Voiced Alveolar Plosive1 11 SHOES CHOOSE SUSTENTION UNVOICED Voicless Palato-Alveolar Fricative Voicless Palato-Alveolar Affricate1 12 CHEEP KEEP SIBILATION UNVOICED Voicless Palato-Alveolar Affricate Voicless Velar Plosive1 13 BANK DANK GRAVENESS VOICED Voiced Bilabial Plosive Voiced Alveolar Plosive1 14 GOT DOT COMPACTNESS VOICED Voiced Velar Plosive Voiced Alveolar Plosive1 15 NOSE ROSE EXP Voiced Alveolar Affricate Voiced Palato-Alveolar Approximant1 16 DINT TINT VOICING NON-FRICTIONAL Voiced Alveolar Plosive Voicless Alveolar Plosive1 17 NECK DECK NASALITY ACUTE Voiced Alveolar Affricate Voiced Alveolar Plosive1 18 THONG TONG SUSTENTION UNVOICED Voicless Dental Fricative Voicless Alveolar Plosive1 19 CHOO COO SIBILATION UNVOICED Voicless Palato-Alveolar Affricate Voicless Velar Plosive1 20 WEED REED GRAVENESS VOICED Voiced Labio-velar Approximant Voiced Palato-Alveolar Approximant1 21 SHAG SAG COMPACTNESS UNVOICED Voicless Palato-Alveolar Fricative Voicless Alveolar Fricative1 22 KNOB ROB EXP Voiced Alveolar Affricate Voiced Palato-Alveolar Approximant1 23 VOLE FOAL VOICING FRICTIONAL Voiced Labio-Dental Fricative Voicless Labio-Dental Fricative1 24 NIP DIP NASALITY ACUTE Voiced Alveolar Affricate Voiced Alveolar Plosive1 25 FENCE PENCE SUSTENTION UNVOICED Voicless Labio-Dental Fricative Voicless Bilabial Plosive1 26 SAW THAW SIBILATION UNVOICED Voicless Alveolar Fricative Voicless Dental Fricative1 27 POOL TOOL GRAVENESS UNVOICED Voicless Bilabial Plosive Voicless Alveolar Plosive1 28 YIELD WIELD COMPACTNESS VOICED Voiced Palatal Approximant Voiced Labio-velar Approximant1 29 GNAT RAT EXP Voiced Alveolar Affricate Voiced Palato-Alveolar Approximant

Page 19: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 19 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

DRT Attributes : Distinctive Features : Acoustic Correlates

DRT Attributes JFH Distinctive FeaturesVoicing Voiced/Voicelessharmonic content, energy at concentration at LF, long duration, low peak power

Nasality Nasal/Oralnasal formants in regions of 200, 800 and 2400 Hz

Sustention Continuant/Interruptedgradual onset > 130 msec, low level noise in MF to HF

Sibilation Strident/Mellowsustained HF noise of relatively high intensity

Compactness Compact/DiffuseLF spectral shape, low loci of 2nd and 3rd formants, dynamics of formant transitions

Graveness Grave/AcuteHF spectral shape, separation of 2nd and 3rd formants, dynamics of 2nd and 3rd formant

Page 20: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 20 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Diagnostic Capabilities of the DRT• Talkers

– Male : Female

• Attribute State– Present : Absent

• Attribute Bias

• Sub-Attribute Scores

• Characteristic Attribute Profile

• Empirical Studies

Page 21: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 21 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Cook Book for Improved Intelligibility

Page 22: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 22 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Pitfalls in Subjective Evaluation

•Measured Intelligibility vs. Real World Intelligibility– NAS-NRC CHABA 1989 Symposium Removal of Noise From

Noise-Degraded Speech Signals

– Vocoder Tuned to DRT Words

– Vocoder based on “scripted word” characteristics that are not applicable to conversational speech.

•Danger of "self evaluation" by Vocoder Developers– Tardelli, ICASSP96, DAM vs MOS Study 1996

Page 23: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 23 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

DAM vs. MOS Study

A Systematic Investigation of the Mean Opinion Score (MOS)

and the Diagnostic Acceptability Measure (DAM) for Use in the

Selection of Digital Speech Compression Algorithms

ARCON Corp. 1996

Available in DRAFT form at http://www.arcon.com/dld.html

Page 24: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 24 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

P.NSA and WHY

•ETSI/3GPP AMR-NS 1999

•Exp.. 3 MMOS w/ Multi-Dimensional QuestionYou will hear speech samples reproduced in a telephone handset. Every sample consists of four short unconnected sentences in a noise environment. Your task is to indicate your opinion of the overall sound quality with respect to any unnatural sound in the sample. Please make your judgement of the sample considering unnatural sound during the complete sample.

•Resulted in Bimodal Decision

P.NSA Subjective test methodology for evaluating speech

communication systems that include noise suppression algorithm

SummaryThis document proposes a methodology for evaluating the subjective quality of speech in noise and particularly appropriate for the evaluation of noise suppression algorithms. The proposed methodology uses separate rating scales to independently estimate the subjective quality of the Speech Signal alone, the Background Noise alone, and Overall

Quality. ITU-T SG12/Q7 SQEG, Primarily Dynastat and FT

Page 25: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 25 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

INTELTRANS Testbed

Page 26: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 26 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

DRT Characteristic Attribute Profile2400bps MELPe DRT Results - Female

50

55

60

65

70

75

80

85

90

95

100

V N Su Si G C T

Attributes

Inte

llig

ibil

ity

BlackHaw kUH-60

M2 BradleyVehicle

MOUT

MobileCommandEnclosureOffice

Quiet

2400bps MELPe DRT Results - Male

50

55

60

65

70

75

80

85

90

95

100

V N Su Si G C T

Attributes

Inte

llig

ibil

ity

BlackHaw kUH-60

M2 BradleyVehicle

MOUT

MobileCommandEnclosureOffice

Quiet

2400bps MELPe DRT Results - Combined

50

55

60

65

70

75

80

85

90

95

100

V N Su Si G C T

Attributes

Inte

llig

ibil

ity

UH-60

UH-60Present

UH-60Absent

MCE

MCE Present

MCE Absent

2400bps MELPe DRT Results - Combined

50

55

60

65

70

75

80

85

90

95

100

V N Su Si G C T

Attributes

Inte

llig

ibil

ity

M2

M2 Present

M2 Absent

Quiet

Quiet Present

Quiet Absent

Page 27: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 27 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Empirical Study of DRT Attributes vs. SNR

Band Limited Gaussian Noise

Voiers, JASA 1973

Page 28: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 28 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Scripted Material - DRT Word Lists

MOOT or BOOT Voicing

SHEET or CHEAT Nasality

JAB or GAB Sustention

POT or TOT Sibilation

GHOST or BOAST Graveness

DINT or TINT Compactness

Page 29: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 29 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Scripted Material - CVC Nonsense Words

MIG(RAINE)COS(T)HAYMDITTOUP(EE)BACHPOD(IUM)SEM(I)LAL:PALREAS(ON)REET:BEETSAYZ:DAYSBOD(Y)KOOMLEP(ER)PONE:BONEHIESDACK:BACKTEEG:LEAGUEMAHL

Page 30: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 30 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Problems with CVC Test Implementation

• CVC Corpus Balance

– Talker by Word by Environment

– Word by Distinctive Feature by Lexicon

• Regional Dialectic Differences

– New England• Spoken “COT” = “CAUGHT”

• Perception Midwest “CART” = “COT”

• Test Design

– Uniqueness for Talker By Word by Environment by Process

– Balance Across Distinctive Feature by Process

– Balance Across Subject by Stimulus

– Sufficient Subjects for Reasonable Resolution

Page 31: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 31 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Diagnostic Capabilities of INTELTRANS

Five acoustic indices are sufficient to characterize all vowels :sharp / lowdiffuse / compactextreme / no extremeflatten / sharpennasal / no nasal

.picture 2 : The French vocalic system (from L.J. Boê et al.)

Page 32: The Utilization of Subjective Evaluation in the Development of Vocoders

The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003

Slide 32 of 22

ARCON CorporationJ.D. Tardelli - [email protected]

Diagnostic Capabilities of INTELTRANS (cont.)Seven acoustic indices sufficient to determine all the consonants :

voice / unvoiceinterrupted / no interruptedvocalic / no vocalicsharp / lowcontinuous / discontinuousdiffuse / compactnasal / no nasal

picture 3 : the French consonantic system (from L.J. Boê et al.)