Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as...

22
Long-Term Formant Long-Term Formant Distribution as a forensic- phonetic feature phonetic feature ASA 2 d P A i /Ib i ASA 2nd Pan-American/Iberian Meeting on Acoustics Cancún, México, Nov 15-19, 2010 2010 Michael Jessen and Timo Becker Michael Jessen and Timo Becker BKA, Department of Speaker Identification and Audio Analysis (KT54)

Transcript of Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as...

Page 1: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Long-Term FormantLong-Term Formant Distribution as a forensic-phonetic featurephonetic feature

ASA 2 d P A i /Ib iASA 2nd Pan-American/Iberian Meeting on Acoustics

Cancún, México, Nov 15-19, 20102010

Michael Jessen and Timo BeckerMichael Jessen and Timo BeckerBKA, Department of Speaker Identification and Audio Analysis (KT54)

Geoff
Text Box
3aSC4 Special Session on Forensic Voice Comparison and Forensic Acoustics @ 2nd Pan-American/Iberian Meeting on Acoustics, Cancún, México, 15–19 November, 2010 http://cancun2010.forensic-voice-comparison.net
Page 2: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

StructureStructure

1. Long-Term Formant Distribution: measurement methods and backgroundg

2. LTF and body height

3 LTF t i t3. LTF measurement consistency

4. Language dependence of LTF

5. Recognition performance based on LTF and automatic speaker recognition

6. Conclusions

Nov 17, 20102 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature

Page 3: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Long-Term Formant (LTF) Distribution: t i lterminology

Long Te m Fo mant Dist ib tion (Nolan & G igo as 2005)Long-Term Formant Distribution (Nolan & Grigoras, 2005)is a global (as opposed to segment-based) representation of vowel formant frequencies over an entire recording of a speaker (or over a long stretch of speech from that speaker).

Formant frequencies are extracted with a formant tracker (LPC-based) and manually corrected. No segmentation into sounds is performed.into sounds is performed.

The resulting distribution of formant values (mainly F2 and F3) can be characterized in different ways Theand F3) can be characterized in different ways. The simplest way is to calculate the average. More advanced ways include modeling of the LTF distribution with Gaussian Mixture Models (GMM) (Becker et al 2008)

Nov 17, 20103 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature 3

Gaussian Mixture Models (GMM) (Becker et al., 2008).

Page 4: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Speech-Datei Ungeschnitten geschnitten und Excel-Ausschnitt

Illustration of the method:Illustration of the method:

Step 1: Editing the signal in a way that only vowels with clear formantonly vowels with clear formant structure remain

Step 2: LPC-analysis and manual correction of the formant tracks

Nov 17, 20104 Long-Term Formant (LTF) Distribution as a forensic-phonetic featureWorkshop LTF - BKA 2010 - M.Jessen 4

Page 5: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Step 3: Exporting the formant tracks F1,2,3 for further processing

F1 of limited reliability in telephone speech; F4 unreliable or invisible

3500

4000

2000

2500

3000

3500

F1

F2

500

1000

1500 F3

Nov 17, 20105 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature 5

01 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 105 109 113 117 121 125 129

Formant values every 10 ms

Page 6: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Example of the raw LTF di t ib ti f kdistribution of a speaker

from freeware Catalina Forensic Expert opinion v1.0from Catalin Grigoras (U Colorado Denver) http://www forensicav ro/download/CatalinaManual3h pdf

Nov 17, 20106 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature

http://www.forensicav.ro/download/CatalinaManual3h.pdf

Page 7: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Correlation between LTF and body h i ht

1800

height

F2Pearson's product-moment correlation

1400

1500

1600

1700

F2 [H

z]

F2 One-sided (less)rho=-0.315726857072528p=0.00204454743894922

1100

1200

1300

1400

LTF

1100150 155 160 165 170 175 180 185 190 195 200 205

Body height [cm]

2800

F3rho=-0.339139631480740p 0 00097693931875183

2400

2500

2600

2700

F3 [H

z]

F3Significant negative correlations between long-

p=0.00097693931875183

2000

2100

2200

2300LTF

LTF-means from 81 speakers in

term formant frequencies (F2, F3) and body height

Nov 17, 20107 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature 7

2000150 155 160 165 170 175 180 185 190 195 200 205

Body height [cm]

LTF means from 81 speakers in Pool 2010 (telephone-transmitted) (thanks to Hanna Feiser for assistance)

Page 8: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Measurements consistency across h ti i LT F2phoneticians: LT-F2

1800

1600

1700F2

1400

1500

1600

2 [H

z]

JF

AK

Bay

1200

1300

1400

LT-F

2 Bay

B1

B2

1000

1100

1200

10001 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

recordings of different speakers

Nov 17, 20108 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature 8

Pearson correlations (two-sided) between 0.84 and 0.95LTF-means from 20 speakers in “Digs” dialect corpus under forensically realistic conditions

Page 9: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Measurements consistency across h ti i LT F3phoneticians: LT-F3

2800

F3

2600

2700F3

2400

2500

3 [H

z]

JFAKBay

2200

2300

2400

LT-F

BayB1B2

2100

2200

20001 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

recordings of different speakers

Nov 17, 20109 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature 9

Pearson correlations (two-sided) between 0.98 and 0.99

Page 10: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Language influence on LTFLanguage influence on LTF

2900

3000

RussianGerman probe1German probe2 For these data,

2600

2700

2800

[Hz]

German probe3Albanian

different languages do not differ in the LTF-space that th

2400

2500LT‐F3 [ they occupy

(one-way ANOVA [F(4,55) = 0.44; p= 0.77]).

2100

2200

2300

2000

2100

1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000

LTF-means from three German speakers in Digs dialect corpus and from Russian and Albanianspeakers in case data under

Nov 17, 201010 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature 10

LT‐F2 [Hz]p

analogous conditions (spont telephone)

Page 11: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Speaker recognition tests

37 target trials and 803 non-target trials, involving 21 speakers

Speaker recognition tests

g g , g pfrom casework, comparing:

- Baseline = a standard GMM-UBM automatic system- FGMM = GMM-modeled LTF

Nov 17, 201011 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature

Page 12: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Target trials (same speaker)

Non-target trials (different speakers)New development at BKA:

DiSC-PlotDiscrimination, Scatter, Correlation

butio

nm

ant

Dis

trib

g-Te

rm F

orm

Long

Nov 17, 201012 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature

Automatic speaker recognition system

logLR (lnLR)

Page 13: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Conclusions: LTF analysis in forensic h ti d ti (1)phonetics and acoustics (1)

☺ LTF (F2 and F3) correlates negatively with body height (relevant for voice profiling).

☺ LTF measurements have high consistency across phonetic experts.

☺ f f☺ Pending further tests and with some degree of caution, LTF statistics established for one language can be used across languages.

☺ LTF (F2 and F3) do not differ much between different vocal effort levels. Vocal effort differences are a common problem i f i t i lin forensic material.

Nov 17, 201013 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature

Page 14: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Conclusions: LTF analysis in forensic h ti d ti (2)phonetics and acoustics (2)

Performance of LTF analysis with classical evaluation measuresPerformance of LTF analysis with classical evaluation measures (DET-plots, APE-plots, Cllr) is worse than performance of automatic speaker recognition and fusion does not increase overall performance. But: p

The tests so far are based predominantly on matching conditions; under mismatched conditions, the relative performance of LTF analysis might increase.analysis might increase.

☺ Detailed results in the DiSC plot shows that LTF and automatic speaker recognition can make different errors: using both methods is a good safeguard against false conclusions.methods is a good safeguard against false conclusions.

Quite limited LR values in same-speaker comparisons (max about LR=16 in case material for the tests so far): LTF cannot give very strong support for same-speaker hypothesisstrong support for same speaker hypothesis.

☺ Different-speaker comparisons can yield very low LR values: LTF can give very strong support for different-speaker hypothesis.

Nov 17, 201014 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature

Page 15: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

References

Becker, Timo, Michael Jessen and Catalin Grigoras (2008): Forensic speaker verification using formant features and Gaussian mixture models. Proceedings of Interspeech 2008, 1505-1508.

Kirchhübel Christin (2009): The effects of Lombard speech on vowel formant measurements MSc thesisKirchhübel, Christin (2009): The effects of Lombard speech on vowel formant measurements. MSc thesis, University of York, UK.

Moos, Anja (2008): Forensische Sprechererkennung mit der Messmethode LTF (long-term formant distribution) MA thesis Universität des Saarlandesdistribution). MA thesis, Universität des Saarlandes. www.psy.gla.ac.uk/docs/download.php?type=PUBLS&id=1286.

Moos, Anja (2010): Long-term formant distribution as a measure of speaker characteristics in read and spontaneous speech To appear in The Phoneticianspontaneous speech. To appear in The Phonetician.

Nolan, Francis and Catalin Grigoras (2005): A Case for formant analysis in forensic speaker identification. International Journal of Speech, Language and the Law 12: 143-173.

Wagner, Katrin (2010): Der Einfluss der Sprechlautstärke auf die ersten drei Vokalformanten in mobilfunkübertragener Sprache: Forensischer Stimmenvergleich anhand der LTF-Methode“. BA thesis, Universität Frankfurt.

Nov 17, 201015 Long-Term Formant (LTF) Distribution as a forensic-phonetic featureWorkshop LTF - BKA 2010 - M.Jessen 15

Page 16: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Nov 17, 201016 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature

Page 17: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Inter-speaker variation: Mean LTF for 71 d lt l k f G71 adult male speakers of German

Means of LT F2 and LT F3Means of LT-F2 and LT-F3

Moos (2008, 2010), based on GSM transmitted speech inGSM-transmitted speech in BKA corpus “Pool 2010”

Nov 17, 201017 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature 17

Page 18: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Influence of vocal effort (Lombard diti ) LT F1condition) on LT-F1

800

LTF means from 31 speakers in Pool 2010 (telephone transmitted) based on Wagner

700

LTF-means from 31 speakers in Pool 2010 (telephone-transmitted), based on Wagner (2010); cf. also Kirchhübel (2009) and this conference

500

600

T-F1

[Hz]

normalLombard

400

LT

200

300

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

31 speakers

LT-F1 consistently higher in Lombard speech. Significant difference with paired t-

Nov 17, 201018 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature 18

test, indicating substantial intra-speaker variation. But: LT-F1 is of limited forensic use anyway (due to the effect of telephone transmission on F1)

Page 19: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Influence of vocal effort (Lombard diti ) LT F2condition) on LT-F2

1700

1600

1400

1500

F2 [H

z]

normalL b d

1300

LT-F Lombard

1100

1200

Lombard effect on LT F2 inconsistent across speakers Non significant difference

11001 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

31 speakers

Nov 17, 201019 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature 19

Lombard-effect on LT-F2 inconsistent across speakers. Non-significant difference with paired t-test, indicating acceptable intra-speaker variation.

Page 20: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

Influence of vocal effort (Lombard diti ) LT F3condition) on LT-F3

2700

2600

2400

2500

3 [H

z]

normal

2300

2400

LT-F

3

Lombard

2200

Lik ith LT F2 L b d ff t LT F3 i i t t k N i ifi t

21001 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

31 speakers

Nov 17, 201020 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature 20

Like with LT-F2: Lombard-effect on LT-F3 inconsistent across speakers. Non-significant difference with paired t-test, indicating acceptable intra-speaker variation.

Page 21: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

DET-Plot

Automatic speaker recognition systemu o a p a og o yGMM-modeled Long-Term Formant Distribution

Nov 17, 201021 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature

Page 22: Long-Term FormantTerm Formant Distribution as a …...Long-Term FormantTerm Formant Distribution as a forensic-phonetic featurephonetic feature ASA 2 d PASA 2nd Pan-Ai /IbiAmerican/Iberian

APE-Plot

Cllrllr

Nov 17, 201022 Long-Term Formant (LTF) Distribution as a forensic-phonetic feature