Reference population in Batvox - exploring the use of the population optimizer
description
Transcript of Reference population in Batvox - exploring the use of the population optimizer
Nederlands Forensisch Instituut
April 20, 2023
Reference population in Batvox - exploring the use of the population optimizer
D.L. van der VloedD. MeuwlyR. HaraksimJ.F.M. Vermeulen
Titel | April 20, 20232
Agnitio BatVox [1]
The NFI is in the process of validating BatVox software of Agnitio.
BatVox is speaker comparison software; it compares a test recording (trace, disputed sample) with a suspect recording (training file, known sample).
To model the within variation the suspect recordings are used. (HSS)
To model the between variation a reference population is used. (HDS)
Titel | April 20, 2023
Agnitio BatVox
The output has the form of a Likelihood Ratio [2].
LR = P(E | HSS)
P(E | HDS)
In words:The chance that the distance between test and suspect is observed if
it is part of within variationdivided byThe chance that the distance between test and suspect is observed if
it is part of between variation
3
Titel | April 20, 2023
Agnitio BatVox – reference population
How is the reference population used in BatVox?
First: it is used to evaluate the denominator of the LR: P(E|HDS)
Second: It is used for normalization of the scores.
4
Titel | April 20, 2023
Agnitio BatVox – normalization
A score is calculated representing the distance between the test sample and the suspect sample.
This score is not only influenced by speaker features, but also by other features, like channel and language.
The score needs to be normalized in order to be able to use it in an LR-calculation.
5
Titel | April 20, 2023
Agnitio BatVox – normalization
The normalization is done using the variation in the reference population.
For this to be a sensible move, the non-speaker information like channel and language in the reference population need to be similar; to each other and to the suspect model.
If this is the case, the normalization makes sure only speaker information is used in the comparison.
6
Titel | April 20, 2023
Agnitio BatVox – population optimizer
To ensure suitability of the reference population for this task a population optimizer functionality has been built in.
It chooses the ones closest to the suspect recording. For this selection a biometric distance measure is used, similar to the distance measures used in the actual comparisons.
One can set BatVox to choose a certain number of reference population speakers out of a larger set.
In short: Out of a total set of speakers a sub set of speakers is chosen, which is then used as reference population.
7
Titel | April 20, 2023
BatVox is only a tool in the hands of the user
The reference population and the population optimizer are important concepts in the system when obtaining LRs.
Therefore, the NFI devised an experiment to understand better how the population optimizer works.
8
Titel | April 20, 2023
Validation NFI – speech databases
Swiss-French Polyphone IPSC database (16 male speakers) [6]
GSM data and PSTN data
Used to provide ‘test’ and ‘suspect’ recordings:1 GSM suspect recording per speaker6 PSTN suspect recordings per speaker1 GSM test recording per speaker5 PSTN test recordings per speaker
9
Titel | April 20, 2023
Validation NFI – speech databases
Swiss-French Polyphone database (1995 male speakers) [5]
1 recording per speaker (PSTN, landline)
This database was used as reference population in this study.
10
Titel | April 20, 2023
Conditions
Tests were conducted in three population conditions:Letting Batvox choose:
• 35 out of 45 (P1)• 35 out of 1995 (P2)• 1400 out of 1995 (P3)
The tests have been done in two channel conditions:
• GSM• PSTN
11
Titel | April 20, 2023
[3]12
Titel | April 20, 2023
[3]13
Titel | April 20, 2023
Cllrs[4] in the PSTN condition
14
Titel | April 20, 2023
Cllrs[4] in the GSM condition
15
Titel | April 20, 2023
Cllr[4] values
PSTN:P1: Cllr: 0.24 minimal Cllr: 0.23P2: Cllr: 0.39 minimal Cllr: 0.27P3: Cllr: 0.25 minimal Cllr: 0.23
GSM:P1: Cllr: 0.23 minimal Cllr: 0.11P2: Cllr: 0.15 minimal Cllr: 0.09P3: Cllr: 0.29 minimal Cllr: 0.15
16
Titel | April 20, 2023
Observations
Condition P2 ‘35 out of 1995’ is yielding lower LRs throughout the data.
Condition P1 ‘35 out of 45’ and P3 ‘1400 out of 1995’ are more or less the same within each of the channel conditions.
GSM condition yields higher LRs and lower minimal Cllr than PSTN, even though the reference population consists of PSTN recordings.
17
Titel | April 20, 2023
Observations
The ratio of the sizes of subset and total population appears to be more important than the absolute size of the subset.
In P2 ‘35 out of 1995’ two factors come into play:1. The reference population will be more like the suspect2. The reference population will be more homogeneous
The first factor will make P(E|HDS) larger, hence lowering the LR.
The second factor will make the variation within the reference population lower, thus the population is ‘narrower’. It is harder to fit in a narrower population, hence P(E|HSS) will become smaller.
LRs are lower in P2, apparently the first factor is more important.
18
Titel | April 20, 2023
Conclusion
The ratio of the size of the sub set and the total population is the important factor, rather than the absolute number of the used reference population.
19
Titel | April 20, 2023
References
[1] www.agnitio.es [2] Meuwly D(2006). Forensic individualization from biometric data; Science and Justice;
46, 4, 205 – 213. [3] Ramos D (2007), Forensic evaluation of the evidence using automatic speaker
recognition systems, Ph.D. thesis, Universidad Autonoma de Madrid, Madrid, Spain. [4] van Leeuwen, D. and Brümmer, N. (2007); An Introduction to Application-Independent
Evaluation of Speaker Recognition Systems; Speaker Classification I; 343; 330-353 [5] http://catalog.elra.info/product_info.php?products_id=708 [6] Meuwly D, Alexander A, Drygajlo A, and Botti F(2003). Polyphone-IPSC: A shared
speakers database for evaluation of forensic-automatic speaker recognition systems. In Forensic Science International, vol. 136, p. 367, Istanbul, Turkey, Elsevier.
20