Sample size and the multivariate kernel density likelihood ratio: How many speakers are enough?
- Citation data:
Speech Communication, ISSN: 0167-6393, Vol: 94, Page: 15-29
- Publication Year:
- Usage 26
- Abstract Views 25
- EBSCO 25
- Link-outs 1
- EBSCO 1
- Captures 3
- Readers 3
- Mendeley 3
- Computer Science; Arts and Humanities; Mathematics; Social Sciences
- Most Recent Tweet View All Tweets
The likelihood ratio (LR) is now widely accepted as the appropriate framework for evaluating expert evidence. However, an empirical issue in forensic voice comparison is the number of speakers required to generate robust LR output and adequately test system performance. In this study, Monte Carlo simulations were used to synthesise temporal midpoint F1, F2 and F3 values from the hesitation marker um from a set of raw data consisting of 86 male speakers of standard southern British English. Using the multivariate kernel density LR approach, these data were used to investigate: (1) the number of development (training) speakers required for adequate calibration, (2) the number of test speakers needed for robust validity, and (3) the effects of varying the number of reference speakers. The experiments were run over 20 replications to assess the effects of which, as well as how many, speakers are included in each set. Predictably, LR output was most imprecise using small samples. Comparison across the three experiments shows that the greatest variability in LR output was found as a function of the number of development speakers – where stable LR output was only achieved with more than 20 speakers. Thus, it is possible to achieve stable performance with small numbers of test and reference speakers, as long as the system is adequately calibrated. Importantly, however, LRs for individual comparisons may still be substantially affected by the inclusion of additional speakers in each set, even when large samples are used.