Using BIG data to understand the BIG picture: Overcoming heterogeneity in speech for forensic applications
使用大数据了解大局:克服取证应用中语音的异质性
基本信息
- 批准号:ES/N003268/1
- 负责人:
- 金额:$ 35.77万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2016
- 资助国家:英国
- 起止时间:2016 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Forensic speech science (FSS) - an applied sub-discipline of phonetics - has come to play a critical role in criminal cases involving voice evidence. Within FSS, Forensic speaker comparison (FSC) involves the comparison of a criminal recording (e.g. a threatening phone call), and a known suspect sample (e.g. a police interview). It is the role of an expert forensic phonetician to advise the trier of fact (e.g. judge or jury) on the likelihood of the two samples coming from the same speaker. There are two important elements involved in making such a comparison. First, the expert will carry out an assessment of the similarity of the speech characteristics in the criminal recording and the suspect sample. Second, the expert will assess the degree to which the same speech features for the criminal sample can be considered to be typical for a given speaker group. The speaker group will typically be defined by age, sex and geographical region (or accent). This second element is critical in providing context for the first; the suspect could have speech very similar to that in the criminal recording but this could be purely coincidental if they exhibit speech characteristics that are common to their speaker group. In contrast, if the criminal and suspect are observed as having speech features considered as being atypical for their speaker group then this would provide strong evidence for it being the same speaker.One complication associated with FSC is that data to estimate whether a speech feature is typical or atypical for the given speaker group, commonly known as population data, are scarcely available. Population data are typically obtained by collecting a set of recordings containing the voices of a homogeneous group of speakers similar in age, sex, and geographical region (or accent). Unfortunately, the time and expense involved in the collection of population data means that forensic phoneticians face a huge challenge in obtaining such data for casework. This problem is further complicated by the high degree of variation that exists in speech across different speaker groups. Methodological research in the field of FSS has demonstrated that identifying the correct population for a FSC is vital in accurately representing the strength of evidence. It is largely for these reasons that experts argue that the biggest problem facing the field is the limited availability of population data.The primary aim of this research is to explore a novel set of proposed methods that seek to remedy the aforementioned problems. The current lack of a platform on which to exchange data means that population data for a specific speaker group might have already been collected, unbeknown to experts in need of such data. This project intends to bring an end to this type of scenario by developing an international platform on which to share data, and also encouraging fellow researchers and experts to participate in data sharing. In addition, the project will explore the extent to which population data are generalizable; specifically, this will entail identifying the geographical (or regional accent) level at which speaker groups can be defined. For example, an expert might define a population group as having a Leeds accent, when in actuality a population defined more generally as West Yorkshire would suffice. This would clearly have implications for the way in which population data would be collected.In order to explore the issue of defining the population data, a West Yorkshire (WY) database of 200 male speakers will be collected (including 50 speakers from each of the four urban areas: Huddersfield, Leeds, Bradford, and Wakefield). The database will be used to test the sensitivity of the strength of evidence when FSC cases are simulated using varying definitions of accent for the population data. In addition to serving methodological purpose, the WY database will also serve as a practical resource for casework and research in its own right.
作为语音学的一个应用分支学科,法庭言语科学在涉及声音证据的刑事案件中发挥着至关重要的作用。在FSS中,法医说话人比较(Forensic Speaker Comparison,FSC)涉及犯罪录音(例如威胁电话)和已知嫌疑人样本(例如警方访谈)的比较。法医语音学家的作用是向事实的审判者(例如法官或陪审团)建议两个样本来自同一说话者的可能性。进行这种比较涉及两个重要因素。首先,专家将对犯罪录音和嫌疑人样本中的语音特征进行相似性评估。第二,专家将评估犯罪样本的相同语音特征在多大程度上可被视为特定说话者群体的典型特征。发言者群体通常由年龄、性别和地理区域(或口音)来定义。第二个要素在为第一个要素提供背景方面至关重要;嫌疑人的言语可能与犯罪记录中的言语非常相似,但如果他们表现出说话者群体共有的言语特征,则这可能纯粹是巧合。相反,如果观察到罪犯和嫌疑人具有被认为是他们的说话者组的非典型语音特征,那么这将提供强有力的证据,它是同一speaker.One并发症与FSC相关的是,数据来估计是否语音特征是典型的或非典型的给定的扬声器组,通常被称为人口数据,是几乎没有。人口数据通常通过收集一组录音来获得,这些录音包含年龄、性别和地理区域(或口音)相似的一组同质说话者的声音。不幸的是,收集人口数据所涉及的时间和费用意味着法医语音学家在获得此类数据以用于个案工作方面面临巨大挑战。这个问题进一步复杂化的高度变化,存在于语音跨不同的扬声器组。FSS领域的方法学研究表明,为FSC确定正确的人群对于准确代表证据的强度至关重要。正是由于这些原因,专家们认为,该领域面临的最大问题是人口数据的有限性。本研究的主要目的是探索一套新的建议方法,寻求补救上述问题。目前缺乏一个交换数据的平台,这意味着可能已经收集了某个特定发言者群体的人口数据,而需要这些数据的专家却不知道。该项目旨在通过开发一个共享数据的国际平台,并鼓励研究人员和专家参与数据共享,结束这种情况。此外,该项目还将探讨人口数据的可推广程度;具体而言,这将需要确定可以界定说话者群体的地理(或区域口音)层面。例如,专家可能会将一个人口群体定义为具有利兹口音,而实际上,更一般地定义为西约克郡的人口就足够了。为了探讨人口数据的定义问题,将收集一个西约克郡(WY)的200名男性发言者的数据库(包括来自四个城市地区的50名发言者:哈德斯菲尔德、利兹、布拉德福德和韦克菲尔德)。该数据库将用于测试当使用不同的人口数据口音定义模拟FSC病例时证据强度的敏感性。除了服务于方法目的之外,WY数据库本身也将作为个案工作和研究的实用资源。
项目成果
期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Speaker identification using laughter in a close social network
在紧密的社交网络中使用笑声来识别说话者
- DOI:10.1558/ijsll.34552
- 发表时间:2017
- 期刊:
- 影响因子:0.4
- 作者:Land E
- 通讯作者:Land E
Variation in the FACE Vowel across West Yorkshire: Implications for Forensic Speaker Comparisons
西约克郡面部元音的变化:对法医说话者比较的影响
- DOI:10.21437/interspeech.2018-1944
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Earnshaw K
- 通讯作者:Earnshaw K
"We don't pronounce our t's around here": Realisations of /t/ in West Yorkshire English
“我们在这里不发音我们的 t”:/t/ 在西约克郡英语中的实现
- DOI:
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Earnshaw K
- 通讯作者:Earnshaw K
An introduction to the West Yorkshire Regional English Database (WYRED)
西约克郡地区英语数据库(WYRED)简介
- DOI:
- 发表时间:2016
- 期刊:
- 影响因子:0
- 作者:Gold, E.
- 通讯作者:Gold, E.
A Cautionary Tale For Phonetic Analysis: The Variability of Speech Between and Within Recording Sessions
语音分析的警示故事:录音期间和录音期间的语音变化
- DOI:
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Ross S
- 通讯作者:Ross S
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Erica Gold其他文献
Evidence evaluation for discrete data
- DOI:
10.1016/j.forsciint.2013.02.042 - 发表时间:
2013-07-10 - 期刊:
- 影响因子:
- 作者:
Colin Aitken;Erica Gold - 通讯作者:
Erica Gold
Uncertainty in Illness and Optimism in Couples With Multiple Sclerosis
多发性硬化症夫妇疾病的不确定性和乐观情绪
- DOI:
- 发表时间:
2000 - 期刊:
- 影响因子:0
- 作者:
Erica Gold;T. Sher;V. Theodos - 通讯作者:
V. Theodos
An International Investigation of Forensic Speaker Comparison Practices
法医说话者比较实践的国际调查
- DOI:
- 发表时间:
2011 - 期刊:
- 影响因子:0
- 作者:
Erica Gold;Peter French - 通讯作者:
Peter French
Erica Gold的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:合作创新研究团队
ARF鸟苷酸交换因子BIG1介导ACSL4依赖性铁死亡在非酒精性脂肪性肝炎中的作用及机制研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于Big Code深度背景增强的Android应用代码反混淆研究
- 批准号:61972290
- 批准年份:2019
- 资助金额:60.0 万元
- 项目类别:面上项目
BIG1介导STING囊泡转运在抗肺癌免疫反应中的作用及分子机制
- 批准号:81903639
- 批准年份:2019
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
水稻Big Grain3 通过调控细胞分裂素转运调节籽粒大小
- 批准号:2019JJ50243
- 批准年份:2019
- 资助金额:0.0 万元
- 项目类别:省市级项目
ARF鸟苷酸交换因子BIG1调控巨噬细胞重编程在脓毒症免疫抑制形成中的作用及机制研究
- 批准号:81971488
- 批准年份:2019
- 资助金额:56.0 万元
- 项目类别:面上项目
控制豆科作物器官大小关键基因BIG SEEDS1的功能与应用研究
- 批准号:31771345
- 批准年份:2017
- 资助金额:65.0 万元
- 项目类别:面上项目
生长素转运调控基因BIG介导高浓度CO2下气孔关闭的分子机制
- 批准号:31171356
- 批准年份:2011
- 资助金额:65.0 万元
- 项目类别:面上项目
ARF鸟苷酸交换因子BIG1定向调控ABCA1功能的分子机制
- 批准号:81173056
- 批准年份:2011
- 资助金额:69.0 万元
- 项目类别:面上项目
BIG2介导的GABAA型受体转运模式及信号调控机制
- 批准号:31070924
- 批准年份:2010
- 资助金额:35.0 万元
- 项目类别:面上项目
相似海外基金
FightAMR: Novel global One Health surveillance approach to fight AMR using Artificial Intelligence and big data mining
FightAMR:利用人工智能和大数据挖掘对抗 AMR 的新型全球统一健康监测方法
- 批准号:
MR/Y034422/1 - 财政年份:2024
- 资助金额:
$ 35.77万 - 项目类别:
Research Grant
Big Data-based Distributed Control using a Behavioural Systems Framework
使用行为系统框架的基于大数据的分布式控制
- 批准号:
DP240100300 - 财政年份:2024
- 资助金额:
$ 35.77万 - 项目类别:
Discovery Projects
Understanding of Consumption Context Using User Generated Big Data
使用用户生成的大数据了解消费环境
- 批准号:
23H00859 - 财政年份:2023
- 资助金额:
$ 35.77万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Development and clinical application of AI model to support diagnosis of dementia using medical big data from electronic medical records
利用电子病历中的医疗大数据支持痴呆症诊断的人工智能模型的开发和临床应用
- 批准号:
22KJ3211 - 财政年份:2023
- 资助金额:
$ 35.77万 - 项目类别:
Grant-in-Aid for JSPS Fellows
International research using big data for lifestyle disease, brain aging and genetic background
利用大数据研究生活方式疾病、大脑衰老和遗传背景的国际研究
- 批准号:
23K06843 - 财政年份:2023
- 资助金额:
$ 35.77万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Developing and exploring methods to understand human-nature interactions in urban areas using new forms of big data
利用新形式的大数据开发和探索理解城市地区人与自然相互作用的方法
- 批准号:
ES/W012979/1 - 财政年份:2023
- 资助金额:
$ 35.77万 - 项目类别:
Research Grant
Understanding hearing loss phenotypes, their progression and associations with otological and non-otological disease using hearing health big data
使用听力健康大数据了解听力损失表型、其进展以及与耳科和非耳科疾病的关联
- 批准号:
MR/X019217/1 - 财政年份:2023
- 资助金额:
$ 35.77万 - 项目类别:
Fellowship
Using AI and big data to identify a set of biologically validated drug targets for hard-to-treat cancers
使用人工智能和大数据来确定一组经过生物学验证的药物靶点,用于治疗难以治疗的癌症
- 批准号:
2886797 - 财政年份:2023
- 资助金额:
$ 35.77万 - 项目类别:
Studentship
Development of a long-term water resources forecasting model using global climate big data
利用全球气候大数据开发长期水资源预测模型
- 批准号:
23KJ0924 - 财政年份:2023
- 资助金额:
$ 35.77万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Agroecosystems in a Climate Crisis: Using Big Data to understand the Out of the Ordinary
气候危机中的农业生态系统:利用大数据了解异常情况
- 批准号:
2882391 - 财政年份:2023
- 资助金额:
$ 35.77万 - 项目类别:
Studentship