Towards linguistically-informed automatic speaker recognition

迈向基于语言的自动说话人识别

基本信息

  • 批准号:
    2279775
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Studentship
  • 财政年份:
    2019
  • 资助国家:
    英国
  • 起止时间:
    2019 至 无数据
  • 项目状态:
    已结题

项目摘要

This project will investigate how Automatic Speaker Recognition (ASR) systems work and how they can be improved. ASR systems recognise speakers from just their voice and are commonly used by banks and government institutions such as the HMRC. Such systems have seen improvements in recent years due to continual refinement of methods and the availability of large databases of recordings to test them. State-of-the-art systems now produce few errors even with poor quality and short recordings. However, ASR systems are a 'black box': we know that they are analysing a speaker's voice but we do not know what linguistic information they rely on to make their decisions. This project is exciting because it builds on a small but growing body of research at the intersection of linguistics and speech technology. It is also the first systematic study to investigate the inner workings of such systems and the outcomes will be beneficial to society by improving the reliability of security systems and speaker recognition systems used by banks and courtrooms. I am interested in this project because I have experience working with similar systems which recognise individuals using written data. However, those systems are not 'black boxes' like these ASR systems which use spoken data. Thus, I am driven to understand the 'black box' of ASR systems to ensure that the systems which use written and spoken data are equally reliable. Understanding this 'black box' is crucial because it will allow us to improve ASR systems further, particularly through understanding what 'types' of voices are difficult. They must also be explainable to lay people, e.g. jury members, in legal cases where such systems are used in evidence. The project will ask three research questions devoted to understanding and improving ASR systems:RQ.1: To what extent do ASR systems capture tangible linguistic properties of a voice?Firstly, we will investigate what linguistic properties of voice map onto the abstract properties of voice which ASR systems already detect. I hypothesise that many properties will be pertinent, e.g. vowel formants which are the regular and consistent resonating frequencies of different vowel sounds that are uniquely shaped by each speaker's vocal tract and accent. RQ.2: Can we predict which speakers will be problematic for the system?Secondly, we will identify groups who may be problematic for ASR systems so that we can improve the systems based on why these groups pose issues. Some accents have less vowel variation than others; as a result, their speakers could be at greater risk of misrecognition as someone with the same accent because there are less variables to identify the speaker uniquely. RQ.3: Can linguistic information be used to improve the performance of ASR?Finally, we will use linguistic speech analysis to improve ASR systems. By identifying the linguistic features which are used by ASR systems, we can tailor ASR systems to focus on these features to improve their reliability. This project uses a state-of-the-art speaker recognition system (VoiSentry) developed by the commercial partner, Aculab. My methodology will involve testing the VoiSentry software on voices that have been manipulated in controlled ways, e.g. changing the acoustic properties of the vowel sounds, and seeing how it affects the end score. If it does, we will know that ASR systems capture tangible linguistic properties of voice and we can therefore tailor these systems to focus on detecting these features. Aculab will be influential to this study because they will allow us to examine the underlying computer code which no other ASR system will permit. Thus, we can do specific manipulations to test changes to the outcome result. Overall, this research will have societal value because it will ensure that speaker recognition systems used by banks and government institutions are as reliable as possible.
本项目将研究自动说话人识别(ASR)系统如何工作以及如何改进。ASR系统仅通过声音识别说话者,通常被银行和HMRC等政府机构使用。近年来,由于方法的不断改进和大型录音数据库的可用性,这些系统已经得到了改进。最先进的系统现在产生的错误很少,即使质量差和短的录音。然而,ASR系统是一个“黑匣子”:我们知道它们正在分析说话者的声音,但我们不知道它们依赖什么语言信息来做出决定。这个项目是令人兴奋的,因为它建立在语言学和语音技术交叉点的一个小但不断增长的研究基础上。这也是第一次系统地研究这种系统的内部工作原理,其结果将通过提高银行和法庭使用的安全系统和说话人识别系统的可靠性而有益于社会。我对这个项目很感兴趣,因为我有类似的系统工作经验,这些系统可以识别使用书面数据的个人。然而,这些系统不是像这些使用语音数据的ASR系统那样的“黑匣子”。因此,我被驱使去理解ASR系统的“黑匣子”,以确保使用书面和口头数据的系统同样可靠。了解这个“黑匣子”至关重要,因为它将使我们能够进一步改进ASR系统,特别是通过了解哪些“类型”的声音是困难的。在法律的案件中,当这些系统被用作证据时,它们还必须能够向非专业人员,例如陪审团成员解释。该项目将提出三个研究问题,致力于理解和改善ASR系统:RQ.1:在多大程度上做ASR系统捕获语音的有形语言特性?首先,我们将研究语音的语言属性映射到ASR系统已经检测到的语音的抽象属性。我假设,许多属性将是相关的,例如元音共振峰是由每个扬声器的声道和口音独特塑造的不同元音声音的规则和一致的共振频率。RQ.2:我们能预测哪些扬声器会对系统造成问题吗?其次,我们将确定可能对ASR系统造成问题的群体,以便我们可以根据这些群体造成问题的原因改进系统。有些口音的元音变化比其他口音少;因此,他们的说话者可能会被误认为具有相同口音的人,因为只有较少的变量来识别说话者。RQ.3:语言信息可以用来提高ASR的性能吗?最后,我们将使用语言学语音分析来改进ASR系统。通过识别ASR系统所使用的语言特征,我们可以定制ASR系统来关注这些特征,以提高其可靠性。该项目使用了由商业合作伙伴Aculab开发的最先进的说话人识别系统(VoiSentry)。我的方法将涉及测试VoiSentry软件的声音已经被操纵在控制的方式,例如,改变元音声音的声学特性,并看看它如何影响最终得分。如果是这样,我们将知道ASR系统捕获了语音的有形语言特性,因此我们可以定制这些系统来专注于检测这些特征。Aculab将对这项研究产生影响,因为它们将允许我们检查其他ASR系统不允许的底层计算机代码。因此,我们可以进行特定的操作来测试结果结果的变化。总的来说,这项研究将具有社会价值,因为它将确保银行和政府机构使用的说话人识别系统尽可能可靠。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

其他文献

Internet-administered, low-intensity cognitive behavioral therapy for parents of children treated for cancer: A feasibility trial (ENGAGE).
针对癌症儿童父母的互联网管理、低强度认知行为疗法:可行性试验 (ENGAGE)。
  • DOI:
    10.1002/cam4.5377
  • 发表时间:
    2023-03
  • 期刊:
  • 影响因子:
    4
  • 作者:
  • 通讯作者:
Differences in child and adolescent exposure to unhealthy food and beverage advertising on television in a self-regulatory environment.
在自我监管的环境中,儿童和青少年在电视上接触不健康食品和饮料广告的情况存在差异。
  • DOI:
    10.1186/s12889-023-15027-w
  • 发表时间:
    2023-03-23
  • 期刊:
  • 影响因子:
    4.5
  • 作者:
  • 通讯作者:
The association between rheumatoid arthritis and reduced estimated cardiorespiratory fitness is mediated by physical symptoms and negative emotions: a cross-sectional study.
类风湿性关节炎与估计心肺健康降低之间的关联是由身体症状和负面情绪介导的:一项横断面研究。
  • DOI:
    10.1007/s10067-023-06584-x
  • 发表时间:
    2023-07
  • 期刊:
  • 影响因子:
    3.4
  • 作者:
  • 通讯作者:
ElasticBLAST: accelerating sequence search via cloud computing.
ElasticBLAST:通过云计算加速序列搜索。
  • DOI:
    10.1186/s12859-023-05245-9
  • 发表时间:
    2023-03-26
  • 期刊:
  • 影响因子:
    3
  • 作者:
  • 通讯作者:
Amplified EQCM-D detection of extracellular vesicles using 2D gold nanostructured arrays fabricated by block copolymer self-assembly.
使用通过嵌段共聚物自组装制造的 2D 金纳米结构阵列放大 EQCM-D 检测细胞外囊泡。
  • DOI:
    10.1039/d2nh00424k
  • 发表时间:
    2023-03-27
  • 期刊:
  • 影响因子:
    9.7
  • 作者:
  • 通讯作者:

的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('', 18)}}的其他基金

An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
  • 批准号:
    2901954
  • 财政年份:
    2028
  • 资助金额:
    --
  • 项目类别:
    Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
  • 批准号:
    2896097
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
  • 批准号:
    2780268
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
  • 批准号:
    2908918
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
  • 批准号:
    2908693
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
  • 批准号:
    2908917
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
  • 批准号:
    2879438
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
  • 批准号:
    2890513
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
CDT year 1 so TBC in Oct 2024
CDT 第 1 年,预计 2024 年 10 月
  • 批准号:
    2879865
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
  • 批准号:
    2876993
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship

相似海外基金

Developing Mathematics Teachers Responsive Pedagogies for Linguistically Marginalized Students
为语言边缘化学生开发数学教师响应式教学法
  • 批准号:
    2247128
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
The impact of bilingualism on cognitive reserve/resilience using socio-demographically and linguistically diverse populations
双语对社会人口和语言多样化人群的认知储备/弹性的影响
  • 批准号:
    10584245
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
Early childhood education and community network to empower children and their families with linguistically and culturally diverse backgrounds
幼儿教育和社区网络,赋予具有不同语言和文化背景的儿童及其家庭权力
  • 批准号:
    23K02316
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Toward a speech rehabilitation model for linguistically and culturally minor groups: Foreign born immigrants
为语言和文化上的少数群体建立言语康复模式:外国出生的移民
  • 批准号:
    10660443
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
Building Capacity and Collaborations to Prepare Elementary Teachers for Cultivating Linguistically Just and Integrated STEM Education
能力建设和合作,为小学教师培养语言公正和综合的 STEM 教育做好准备
  • 批准号:
    2243317
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
RAPID: Collaborative Research: Providing useable COVID-19 health information to linguistically underserved people
RAPID:协作研究:为语言服务不足的人群提供可用的 COVID-19 健康信息
  • 批准号:
    2331607
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
A Team-Based Model for Co-Adapting Existing Middle-School Science Curricula for Culturally and Linguistically Diverse Learners
基于团队的模型,为文化和语言多样化的学习者共同调整现有的中学科学课程
  • 批准号:
    2247435
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Realising inclusive communicative practices in support services for refugees and other migrants: the role of translanguaging in a linguistically dive
在难民和其他移民的支持服务中实现包容性的沟通实践:跨语言在语言潜水中的作用
  • 批准号:
    2763635
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Studentship
Collaborative Research: HCC: Medium: Linguistically-Driven Sign Recognition from Continuous Signing for American Sign Language (ASL)
合作研究:HCC:媒介:美国手语 (ASL) 连续手语中语言驱动的手语识别
  • 批准号:
    2212301
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
CAREER: Designing Linguistically Responsive Environments for STEM Learning
职业:为 STEM 学习设计语言响应环境
  • 批准号:
    2143432
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了