RI: Medium: Deep Neural Networks for Robust Speech Recognition through Integrated Acoustic Modeling and Separation

RI:中:通过集成声学建模和分离实现鲁棒语音识别的深度神经网络

基本信息

  • 批准号:
    1409431
  • 负责人:
  • 金额:
    $ 79.81万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2014
  • 资助国家:
    美国
  • 起止时间:
    2014-06-01 至 2019-05-31
  • 项目状态:
    已结题

项目摘要

Over the last decade, speech recognition technology has become steadily more present in everyday life, as seen by the proliferation of applications including mobile personal agents and transcription of voicemail messages. Performance of these systems, however, degrades significantly in the presence of background noise; for example, using speech recognition technology in a noisy restaurant or on a windy street can be difficult because speech recognizers confuse the background noise with linguistic content. Compensation for noise typically involves preprocessing the acoustic signal to emphasize the speech signal (i.e. speech separation), and then feeding this processed input into the recognizer. The innovative approach in this project is to train the recognition and separation systems in an integrated manner so that the linguistic content of the signal can inform the separation, and vice versa. Given the impact of the recent resurgence of Deep Neural Networks (DNNs) in speech processing, this project seeks to make DNNs more resistant to noise by integrating speech separation and speech recognition, exploring three related areas. The first research area seeks to stabilize input to DNNs by combining DNN-based suppression and acoustic modeling, integrating masking estimates across time and frequency, and using this information to improve reconstruction of speech from noisy input. The second area seeks to examine a richer DNN structure, using multi-task learning techniques to guide the construction of DNNs better at performing all tasks and where layers have meaningful structure. The final research area examines ways to adapt the spurious output of DNN acoustic models given acoustic noise. With the focus of integrating speech separation and recognition, the project will be evaluated both by measuring speech recognition performance, as well as metrics that are more closely related to human speech perception. This will ensure a broader impact of this research by providing insights not only to speech technology but also facilitating the design of next-generation hearing technology in the long run.
在过去的十年中,语音识别技术在日常生活中变得越来越稳定,如包括移动的个人代理和语音邮件消息的转录的应用的激增所示。然而,这些系统的性能在存在背景噪声的情况下会显著降低;例如,在嘈杂的餐馆或多风的街道上使用语音识别技术可能很困难,因为语音识别器将背景噪声与语言内容混淆。对噪声的补偿通常涉及对声学信号进行预处理以强调语音信号(即语音分离),然后将该处理后的输入馈送到识别器中。 该项目的创新方法是以集成的方式训练识别和分离系统,以便信号的语言内容可以通知分离,反之亦然。鉴于最近深度神经网络(DNN)在语音处理中的复苏,该项目旨在通过整合语音分离和语音识别,探索三个相关领域,使DNN更能抵抗噪声。 第一个研究领域旨在通过结合基于DNN的抑制和声学建模来稳定DNN的输入,整合时间和频率上的掩蔽估计,并使用这些信息来改善噪声输入的语音重建。 第二个领域旨在研究更丰富的DNN结构,使用多任务学习技术来指导DNN的构建,以便更好地执行所有任务,并且层具有有意义的结构。 最后一个研究领域研究了在给定声学噪声的情况下适应DNN声学模型的伪输出的方法。 该项目的重点是集成语音分离和识别,将通过测量语音识别性能以及与人类语音感知更密切相关的指标进行评估。 这将确保这项研究产生更广泛的影响,不仅为语音技术提供见解,而且从长远来看,还将促进下一代听力技术的设计。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Eric Fosler-Lussier其他文献

Eric Fosler-Lussier的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Eric Fosler-Lussier', 18)}}的其他基金

Deep Learning Based Complex Spectral Mapping for Multi-Channel Speaker Separation and Speech Enhancement
基于深度学习的复杂频谱映射,用于多通道说话人分离和语音增强
  • 批准号:
    2125074
  • 财政年份:
    2021
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
RI: Small: Early Elementary Reading Verification in Challenging Acoustic Environments
RI:小:具有挑战性的声学环境中的早期小学阅读验证
  • 批准号:
    2008043
  • 财政年份:
    2020
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
CI-ADDO-NEW: Collaborative Research: The Speech Recognition Virtual Kitchen
CI-ADDO-NEW:协作研究:语音识别虚拟厨房
  • 批准号:
    1305319
  • 财政年份:
    2013
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
CI-P:Collaborative Research:The Speech Recognition Virtual Kitchen
CI-P:协作研究:语音识别虚拟厨房
  • 批准号:
    1205424
  • 财政年份:
    2012
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
RI: Medium: Collaborative Research: Explicit Articulatory Models of Spoken Language, with Application to Automatic Speech Recognition
RI:媒介:协作研究:口语显式发音模型及其在自动语音识别中的应用
  • 批准号:
    0905420
  • 财政年份:
    2009
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
CAREER: Breaking the phonetic code: novel acoustic-lexical modeling techniques for robust automatic speech recognition
职业:打破语音密码:用于鲁棒自动语音识别的新颖声学词汇建模技术
  • 批准号:
    0643901
  • 财政年份:
    2006
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Continuing Grant
Workshop: Student Research in Computational Linguistics, at the HLT/NAACL 2004 Conference
研讨会:计算语言学学生研究,HLT/NAACL 2004 会议
  • 批准号:
    0422841
  • 财政年份:
    2004
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant

相似海外基金

Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
  • 批准号:
    2312841
  • 财政年份:
    2023
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
  • 批准号:
    2312842
  • 财政年份:
    2023
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
  • 批准号:
    2312840
  • 财政年份:
    2023
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: MoDL: Occams Razor in Deep and Physical Learning
合作研究:RI:媒介:MoDL:深度学习和物理学习中的奥卡姆斯剃刀
  • 批准号:
    2212519
  • 财政年份:
    2022
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: MoDL: Occams Razor in Deep and Physical Learning
合作研究:RI:媒介:MoDL:深度学习和物理学习中的奥卡姆斯剃刀
  • 批准号:
    2212520
  • 财政年份:
    2022
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Flexible Deep Speech Synthesis through Gestural Modeling
合作研究:RI:Medium:通过手势建模进行灵活的深度语音合成
  • 批准号:
    2106928
  • 财政年份:
    2021
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Flexible Deep Speech Synthesis through Gestural Modeling
合作研究:RI:Medium:通过手势建模进行灵活的深度语音合成
  • 批准号:
    2106930
  • 财政年份:
    2021
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Flexible Deep Speech Synthesis through Gestural Modeling
合作研究:RI:Medium:通过手势建模进行灵活的深度语音合成
  • 批准号:
    2106929
  • 财政年份:
    2021
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
AF: RI: Medium: Collaborative Research: Understanding and Improving Optimization in Deep and Recurrent Networks
AF:RI:中:协作研究:理解和改进深度和循环网络的优化
  • 批准号:
    1764032
  • 财政年份:
    2018
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
AF: RI: Medium: Collaborative Research: Understanding and Improving Optimization in Deep and Recurrent Networks
AF:RI:中:协作研究:理解和改进深度和循环网络的优化
  • 批准号:
    1763562
  • 财政年份:
    2018
  • 资助金额:
    $ 79.81万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了