ITR: Dynamics-based Speech Segregation

ITR:基于动力学的语音分离

基本信息

项目摘要

A typical auditory scene contains multiple simultaneous events, and a remarkable feat of the auditory nervous system is its ability to disentangle the acoustic mixture and group the acoustic energy from the same event. This fundamental process of auditory perception is called auditory scene analysis. Of particular importance in auditory scene analysis is the separation of speech from interfering sounds, or speech segregation. Speech segregation remains a largely unsolved problem in auditory engineering and speech technology. In this project, the P1 seeks to develop a dynamics-based system for speech segregation using perceptual and neural principles. Auditory grouping will be based on oscillatory correlation, whereby phases of neural oscillators encode the binding of auditory features. The investigation will consist of subsequent stages of computation, starting from simulated auditory periphery composed of cochlear filtering and hair cell transduction. A mid-level representation will be formed by computing auto- and cross-correlation of filter channels. A stage of segment formation then creates individual elements of a represented auditory scene, each of which is a dynamically evolving, connected time-frequency structure that may overlap with other elements. Operating on auditory segments from the segment formation stage, both simultaneous organization and sequential organization will be incorporated. For simultaneous organization, grouping will be based on periodicity, location, onset and offset analyses, while for sequential organization grouping will be based on pitch, spectral, and location continuities. In particular, two pitch maps corresponding to two ears and one location map will be computed for auditory organization. All of the employed grouping cues are consistent with perceptual principles of auditory scene analysis. These cues guide the connectivity of neural oscillator networks, which perform grouping and segregation of auditory segments. The proposed system will be evaluated using real recordings of speech and interfering sounds, where speech can be both voiced and unvoiced. The success of the system will be quantitatively assessed using two measures: changes in signal-to-noise ratio and speech recognition rate. This project is expected to make significant contributions to automatic speech recognition in unconstrained environments.
一个典型的听觉场景包含多个同时发生的事件,听觉神经系统的一个显着的壮举是它能够从同一事件中分离声音混合物并将声能分组。听觉感知的这个基本过程被称为听觉场景分析。在听觉场景分析中特别重要的是将语音与干扰声音分离,或语音分离。语音分离一直是听觉工程和语音技术中一个尚未解决的问题。在这个项目中,P1试图开发一个基于动态的系统,使用感知和神经原理进行语音分离。听觉分组将基于振荡相关性,由此神经振荡器的相位编码听觉特征的结合。调查将包括后续阶段的计算,从耳蜗过滤和毛细胞转导组成的模拟听觉外周开始。通过计算滤波器通道的自相关和互相关,将形成中间级表示。片段形成的阶段然后创建所表示的听觉场景的各个元素,每个元素是动态演变的、连接的时间-频率结构,其可以与其他元素重叠。从段形成阶段开始对听觉段进行操作,同时组织和顺序组织都将被纳入。对于同时组织,分组将基于周期性、位置、起始和偏移分析,而对于顺序组织,分组将基于音高、频谱和位置连续性。特别地,将计算对应于两个耳朵的两个音高图和一个位置图以用于听觉组织。所有的分组线索都符合听觉场景分析的知觉原则。这些线索指导神经振荡器网络的连接,神经振荡器网络执行听觉片段的分组和分离。所提出的系统将使用语音和干扰声音的真实的记录进行评估,其中语音可以是有声的和无声的。该系统的成功将使用两种措施进行定量评估:信噪比和语音识别率的变化。该项目有望为无约束环境下的自动语音识别做出重大贡献。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

DeLiang Wang其他文献

Multi-Channel Conversational Speaker Separation via Neural Diarization
通过神经二值化进行多通道会话说话人分离
Leveraging Laryngograph Data for Robust Voicing Detection in Speech
利用喉头图数据进行稳健的语音发声检测
  • DOI:
    10.48550/arxiv.2312.03129
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yixuan Zhang;Heming Wang;DeLiang Wang
  • 通讯作者:
    DeLiang Wang
Time-frequency masking for speech separation and its potential for hearing aid design.
  • DOI:
    10.1177/1084713808326455
  • 发表时间:
    2008-12-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
    DeLiang Wang
  • 通讯作者:
    DeLiang Wang
A Neural Model of Synaptic Plasticity Underlying Short-term and Long-term Habituation
  • DOI:
    10.1177/105971239300200201
  • 发表时间:
    1993-09
  • 期刊:
  • 影响因子:
    1.6
  • 作者:
    DeLiang Wang
  • 通讯作者:
    DeLiang Wang
Leveraging Sound Localization to Improve Continuous Speaker Separation
利用声音定位来改善连续扬声器分离

DeLiang Wang的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('DeLiang Wang', 18)}}的其他基金

Deep neural networks for multi-channel speaker localization and speech separation
用于多通道说话者定位和语音分离的深度神经网络
  • 批准号:
    1808932
  • 财政年份:
    2018
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
Collaborative Research: Separating Speech from Speech Noise to Improve Speech Intelligibility
合作研究:将语音与语音噪声分离以提高语音清晰度
  • 批准号:
    0534707
  • 财政年份:
    2006
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
Automated Auditory Scene Analysis Based on Oscillatory Correlation
基于振荡相关性的自动听觉场景分析
  • 批准号:
    9423312
  • 财政年份:
    1995
  • 资助金额:
    $ 45万
  • 项目类别:
    Continuing Grant
Segmentation and Recognition of Complex Temporal Patterns
复杂时间模式的分割和识别
  • 批准号:
    9211419
  • 财政年份:
    1992
  • 资助金额:
    $ 45万
  • 项目类别:
    Continuing Grant

相似国自然基金

β-arrestin2- MFN2-Mitochondrial Dynamics轴调控星形胶质细胞功能对抑郁症进程的影响及机制研究
  • 批准号:
    n/a
  • 批准年份:
    2023
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目

相似海外基金

Adapting Position-Based Dynamics as a Biophysically Accurate and Efficient Modeling Framework for Dynamic Cell Shapes
采用基于位置的动力学作为动态细胞形状的生物物理准确且高效的建模框架
  • 批准号:
    24K16962
  • 财政年份:
    2024
  • 资助金额:
    $ 45万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Moving away from aeration – utilising computational fluid dynamics modelling ofmechanical mixing within an industrial scale nature-based wastewater treatment system
摆脱曝气 — 在工业规模的基于自然的废水处理系统中利用机械混合的计算流体动力学模型
  • 批准号:
    10092420
  • 财政年份:
    2024
  • 资助金额:
    $ 45万
  • 项目类别:
    Collaborative R&D
Collaborative Research: Nonlinear Dynamics and Wave Propagation through Phononic Tunneling Junctions based on Classical and Quantum Mechanical Bistable Structures
合作研究:基于经典和量子机械双稳态结构的声子隧道结的非线性动力学和波传播
  • 批准号:
    2423960
  • 财政年份:
    2024
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
Strategies for predicting functionality of polymer electrolyte membranes based on dynamics and hierarchical structures
基于动力学和分层结构的聚合物电解质膜功能预测策略
  • 批准号:
    24K08091
  • 财政年份:
    2024
  • 资助金额:
    $ 45万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
CAREER: An Integrated Framework for Resilience Analytics: From Physics-based Modeling of Building Components to Dynamics of Community Level Recovery
职业:弹性分析的综合框架:从基于物理的建筑组件建模到社区层面恢复的动态
  • 批准号:
    2347722
  • 财政年份:
    2023
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
Structure and dynamics of the subcontinental lithospheric mantle over the Central and Eastern North American continent, constrained by numerical modeling based on tomography models
基于层析成像模型的数值模拟约束北美大陆中部和东部次大陆岩石圈地幔的结构和动力学
  • 批准号:
    2240943
  • 财政年份:
    2023
  • 资助金额:
    $ 45万
  • 项目类别:
    Standard Grant
An autonomous machine learning-based molecular dynamics method that utilizes first-principles atomic energy calculation
一种基于自主机器学习的分子动力学方法,利用第一原理原子能计算
  • 批准号:
    23H03415
  • 财政年份:
    2023
  • 资助金额:
    $ 45万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Interaction Design for Circular Economy Based on the Dynamics of Subjective Value for Objects
基于客体主观价值动态的循环经济交互设计
  • 批准号:
    23H03685
  • 财政年份:
    2023
  • 资助金额:
    $ 45万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Tyrosinase-based sequential proximity labeling for tracking proteome dynamics
基于酪氨酸酶的顺序邻近标记用于跟踪蛋白质组动态
  • 批准号:
    23K13855
  • 财政年份:
    2023
  • 资助金额:
    $ 45万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Quantum digital twins for quantum dynamics based on hardware-tailored tensor networks
基于硬件定制张量网络的量子动力学量子数字孪生
  • 批准号:
    EP/Y005090/1
  • 财政年份:
    2023
  • 资助金额:
    $ 45万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了