RI: Extension of the APP detector for multipitch tracking and speaker separation
RI:APP 检测器的扩展,用于多音高跟踪和扬声器分离
基本信息
- 批准号:0812509
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2008
- 资助国家:美国
- 起止时间:2008-09-01 至 2010-02-28
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In many real world scenarios, speech recognition and speaker identification systems must deal with simultaneous speech from several talkers, i.e., speech mixtures representing conversations in natural environments. Users of cochlear implants encounter problems in separating speakers in multi-speaker environments, because of the loss of fine temporal structure. Thus, a crucial preprocessing step for such systems is the segregation of speech according to its constituent sources. The project is the first part of this process which involves the recognition of the number of speakers and the separation of their pitch tracks based on the periodic portions of the speech signal (i.e., voiced regions). Since different speakers have characteristic pitch ranges as a consequence of vocal cord physiology, pitch tracks can be used to help separate the combined signal into different speech streams. Current popular multi-pitch tracking approaches are susceptible to artifacts caused by the interaction between the periodic regions of the different speech signals. Consequently, the periodicity of the combined signal can be different from that of the individual components. The major new idea is the extension of an existent periodicity and pitch estimation process to higher dimensions, arriving at a multi-dimensional periodicity function which is not susceptible to the harmonic interaction artifacts. Preliminary results show that the multiple pitch tracks obtained are accurate even when one speaker is considerably more dominant than the other speaker. The approach is easily generalized to non-speech audio and it should be robust in noisy channels. The outcome of this project will be used in a future project where the actual speech streams will be separated from each other based on the multi-pitch information.
在许多现实场景中,语音识别和说话人识别系统必须处理来自多个说话人的同时语音,即自然环境中代表对话的语音混合。由于精细颞叶结构的丧失,人工耳蜗使用者在多扬声器环境中会遇到分离扬声器的问题。因此,这种系统的一个关键预处理步骤是根据语音的组成来源对语音进行分离。这个项目是这个过程的第一部分,它涉及到识别说话者的数量,并根据语音信号的周期性部分(即浊音区域)分离他们的音高轨道。由于不同的说话者具有声带生理学的特征音高范围,音高轨道可以用来帮助将组合信号分离成不同的语音流。当前流行的多音高跟踪方法容易受到不同语音信号周期区域之间相互作用所引起的伪影的影响。因此,组合信号的周期性可能不同于单个分量的周期性。主要的新思想是将现有的周期和基音估计过程扩展到更高的维度,得到一个不受谐波相互作用伪像影响的多维周期函数。初步结果表明,即使在一个说话者比另一个说话者更占优势的情况下,所获得的多重音轨也是准确的。该方法易于推广到非语音音频,并且在噪声信道中具有较强的鲁棒性。这个项目的结果将在未来的项目中使用,在这个项目中,实际的语音流将基于多音高信息彼此分离。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Carol Espy-Wilson其他文献
Computationally Scalable and Clinically Sound: Laying the Groundwork to Use Machine Learning Techniques for Social Media and Language Data in Predicting Psychiatric Symptoms
- DOI:
10.1016/j.biopsych.2022.02.146 - 发表时间:
2022-05-01 - 期刊:
- 影响因子:
- 作者:
Deanna Kelly;Glen Coppersmith;John Dickerson;Carol Espy-Wilson;Hanna Michel;Philip Resnik - 通讯作者:
Philip Resnik
Carol Espy-Wilson的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Carol Espy-Wilson', 18)}}的其他基金
Collaborative Research: Estimating Articulatory Constriction Place and Timing from Speech Acoustics
合作研究:从语音声学估计发音收缩位置和时间
- 批准号:
2141413 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Standard Grant
SCH: INT: Collaborative Research: Using Multi-Stage Learning to Prioritize Mental Health
SCH:INT:协作研究:利用多阶段学习优先考虑心理健康
- 批准号:
2124270 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Standard Grant
Collaborative Research: Effects of production variability on the acoustic consequences of coordinated articulatory gestures
合作研究:生产变异性对协调发音姿势的声学结果的影响
- 批准号:
1436600 - 财政年份:2014
- 资助金额:
-- - 项目类别:
Standard Grant
RI: Medium: Collaborative Research: Multilingual Gestural Models for Robust Language-Independent Speech Recognition
RI:媒介:协作研究:用于鲁棒语言无关语音识别的多语言手势模型
- 批准号:
1162525 - 财政年份:2012
- 资助金额:
-- - 项目类别:
Standard Grant
CIF: Small: Nonintrusive Digital Speech Forensics: Source Identification and Content authentication
CIF:小型:非侵入式数字语音取证:源识别和内容身份验证
- 批准号:
0917104 - 财政年份:2009
- 资助金额:
-- - 项目类别:
Standard Grant
RI: Collaborative Research: Landmark-based Robust Speech Recognition Using Prosody-Guided Models of Speech Variability
RI:协作研究:使用韵律引导的语音变异模型进行基于地标的鲁棒语音识别
- 批准号:
0703859 - 财政年份:2007
- 资助金额:
-- - 项目类别:
Continuing Grant
The Development of Low-Level Speaker-Specific Information for Speaker Recognition
用于说话人识别的低级说话人特定信息的开发
- 批准号:
0519256 - 财政年份:2005
- 资助金额:
-- - 项目类别:
Continuing Grant
Acoustic-Phonetic Knowledge and Speech Recognition
声学语音知识和语音识别
- 批准号:
0236707 - 财政年份:2003
- 资助金额:
-- - 项目类别:
Continuing Grant
SGER: Exploration of a Neurological Model to Improve the Extraction of Linguistic Features in Speech
SGER:探索神经模型以改进语音中语言特征的提取
- 批准号:
0233482 - 财政年份:2002
- 资助金额:
-- - 项目类别:
Standard Grant
相似海外基金
Large Amplitude Oscillatory Extension (LAOE)
大振幅振荡扩展 (LAOE)
- 批准号:
24K07332 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (C)
Greener cleaner composites [extension]
更环保、更清洁的复合材料[扩展]
- 批准号:
MR/Y020057/1 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Fellowship
Development of sustainable bionanocomposite materials for perishable foods & drinks shelf life extension: delivering better food for all
开发用于易腐烂食品的可持续生物纳米复合材料
- 批准号:
10075083 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Collaborative R&D
CReDo+ Climate Resilience Demonstrator (extension to new climate risks)
CReDo 气候复原力演示器(扩展到新的气候风险)
- 批准号:
10061340 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Feasibility Studies
Investigation of compounds that mimic the effects of calorie restriction on healthy life extension and their mechanisms of action
研究模拟热量限制对延长健康寿命的影响的化合物及其作用机制
- 批准号:
23H03331 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (B)
Precision Collider Phenomenology: OpenLoops BSM Extension
精密对撞机现象学:OpenLoops BSM 扩展
- 批准号:
2888853 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Studentship
Lifespan extension, Somatotropic signaling and Tauopathy
寿命延长、促生长信号传导和 Tau 蛋白病
- 批准号:
10661340 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Impact of Medicaid Postpartum Coverage Extension and Mandated Postpartum Depression Screening on Care for Gestational Diabetes and Pregnancy-Induced Hypertension
医疗补助产后覆盖范围扩大和强制性产后抑郁症筛查对妊娠期糖尿病和妊娠高血压综合征护理的影响
- 批准号:
10749378 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Collaborative Research: The interplay of nitrogen loading and ecosystem sustainability in threatened wetlands: an extension of the WETFEET project
合作研究:受威胁湿地氮负荷与生态系统可持续性的相互作用:WEFTEET 项目的延伸
- 批准号:
2225001 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
PFI (MCA): Self-Contained Soft Robotic Rehabilitation Device for Finger Extension Exercises
PFI (MCA):用于手指伸展练习的自给式软机器人康复装置
- 批准号:
2321454 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant