RI: Collaborative Research: Landmark-based Robust Speech Recognition Using Prosody-Guided Models of Speech Variability
RI:协作研究:使用韵律引导的语音变异模型进行基于地标的鲁棒语音识别
基本信息
- 批准号:0703859
- 负责人:
- 金额:$ 51.43万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2007
- 资助国家:美国
- 起止时间:2007-06-01 至 2013-11-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Despite great strides in the development of automatic speech recognition technology, we do not yet have a system with performance comparable to humans in automatically transcribing unrestricted conversational speech, representing many speakers and dialects, and embedded in adverse acoustic environments. This approach applies new high-dimensional machine learning techniques, constrained by empirical and theoretical studies of speech production and perception, to learn from data the information structures that human listeners extract from speech. To do this, we will develop large-vocabulary psychologically realistic models of speech acoustics, pronunciation variability, prosody, and syntax by deriving knowledge representations that reflect those proposed for human speech production and speech perception, using machine learning techniques to adjust the parameters of all knowledge representations simultaneously in order to minimize the structural risk of the recognizer. The team will develop nonlinear acoustic landmark detectors and pattern classifiers that integrate auditory-based signal processing and acoustic phonetic processing, are invariant to noise, change in speaker characteristics and reverberation, and can be learned in a semi-supervised fashion from labeled and unlabeled data. In addition, they will use variable frame rate analysis, which will allow for multi-resolution analysis, as well as implement lexical access based on gesture, using a variety of training data. The work will improve communication and collaboration between people and machines and also improve understanding of how human produce and perceive speech. The work brings together a team of experts in speech processing, acoustic phonetics, prosody, gestural phonology, statistical pattern matching, language modeling, and speech perception, with faculty across engineering, computer science and linguistics. Support and engagement of students and postdoctoral fellows are part of the project, engaging in speech modeling and algorithm development. Finally, the proposed work will result in a set of databases and tools that will be disseminated to serve the research and education community at large.
尽管自动语音识别技术的发展取得了长足的进步,但我们还没有一个系统在自动转录不受限制的对话语音,代表许多扬声器和方言,并嵌入在恶劣的声学环境中的性能与人类相当。 这种方法应用新的高维机器学习技术,受语音产生和感知的经验和理论研究的约束,从数据中学习人类听众从语音中提取的信息结构。 要做到这一点,我们将开发大词汇量的语音声学,发音变化,韵律和语法的心理现实模型,通过推导知识表示,反映那些建议人类的语音生产和语音感知,使用机器学习技术来调整所有知识表示的参数,同时为了最大限度地减少识别器的结构风险。该团队将开发非线性声学地标检测器和模式分类器,这些检测器和模式分类器集成了基于语音的信号处理和声学语音处理,对噪声、说话者特征和混响的变化具有不变性,并且可以以半监督的方式从标记和未标记的数据中学习。 此外,他们将使用可变帧速率分析,这将允许多分辨率分析,以及使用各种训练数据实现基于手势的词汇访问。 这项工作将改善人与机器之间的沟通和协作,并提高对人类如何产生和感知语音的理解。这项工作汇集了语音处理,声学语音学,韵律学,手势音位学,统计模式匹配,语言建模和语音感知方面的专家团队,以及工程,计算机科学和语言学方面的教师。 学生和博士后研究员的支持和参与是该项目的一部分,从事语音建模和算法开发。 最后,拟议的工作将产生一套数据库和工具,将分发给广大研究和教育界。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Carol Espy-Wilson其他文献
Computationally Scalable and Clinically Sound: Laying the Groundwork to Use Machine Learning Techniques for Social Media and Language Data in Predicting Psychiatric Symptoms
- DOI:
10.1016/j.biopsych.2022.02.146 - 发表时间:
2022-05-01 - 期刊:
- 影响因子:
- 作者:
Deanna Kelly;Glen Coppersmith;John Dickerson;Carol Espy-Wilson;Hanna Michel;Philip Resnik - 通讯作者:
Philip Resnik
Carol Espy-Wilson的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Carol Espy-Wilson', 18)}}的其他基金
Collaborative Research: Estimating Articulatory Constriction Place and Timing from Speech Acoustics
合作研究:从语音声学估计发音收缩位置和时间
- 批准号:
2141413 - 财政年份:2022
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
SCH: INT: Collaborative Research: Using Multi-Stage Learning to Prioritize Mental Health
SCH:INT:协作研究:利用多阶段学习优先考虑心理健康
- 批准号:
2124270 - 财政年份:2021
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
Collaborative Research: Effects of production variability on the acoustic consequences of coordinated articulatory gestures
合作研究:生产变异性对协调发音姿势的声学结果的影响
- 批准号:
1436600 - 财政年份:2014
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
RI: Medium: Collaborative Research: Multilingual Gestural Models for Robust Language-Independent Speech Recognition
RI:媒介:协作研究:用于鲁棒语言无关语音识别的多语言手势模型
- 批准号:
1162525 - 财政年份:2012
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
CIF: Small: Nonintrusive Digital Speech Forensics: Source Identification and Content authentication
CIF:小型:非侵入式数字语音取证:源识别和内容身份验证
- 批准号:
0917104 - 财政年份:2009
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
RI: Extension of the APP detector for multipitch tracking and speaker separation
RI:APP 检测器的扩展,用于多音高跟踪和扬声器分离
- 批准号:
0812509 - 财政年份:2008
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
The Development of Low-Level Speaker-Specific Information for Speaker Recognition
用于说话人识别的低级说话人特定信息的开发
- 批准号:
0519256 - 财政年份:2005
- 资助金额:
$ 51.43万 - 项目类别:
Continuing Grant
Acoustic-Phonetic Knowledge and Speech Recognition
声学语音知识和语音识别
- 批准号:
0236707 - 财政年份:2003
- 资助金额:
$ 51.43万 - 项目类别:
Continuing Grant
SGER: Exploration of a Neurological Model to Improve the Extraction of Linguistic Features in Speech
SGER:探索神经模型以改进语音中语言特征的提取
- 批准号:
0233482 - 财政年份:2002
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
相似海外基金
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312841 - 财政年份:2023
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312842 - 财政年份:2023
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Foundations of Few-Round Active Learning
协作研究:RI:小型:少轮主动学习的基础
- 批准号:
2313131 - 财政年份:2023
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313151 - 财政年份:2023
- 资助金额:
$ 51.43万 - 项目类别:
Continuing Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312840 - 财政年份:2023
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Deep Constrained Learning for Power Systems
合作研究:RI:小型:电力系统的深度约束学习
- 批准号:
2345528 - 财政年份:2023
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Motion Fields Understanding for Enhanced Long-Range Imaging
合作研究:RI:小型:增强远程成像的运动场理解
- 批准号:
2232298 - 财政年份:2023
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: End-to-end Learning of Fair and Explainable Schedules for Court Systems
合作研究:RI:小型:法院系统公平且可解释的时间表的端到端学习
- 批准号:
2232055 - 财政年份:2023
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313149 - 财政年份:2023
- 资助金额:
$ 51.43万 - 项目类别:
Continuing Grant
Collaborative Research: CompCog: RI: Medium: Understanding human planning through AI-assisted analysis of a massive chess dataset
合作研究:CompCog:RI:中:通过人工智能辅助分析海量国际象棋数据集了解人类规划
- 批准号:
2312374 - 财政年份:2023
- 资助金额:
$ 51.43万 - 项目类别:
Standard Grant