权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Collaborative Research: Landmark-based Robust Speech Recognition Using Prosody-Guided Models of Speech Variability

RI：协作研究：使用韵律引导的语音变异模型进行基于地标的鲁棒语音识别

基本信息

批准号：
0703859
负责人：
Carol Espy-Wilson
金额：
$ 51.43万
依托单位：
University of Maryland, College Park
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2007
资助国家：
美国
起止时间：
2007-06-01 至 2013-11-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=0703859&HistoricalAwards=false
关键词：
RI Collaborative Research Landmark based

项目摘要

Despite great strides in the development of automatic speech recognition technology, we do not yet have a system with performance comparable to humans in automatically transcribing unrestricted conversational speech, representing many speakers and dialects, and embedded in adverse acoustic environments. This approach applies new high-dimensional machine learning techniques, constrained by empirical and theoretical studies of speech production and perception, to learn from data the information structures that human listeners extract from speech. To do this, we will develop large-vocabulary psychologically realistic models of speech acoustics, pronunciation variability, prosody, and syntax by deriving knowledge representations that reflect those proposed for human speech production and speech perception, using machine learning techniques to adjust the parameters of all knowledge representations simultaneously in order to minimize the structural risk of the recognizer. The team will develop nonlinear acoustic landmark detectors and pattern classifiers that integrate auditory-based signal processing and acoustic phonetic processing, are invariant to noise, change in speaker characteristics and reverberation, and can be learned in a semi-supervised fashion from labeled and unlabeled data. In addition, they will use variable frame rate analysis, which will allow for multi-resolution analysis, as well as implement lexical access based on gesture, using a variety of training data. The work will improve communication and collaboration between people and machines and also improve understanding of how human produce and perceive speech. The work brings together a team of experts in speech processing, acoustic phonetics, prosody, gestural phonology, statistical pattern matching, language modeling, and speech perception, with faculty across engineering, computer science and linguistics. Support and engagement of students and postdoctoral fellows are part of the project, engaging in speech modeling and algorithm development. Finally, the proposed work will result in a set of databases and tools that will be disseminated to serve the research and education community at large.

尽管自动语音识别技术的发展取得了长足的进步，但我们还没有一个系统在自动转录不受限制的对话语音，代表许多扬声器和方言，并嵌入在恶劣的声学环境中的性能与人类相当。这种方法应用新的高维机器学习技术，受语音产生和感知的经验和理论研究的约束，从数据中学习人类听众从语音中提取的信息结构。要做到这一点，我们将开发大词汇量的语音声学，发音变化，韵律和语法的心理现实模型，通过推导知识表示，反映那些建议人类的语音生产和语音感知，使用机器学习技术来调整所有知识表示的参数，同时为了最大限度地减少识别器的结构风险。该团队将开发非线性声学地标检测器和模式分类器，这些检测器和模式分类器集成了基于语音的信号处理和声学语音处理，对噪声、说话者特征和混响的变化具有不变性，并且可以以半监督的方式从标记和未标记的数据中学习。此外，他们将使用可变帧速率分析，这将允许多分辨率分析，以及使用各种训练数据实现基于手势的词汇访问。这项工作将改善人与机器之间的沟通和协作，并提高对人类如何产生和感知语音的理解。这项工作汇集了语音处理，声学语音学，韵律学，手势音位学，统计模式匹配，语言建模和语音感知方面的专家团队，以及工程，计算机科学和语言学方面的教师。学生和博士后研究员的支持和参与是该项目的一部分，从事语音建模和算法开发。最后，拟议的工作将产生一套数据库和工具，将分发给广大研究和教育界。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Carol Espy-Wilson其他文献

Computationally Scalable and Clinically Sound: Laying the Groundwork to Use Machine Learning Techniques for Social Media and Language Data in Predicting Psychiatric Symptoms

DOI：
10.1016/j.biopsych.2022.02.146
发表时间：
2022-05-01
期刊：
Conference abstract
影响因子：
作者：
Deanna Kelly;Glen Coppersmith;John Dickerson;Carol Espy-Wilson;Hanna Michel;Philip Resnik
通讯作者：
Philip Resnik