权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

SGER: Pronunciation Modeling for Conversational Speech Recognition

SGER：会话语音识别的发音建模

基本信息

批准号：
9714169
负责人：
Sanjeev Khudanpur
金额：
$ 5万
依托单位：
Johns Hopkins University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
1997
资助国家：
美国
起止时间：
1997-10-01 至 1998-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=9714169&HistoricalAwards=false
关键词：
SGER Pronunciation Modeling Conversational Speech

项目摘要

Technological advances in the last few years have brought automatic speech recognition where it can now be deployed in limited, real-world applications. However, serious obstacles still remain in satisfactory recognition of spontaneous speech. Casual dialog exhibits a large amount of variability in the pronunciation of words. Humans have little difficulty in transcribing such speech, but automatic transcription suffers badly because of the inability of recognizers to properly consider nonstandard pronunciations. This must be addressed to make systems robust to a conversational speaking style. The investigators propose to examine alternate pronunciations of words in Switchboard, a large corpus of conversational speech, and develop statistical models for predicting the variations based on cues in the acoustic, phonemic and linguistic context of a word. Alternate pronunciations are viewed as perturbations of the canonical ones and, to predict likely variations, the researchers propose to use phonemic environment, lexical stress, syllabic structure, part-of-speech, phrase boundary locations, the realization or deletion of neighboring phones, etc. They shall apply complementary nonparametric and parametric techniques for building predictive models with the goal of improving accuracy in a speech recognition system. This effort addresses a significant obstacle in automatic recognition of spontaneous speech and, in the process, gains fundamental understanding of the causes, mechanisms and extent of pronunciation variability.

过去几年的技术进步带来了自动语音识别，现在它可以在有限的现实应用中部署。然而，在对自发语音进行满意的识别方面仍然存在严重的障碍。随意的对话在单词的发音上表现出很大的变化。人类在转录这些语音方面没有什么困难，但由于识别器无法正确考虑非标准发音，自动转录受到严重影响。必须解决这个问题，使系统能够适应会话式的讲话风格。研究人员建议检查Switchboard（一个大型会话语音语料库）中单词的交替发音，并开发统计模型，根据单词的声学、音位和语言上下文的线索预测这些变化。交替发音被视为标准发音的扰动，为了预测可能的变化，研究人员建议使用音素环境、词汇重音、音节结构、词性、短语边界位置、邻近电话的实现或删除等。他们应该应用互补的非参数和参数技术来建立预测模型，以提高语音识别系统的准确性。这一努力解决了自发语音自动识别中的一个重大障碍，并在此过程中获得了对语音变异的原因、机制和程度的基本理解。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Sanjeev Khudanpur其他文献

DOI：
10.1016/j.csl.2015.08.007
发表时间：
2016-03-01
期刊：
Research article
影响因子：
作者：
Scott Novotney;Richard Schwartz;Sanjeev Khudanpur
通讯作者：
Sanjeev Khudanpur

A dilemma of ground truth in noisy speech separation and an approach to lessen the impact of imperfect training data

DOI：
10.1016/j.csl.2022.101410
发表时间：
2023-01-01
期刊：
Research article
影响因子：
作者：
Matthew Maciejewski;Jing Shi;Shinji Watanabe;Sanjeev Khudanpur
通讯作者：
Sanjeev Khudanpur

Towards machines that know when they do not know: Summary of work done at 2014 Frederick Jelinek Memorial workshop

走向知道何时不知道的机器：2014 年 Frederick Jelinek 纪念研讨会所做工作总结

DOI：
发表时间：
2015
期刊：
Proc. ICASSP2015
影响因子：
0
作者：
Hynek Hermansky;Lukas Burget;Jordan Cohen;Emmanuel Dupoux Naomi Feldman;John Godfrey;Sanjeev Khudanpur;Matthew Maciejewski;Sri Harish Mallidi;Anjali Menon;Tetsuji Ogawa;Vijayaditya Peddinti;Richard Rose;Richard Stern;Matthew Wiesner;Karel Ve
通讯作者：
Karel Ve