SGER: Pronunciation Modeling for Conversational Speech Recognition

SGER:会话语音识别的发音建模

基本信息

  • 批准号:
    9714169
  • 负责人:
  • 金额:
    $ 5万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    1997
  • 资助国家:
    美国
  • 起止时间:
    1997-10-01 至 1998-09-30
  • 项目状态:
    已结题

项目摘要

Technological advances in the last few years have brought automatic speech recognition where it can now be deployed in limited, real-world applications. However, serious obstacles still remain in satisfactory recognition of spontaneous speech. Casual dialog exhibits a large amount of variability in the pronunciation of words. Humans have little difficulty in transcribing such speech, but automatic transcription suffers badly because of the inability of recognizers to properly consider nonstandard pronunciations. This must be addressed to make systems robust to a conversational speaking style. The investigators propose to examine alternate pronunciations of words in Switchboard, a large corpus of conversational speech, and develop statistical models for predicting the variations based on cues in the acoustic, phonemic and linguistic context of a word. Alternate pronunciations are viewed as perturbations of the canonical ones and, to predict likely variations, the researchers propose to use phonemic environment, lexical stress, syllabic structure, part-of-speech, phrase boundary locations, the realization or deletion of neighboring phones, etc. They shall apply complementary nonparametric and parametric techniques for building predictive models with the goal of improving accuracy in a speech recognition system. This effort addresses a significant obstacle in automatic recognition of spontaneous speech and, in the process, gains fundamental understanding of the causes, mechanisms and extent of pronunciation variability.
过去几年的技术进步带来了自动语音识别,现在它可以在有限的现实应用中部署。然而,在对自发语音进行满意的识别方面仍然存在严重的障碍。随意的对话在单词的发音上表现出很大的变化。人类在转录这些语音方面没有什么困难,但由于识别器无法正确考虑非标准发音,自动转录受到严重影响。必须解决这个问题,使系统能够适应会话式的讲话风格。研究人员建议检查Switchboard(一个大型会话语音语料库)中单词的交替发音,并开发统计模型,根据单词的声学、音位和语言上下文的线索预测这些变化。交替发音被视为标准发音的扰动,为了预测可能的变化,研究人员建议使用音素环境、词汇重音、音节结构、词性、短语边界位置、邻近电话的实现或删除等。他们应该应用互补的非参数和参数技术来建立预测模型,以提高语音识别系统的准确性。这一努力解决了自发语音自动识别中的一个重大障碍,并在此过程中获得了对语音变异的原因、机制和程度的基本理解。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Sanjeev Khudanpur其他文献

Getting more from automatic transcripts for semi-supervised language modeling
  • DOI:
    10.1016/j.csl.2015.08.007
  • 发表时间:
    2016-03-01
  • 期刊:
  • 影响因子:
  • 作者:
    Scott Novotney;Richard Schwartz;Sanjeev Khudanpur
  • 通讯作者:
    Sanjeev Khudanpur
A dilemma of ground truth in noisy speech separation and an approach to lessen the impact of imperfect training data
  • DOI:
    10.1016/j.csl.2022.101410
  • 发表时间:
    2023-01-01
  • 期刊:
  • 影响因子:
  • 作者:
    Matthew Maciejewski;Jing Shi;Shinji Watanabe;Sanjeev Khudanpur
  • 通讯作者:
    Sanjeev Khudanpur
Towards machines that know when they do not know: Summary of work done at 2014 Frederick Jelinek Memorial workshop
走向知道何时不知道的机器:2014 年 Frederick Jelinek 纪念研讨会所做工作总结
  • DOI:
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hynek Hermansky;Lukas Burget;Jordan Cohen;Emmanuel Dupoux Naomi Feldman;John Godfrey;Sanjeev Khudanpur;Matthew Maciejewski;Sri Harish Mallidi;Anjali Menon;Tetsuji Ogawa;Vijayaditya Peddinti;Richard Rose;Richard Stern;Matthew Wiesner;Karel Ve
  • 通讯作者:
    Karel Ve

Sanjeev Khudanpur的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Sanjeev Khudanpur', 18)}}的其他基金

CCRI: ENS: Next Generation Tools for Spoken Language Science & Technology
CCRI:ENS:下一代口语科学工具
  • 批准号:
    2120435
  • 财政年份:
    2021
  • 资助金额:
    $ 5万
  • 项目类别:
    Standard Grant
RI: Medium: Collaborative Research: Semi-Supervised Discriminative Training of Language Models
RI:媒介:协作研究:语言模型的半监督判别训练
  • 批准号:
    0963898
  • 财政年份:
    2010
  • 资助金额:
    $ 5万
  • 项目类别:
    Continuing Grant
Cross-Cutting Research Workshops on Intelligent Information Systems
智能信息系统跨领域研究研讨会
  • 批准号:
    1005411
  • 财政年份:
    2010
  • 资助金额:
    $ 5万
  • 项目类别:
    Continuing Grant
SGER: Self-Supervised Discriminative Training of Statistical Language Models
SGER:统计语言模型的自监督判别训练
  • 批准号:
    0840112
  • 财政年份:
    2008
  • 资助金额:
    $ 5万
  • 项目类别:
    Standard Grant
PIRE: Investigation of Meaning Representations in Language Understanding for Machine Translation Systems
PIRE:机器翻译系统语言理解中的意义表示研究
  • 批准号:
    0530118
  • 财政年份:
    2005
  • 资助金额:
    $ 5万
  • 项目类别:
    Continuing Grant

相似海外基金

Native, non-native or artificial phonetic content for pronunciation education: representations and perception in the case of L2 French
用于发音教育的母语、非母语或人工语音内容:以法语 L2 为例的表征和感知
  • 批准号:
    24K00093
  • 财政年份:
    2024
  • 资助金额:
    $ 5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Does the functional load principle predict to how non-native English speakers assess the pronunciation intelligibility of Japanese non-native English speakers?
功能负荷原则是否可以预测非英语母语人士如何评估日语非英语母语人士的发音清晰度?
  • 批准号:
    24K04051
  • 财政年份:
    2024
  • 资助金额:
    $ 5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Investigation of Pronunciation Learning through Auditory Priming and Its Application to Pronunciation Materials: A Study Considering Individual Differences
通过听觉启动进行发音学习的调查及其在发音材料中的应用:一项考虑个体差异的研究
  • 批准号:
    23K00689
  • 财政年份:
    2023
  • 资助金额:
    $ 5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Analysis of the paralinguistic production mechanism by Japanese learners and applications to pronunciation teaching
日语学习者副语言产生机制分析及其在发音教学中的应用
  • 批准号:
    22KF0429
  • 财政年份:
    2023
  • 资助金额:
    $ 5万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Measurement of L2 pronunciation deviation and L2 listening disfluency and its application to prosody training for smooth international communication
二语发音偏差和二语听力不流利度测量及其在国际交流顺畅韵律训练中的应用
  • 批准号:
    23K17459
  • 财政年份:
    2023
  • 资助金额:
    $ 5万
  • 项目类别:
    Grant-in-Aid for Challenging Research (Pioneering)
EduSay™ - developing a digital, audio-visual and kinesthetic English pronunciation training programme for international students and professionals; upskilling communications for education, employability, UK productivity and integration
EduSay™ - 为国际学生和专业人士开发数字、视听和动觉英语发音培训计划;
  • 批准号:
    10063001
  • 财政年份:
    2023
  • 资助金额:
    $ 5万
  • 项目类别:
    Collaborative R&D
Automatic pronunciation and prosody evaluation based on longitudinal analysis of English speech produced by Japanese children
基于日本儿童英语语音纵向分析的自动发音和韵律评估
  • 批准号:
    23H00648
  • 财政年份:
    2023
  • 资助金额:
    $ 5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Study on Dialectal Differences and Developmental Processes of Japanese Language Rhythms through Observation of Pronunciation Dynamics
从发音动态观察研究日语韵律的方言差异及发展过程
  • 批准号:
    23K00544
  • 财政年份:
    2023
  • 资助金额:
    $ 5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
The effect of loanword knowledge on L2 pronunciation
外来词知识对二语发音的影响
  • 批准号:
    23K12157
  • 财政年份:
    2023
  • 资助金额:
    $ 5万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Development of second language pronunciation learning system with a portable ultrasound device
便携式超声设备第二语言发音学习系统的开发
  • 批准号:
    22K00796
  • 财政年份:
    2022
  • 资助金额:
    $ 5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了