Modeling Pronunciation Variation for Universal Access to Speech Understanding
为普遍获得语音理解而建模发音变化
基本信息
- 批准号:9978025
- 负责人:
- 金额:$ 50.4万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:1999
- 资助国家:美国
- 起止时间:1999-09-15 至 2004-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In order for speech-based interfaces to be universally accessible - that is, available to all citizens in all situations - they must deal successfully with massive variability in pronunciation, as studies have shown that this is by far the single largest source of error in machine speech recognition. This project will address that issue by building dynamic pronunciation models which use context to predict the most likely pronunciations of a word in a particular utterance. The PIs' previous work has established that speakers are more likely to delete certain sounds (e.g., the final "t" in "about") in a variety of situations: if the next word starts with certain consonants; if they are speaking quickly; if the word is unsurprising (has a high trigram probability); if they are young and male; if the word is not followed by a pause or a word-repetition; or if they are speakers of certain dialects. They have further shown how confidence metrics can be used to identify inaccurate dictionary pronunciations based on actual pronunciations in speech. This project will extend and combine both of these lines of research, by using phonetic-confidence models to identify words with incorrect pronunciations that contribute to word-error, and then building statistical (decision-tree) models of pronunciation variation for these words. The PIs' model provides a general engine which they will apply to several specific classes of pronunciation variation, including dialect and speaker type, embedded in the Sphinx-II speech recognizer. The application of computational models to the study of pronunciation variation is an important new research area in linguistics and cognitive modeling, with implications for and potential benefits to many areas of science and pedagogy, including universal access.
为了使基于语音的界面具有普遍可访问性--也就是说,在所有情况下都可供所有公民使用--它们必须成功地处理发音的巨大差异,因为研究表明,这是迄今为止机器语音识别中最大的错误来源。 这个项目将通过建立动态发音模型来解决这个问题,该模型使用上下文来预测特定话语中一个词最可能的发音。 PI之前的工作已经确定,说话者更有可能删除某些声音(例如,“about”中的最后一个“t”)在各种情况下:如果下一个单词以某些辅音开头; 如果他们说话很快;如果这个词并不令人惊讶(具有很高的三元组概率);如果他们是年轻的男性; 如果这个词后面没有停顿或重复;或者如果他们说某些方言。 他们还进一步展示了如何使用置信度来根据语音中的实际发音识别不准确的字典发音。这个项目将扩展和联合收割机这两条线的研究,通过使用语音置信度模型来识别单词与不正确的发音,有助于单词错误,然后建立统计(决策树)模型的发音变化,这些话。 PI的模型提供了一个通用的引擎,它们将应用于几个特定类别的发音变化,包括方言和扬声器类型,嵌入在Sphinx-II语音识别器。 将计算模型应用于发音变异的研究是语言学和认知建模领域的一个重要的新研究领域,对科学和教育学的许多领域都有影响和潜在的好处,包括普遍获得。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Daniel Jurafsky其他文献
ReFT: Representation Finetuning for Language Models
ReFT:语言模型的表示微调
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Zhengxuan Wu;Aryaman Arora;Zheng Wang;Atticus Geiger;Daniel Jurafsky;Christopher D. Manning;Christopher Potts - 通讯作者:
Christopher Potts
Daniel Jurafsky的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Daniel Jurafsky', 18)}}的其他基金
RI: Small: New tools for studying structural and inductive bias in NLP models
RI:小:研究 NLP 模型中的结构和归纳偏差的新工具
- 批准号:
2128145 - 财政年份:2021
- 资助金额:
$ 50.4万 - 项目类别:
Continuing Grant
RI: Medium: Deep Understanding: Integrating Neural and Symbolic Models of Meaning
RI:中:深度理解:整合意义的神经模型和符号模型
- 批准号:
1514268 - 财政年份:2015
- 资助金额:
$ 50.4万 - 项目类别:
Continuing Grant
RI: Small: Learning Meaning and Grammar from Interaction, Context, and the World
RI:小:从互动、情境和世界中学习意义和语法
- 批准号:
1216875 - 财政年份:2012
- 资助金额:
$ 50.4万 - 项目类别:
Standard Grant
RI-Small: Unsupervised Learning of Meaning
RI-Small:无监督意义学习
- 批准号:
0811974 - 财政年份:2008
- 资助金额:
$ 50.4万 - 项目类别:
Standard Grant
CAREER: Spoken Lexical Processing in Humans and Machines
职业:人类和机器的口语词汇处理
- 批准号:
9733067 - 财政年份:1998
- 资助金额:
$ 50.4万 - 项目类别:
Continuing Grant
SGER: Using Text Coherence and Verbal Valence in Long- Distance N-grams
SGER:在长距离 N 元语法中使用文本连贯性和语言效价
- 批准号:
9704046 - 财政年份:1997
- 资助金额:
$ 50.4万 - 项目类别:
Standard Grant
相似海外基金
Native, non-native or artificial phonetic content for pronunciation education: representations and perception in the case of L2 French
用于发音教育的母语、非母语或人工语音内容:以法语 L2 为例的表征和感知
- 批准号:
24K00093 - 财政年份:2024
- 资助金额:
$ 50.4万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Does the functional load principle predict to how non-native English speakers assess the pronunciation intelligibility of Japanese non-native English speakers?
功能负荷原则是否可以预测非英语母语人士如何评估日语非英语母语人士的发音清晰度?
- 批准号:
24K04051 - 财政年份:2024
- 资助金额:
$ 50.4万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Investigation of Pronunciation Learning through Auditory Priming and Its Application to Pronunciation Materials: A Study Considering Individual Differences
通过听觉启动进行发音学习的调查及其在发音材料中的应用:一项考虑个体差异的研究
- 批准号:
23K00689 - 财政年份:2023
- 资助金额:
$ 50.4万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Measurement of L2 pronunciation deviation and L2 listening disfluency and its application to prosody training for smooth international communication
二语发音偏差和二语听力不流利度测量及其在国际交流顺畅韵律训练中的应用
- 批准号:
23K17459 - 财政年份:2023
- 资助金额:
$ 50.4万 - 项目类别:
Grant-in-Aid for Challenging Research (Pioneering)
Analysis of the paralinguistic production mechanism by Japanese learners and applications to pronunciation teaching
日语学习者副语言产生机制分析及其在发音教学中的应用
- 批准号:
22KF0429 - 财政年份:2023
- 资助金额:
$ 50.4万 - 项目类别:
Grant-in-Aid for JSPS Fellows
EduSay™ - developing a digital, audio-visual and kinesthetic English pronunciation training programme for international students and professionals; upskilling communications for education, employability, UK productivity and integration
EduSay™ - 为国际学生和专业人士开发数字、视听和动觉英语发音培训计划;
- 批准号:
10063001 - 财政年份:2023
- 资助金额:
$ 50.4万 - 项目类别:
Collaborative R&D
Automatic pronunciation and prosody evaluation based on longitudinal analysis of English speech produced by Japanese children
基于日本儿童英语语音纵向分析的自动发音和韵律评估
- 批准号:
23H00648 - 财政年份:2023
- 资助金额:
$ 50.4万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Study on Dialectal Differences and Developmental Processes of Japanese Language Rhythms through Observation of Pronunciation Dynamics
从发音动态观察研究日语韵律的方言差异及发展过程
- 批准号:
23K00544 - 财政年份:2023
- 资助金额:
$ 50.4万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The effect of loanword knowledge on L2 pronunciation
外来词知识对二语发音的影响
- 批准号:
23K12157 - 财政年份:2023
- 资助金额:
$ 50.4万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Development of second language pronunciation learning system with a portable ultrasound device
便携式超声设备第二语言发音学习系统的开发
- 批准号:
22K00796 - 财政年份:2022
- 资助金额:
$ 50.4万 - 项目类别:
Grant-in-Aid for Scientific Research (C)