RI: Medium: Collaborative Research: Explicit Articulatory Models of Spoken Language, with Application to Automatic Speech Recognition
RI:媒介:协作研究:口语显式发音模型及其在自动语音识别中的应用
基本信息
- 批准号:0905633
- 负责人:
- 金额:$ 43.88万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2009
- 资助国家:美国
- 起止时间:2009-07-01 至 2013-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).One of the main challenges in automatic speech recognition is variability in speaking style, including speaking rate changes and coarticulation. Models of the articulators (such as the lips and tongue) can succinctly represent much of this variability. Most previous work on articulatory models has focused on the relationship between acoustics and articulation, but more significant improvements require models of the hidden articulatory state structure. This work has both a technological goal of improving recognition and a scientific goal of better understanding articulatory phenomena.The project considers larger model classes than previously studied. In particular, the project develops graphical models, including dynamic Bayesian networks and conditional random fields, designed to take advantage of articulatory knowledge. A new framework for hybrid directed and undirected graphical models is being developed, in recognition of the benefits of both directed and undirected models, and of both generative and discriminative training. The project activities include major extension of earlier articulatory models with context modeling, asynchrony structures, and specialized training; development of factored conditional random field models of articulatory variables; and discriminative training to alleviate word confusability.The scientific goal addresses questions about the ways in which articulatory trajectories vary in different contexts. Existing databases are used, and initial work in manual articulatory annotation is being extended. In addition, the project uses articulatory models to perform forced transcription of larger data sets, providing an additional resource for the research community. Other broad impacts include new models and techniques with applicability to other time-series modeling problems. Extending the applicability of speech recognition will help it fulfill its promise of enabling more efficient storage of and access to spoken information, and equalizing the technological playing field for those with hearing or motor disabilities.
该奖项是根据2009年美国复苏和再投资法案(公法111-5)资助的。自动语音识别的主要挑战之一是说话风格的变化,包括语速变化和协同发音。 发音器官的模型(如嘴唇和舌头)可以简洁地表示这种变化的大部分。 大多数以前的发音模型的工作都集中在声学和发音之间的关系,但更显着的改进需要隐藏的发音状态结构的模型。 这项工作既有提高识别的技术目标,也有更好地理解发音现象的科学目标。 特别是,该项目开发图形模型,包括动态贝叶斯网络和条件随机场,旨在利用发音知识。 一个新的框架,混合有向和无向图形模型正在开发中,在承认的好处,有向和无向模型,生成和歧视性的培训。 该项目的活动包括:对早期的发音模型进行重大扩展,包括语境建模、语音结构和专门培训;发展发音变量的因子化条件随机场模型;以及减少单词混淆性的辨别性培训。 使用现有的数据库,并在手动发音注释的初步工作正在扩大。 此外,该项目使用发音模型来执行较大数据集的强制转录,为研究社区提供额外的资源。 其他广泛的影响包括适用于其他时间序列建模问题的新模型和技术。 扩展语音识别的适用性将有助于实现其承诺,即更有效地存储和访问语音信息,并为听力或运动障碍者提供平等的技术竞争环境。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Karen Livescu其他文献
Discriminatively Structured Graphical Models for Speech Recognition The Graphical Models Team JHU 2001 Summer Workshop
用于语音识别的判别式结构化图形模型图形模型团队 JHU 2001 年夏季研讨会
- DOI:
- 发表时间:
2001 - 期刊:
- 影响因子:0
- 作者:
J. Bilmes;G. Zweig;Karen Livescu - 通讯作者:
Karen Livescu
Eating Activity Monitoring in Home Environments Using Smartphone-Based Video Recordings
使用基于智能手机的视频记录来监测家庭环境中的饮食活动
- DOI:
10.1109/dicta56598.2022.10034636 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Bowen Shi;D. Brentari;G. Shakhnarovich;Karen Livescu - 通讯作者:
Karen Livescu
A comparison of training approaches for discriminative segmental models
判别分段模型训练方法的比较
- DOI:
10.21437/interspeech.2014-307 - 发表时间:
2014 - 期刊:
- 影响因子:7.5
- 作者:
Hao Tang;Kevin Gimpel;Karen Livescu - 通讯作者:
Karen Livescu
Feature-based pronunciation modeling for automatic speech recognition
用于自动语音识别的基于特征的发音建模
- DOI:
- 发表时间:
2005 - 期刊:
- 影响因子:0
- 作者:
Karen Livescu - 通讯作者:
Karen Livescu
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
DiscreteSLU:具有自监督离散语音单元的大型语言模型,用于口语理解
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Suwon Shon;Kwangyoun Kim;Yi;Prashant Sridhar;Shinji Watanabe;Karen Livescu - 通讯作者:
Karen Livescu
Karen Livescu的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Karen Livescu', 18)}}的其他基金
RI: Small: From acoustics to semantics: Embedding speech for a hierarchy of tasks
RI:小:从声学到语义:为任务层次结构嵌入语音
- 批准号:
1816627 - 财政年份:2018
- 资助金额:
$ 43.88万 - 项目类别:
Continuing Grant
EAGER: Discovery of Segmental Sub-Word Structure in Speech
EAGER:语音中分段子词结构的发现
- 批准号:
1433485 - 财政年份:2014
- 资助金额:
$ 43.88万 - 项目类别:
Standard Grant
RI: Medium: Collaborative Research: Models of Handshape Articulatory Phonology for Recognition and Analysis of American Sign Language
RI:媒介:协作研究:用于识别和分析美国手语的手形发音音系模型
- 批准号:
1409837 - 财政年份:2014
- 资助金额:
$ 43.88万 - 项目类别:
Standard Grant
RI: Small: Multi-View Learning of Acoustic Features for Speech Recognition Using Articulatory Measurements
RI:小:使用发音测量进行语音识别的声学特征的多视图学习
- 批准号:
1321015 - 财政年份:2013
- 资助金额:
$ 43.88万 - 项目类别:
Continuing Grant
相似海外基金
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312841 - 财政年份:2023
- 资助金额:
$ 43.88万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312842 - 财政年份:2023
- 资助金额:
$ 43.88万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313151 - 财政年份:2023
- 资助金额:
$ 43.88万 - 项目类别:
Continuing Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312840 - 财政年份:2023
- 资助金额:
$ 43.88万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313149 - 财政年份:2023
- 资助金额:
$ 43.88万 - 项目类别:
Continuing Grant
Collaborative Research: CompCog: RI: Medium: Understanding human planning through AI-assisted analysis of a massive chess dataset
合作研究:CompCog:RI:中:通过人工智能辅助分析海量国际象棋数据集了解人类规划
- 批准号:
2312374 - 财政年份:2023
- 资助金额:
$ 43.88万 - 项目类别:
Standard Grant
Collaborative Research: CompCog: RI: Medium: Understanding human planning through AI-assisted analysis of a massive chess dataset
合作研究:CompCog:RI:中:通过人工智能辅助分析海量国际象棋数据集了解人类规划
- 批准号:
2312373 - 财政年份:2023
- 资助金额:
$ 43.88万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Superhuman Imitation Learning from Heterogeneous Demonstrations
合作研究:RI:媒介:异质演示中的超人模仿学习
- 批准号:
2312955 - 财政年份:2023
- 资助金额:
$ 43.88万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Informed, Fair, Efficient, and Incentive-Aware Group Decision Making
协作研究:RI:媒介:知情、公平、高效和具有激励意识的群体决策
- 批准号:
2313137 - 财政年份:2023
- 资助金额:
$ 43.88万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313150 - 财政年份:2023
- 资助金额:
$ 43.88万 - 项目类别:
Continuing Grant














{{item.name}}会员




