RI: Medium: Collaborative Research: Semi-Supervised Discriminative Training of Language Models

RI:媒介:协作研究:语言模型的半监督判别训练

基本信息

  • 批准号:
    0964102
  • 负责人:
  • 金额:
    $ 50万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2010
  • 资助国家:
    美国
  • 起止时间:
    2010-06-01 至 2015-05-31
  • 项目状态:
    已结题

项目摘要

This project is conducting fundamental research in statistical language modeling to improve human language technologies, including automatic speech recognition (ASR) and machine translation (MT).A language model (LM) is conventionally optimized, using text in the target language, to assign high probability to well-formed sentences. This method has a fundamental shortcoming: the optimization does not explicitly target the kinds of distinctions necessary to accomplish the task at hand, such as discriminating (for ASR) between different words that are acoustically confusable or (for MT) between different target-language words that express the multiple meanings of a polysemous source-language word.Discriminative optimization of the LM, which would overcome this shortcoming, requires large quantities of paired input-output sequences: speech and its reference transcription for ASR or source-language (e.g. Chinese) sentences and their translations into the target language (say, English) for MT. Such resources are expensive, and limit the efficacy of discriminative training methods.In a radical departure from convention, this project is investigating discriminative training using easily available, *unpaired* input and output sequences: un-transcribed speech or monolingual source-language text and unpaired target-language text. Two key ideas are being pursued: (i) unlabeled input sequences (e.g. speech or Chinese text) are processed to learn likely confusions encountered by the ASR or MT system; (ii) unpaired output sequences (English text) are leveraged to discriminate between these well-formed sentences from the (supposed) ill-formed sentences the system could potentially confuse them with.This self-supervised discriminative training, if successful, will advance machine intelligence in fundamental ways that impact many other applications.
本研究课题为提高自动语音识别(ASR)和机器翻译(MT)等人类语言技术,进行统计语言建模的基础研究。使用目标语言的文本,对语言模型(LM)进行常规优化,使正确的句子具有较高的概率。 这种方法有一个根本的缺点:优化没有明确地针对完成手头任务所需的各种区别,例如区分(对于ASR)在听觉上易混淆的不同单词之间或者(对于MT)在表达多义源语言单词的多种含义的不同目标语言单词之间。LM的区分性优化,其将克服这个缺点,需要大量成对的输入-输出序列:语音及其用于ASR的参考转录或源语言(例如中文)句子及其到用于MT的目标语言(例如英语)的翻译。 这种资源是昂贵的,并限制了判别训练方法的有效性。在一个彻底的背离惯例,这个项目正在研究判别训练使用容易获得的,* 不成对的 * 输入和输出序列:未转录的语音或单语源语言文本和不成对的目标语言文本。 两个关键的想法正在追求:(i)未标记的输入序列(例如语音或中文文本)被处理以学习ASR或MT系统可能遇到的混淆;(ii)不成对输出序列(英文文本)被用来区分这些格式良好的句子,这种自我监督的区别训练,如果成功,将以影响许多其他应用的根本方式推进机器智能。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Alexander Kain其他文献

Alexander Kain的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Alexander Kain', 18)}}的其他基金

Collaborative Research: CDI-Type I: Computational Models for the Automatic Recognition of Non-Human Primate Social Behaviors
合作研究:CDI-Type I:自动识别非人类灵长类动物社会行为的计算模型
  • 批准号:
    1027834
  • 财政年份:
    2010
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
HCC: Medium: Synthesis and Perception of Speaker Identity
HCC:媒介:说话者身份的综合和感知
  • 批准号:
    0964468
  • 财政年份:
    2010
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
RI: Small: Modeling Coarticulation for Automatic Speech Recognition
RI:小型:自动语音识别的协同发音建模
  • 批准号:
    0915754
  • 财政年份:
    2009
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
HCC: High-Quality Compression, Enhancement, and Personalization of Text-to-Speech Voices
HCC:文本转语音的高质量压缩、增强和个性化
  • 批准号:
    0713617
  • 财政年份:
    2007
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
STTR Phase I: Small Footprint Speech Synthesis
STTR 第一阶段:小规模语音合成
  • 批准号:
    0441125
  • 财政年份:
    2005
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant

相似海外基金

Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
  • 批准号:
    2312841
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
  • 批准号:
    2312842
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
  • 批准号:
    2313151
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
  • 批准号:
    2312840
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
  • 批准号:
    2313149
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
Collaborative Research: CompCog: RI: Medium: Understanding human planning through AI-assisted analysis of a massive chess dataset
合作研究:CompCog:RI:中:通过人工智能辅助分析海量国际象棋数据集了解人类规划
  • 批准号:
    2312374
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: CompCog: RI: Medium: Understanding human planning through AI-assisted analysis of a massive chess dataset
合作研究:CompCog:RI:中:通过人工智能辅助分析海量国际象棋数据集了解人类规划
  • 批准号:
    2312373
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Superhuman Imitation Learning from Heterogeneous Demonstrations
合作研究:RI:媒介:异质演示中的超人模仿学习
  • 批准号:
    2312955
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Informed, Fair, Efficient, and Incentive-Aware Group Decision Making
协作研究:RI:媒介:知情、公平、高效和具有激励意识的群体决策
  • 批准号:
    2313137
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
  • 批准号:
    2313150
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了