ITR: Applying Translation Technology to Language Modeling

ITR:将翻译技术应用于语言建模

基本信息

  • 批准号:
    0326276
  • 负责人:
  • 金额:
    $ 300万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2003
  • 资助国家:
    美国
  • 起止时间:
    2003-09-15 至 2008-08-31
  • 项目状态:
    已结题

项目摘要

Virtually all systems that produce text, from speech recognition to natural language generation, use a language model as a core component in order to rank word strings by their well-formedness and appropriateness for a given context. These models are difficult to develop both because of algorithmic challenges specific to integration of multiple knowledge sources and the lack of robust language processing tools. The goal of this project is to develop models via new techniques for exploiting the information available in parallel multilingual corpora, i.e., translations of the same source in multiple languages. Such corpora implicitly encode a hidden, common core that can be uncovered using state-of-the-art estimation techniques. The project involves: i) automatic learning of structure within and across languages at multiple levels of abstraction: semantics, morphology, phonology, and paraphrasing, and ii) integration of the results into novel language model frameworks to address the problem of limited domain- and language-specific training data. The hypothesis is that, by sharing data and structure across languages and genres within a language, the resulting models will be richer and more robust. Such ideas were impossible to envision until recently; availability of multilingual corpora and increases in computing power make them now feasible.This project marries machine translation and speech recognition language modeling techniques, anticipating that the combination will lead to more powerful and general models. The research will facilitate rapid development of tools for less well studied languages and will immediately impact applications in mainstream languages ranging from information management to international collaboration to bilingual education. The results will also have implications for statistical modeling problems beyond language processing.
几乎所有产生文本的系统,从语音识别到自然语言生成,都使用语言模型作为核心组件,以便根据给定上下文的格式良好性和适当性对单词串进行排名。 这些模型很难开发,因为算法的挑战,具体到多个知识源的整合和缺乏强大的语言处理工具。 该项目的目标是通过新技术开发模型,以利用并行多语言语料库中的信息,即,将同一来源翻译成多种语言。 这样的语料库隐含地编码了一个隐藏的、共同的核心,可以使用最先进的估计技术来发现。 该项目涉及:i)在多个抽象层次上自动学习语言内部和跨语言的结构:语义、形态学、语音学和释义,以及ii)将结果集成到新的语言模型框架中,以解决特定于领域和语言的训练数据有限的问题。 其假设是,通过在一种语言中跨语言和体裁共享数据和结构,生成的模型将更丰富、更健壮。 直到最近,这种想法才成为可能;多语言语料库的可用性和计算能力的提高使它们成为可能。该项目将机器翻译和语音识别语言建模技术结合起来,预计这种结合将导致更强大和更通用的模型。 这项研究将促进为研究较少的语言快速开发工具,并将立即影响主流语言的应用,从信息管理到国际合作到双语教育。 这些结果也将对语言处理之外的统计建模问题产生影响。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Mari Ostendorf其他文献

Design of a speech recognition system based on acoustically derived segmental units
基于声学分段单元的语音识别系统设计
Automatic recognition of prosodic phrases
自动识别韵律短语
The challenge of spoken language systems: research directions for the nineties
口语系统的挑战:九十年代的研究方向
  • DOI:
    10.1109/89.365385
  • 发表时间:
    1995
  • 期刊:
  • 影响因子:
    0
  • 作者:
    R. Cole;L. Hirschman;L. Atlas;M. Beckman;A. Biermann;M. Bush;M. Clements;Jordan Cohen;Oscar Garcia;B. Hanson;H. Hermansky;S. Levinson;K. McKeown;N. Morgan;D. Novick;Mari Ostendorf;S. Oviatt;P. Price;H. Silverman;J. Spitz;A. Waibel;C. Weinstein;S. Zahorian;V. Zue
  • 通讯作者:
    V. Zue
Representations for Question Answering from Documents with Tables and Text
带有表格和文本的文档问答的表示
The stochastic segment model for continuous speech recognition
连续语音识别的随机分段模型

Mari Ostendorf的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Mari Ostendorf', 18)}}的其他基金

Collaborative Research: Improving Speech Technology for Better Learning Outcomes: The Case of AAE Child Speakers
合作研究:改进语音技术以获得更好的学习成果:AAE 儿童演讲者的案例
  • 批准号:
    2202049
  • 财政年份:
    2022
  • 资助金额:
    $ 300万
  • 项目类别:
    Standard Grant
RI: Small: Modeling Idiosyncrasies of Speech for Automatic Spoken Language Processing
RI:小:为自动口语处理建模语音特质
  • 批准号:
    1617176
  • 财政年份:
    2016
  • 资助金额:
    $ 300万
  • 项目类别:
    Standard Grant
RI: Small: Simplifying Text for Individual Reading Needs
RI:小:简化文本以满足个人阅读需求
  • 批准号:
    0916951
  • 财政年份:
    2009
  • 资助金额:
    $ 300万
  • 项目类别:
    Standard Grant
U.S.-Germany Dissertation Enhancement: Predicting Hidden Structure and Punctuation in Speech for Machine Translation
美德论文增强:预测机器翻译语音中的隐藏结构和标点符号
  • 批准号:
    0552492
  • 财政年份:
    2006
  • 资助金额:
    $ 300万
  • 项目类别:
    Standard Grant
A Computing Lab for Integrated Teaching of Systems Courses in Electrical Engineering
电气工程系统课程集成教学计算实验室
  • 批准号:
    0511635
  • 财政年份:
    2005
  • 资助金额:
    $ 300万
  • 项目类别:
    Standard Grant
Speech Generation for Human-Computer Interaction
人机交互的语音生成
  • 批准号:
    9996440
  • 财政年份:
    1999
  • 资助金额:
    $ 300万
  • 项目类别:
    Continuing Grant
STIMULATE: Modeling Structure in Speech above the Segment for Spontaneous Speech Recognition
刺激:对自发语音识别片段上方的语音结构进行建模
  • 批准号:
    9996450
  • 财政年份:
    1999
  • 资助金额:
    $ 300万
  • 项目类别:
    Continuing Grant
Workshop for Discussing Research Priorities and Evaluation Strategies in Speech Synthesis
讨论语音合成研究重点和评估策略的研讨会
  • 批准号:
    9872796
  • 财政年份:
    1998
  • 资助金额:
    $ 300万
  • 项目类别:
    Standard Grant
STIMULATE: Modeling Structure in Speech above the Segment for Spontaneous Speech Recognition
刺激:对自发语音识别片段上方的语音结构进行建模
  • 批准号:
    9618926
  • 财政年份:
    1997
  • 资助金额:
    $ 300万
  • 项目类别:
    Continuing Grant
Speech Generation for Human-Computer Interaction
人机交互的语音生成
  • 批准号:
    9528990
  • 财政年份:
    1996
  • 资助金额:
    $ 300万
  • 项目类别:
    Continuing Grant

相似海外基金

Applying a Program Science approach for strengthening partnerships and advancing embedded research to optimize public health programming for HIV and sexually transmitted and blood-borne infections among criminalized populations in the Global South
应用计划科学方法来加强伙伴关系并推进嵌入式研究,以优化南半球犯罪人群中针对艾滋病毒、性传播和血源性感染的公共卫生规划
  • 批准号:
    502554
  • 财政年份:
    2024
  • 资助金额:
    $ 300万
  • 项目类别:
Applying synthetic biology to the development of in vivo technologies for the monitoring and control of vector-borne diseases.
应用合成生物学来开发用于监测和控制媒介传播疾病的体内技术。
  • 批准号:
    BB/Y008340/1
  • 财政年份:
    2024
  • 资助金额:
    $ 300万
  • 项目类别:
    Research Grant
Applying advanced understanding of CTLA-4 function to optimise therapies for autoimmunity
应用对 CTLA-4 功能的深入理解来优化自身免疫疗法
  • 批准号:
    MR/Y001273/1
  • 财政年份:
    2024
  • 资助金额:
    $ 300万
  • 项目类别:
    Research Grant
Applying a complex systems perspective to investigate the relationship between choreography and agent-based modeling as tools for scientific sense-making
应用复杂系统的视角来研究编排和基于代理的建模之间的关系,作为科学意义构建的工具
  • 批准号:
    2418539
  • 财政年份:
    2024
  • 资助金额:
    $ 300万
  • 项目类别:
    Continuing Grant
HSI Pilot Project: Applying a Research-Based Learning Approach to Enhance Biomanufacturing Skills
HSI 试点项目:应用基于研究的学习方法来提高生物制造技能
  • 批准号:
    2345033
  • 财政年份:
    2024
  • 资助金额:
    $ 300万
  • 项目类别:
    Standard Grant
Applying digital archeology to rock art placement
将数字考古学应用于岩画布局
  • 批准号:
    DE240100030
  • 财政年份:
    2024
  • 资助金额:
    $ 300万
  • 项目类别:
    Discovery Early Career Researcher Award
ARISTOTELES - Applying ARtificial Intelligence to Define clinical trajectorieS for personalized predicTiOn and early deTEctiOn of comorbidiTy and muLtimorbidiTy pattErnS
亚里士多德 - 应用人工智能定义临床轨迹,以实现个性化预测以及合并症和多发病模式的早期检测
  • 批准号:
    10103153
  • 财政年份:
    2023
  • 资助金额:
    $ 300万
  • 项目类别:
    EU-Funded
Applying co-production to enhance Ontario's clinical trial landscape
应用联合生产来增强安大略省的临床试验前景
  • 批准号:
    484616
  • 财政年份:
    2023
  • 资助金额:
    $ 300万
  • 项目类别:
    Fellowship Programs
Applying an equity and diversity lens to understand the care experiences and healthcare outcomes of low income and linguistic minority groups in Ontario retirement homes: A mixed methods study
应用公平和多样性的视角来了解安大略省养老院中低收入和语言少数群体的护理体验和医疗保健结果:一项混合方法研究
  • 批准号:
    484613
  • 财政年份:
    2023
  • 资助金额:
    $ 300万
  • 项目类别:
    Fellowship Programs
Establishment of therapeutic strategy for corneal stromal scaring treatment by applying the ZFP521 gene.
应用ZFP521基因建立角膜基质疤痕治疗策略。
  • 批准号:
    23K09045
  • 财政年份:
    2023
  • 资助金额:
    $ 300万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了