权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

ITR: Applying Translation Technology to Language Modeling

ITR：将翻译技术应用于语言建模

基本信息

批准号：
0326276
负责人：
Mari Ostendorf
金额：
$ 300万
依托单位：
University of Washington
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2003
资助国家：
美国
起止时间：
2003-09-15 至 2008-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=0326276&HistoricalAwards=false
关键词：
ITR Applying Translation Technology Language

项目摘要

Virtually all systems that produce text, from speech recognition to natural language generation, use a language model as a core component in order to rank word strings by their well-formedness and appropriateness for a given context. These models are difficult to develop both because of algorithmic challenges specific to integration of multiple knowledge sources and the lack of robust language processing tools. The goal of this project is to develop models via new techniques for exploiting the information available in parallel multilingual corpora, i.e., translations of the same source in multiple languages. Such corpora implicitly encode a hidden, common core that can be uncovered using state-of-the-art estimation techniques. The project involves: i) automatic learning of structure within and across languages at multiple levels of abstraction: semantics, morphology, phonology, and paraphrasing, and ii) integration of the results into novel language model frameworks to address the problem of limited domain- and language-specific training data. The hypothesis is that, by sharing data and structure across languages and genres within a language, the resulting models will be richer and more robust. Such ideas were impossible to envision until recently; availability of multilingual corpora and increases in computing power make them now feasible.This project marries machine translation and speech recognition language modeling techniques, anticipating that the combination will lead to more powerful and general models. The research will facilitate rapid development of tools for less well studied languages and will immediately impact applications in mainstream languages ranging from information management to international collaboration to bilingual education. The results will also have implications for statistical modeling problems beyond language processing.

几乎所有产生文本的系统，从语音识别到自然语言生成，都使用语言模型作为核心组件，以便根据给定上下文的格式良好性和适当性对单词串进行排名。这些模型很难开发，因为算法的挑战，具体到多个知识源的整合和缺乏强大的语言处理工具。该项目的目标是通过新技术开发模型，以利用并行多语言语料库中的信息，即，将同一来源翻译成多种语言。这样的语料库隐含地编码了一个隐藏的、共同的核心，可以使用最先进的估计技术来发现。该项目涉及：i）在多个抽象层次上自动学习语言内部和跨语言的结构：语义、形态学、语音学和释义，以及ii）将结果集成到新的语言模型框架中，以解决特定于领域和语言的训练数据有限的问题。其假设是，通过在一种语言中跨语言和体裁共享数据和结构，生成的模型将更丰富、更健壮。直到最近，这种想法才成为可能;多语言语料库的可用性和计算能力的提高使它们成为可能。该项目将机器翻译和语音识别语言建模技术结合起来，预计这种结合将导致更强大和更通用的模型。这项研究将促进为研究较少的语言快速开发工具，并将立即影响主流语言的应用，从信息管理到国际合作到双语教育。这些结果也将对语言处理之外的统计建模问题产生影响。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Mari Ostendorf其他文献

Design of a speech recognition system based on acoustically derived segmental units

基于声学分段单元的语音识别系统设计

DOI：
10.1109/icassp.1996.541128
发表时间：
1996
期刊：
1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings
影响因子：
0
作者：
M. Bacchiani;Mari Ostendorf;Y. Sagisaka;K. Paliwal
通讯作者：
K. Paliwal

Automatic recognition of prosodic phrases

自动识别韵律短语

DOI：
发表时间：
1991
期刊：
IEEE International Conference on Acoustics, Speech, and Signal Processing
影响因子：
0
作者：
Colin W. Wightman;Mari Ostendorf
通讯作者：
Mari Ostendorf

The challenge of spoken language systems: research directions for the nineties

口语系统的挑战：九十年代的研究方向

DOI：
10.1109/89.365385
发表时间：
1995
期刊：
IEEE Trans. Speech Audio Process.
影响因子：
0
作者：
R. Cole;L. Hirschman;L. Atlas;M. Beckman;A. Biermann;M. Bush;M. Clements;Jordan Cohen;Oscar Garcia;B. Hanson;H. Hermansky;S. Levinson;K. McKeown;N. Morgan;D. Novick;Mari Ostendorf;S. Oviatt;P. Price;H. Silverman;J. Spitz;A. Waibel;C. Weinstein;S. Zahorian;V. Zue
通讯作者：
V. Zue

Representations for Question Answering from Documents with Tables and Text

带有表格和文本的文档问答的表示

DOI：
10.18653/v1/2021.eacl-main.253
发表时间：
2021
期刊：
Proceedings of the Fourth Workshop on Fact Extraction and VERification (FEVER)
影响因子：
0
作者：
V. Zayats;Kristina Toutanova;Mari Ostendorf
通讯作者：
Mari Ostendorf