权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

EAGER: Building Language Technologies by Machine Reading Grammars

EAGER：通过机器阅读语法构建语言技术

基本信息

批准号：
2327143
负责人：
Antonios Anastasopoulos
金额：
$ 9.93万
依托单位：
George Mason University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-06-15 至 2024-05-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2327143&HistoricalAwards=false
关键词：
EAGER Building Language Technologies Machine

项目摘要

Recent years have seen incredible advances in natural language processing (NLP) technologies, which now make it possible to perform numerous tasks through, with, or on language data. However, this progress has been limited to the handful of languages for which abundant data are available, because the neural models that facilitate the recent improvements are particularly data hungry. This work suggests that we should move away from the current data-inefficient learning paradigm, and instead attempt to also model languages by relying on the human mode of describing them: the grammar of each language. Put simply, we will aim to incorporate the grammars of languages, as written by linguists and treated as symbolic knowledge bases, in the process of training neural language models. Specifically, this work will focus on the first step towards this goal, namely extracting the necessary information from grammar descriptions and other linguistic documents. We will explore several alternative modeling approaches, first by relying on retrieval-based models. We will additionally attack the problem through a machine-reading and question-answering framework. Ultimately, the success of these methods will enable the creation of linguistically-informed models, which will in turn facilitate the creation of technologies especially for under-served language communities.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

近年来，自然语言处理(NLP)技术取得了令人难以置信的进步，这使得通过语言数据、使用语言数据或在语言数据上执行大量任务成为可能。然而，这一进展仅限于少数几种有丰富数据可用的语言，因为促进最近改进的神经模型特别需要数据。这项工作表明，我们应该摆脱目前数据效率低下的学习范式，转而尝试通过依赖人类描述语言的模式来对语言进行建模：每种语言的语法。简而言之，在训练神经语言模型的过程中，我们的目标是将语言学家编写的并被视为符号知识库的语言语法纳入其中。具体地说，这项工作将侧重于实现这一目标的第一步，即从语法描述和其他语言文档中提取必要的信息。我们将探索几种替代的建模方法，首先是依赖于基于检索的模型。我们还将通过机器阅读和问答框架来解决这个问题。最终，这些方法的成功将使语言信息模型的创建成为可能，这反过来将促进技术的创造，特别是针对服务不足的语言社区。该奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Antonios Anastasopoulos其他文献

PROBER: A System for Real-time Propaganda Behavior Analytics on Social Media and Web Data Streams

PROBER：社交媒体和网络数据流实时宣传行为分析系统

DOI：
发表时间：
2022
期刊：
2022 IEEE International Conference on Big Data (Big Data)
影响因子：
0
作者：
Yasas Senarath;Antonios Anastasopoulos;Tonya Thornton;Hemant Purohit
通讯作者：
Hemant Purohit

Noisy Parallel Data Alignment

嘈杂的并行数据对齐

DOI：
10.48550/arxiv.2301.09685
发表时间：
2023
期刊：
影响因子：
0
作者：
Ruoyu Xie;Antonios Anastasopoulos
通讯作者：
Antonios Anastasopoulos

Flagging Comprehensibility Issues in Hindi Text with Question Answering

通过问答标记印地语文本中的可理解性问题

DOI：
发表时间：
2021
期刊：
影响因子：
0
作者：
Antonios Anastasopoulos;A. Cattelan;Yi Dou;Marcello Federico;Christian Federman;Dmitriy Genzel;Francisco Guzm'an;Junjie Hu;Sheila Castilho;Stephen Doherty;F. Gaspari;J. Devlin;Ming;Kenton Lee;Natasha Dhawan;I. Subbiah;Benjamin Thompson;Zachary Hildner;Areeba;Eric Prommer;Christian T Sinclair
通讯作者：
Christian T Sinclair