权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Medium: Tree-Structured Self-Supervised Modeling for Natural Language

RI：中：自然语言的树结构自监督建模

基本信息

批准号：
1955567
负责人：
Mohit Iyyer
金额：
$ 113.09万
依托单位：
University of Massachusetts Amherst
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2020
资助国家：
美国
起止时间：
2020-09-01 至 2024-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1955567&HistoricalAwards=false
关键词：
RI Medium Tree Structured Self

项目摘要

This project promotes the development of energy-efficient and linguistically-motivated computational methods for understanding human language. Recent advances in text-based tasks such as machine translation and question answering have been fueled by training huge-scale neural network models on billions of words. While this brute-force approach has no doubt been successful, it also has many downsides. As the computational requirements for training and using these models grow larger and larger, their carbon footprints have been steadily increasing, and their accessibility has become limited to those at a few well-funded companies and institutions. These models do not explicitly consider the hierarchical nature of language, a well-studied phenomenon in linguistics, which the investigators believe contributes to their overall inefficiency and also reduces their interpretability to end users. The technologies developed in this project aim not only to make computational models of language more accessible and efficient, but also to improve the state of the art in text generation tasks such as translation and summarization. The project integrates the newly-developed methods into academic settings to provide significant outreach to undergraduates outside of computer science as well as in underrepresented communities.To develop this new methodology, the project introduces neural architectures that induce syntactic and semantically-relevant tree structures from raw text while simultaneously learning powerful vector-based representations that improve downstream tasks. These models combine insights from self-supervised learning, which allows for powerful representation learning without expensive manual effort, with a tree-shaped structural bias. The resulting methods are evaluated with respect to three major goals: (1) enabling representation learning of the entire linguistic hierarchy (i.e., words, phrases, sentences, and discourse-level units) within a single architecture; (2) improving computational and energy efficiency of training and inference; and (3) improving long-form text generation tasks including document-level translation and text summarization. This research effort aims to spur research into sustainable and scalable language representation learning, and as such its outputs include publicly-released pretrained models and open-sourced code.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

该项目促进了能源效率和语言动机的计算方法的发展，以理解人类语言。机器翻译和问答等基于文本的任务的最新进展是通过在数十亿个单词上训练大规模神经网络模型来推动的。虽然这种暴力手段毫无疑问是成功的，但它也有许多缺点。随着训练和使用这些模型的计算需求越来越大，它们的碳足迹一直在稳步增加，并且它们的可访问性仅限于少数资金充足的公司和机构。这些模型没有明确考虑语言的等级性质，这是语言学中一个研究得很好的现象，研究人员认为这导致了它们的整体效率低下，也降低了它们对最终用户的可解释性。该项目中开发的技术不仅旨在使语言的计算模型更易于访问和更有效，而且还旨在提高翻译和摘要等文本生成任务的最新水平。该项目将新开发的方法整合到学术环境中，为计算机科学以外的本科生以及代表性不足的社区提供重要的推广服务。为了开发这种新方法，该项目引入了神经架构，从原始文本中诱导语法和语义相关的树结构，同时学习强大的基于向量的表示，以改善下游任务。这些模型结合了联合收割机的自我监督学习的见解，它允许强大的表示学习，而无需昂贵的手动工作，具有树形结构偏差。由此产生的方法相对于三个主要目标进行评估：（1）使整个语言层次的表示学习（即，单词、短语、句子和话语级单元）;（2）提高训练和推理的计算和能量效率;以及（3）改进长格式文本生成任务，包括文档级翻译和文本摘要。这项研究工作旨在推动对可持续和可扩展的语言表征学习的研究，因此其成果包括公开发布的预训练模型和开源代码。该奖项反映了NSF的法定使命，并被认为值得通过使用基金会的评估来支持智力优点和更广泛的影响审查标准。

项目成果

期刊论文数量（7）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Revisiting Simple Neural Probabilistic Language Models

DOI：
10.18653/v1/2021.naacl-main.407
发表时间：
2021-04
期刊：
ArXiv
影响因子：
0
作者：
Simeng Sun;Mohit Iyyer
通讯作者：
Simeng Sun;Mohit Iyyer

DEMETR: Diagnosing Evaluation Metrics for Translation

DEMETR：诊断翻译评估指标

DOI：
10.18653/v1/2022.emnlp-main.649
发表时间：
2022
期刊：
Empirical Methods in Natural Language Processing
影响因子：
0
作者：
Karpinska, Marzena;Raj, Nishant;Thai, Katherine;Song, Yixiao;Gupta, Ankita;Iyyer, Mohit
通讯作者：
Iyyer, Mohit

TABBIE: Pretrained Representations of Tabular Data

DOI：
10.18653/v1/2021.naacl-main.270
发表时间：
2021-05
期刊：
影响因子：
0
作者：
H. Iida;Dung Ngoc Thai;Varun Manjunatha;Mohit Iyyer
通讯作者：
H. Iida;Dung Ngoc Thai;Varun Manjunatha;Mohit Iyyer

Hurdles to Progress in Long-form Question Answering

DOI：
10.18653/v1/2021.naacl-main.393
发表时间：
2021-03
期刊：
影响因子：
0
作者：
Kalpesh Krishna;Aurko Roy;Mohit Iyyer
通讯作者：
Kalpesh Krishna;Aurko Roy;Mohit Iyyer

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Mohit Iyyer其他文献

Casting Light on Invisible Cities: Computationally Engaging with Literary Criticism

照亮看不见的城市：计算与文学批评的结合

DOI：
10.18653/v1/n19-1130
发表时间：
2019
期刊：
影响因子：
0
作者：
Shufan Wang;Mohit Iyyer
通讯作者：
Mohit Iyyer

One Thousand and One Pairs: A"novel"challenge for long-context language models

一千零一对：长上下文语言模型的“新颖”挑战

DOI：
发表时间：
2024
期刊：
影响因子：
0
作者：
Marzena Karpinska;Katherine Thai;Kyle Lo;Tanya Goyal;Mohit Iyyer
通讯作者：
Mohit Iyyer

PaRaDe: Passage Ranking using Demonstrations with Large Language Models

PaRaDe：使用大型语言模型的演示进行段落排名

DOI：
10.48550/arxiv.2310.14408
发表时间：
2023
期刊：
ArXiv
影响因子：
0
作者：
Andrew Drozdov;Honglei Zhuang;Zhuyun Dai;Zhen Qin;Razieh Rahimi;Xuanhui Wang;Dana Alon;Mohit Iyyer;Andrew McCallum;Donald Metzler;Kai Hui
通讯作者：
Kai Hui

KNN-LM Does Not Improve Open-ended Text Generation

KNN-LM 没有改进开放式文本生成

DOI：
10.48550/arxiv.2305.14625
发表时间：
2023
期刊：
ArXiv
影响因子：
0
作者：
Shufan Wang;Yixiao Song;Andrew Drozdov;Aparna Garimella;Varun Manjunatha;Mohit Iyyer
通讯作者：
Mohit Iyyer