权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: RI: Medium: Multilingual Long-form QA with Retrieval-Augmented Language Models

合作研究：RI：Medium：采用检索增强语言模型的多语言长格式 QA

基本信息

批准号：
2312949
负责人：
Mohit Iyyer
金额：
$ 55.42万
依托单位：
University of Massachusetts Amherst
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-08-01 至 2027-07-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2312949&HistoricalAwards=false
关键词：
Collaborative Research RI Medium Multilingual

项目摘要

This project aims to enable automatic question answering systems to produce paragraph-level answers. Prior work on question answering has focused on simpler questions that can be answered with short phrases. Building systems to produce paragraph-level answers opens up exciting opportunities to answer complicated questions, and to offer more nuanced and comprehensive answers to simpler questions. This project will create comprehensive and reliable evaluation protocols for long form question answering (LFQA), pioneer multilingual studies to broaden information access to a wider population, and develop new algorithms that integrate web search with LFQA systems to provide verifiable long form answers paired with human-written evidence documents. This project focuses on three core dimensions of LFQA – datasets, evaluation, and modeling. Expanding the scope of prior English-centric LFQA, this research will investigate multilingual capabilities of large language models by constructing multilingual LFQA datasets and studying knowledge transfer across languages. In terms of modeling, it will propose a new framework that iteratively weaves together – in a transparent manner—knowledge retrieved from documents and memorized knowledge from a language model. Finally for evaluation, the project will engage domain experts who are familiar with the question topic to provide rationales for their evaluation of model generated answers. Such feedback will be used to derive a fine-grained annotation framework which localizes errors and unpack the weaknesses of generated answers. Together, the proposed work will bring significant progress to LFQA, an emerging topic for natural language processing and artificial intelligence research.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

本项目旨在使自动问答系统能够生成段落级答案。以前的问答工作集中在可以用简短短语回答的简单问题上。构建生成段落级答案的系统为回答复杂问题提供了令人兴奋的机会，并为简单问题提供了更细致和全面的答案。该项目将为长式问答（LFQA）创建全面可靠的评估协议，开创多语言研究，以扩大更广泛人群的信息访问，并开发将网络搜索与LFQA系统集成的新算法，以提供可验证的长式答案与人类书面证据文件。该项目侧重于LFQA的三个核心维度-数据集，评估和建模。本研究将扩展以前以英语为中心的LFQA的范围，通过构建多语言LFQA数据集和研究跨语言的知识转移来研究大型语言模型的多语言能力。在建模方面，它将提出一个新的框架，迭代编织在一起-在一个透明的方式-从文档中检索的知识和记忆的知识从语言模型。最后，对于评估，该项目将聘请熟悉问题主题的领域专家，为他们对模型生成的答案的评估提供依据。这些反馈将被用于导出一个细粒度的注释框架，该框架定位错误并解开生成的答案的弱点。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Mohit Iyyer其他文献

Casting Light on Invisible Cities: Computationally Engaging with Literary Criticism

照亮看不见的城市：计算与文学批评的结合

DOI：
10.18653/v1/n19-1130
发表时间：
2019
期刊：
影响因子：
0
作者：
Shufan Wang;Mohit Iyyer
通讯作者：
Mohit Iyyer

One Thousand and One Pairs: A"novel"challenge for long-context language models

一千零一对：长上下文语言模型的“新颖”挑战

DOI：
发表时间：
2024
期刊：
影响因子：
0
作者：
Marzena Karpinska;Katherine Thai;Kyle Lo;Tanya Goyal;Mohit Iyyer
通讯作者：
Mohit Iyyer

PaRaDe: Passage Ranking using Demonstrations with Large Language Models

PaRaDe：使用大型语言模型的演示进行段落排名

DOI：
10.48550/arxiv.2310.14408
发表时间：
2023
期刊：
ArXiv
影响因子：
0
作者：
Andrew Drozdov;Honglei Zhuang;Zhuyun Dai;Zhen Qin;Razieh Rahimi;Xuanhui Wang;Dana Alon;Mohit Iyyer;Andrew McCallum;Donald Metzler;Kai Hui
通讯作者：
Kai Hui

KNN-LM Does Not Improve Open-ended Text Generation

KNN-LM 没有改进开放式文本生成

DOI：
10.48550/arxiv.2305.14625
发表时间：
2023
期刊：
ArXiv
影响因子：
0
作者：
Shufan Wang;Yixiao Song;Andrew Drozdov;Aparna Garimella;Varun Manjunatha;Mohit Iyyer
通讯作者：
Mohit Iyyer