权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

BSF: 2016257: Building Models for Reading Comprehension in Specialized Domains from Scratch

BSF：2016257：从头开始构建专业领域的阅读理解模型

基本信息

批准号：
1737230
负责人：
Vivek Srikumar
金额：
$ 3.5万
依托单位：
University of Utah
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2017
资助国家：
美国
起止时间：
2017-09-01 至 2020-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1737230&HistoricalAwards=false
关键词：
BSF 2016257 Building Models Reading

项目摘要

Machine learning algorithms are increasingly allowing people to search, structure, and access the textual information created daily in every possible domain. In areas with abundant annotated data, where supervised learning algorithms can be applied, algorithms for text understanding have had success in structuring text and providing natural language interfaces. When building a system in a new domain for which there is little to no data, however, data collection and annotation data can be prohibitively expensive. This project explores a protocol for developing text understanding systems that read text and provide a natural language interface in a particular domain (such as biology or history) -- this can allow specialized communities to have digital access to data that is otherwise locked in text. The project also trains students as part of an international collaboration -- this award supports travel of the US-based researchers to collaborate in a project funded by the US-Israel Binational Science Foundation. The project encompasses both data collection and model training, and considers the interaction between the two. To replace expert annotations it uses crowdsourcing workers in an iterative procedure that starts training a structured predictor from almost no data. It creates an interactive framework in which users ask questions and verify candidate answers that are later used to retrain the system. It aims to jointly train over multiple domains, and use domain adaptation methods to transfer knowledge from one domain to another. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

机器学习算法越来越多地允许人们搜索、构建和访问每天在每个可能的领域创建的文本信息。在具有丰富注释数据的领域，可以应用监督学习算法，文本理解算法在构建文本和提供自然语言接口方面取得了成功。然而，当在几乎没有数据的新领域中构建系统时，数据收集和注释数据可能非常昂贵。该项目探索了一种用于开发文本理解系统的协议，该系统可以阅读文本并在特定领域（如生物学或历史）提供自然语言界面-这可以允许专业社区以数字方式访问否则被锁定在文本中的数据。该项目还培训学生作为国际合作的一部分-该奖项支持美国研究人员在美国-以色列两国科学基金会资助的项目中进行合作。该项目包括数据收集和模型训练，并考虑两者之间的相互作用。为了取代专家注释，它在一个迭代过程中使用众包工作人员，从几乎没有数据开始训练结构化预测器。它创建了一个交互式框架，用户可以在其中提出问题并验证候选答案，这些答案随后将用于重新训练系统。它旨在在多个领域进行联合训练，并使用领域适应方法将知识从一个领域转移到另一个领域。该奖项反映了NSF的法定使命，并被认为是值得通过使用基金会的知识价值和更广泛的影响审查标准进行评估的支持。

项目成果

期刊论文数量（1）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

On the Limits of Learning to Actively Learn Semantic Representations

DOI：
10.18653/v1/k19-1042
发表时间：
2019-10
期刊：
ArXiv
影响因子：
0
作者：
Omri Koshorek;Gabriel Stanovsky;Yichu Zhou;Vivek Srikumar;Jonathan Berant
通讯作者：
Omri Koshorek;Gabriel Stanovsky;Yichu Zhou;Vivek Srikumar;Jonathan Berant

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Vivek Srikumar其他文献

Double Trouble: The Problem of Construal in Semantic Annotation of Adpositions

双重麻烦：副词语义注释中的解释问题

DOI：
10.18653/v1/s17-1022
发表时间：
2017
期刊：
Georgetown University-Graduate School of Arts & Sciences
影响因子：
0
作者：
Jena D. Hwang;Archna Bhatia;Na;Timothy J. O'Gorman;Vivek Srikumar;Nathan Schneider
通讯作者：
Nathan Schneider

EDISON: Feature Extraction for NLP, Simplified

EDISON：简化的 NLP 特征提取

DOI：
发表时间：
2016
期刊：
International Conference on Language Resources and Evaluation
影响因子：
0
作者：
Mark Sammons;Christos Christodoulopoulos;Parisa Kordjamshidi;Daniel Khashabi;Vivek Srikumar;D. Roth
通讯作者：
D. Roth

X-Fact: A New Benchmark Dataset for Multilingual Fact Checking

X-Fact：用于多语言事实检查的新基准数据集

DOI：
10.18653/v1/2021.acl-short.86
发表时间：
2021
期刊：
The Association for Computational Linguistics
影响因子：
0
作者：
Ashim Gupta;Vivek Srikumar
通讯作者：
Vivek Srikumar

An Algebra for Feature Extraction

特征提取的代数

DOI：
10.18653/v1/p17-1173
发表时间：
2017
期刊：
影响因子：
0
作者：
Vivek Srikumar
通讯作者：
Vivek Srikumar

Recursive Neural Networks for Coding Therapist and Patient Behavior in Motivational Interviewing

用于编码动机访谈中治疗师和患者行为的递归神经网络

DOI：
发表时间：
2015
期刊：
CLPsych@HLT-NAACL
影响因子：
0
作者：
Michael J. Tanana;Kevin A. Hallgren;Zac E. Imel;David C. Atkins;Padhraic Smyth;Vivek Srikumar
通讯作者：
Vivek Srikumar