权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

III: Medium: Learning Multimodal Knowledge about Entities and Events

III：媒介：学习有关实体和事件的多模态知识

基本信息

批准号：
1703166
负责人：
Hanna Hajishirzi
金额：
$ 70万
依托单位：
University of Washington
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2017
资助国家：
美国
起止时间：
2017-08-01 至 2022-07-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1703166&HistoricalAwards=false
关键词：
III Medium Learning Multimodal Knowledge

项目摘要

Everyday knowledge about the world is a necessary condition for intelligent information processing and reasoning. People can read between the lines in text and see beyond what are visible in images because of everyday functional knowledge about how the world works. The primary goal of this research is to develop learning algorithms that can automatically acquire such knowledge, centered around entities and events, from large-scale multimodal web data. Entity knowledge includes a broad range of physical and conceptual knowledge about objects and people, including their attributes, their relative differences, and logical relations among them. Event knowledge focuses on structural knowledge about everyday events in people's lives organized through hierarchical and temporal relations among sub-events and the event participants. Together, the resulting knowledge will be a critical step forward to enable robust AI systems at the intersection between natural language processing and computer vision that can understand and reason about unstructured multimodal information. The potential impact of this research includes interactive assistive systems for the visually-impaired and multimodal educational interfaces. This project investigates multimodal knowledge extraction as a new research paradigm drawing connections between relevant methods in natural language processing such as information extraction, textual entailments, and frame semantics with recent advances in computer vision. One of the critical challenges in commonsense knowledge acquisition is to overcome reporting bias, i.e., people do not state the obvious. Therefore, this project develops new learning algorithms based on a graph-based collective inference that can reason about unspoken knowledge that systematically influences the way people describe the world in language, images, and videos. In addition, this project develops new models for visual semantic parsing and event recognition, which generalize existing studies on activity recognition by specifying various structural components of events such as actors, objects, locations, tools, intents, and goals. The learned knowledge and representation will be validated through several applications including multimodal question answering and grounded language understanding.

对世界的日常知识是智能信息处理和推理的必要条件。人们可以读懂文本的字里行间，看到图像之外的东西，因为人们对世界如何运作的日常功能性知识。本研究的主要目标是开发学习算法，可以自动获取这样的知识，围绕实体和事件，从大规模的多模态Web数据。实体知识包括关于对象和人的广泛的物理和概念知识，包括它们的属性，它们的相对差异以及它们之间的逻辑关系。事件知识是关于人们日常生活中的事件的结构性知识，通过子事件和事件参与者之间的层次和时间关系来组织。总之，由此产生的知识将是向前迈出的关键一步，使强大的人工智能系统能够在自然语言处理和计算机视觉之间的交叉点上理解和推理非结构化的多模态信息。这项研究的潜在影响包括视觉障碍和多模式教育界面的交互式辅助系统。该项目研究多模态知识提取作为一种新的研究范式，将自然语言处理中的相关方法（如信息提取，文本蕴涵和框架语义）与计算机视觉的最新进展联系起来。获取常识性知识的关键挑战之一是克服报告偏差，即，人们不会说显而易见的事情。因此，该项目基于基于图形的集体推理开发了新的学习算法，该算法可以推理出系统地影响人们用语言、图像和视频描述世界的方式的潜知识。此外，该项目还开发了用于视觉语义解析和事件识别的新模型，这些模型通过指定事件的各种结构组件（如演员，对象，位置，工具，意图和目标）来概括现有的活动识别研究。所学到的知识和表示将通过多个应用程序进行验证，包括多模态问题回答和接地语言理解。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Hanna Hajishirzi其他文献

OLMES: A Standard for Language Model Evaluations

OLMES：语言模型评估标准

DOI：
发表时间：
2024
期刊：
影响因子：
0
作者：
Yuling Gu;Oyvind Tafjord;Bailey Kuehl;Dany Haddad;Jesse Dodge;Hanna Hajishirzi
通讯作者：
Hanna Hajishirzi

SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature

SciRIFF：增强语言模型指令对科学文献的跟踪的资源

DOI：
发表时间：
2024
期刊：
arXiv.org
影响因子：
0
作者：
David Wadden;Kejian Shi;Jacob Daniel Morrison;Aakanksha Naik;Shruti Singh;Nitzan Barzilay;Kyle Lo;Tom Hope;Luca Soldaini;Shannon Zejiang Shen;Doug Downey;Hanna Hajishirzi;Arman Cohan
通讯作者：
Arman Cohan