RI: Small: Visual Reasoning and Self-questioning for Explainable Visual Question Answering
RI:小:视觉推理和自我质疑以实现可解释的视觉问答
基本信息
- 批准号:2007613
- 负责人:
- 金额:$ 46.92万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-10-01 至 2024-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Visual question answering (VQA), aiming to answer a question in natural language related to a given image, is still in its infancy. Current approaches lack flexibility and generalizability to handling diverse questions without training. It is therefore desirable to explorep explainable VQA (or X-VQA) that can provide explanations of its reasoning in natural language in addition to answers. This requires integrating computer vision, natural language, and knowledge representation, and it is an incredibly challenging task. By exploring X-VQA this project advances and enriches the fundamental computer vision, image understanding, visual semantic analysis, machine learning, and knowledge representation. And it also greatly facilitates a wide range of applications including visual chatbots, visual retrieval and recommendation, and human-computer interaction. This research also contributes to education through curriculum development, student training, and knowledge dissemination. It includes interactions with K-12 students for participation and research opportunities. The major goal of this research is to develop a novel computational model with solid theoretical foundation and effective methods, to facilitate X-VQA that provides explanations of its visual reasoning. This challenging task involves many fundamental aspects and needs to integrate vision, language, learning and knowledge. This project focuses on: (1) A unified computational model of X-VQA and its theoretical foundation. This model integrates domain knowledge and visual observations for reasoning: what and how hidden facts can be inferred from incomplete and inaccurate visual observations; how visual observation, hidden facts, and domain knowledge can be represented for efficient question answering; and how the question answering can be scalable. The study of these critical issues creates the foundation for X-VQA; (2) A new model for question-driven task-oriented visual observation. It is inefficient to collect all visual observations before answering a question. Vision needs to be question-driven and task-oriented. This project pursues a new model for the interaction of questions, visual reasoning and visual observation, so as to automatically steer attention to the question-related aspects of an image; (3) An innovative approach to self-questioning for training X-VQA agents. Training simply based on question-answer data is not viable for X-VQA, as it is unable to provide explanations for and insights into the answer. This project pursues a novel approach to self-questioning, in which the VQA agents can also generate and ask questions. It investigates how self-questioning can be combined with reinforcement learning, and how it can deal with versatile questions to improve the scalability of X-VQA; and (4) A solid case study on X-VQA.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
视觉问题回答(VQA)旨在回答与给定图像有关的自然语言问题,但仍处于起步阶段。 当前的方法缺乏在没有培训的情况下处理各种问题的灵活性和概括性。因此,需要探索可解释的VQA(或X-VQA),这些VQA(或X-VQA)除了答案外还可以解释其自然语言的推理。这需要整合计算机视觉,自然语言和知识表示,这是一项极具挑战性的任务。通过探索X-VQA,该项目可以发展并丰富了基本的计算机视觉,图像理解,视觉语义分析,机器学习和知识表示。而且它也极大地促进了广泛的应用,包括视觉聊天机器人,视觉检索和建议以及人类计算机的互动。这项研究还通过课程发展,学生培训和知识传播来促进教育。它包括与K-12学生进行参与和研究机会的互动。 这项研究的主要目的是开发具有扎实的理论基础和有效方法的新型计算模型,以促进X-VQA,从而提供有关其视觉推理的解释。这项具有挑战性的任务涉及许多基本方面,并且需要整合视觉,语言,学习和知识。该项目的重点是:(1)X-VQA的统一计算模型及其理论基础。该模型集成了推理的领域知识和视觉观察:从不完整和不准确的视觉观察中推断出什么和如何隐藏的事实;视觉观察,隐藏的事实和领域知识如何以有效的问答表示;以及问题回答如何可扩展。对这些关键问题的研究为X-VQA奠定了基础。 (2)针对问题驱动的任务视觉观察的新模型。在回答问题之前,收集所有视觉观察结果效率低下。愿景需要以质疑为导向并以任务为导向。该项目为问题的相互作用,视觉推理和视觉观察的相互作用追求一个新的模型,以便自动引导图像的与问题相关的方面。 (3)一种自我询问的创新方法,用于培训X-VQA代理。仅基于问答数据的培训对于X-VQA而言是不可行的,因为它无法为答案提供解释和见解。该项目采用一种新颖的方法来进行自我询问,其中VQA代理商也可以产生并提出问题。它研究了如何将自我询问与强化学习结合在一起,以及如何处理多功能问题以提高X-VQA的可扩展性; (4)关于X-VQA的扎实案例研究。该奖项反映了NSF的法定使命,并被认为是值得通过基金会的知识分子优点和更广泛影响的评论标准来评估的。
项目成果
期刊论文数量(11)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Unsupervised Depth Completion and Denoising for RGB-D Sensors
RGB-D 传感器的无监督深度补全和去噪
- DOI:10.1109/icra46639.2022.9812392
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Fan, Lei;Li, Yunxuan;Jiang, Chen;Wu, Ying
- 通讯作者:Wu, Ying
Morphable Detector for Object Detection on Demand
- DOI:10.1109/iccv48922.2021.00473
- 发表时间:2021-10
- 期刊:
- 影响因子:0
- 作者:Xiangyun Zhao;Xu Zou;Ying Wu
- 通讯作者:Xiangyun Zhao;Xu Zou;Ying Wu
Avoiding Lingering in Learning Active Recognition by Adversarial Disturbance
- DOI:10.1109/wacv56688.2023.00459
- 发表时间:2023-01
- 期刊:
- 影响因子:0
- 作者:Lei Fan;Ying Wu
- 通讯作者:Lei Fan;Ying Wu
Contrastive Learning for Label Efficient Semantic Segmentation
- DOI:10.1109/iccv48922.2021.01045
- 发表时间:2020-12
- 期刊:
- 影响因子:0
- 作者:Xiangyu Zhao;Raviteja Vemulapalli;P. A. Mansfield;Boqing Gong;Bradley Green;Lior Shapira;Ying Wu
- 通讯作者:Xiangyu Zhao;Raviteja Vemulapalli;P. A. Mansfield;Boqing Gong;Bradley Green;Lior Shapira;Ying Wu
Temporal Feature Enhancement Dilated Convolution Network for Weakly-supervised Temporal Action Localization
- DOI:10.1109/wacv56688.2023.00597
- 发表时间:2023-01
- 期刊:
- 影响因子:0
- 作者:Jianxiong Zhou;Ying Wu
- 通讯作者:Jianxiong Zhou;Ying Wu
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ying Wu其他文献
Evaluation of Hepatocellular Carcinoma by Contrast‐Enhanced Sonography
超声造影评估肝细胞癌
- DOI:
- 发表时间:
2011 - 期刊:
- 影响因子:2.3
- 作者:
Jin Feng Xu;Hui Yu Liu;Yang Shi;Zhang Hong Wei;Ying Wu - 通讯作者:
Ying Wu
Effects of APC De-Targeting and GAr Modification on the Duration of Luciferase Expression from Plasmid DNA Delivered to Skeletal Muscle
APC 脱靶和 GAr 修饰对递送至骨骼肌的质粒 DNA 荧光素酶表达持续时间的影响
- DOI:
- 发表时间:
2014 - 期刊:
- 影响因子:3.6
- 作者:
M. C. Subang;Rewas Fatah;Ying Wu;D. Hannaman;J. Rice;C. Evans;Y. Chernajovsky;D. Gould - 通讯作者:
D. Gould
A novel thiazolidinedione derivative TD118 showing selective algicidal effects for red tide control
新型噻唑烷二酮衍生物 TD118 对赤潮控制具有选择性杀藻作用
- DOI:
- 发表时间:
2014 - 期刊:
- 影响因子:4.1
- 作者:
Ying Wu;Yew Lee;Seul;Minju Kim;C. Eom;S. Kim;Hoon Cho;E. Jin - 通讯作者:
E. Jin
Qualitative Research on Dementia in Ethnically Diverse Communities
多种族社区痴呆症的定性研究
- DOI:
- 发表时间:
2013 - 期刊:
- 影响因子:3.4
- 作者:
C. Shanley;Desiree Leone;Y. Santalucia;Jon Adams;Jorge Enrique Ferrerosa;Fatima Kourouche;Silvana Gava;Ying Wu - 通讯作者:
Ying Wu
The Orally Active Glutamate Carboxypeptidase II Inhibitor E2072 Exhibits Sustained Nerve Exposure and Attenuates Peripheral Neuropathy
口服活性谷氨酸羧肽酶 II 抑制剂 E2072 表现出持续的神经暴露并减轻周围神经病变
- DOI:
- 发表时间:
2012 - 期刊:
- 影响因子:3.5
- 作者:
K. Wozniak;Ying Wu;J. Vornov;R. Lapidus;R. Rais;C. Rojas;T. Tsukamoto;B. Slusher - 通讯作者:
B. Slusher
Ying Wu的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ying Wu', 18)}}的其他基金
RI: Small: A Unified Compositional Model for Explainable Video-based Human Activity Parsing
RI:小型:用于可解释的基于视频的人类活动解析的统一组合模型
- 批准号:
1815561 - 财政年份:2018
- 资助金额:
$ 46.92万 - 项目类别:
Standard Grant
RI: Small: Modeling and Learning Visual Similarities Under Adverse Visual Conditions
RI:小:在不利视觉条件下建模和学习视觉相似性
- 批准号:
1619078 - 财政年份:2016
- 资助金额:
$ 46.92万 - 项目类别:
Standard Grant
RI: Small: Mining and Learning Visual Contexts for Video Scene Understanding
RI:小:挖掘和学习视频场景理解的视觉上下文
- 批准号:
1217302 - 财政年份:2012
- 资助金额:
$ 46.92万 - 项目类别:
Continuing Grant
Collaborative Research: Sino-USA Summer School in Vision, Learning, Pattern Recognition VLPR 2010
合作研究:中美视觉、学习、模式识别暑期学校 VLPR 2010
- 批准号:
1037944 - 财政年份:2010
- 资助金额:
$ 46.92万 - 项目类别:
Standard Grant
RI: Small: Computational Models of Context-awareness and Selective Attention for Persistent Visual Target Tracking
RI:小型:持续视觉目标跟踪的上下文感知和选择性注意的计算模型
- 批准号:
0916607 - 财政年份:2009
- 资助金额:
$ 46.92万 - 项目类别:
Standard Grant
CAREER: Visual Analysis of High-Dimensional Motion: A Distributed/Collaborative Approach
职业:高维运动的可视化分析:分布式/协作方法
- 批准号:
0347877 - 财政年份:2004
- 资助金额:
$ 46.92万 - 项目类别:
Continuing Grant
Transductive Learning for Retrieving and Mining Visual Contents
用于检索和挖掘视觉内容的转化学习
- 批准号:
0308222 - 财政年份:2003
- 资助金额:
$ 46.92万 - 项目类别:
Continuing Grant
相似国自然基金
调节内质网蛋白质稳态保护青光眼视觉损害的小分子药物筛选及作用机制研究
- 批准号:82373849
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
基于仿生视觉的近浅海水下小目标光学感知与识别方法
- 批准号:
- 批准年份:2021
- 资助金额:59 万元
- 项目类别:面上项目
融合光学和视觉原理的小模数粉末冶金齿轮高精度快速在线检测的理论及技术研究
- 批准号:
- 批准年份:2021
- 资助金额:58 万元
- 项目类别:面上项目
融合光学和视觉原理的小模数粉末冶金齿轮高精度快速在线检测的理论及技术研究
- 批准号:52175036
- 批准年份:2021
- 资助金额:58.00 万元
- 项目类别:面上项目
面向边缘部署的弱先验小目标视觉检测与跟踪
- 批准号:U21B2037
- 批准年份:2021
- 资助金额:255 万元
- 项目类别:联合基金项目
相似海外基金
RI:Small: Modeling and Relating Visual Tasks
RI:Small:建模和关联视觉任务
- 批准号:
2329927 - 财政年份:2023
- 资助金额:
$ 46.92万 - 项目类别:
Continuing Grant
RI: Small: Toward Efficient and Robust Dynamic Scene Understanding Based on Visual Correspondences
RI:小:基于视觉对应的高效、鲁棒的动态场景理解
- 批准号:
2310254 - 财政年份:2023
- 资助金额:
$ 46.92万 - 项目类别:
Standard Grant
RI: Small: Visual How: Task Understanding and Description in the Real World
RI:小:视觉方式:现实世界中的任务理解和描述
- 批准号:
2143197 - 财政年份:2022
- 资助金额:
$ 46.92万 - 项目类别:
Standard Grant
RI: Small: Learning 3D Equivariant Visual Representation for Animals
RI:小:学习动物的 3D 等变视觉表示
- 批准号:
2202024 - 财政年份:2022
- 资助金额:
$ 46.92万 - 项目类别:
Standard Grant
RI:Small: Improve Visual Tracking by Large Scale Learning, Diagnosis, and Evaluation
RI:Small:通过大规模学习、诊断和评估改进视觉跟踪
- 批准号:
2006665 - 财政年份:2020
- 资助金额:
$ 46.92万 - 项目类别:
Standard Grant