权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Learning to Perceive the Interactive 3D World from an Image

职业：学习从图像感知交互式 3D 世界

基本信息

批准号：
2142529
负责人：
David Fouhey
金额：
$ 58.44万
依托单位：
Regents of the University of Michigan - Ann Arbor
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-03-01 至 2027-02-28
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2142529&HistoricalAwards=false
关键词：
CAREER Learning Perceive Interactive 3D

项目摘要

This award is funded in part under the American Rescue Plan Act of 2021 (Public Law 117-2).The human-built world is filled with interactive objects that have parts that can be manipulated by humans, ranging from cabinets with doors to dressers with drawers. In order for intelligent machines to be able to understand and assist humans in realistic settings, they must be able to understand these objects from vision, and especially in unconstrained realistic settings. This understanding must include understanding the interactions as they occur, as well as recognizing the opportunity for interaction (i.e., that a cabinet could be interacted with even when it is untouched). These abilities are beyond the capabilities of current AI systems since these largely deal with interactive objects in restricted settings such as simulation engines. This project aims to build AI systems that can learn these properties by combining knowledge from large-scale first-person-view video demonstrations of interactions by humans as well as from 3D simulators that do not include interaction. The project has the potential to enhance efforts in many other disciplines, for instance robotics or assistive technology for people, due to the ubiquity and importance of these interactive objects. Integrated with the research is a plan to support and engage the next generation of researchers in computer vision at multiple levels via research opportunities and enhanced course materials.This project aims to achieve this goal via four directions that advance the visual understanding of interactive objects. The first direction aims to build detailed 3D models of articulating objects in unconstrained first person-video. Building on this physical understanding of articulation, the second direction plans to enhance this physical understanding with information about how a human would achieve the interaction and what it might accomplish or reveal about the scene. The third effort aims to enable understanding of articulations before they occur by building associations in 3D across frames of a video, letting a system associate and learn from examples of ongoing interactions. The fourth direction connects this understanding of interactive objects with the goal of producing a 3D understanding of the full scene, by endowing 3D reconstructions of the world with beliefs about objects that may be just out of view or temporarily occluded.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

该奖项的部分资金来自2021年美国救援计划法案（公法117-2）。人类建造的世界充满了可由人类操纵的交互式物体，从带门的橱柜到带抽屉的梳妆台。为了让智能机器能够在现实环境中理解和帮助人类，它们必须能够从视觉中理解这些物体，特别是在不受约束的现实环境中。这种理解必须包括理解发生的相互作用，以及认识到相互作用的机会（即，即使当机柜未被触及时也可以与其交互）。这些能力超出了当前人工智能系统的能力，因为这些系统主要处理模拟引擎等受限设置中的交互对象。该项目旨在构建人工智能系统，该系统可以通过结合来自人类交互的大规模第一人称视角视频演示以及不包括交互的3D模拟器的知识来学习这些属性。该项目有可能加强许多其他学科的努力，例如机器人技术或人类辅助技术，因为这些互动物体的普遍性和重要性。与研究相结合的是一项计划，通过研究机会和增强的课程材料，在多个层面上支持和吸引下一代计算机视觉研究人员。该项目旨在通过四个方向实现这一目标，这些方向促进了对交互式对象的视觉理解。第一个方向的目的是建立详细的三维模型，在不受约束的第一人称视频的关节对象。在这种对清晰度的物理理解的基础上，第二个方向计划通过有关人类如何实现交互以及它可能完成或揭示场景的信息来增强这种物理理解。第三项工作旨在通过在视频帧之间建立3D关联来实现对发音的理解，让系统从正在进行的交互的示例中进行关联和学习。第四个方向将对交互式物体的理解与对整个场景的3D理解相联系，通过赋予世界的3D重建对可能只是在视野之外或暂时被遮挡的物体的信念。该奖项反映了NSF的法定使命，并通过使用基金会的智力价值和更广泛的影响审查标准进行评估而被认为值得支持。

项目成果

期刊论文数量（2）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Understanding 3D Object Interaction from a Single Image

DOI：
10.1109/iccv51070.2023.01988
发表时间：
2023-05
期刊：
2023 IEEE/CVF International Conference on Computer Vision (ICCV)
影响因子：
0
作者：
Shengyi Qian;D. Fouhey
通讯作者：
Shengyi Qian;D. Fouhey

EPIC Fields: Marrying 3D Geometry and Video Understanding

DOI：
10.48550/arxiv.2306.08731
发表时间：
2023-06
期刊：
ArXiv
影响因子：
0
作者：
Vadim Tschernezki;Ahmad Darkhalil;Zhifan Zhu;D. Fouhey;Iro Laina;Diane Larlus;D. Damen;A. Vedaldi-A.-Vedal
通讯作者：
Vadim Tschernezki;Ahmad Darkhalil;Zhifan Zhu;D. Fouhey;Iro Laina;Diane Larlus;D. Damen;A. Vedaldi-A.-Vedal