CAREER: Geometry, Physics and Semantics from Motion: Learning Expressive and Space-Aware Video Representations
职业:运动中的几何、物理和语义:学习富有表现力和空间感知的视频表示
基本信息
- 批准号:1942736
- 负责人:
- 金额:$ 54.65万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-03-01 至 2025-02-28
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
This project develops view-invariant 3D visual representations for visual recognition, robot control and language grounding that support scene understanding. The project minimizes human annotation efforts required for effective 3D visual recognition. The project will inject common sense and affordability reasoning in vision, language and control. It will also introduce learning paradigms for visuomotor representations supervised by embodiment, interaction and human demonstrations and narrations, just as humans learn. The project will be instrumental in controlling any vision-enabled mobile agents, such as ground vehicles and drones, to bring AI systems closer to the levels of human performance in visual reasoning. It will further establish connections between AI research and computational neuroscience and cognitive psychology by suggesting learning paradigms similar to those of humans, powered by embodiment and prediction, and by exploring inductive biases, such as motion/appearance disentanglement that need to be integrated to current computational models to enable the type of reasoning humans are capable of, with the appropriate amount of training. The research of this project with be integrated with the educational program of the investigator and results of this research will be disseminated to research communities.This research introduces visual feature representations that decompose RGB and RGB-D streams into scene appearance and motion for the camera and the objects. Appearance encodes properties that persist over time, such as semantics, material properties, shape, and so on, and motion encodes properties that vary quickly over time, such as camera motion, object locations and poses, and object non-rigid deformations. The project envisions embodied agents equipped with cameras to observe the world and end-effectors to interact with it, that learn to distill their visuomotor experiences into 3D feature representations of the scene appearance and their temporal action-conditioned dynamics. The new video representations learn to encode object properties and spatial common sense, such as world object size, 3D extent, shape, semantics, material properties, object permanence, by optimizing self-supervised objectives of view prediction, time frame prediction, and action-conditioned prediction. The representations enable processing a video stream in terms of objects, their temporal pose and deformation trajectories in 3D, without cross-object interference during occlusions.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目开发了用于视觉识别,机器人控制和支持场景理解的语言基础的视图不变3D视觉表示。该项目最大限度地减少了有效的3D视觉识别所需的人工注释工作。该项目将在视觉、语言和控制方面注入常识和可负担性推理。它还将介绍学习范式的可视化表示监督的体现,互动和人类的示范和叙述,就像人类学习。该项目将有助于控制任何支持视觉的移动的代理,如地面车辆和无人机,使人工智能系统更接近人类在视觉推理方面的表现水平。它将进一步建立人工智能研究与计算神经科学和认知心理学之间的联系,提出类似于人类的学习范式,由具体化和预测提供动力,并探索归纳偏见,例如需要整合到当前计算模型中的运动/外观解纠缠,以实现人类能够进行的推理类型,并进行适当的训练。本项目的研究将与研究者的教育计划相结合,本研究的结果将传播给研究社区。本研究引入了视觉特征表示,将RGB和RGB-D流分解为相机和对象的场景外观和运动。外观对随时间持续的特性(如语义、材质特性、形状等)进行编码,而运动对随时间快速变化的特性(如摄影机运动、对象位置和姿势以及对象非刚性变形)进行编码。该项目设想配备摄像机的具体代理人观察世界和终端效应器与它交互,学习提取他们的视觉体验到场景外观的3D特征表示和他们的时间动作条件动态。 新的视频表示通过优化视图预测、时间帧预测和动作条件预测的自监督目标来学习编码对象属性和空间常识,例如世界对象大小、3D范围、形状、语义、材料属性、对象持久性。该表示能够处理视频流中的对象,他们的时间姿态和变形轨迹在3D中,没有跨对象的干扰,在occlusion.This奖项反映了NSF的法定使命,并已被认为是值得的支持,通过评估使用基金会的智力价值和更广泛的影响审查标准。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Katerina Fragkiadaki其他文献
Embodied Language Grounding with Implicit 3D Visual Feature Representations
隐式 3D 视觉特征表示的具身语言基础
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Mihir Prabhudesai;H. Tung;S. Javed;Maximilian Sieb;Adam W. Harley;Katerina Fragkiadaki - 通讯作者:
Katerina Fragkiadaki
ICAL: Continual Learning of Multimodal Agents by Transforming Trajectories into Actionable Insights
ICAL:通过将轨迹转化为可行的见解来持续学习多模式代理
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Gabriel H. Sarch;Lawrence Jang;Michael J. Tarr;William W. Cohen;Kenneth Marino;Katerina Fragkiadaki - 通讯作者:
Katerina Fragkiadaki
Track, Check, Repeat: An EM Approach to Unsupervised Tracking
跟踪、检查、重复:无监督跟踪的 EM 方法
- DOI:
10.1109/cvpr46437.2021.01631 - 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Adam W. Harley;Yiming Zuo;Jing Wen;Ayush Mangal;Shubhankar Potdar;Ritwick Chaudhry;Katerina Fragkiadaki - 通讯作者:
Katerina Fragkiadaki
TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Commonsense Priors
TIDEE:使用视觉语义常识先验整理新房间
- DOI:
10.48550/arxiv.2207.10761 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Gabriel H. Sarch;Zhaoyuan Fang;Adam W. Harley;Paul Schydlo;M. Tarr;Saurabh Gupta;Katerina Fragkiadaki - 通讯作者:
Katerina Fragkiadaki
Reinforcement Learning of Active Vision for Manipulating Objects under Occlusions
用于在遮挡下操纵物体的主动视觉强化学习
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Ricson Cheng;Arpit Agarwal;Katerina Fragkiadaki - 通讯作者:
Katerina Fragkiadaki
Katerina Fragkiadaki的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
2019年度国际理论物理中心-ICTP School on Geometry and Gravity (smr 3311)
- 批准号:11981240404
- 批准年份:2019
- 资助金额:1.5 万元
- 项目类别:国际(地区)合作与交流项目
新型IIIB、IVB 族元素手性CGC金属有机化合物(Constrained-Geometry Complexes)的合成及反应性研究
- 批准号:20602003
- 批准年份:2006
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Statistical Physics Methods in Combinatorics, Algorithms, and Geometry
组合学、算法和几何中的统计物理方法
- 批准号:
MR/W007320/2 - 财政年份:2023
- 资助金额:
$ 54.65万 - 项目类别:
Fellowship
Applications of homotopy theory to algebraic geometry and physics
同伦理论在代数几何和物理学中的应用
- 批准号:
2305373 - 财政年份:2023
- 资助金额:
$ 54.65万 - 项目类别:
Standard Grant
Physics of interaband effects: Viewpoint of quantum geometry and topology
带间效应物理学:量子几何和拓扑的观点
- 批准号:
23K03243 - 财政年份:2023
- 资助金额:
$ 54.65万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Conference: Low-Dimensional Manifolds, their Geometry and Topology, Representations and Actions of their Fundamental Groups and Connections with Physics
会议:低维流形、其几何和拓扑、其基本群的表示和作用以及与物理学的联系
- 批准号:
2247008 - 财政年份:2023
- 资助金额:
$ 54.65万 - 项目类别:
Standard Grant
CAREER: Cluster Algebras in Representation Theory, Geometry, and Physics
职业:表示论、几何和物理学中的簇代数
- 批准号:
2143922 - 财政年份:2022
- 资助金额:
$ 54.65万 - 项目类别:
Continuing Grant
Conference: On the Crossroads of Algebra, Geometry, and Physics
会议:代数、几何和物理的十字路口
- 批准号:
2200713 - 财政年份:2022
- 资助金额:
$ 54.65万 - 项目类别:
Standard Grant
Vertex Algebras in Geometry and Physics
几何和物理中的顶点代数
- 批准号:
SAPIN-2020-00039 - 财政年份:2022
- 资助金额:
$ 54.65万 - 项目类别:
Subatomic Physics Envelope - Individual
Representation theoretic methods in geometry and mathematical physics
几何和数学物理中的表示理论方法
- 批准号:
RGPIN-2019-03961 - 财政年份:2022
- 资助金额:
$ 54.65万 - 项目类别:
Discovery Grants Program - Individual
Optimal shapes in geometry and physics: Isoperimetry in modern analysis
几何和物理学中的最佳形状:现代分析中的等周法
- 批准号:
DP220100067 - 财政年份:2022
- 资助金额:
$ 54.65万 - 项目类别:
Discovery Projects
Conference: Geometry and Physics---Deformations, Homotopy Algebras, and Higher Structures
会议:几何与物理——变形、同伦代数和更高结构
- 批准号:
2201270 - 财政年份:2022
- 资助金额:
$ 54.65万 - 项目类别:
Standard Grant














{{item.name}}会员




