Collaborative Research: RI:Medium:Understanding Events from Streaming Video - Joint Deep and Graph Representations, Commonsense Priors, and Predictive Learning
协作研究:RI:Medium:理解流视频中的事件 - 联合深度和图形表示、常识先验和预测学习
基本信息
- 批准号:1956050
- 负责人:
- 金额:$ 42.12万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-10-01 至 2024-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
While it is easy for humans to process video data and extract meanings from it, it is extremely hard to design algorithms to do so. When developed, there are many applications of this technology, such as building assistive robotics or constructing smart spaces for independent living or monitoring wildlife. Video-data capture events, which are central to the content of human experience. Events consist of objects/people (who), location (where), time (when), actions (what), activities (how), and intent (why). This project develops a computer vision-based event understanding algorithm that operates in a self-supervised, streaming fashion. The algorithm will predict and detect old and new events, learn to build hierarchical event representations, all in the context of a prior knowledge-base that is updated over time. The intent is to generate interpretations of an event that go beyond what is seen, rather than just recognition. This research pushes the frontier of computer vision by coupling the self-supervised learning process with prior knowledge, moving the field towards open-world algorithms, and needing little or no supervision. Furthermore, this project will focus on recruitment and retention of undergraduate women students through freshman and sophomore years, with attention towards underrepresented minority students at the three sites: University of South Florida, Florida State University, and Oklahoma State University.At the core of the approach is a hybrid representational hierarchy that includes both continuous representations and symbolic graph-based representations. The continuous-valued representation is the standard, vector-valued deep learning stack that ends in an embedding vector of some object or action concept in the knowledge base. The next level of the representation consists of elementary symbolic compositions of these verbs and nouns. These elementary compositions, when associated with concepts from a knowledge-base they makeup an event interpretation, containing descriptions that go beyond what is observed in the image. These symbolic levels are built using Grenander's canonical representations from pattern theory. These representations, which have flexible graph-structured backbones, are more expressive than other well-known graphical models. The specific technical aims of the project are four-fold. First, it seeks to integrate function-based continuous with energy-based Grenander's canonical symbolic representations from pattern theory into one integrated formulation based on equilibrium propagation. Second, it will research and develop ways to use and modify commonsense knowledge bases. This will help to go beyond the closed world assumption, which is implicit in the current practice of annotated data-based deep learning approaches. Third, it will develop dynamical models on graph manifolds, which will enable generative modeling of graph structures for prediction and discovery of new concepts. Fourth, inspired by finding from human perception experiments and neuroscience, it will design predictive self-supervised learning over both continuous and symbolic representations.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
虽然人类很容易处理视频数据并从中提取含义,但设计算法来做到这一点却极其困难。当开发出来时,这项技术有很多应用,比如建造辅助机器人,或者建造独立生活的智能空间,或者监测野生动物。视频数据捕捉事件,这是人类体验的核心内容。事件由对象/人(谁)、地点(地点)、时间(时间)、行动(什么)、活动(如何)和意图(为什么)组成。该项目开发了一种基于计算机视觉的事件理解算法,该算法以自我监督的流媒体方式运行。该算法将预测和检测新旧事件,学习建立分层事件表示,所有这些都是在随时间更新的先前知识库的背景下进行的。其意图是产生对事件的解释,超越所见,而不仅仅是认识。这项研究通过将自我监督学习过程与先验知识相结合,将该领域推向开放世界的算法,并且几乎不需要监督,从而推动了计算机视觉的前沿。此外,该项目将侧重于在大一和大二期间招收和留住本科生,关注南佛罗里达大学、佛罗里达州立大学和俄克拉荷马州立大学这三个地点未被充分代表的少数族裔学生。该方法的核心是一个混合的表征层次结构,包括连续表征和基于符号图形的表征。连续值表示是标准的向量值深度学习堆栈,它以知识库中某个对象或动作概念的嵌入向量结束。下一层次的表征由这些动词和名词的基本符号组成。当这些基本成分与知识库中的概念相关联时,它们构成了事件解释,其中包含的描述超出了图像中观察到的内容。这些象征性的层次是使用格勒南德的模式理论中的规范表示来构建的。这些表示具有灵活的图形结构主干,比其他众所周知的图形模型更具表现力。该项目的具体技术目标有四个方面。首先,它试图将基于函数的连续和基于能量的格勒南德的典型符号表示从模式理论整合为一个基于平衡传播的综合公式。第二,它将研究和开发使用和修改常识性知识库的方法。这将有助于超越封闭世界假设,这种假设隐含在当前基于注释的基于数据的深度学习方法的实践中。第三,它将在图流形上开发动态模型,这将使图结构的生成性建模成为可能,用于预测和发现新概念。第四,受人类感知实验和神经科学发现的启发,它将设计针对连续表征和符号表征的预测性自我监督学习。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Spatio-Temporal Event Segmentation for Wildlife Extended Videos
野生动物扩展视频的时空事件分割
- DOI:10.1007/978-3-031-11349-9
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Mounir, R.;Gula, R.;J., Sarkar
- 通讯作者:J., Sarkar
A Quotient Space Formulation for Generative Statistical Analysis of Graphical Data
- DOI:10.1007/s10851-021-01027-1
- 发表时间:2021-03
- 期刊:
- 影响因子:2
- 作者:Xiaoyang Guo;A. Srivastava;S. Sarkar
- 通讯作者:Xiaoyang Guo;A. Srivastava;S. Sarkar
Actor-Centered Representations for Action Localization in Streaming Videos
- DOI:10.1007/978-3-031-19839-7_5
- 发表时间:2021-04
- 期刊:
- 影响因子:0
- 作者:Sathyanarayanan N. Aakur;Sudeep Sarkar
- 通讯作者:Sathyanarayanan N. Aakur;Sudeep Sarkar
Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration
- DOI:10.1007/978-3-031-19833-5_26
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:A. Bal;R. Mounir;Sathyanarayanan N. Aakur;Sudeep Sarkar;Anuj Srivastava
- 通讯作者:A. Bal;R. Mounir;Sathyanarayanan N. Aakur;Sudeep Sarkar;Anuj Srivastava
Leveraging Symbolic Knowledge Bases for Commonsense Natural Language Inference Using Pattern Theory
- DOI:10.1109/tpami.2023.3287837
- 发表时间:2023-06
- 期刊:
- 影响因子:23.6
- 作者:Sathyanarayanan N. Aakur;Sudeep Sarkar
- 通讯作者:Sathyanarayanan N. Aakur;Sudeep Sarkar
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Sudeep Sarkar其他文献
Mixing Properties of Stable Random Fields Indexed by Amenable and Hyperbolic Groups
由顺从群和双曲群索引的稳定随机场的混合特性
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Mahan Mj;Parthanil Roy;Sudeep Sarkar - 通讯作者:
Sudeep Sarkar
A modeling approach for burn scar assessment using natural features and elastic property
利用自然特征和弹性特性进行烧伤疤痕评估的建模方法
- DOI:
10.1109/tmi.2004.834625 - 发表时间:
2004 - 期刊:
- 影响因子:10.6
- 作者:
Yong Zhang;Dmitry Goldgof;Sudeep Sarkar;L. Tsap - 通讯作者:
L. Tsap
A sensitivity analysis method and its application in physics-based nonrigid motion modeling
灵敏度分析方法及其在物理非刚体运动建模中的应用
- DOI:
10.1016/j.imavis.2005.08.007 - 发表时间:
2007 - 期刊:
- 影响因子:0
- 作者:
Yong Zhang;Dmitry Goldgof;Sudeep Sarkar;L. Tsap - 通讯作者:
L. Tsap
Efficient Generation of Large Amounts of Training Data for Sign Language Recognition: A Semi-automatic Tool
高效生成大量手语识别训练数据:半自动工具
- DOI:
10.1007/11788713_94 - 发表时间:
2006 - 期刊:
- 影响因子:0
- 作者:
Ruiduo Yang;Sudeep Sarkar;B. Loeding;A. Karshmer - 通讯作者:
A. Karshmer
Different atom trapping geometries with time averaged adiabatic potentials
具有时间平均绝热势的不同原子捕获几何结构
- DOI:
10.1140/epjd/s10053-021-00290-6 - 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Sudeep Sarkar;S. P. Ram;V. B. Tiwari;S. Mishra - 通讯作者:
S. Mishra
Sudeep Sarkar的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Sudeep Sarkar', 18)}}的其他基金
I-Corps Sites: Type II - I-Corps Site at University of South Florida Tampa
I-Corps 站点:II 型 - 南佛罗里达大学坦帕分校 I-Corps 站点
- 批准号:
1829217 - 财政年份:2018
- 资助金额:
$ 42.12万 - 项目类别:
Continuing Grant
I-Corps: Semantic Video - from Video to Descriptions
I-Corps:语义视频 - 从视频到描述
- 批准号:
1647887 - 财政年份:2016
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
I-Corps Sites: University of South Florida: Catalyzing Research Translation
I-Corps 网站:南佛罗里达大学:促进研究成果转化
- 批准号:
1449137 - 财政年份:2015
- 资助金额:
$ 42.12万 - 项目类别:
Continuing Grant
RI: Small: Collaborative Research: Ontology based Perceptual Organization of Audio-Video Events using Pattern Theory
RI:小型:协作研究:使用模式理论对音频-视频事件进行基于本体的感知组织
- 批准号:
1217676 - 财政年份:2012
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
EMT/Nano: Energy Minimization Computing using Field Coupled Nanomagnets--Modeling and Fabrication
EMT/Nano:使用场耦合纳米磁体的能量最小化计算——建模和制造
- 批准号:
0829838 - 财政年份:2008
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
ITR: Fundamental Issues in Automated American Sign Language Recognition
ITR:美国手语自动识别的基本问题
- 批准号:
0312993 - 财政年份:2003
- 资助金额:
$ 42.12万 - 项目类别:
Continuing Grant
CISE Research Resources: A Compute-Intensive Sensor-Based Environment for Research in Computer Vision and Artificial Intelligence
CISE 研究资源:用于计算机视觉和人工智能研究的基于计算密集型传感器的环境
- 批准号:
0130768 - 财政年份:2001
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
Enhancing Undergraduate Computer Science Curriculum through Image Computations: Proof-of-Concept
通过图像计算加强本科计算机科学课程:概念验证
- 批准号:
9980832 - 财政年份:2000
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
The Role Learning in Perceptual Organization of Complex Images
复杂图像感知组织中的角色学习
- 批准号:
9907141 - 财政年份:1999
- 资助金额:
$ 42.12万 - 项目类别:
Continuing Grant
Major Research Instrumentation: Acquisition of a Cyberware 3D Scanner to Facilitate State of Art Research in Computer Vision and Graphics
主要研究仪器:购买 Cyberware 3D 扫描仪以促进计算机视觉和图形领域的最先进研究
- 批准号:
9724422 - 财政年份:1997
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312841 - 财政年份:2023
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312842 - 财政年份:2023
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Foundations of Few-Round Active Learning
协作研究:RI:小型:少轮主动学习的基础
- 批准号:
2313131 - 财政年份:2023
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313151 - 财政年份:2023
- 资助金额:
$ 42.12万 - 项目类别:
Continuing Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312840 - 财政年份:2023
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Deep Constrained Learning for Power Systems
合作研究:RI:小型:电力系统的深度约束学习
- 批准号:
2345528 - 财政年份:2023
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Motion Fields Understanding for Enhanced Long-Range Imaging
合作研究:RI:小型:增强远程成像的运动场理解
- 批准号:
2232298 - 财政年份:2023
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: End-to-end Learning of Fair and Explainable Schedules for Court Systems
合作研究:RI:小型:法院系统公平且可解释的时间表的端到端学习
- 批准号:
2232055 - 财政年份:2023
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313149 - 财政年份:2023
- 资助金额:
$ 42.12万 - 项目类别:
Continuing Grant
Collaborative Research: CompCog: RI: Medium: Understanding human planning through AI-assisted analysis of a massive chess dataset
合作研究:CompCog:RI:中:通过人工智能辅助分析海量国际象棋数据集了解人类规划
- 批准号:
2312374 - 财政年份:2023
- 资助金额:
$ 42.12万 - 项目类别:
Standard Grant