EAGER: Spatiotemporal Transformer for Activity Recognition
EAGER:用于活动识别的时空转换器
基本信息
- 批准号:2322993
- 负责人:
- 金额:$ 28.08万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-07-01 至 2025-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Understanding human activity from video is important to several applications in security, defense, medicine, robotics, manufacturing, and education. The field of computer vision explores the use of cameras and computers to automate tasks such as object recognition and activity recognition. Traditionally, researchers have developed computer vision systems by extracting the constituent features in an image or video and matching those features to models of more complex objects. More recently, machine learning methods have been applied that train a computer to perform such a recognition task from data rather than a physical model. This project explores a learning-based object recognition approach based on learning semantic relationships between objects and people observed in video. Specifically, the project attempts to design computing methods that will automatically derive relationships between people and objects in digital video and then exploit those correlative relationships in classifying a human action (e.g., kicking a ball or shaking hands). Unlike machine learning methods developed for understanding language, the proposed solution will use elements of the video specific to understanding human action such as detection of imaged objects, the motion of objects, and the spatial and temporal position in the video. Successful implementation of the computer vision solution will allow human activities in video to be automatically analyzed. The analysis will benefit critical tasks such as learning to perform a surgery or understanding the actions taken in an effective classroom.Transformers are a type of neural network that use attention to compute relationships between words in a sentence or series of sentences. The advantages of the transformer model include the ability to assess these relationships over long sequences of words and the ability to automatically process all words simultaneously via a positional encoding. Instead of taking the transformer developed for natural language and fitting it to a video problem, this project seeks to develop a video transformer from first principles. The realization of this system involves three distinct advances in the machine learning design. First, the proposed approach brings the concept of motion as a feature to the transformer by way of optical flow information encoded with time. Second, the proposed method allows interactions between geometric and motion features of action semantics to exploited in a transformer framework. Last, the distribution-based attention model goes beyond the traditional correlative notion of attention. The proposed attention model captures significant correlations in action sequences. Together, the three theoretical contributions have the potential to significantly advance video understanding.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
从视频中了解人类活动对于安全、国防、医学、机器人、制造和教育等领域的多个应用非常重要。计算机视觉领域探索使用相机和计算机来自动执行任务,例如对象识别和活动识别。传统上,研究人员通过提取图像或视频中的组成特征并将这些特征与更复杂对象的模型进行匹配来开发计算机视觉系统。最近,已经应用了机器学习方法来训练计算机从数据而不是物理模型执行这样的识别任务。该项目探索了一种基于学习的对象识别方法,该方法基于学习视频中观察到的对象和人之间的语义关系。具体地说,该项目试图设计计算方法,该方法将自动导出数字视频中人与对象之间的关系,然后利用这些相关关系对人类行为进行分类(例如,踢球或握手)。与为理解语言而开发的机器学习方法不同,所提出的解决方案将使用特定于理解人类动作的视频元素,例如图像对象的检测,对象的运动以及视频中的空间和时间位置。计算机视觉解决方案的成功实施将允许自动分析视频中的人类活动。分析将有利于关键任务,如学习执行手术或理解在有效课堂中采取的行动。变压器是一种神经网络,它使用注意力来计算句子或一系列句子中单词之间的关系。Transformer模型的优点包括在长的单词序列上评估这些关系的能力,以及通过位置编码同时自动处理所有单词的能力。本项目不是采用为自然语言开发的Transformer并将其应用于视频问题,而是从第一原理开发视频Transformer。该系统的实现涉及机器学习设计中的三个明显进步。首先,所提出的方法通过用时间编码的光流信息将运动的概念作为特征引入到Transformer。其次,该方法允许几何和动作语义的运动特征之间的相互作用,利用在一个Transformer框架。最后,基于分布的注意力模型超越了传统的相关注意力概念。所提出的注意力模型捕捉动作序列中的显著相关性。这三个理论贡献加在一起,有可能大大推进视频理解。该奖项反映了NSF的法定使命,并已被认为值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Scott Acton其他文献
ST Wavefront Sensing and Control
ST 波前传感和控制
- DOI:
- 发表时间:
2007 - 期刊:
- 影响因子:0
- 作者:
L. Feinberg;B. Dean;D. Aronstein;C. Bowers;Bill Hayden;R. Lyon;R. Shiri;Scott Smith;Scott Acton;Larkin Carey;A. Contos;E. Sabatke;J. Schwenker;D. Shields;Timothy W. Towel - 通讯作者:
Timothy W. Towel
Editorial Introduction to multimedia system technologies for educational tools
- DOI:
10.1007/s00530-005-0001-1 - 发表时间:
2006-02-08 - 期刊:
- 影响因子:3.100
- 作者:
Scott Acton;Fumio Kishino;Ryohei Nakatsu;Jinshan Tang;Matthias Rauterberg - 通讯作者:
Matthias Rauterberg
Scott Acton的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Scott Acton', 18)}}的其他基金
Intergovernmental Personnel Act Assignment
政府间人事法转让
- 批准号:
1950730 - 财政年份:2019
- 资助金额:
$ 28.08万 - 项目类别:
Intergovernmental Personnel Award
ABI Innovation: Towards the Neurome -- Automated Image Analysis for Neuroinformatics
ABI 创新:迈向神经元——神经信息学自动图像分析
- 批准号:
1062433 - 财政年份:2011
- 资助金额:
$ 28.08万 - 项目类别:
Standard Grant
Decentralized Image Retrieval for Education (DIRECT)
教育分散式图像检索(DIRECT)
- 批准号:
0121596 - 财政年份:2002
- 资助金额:
$ 28.08万 - 项目类别:
Standard Grant
相似国自然基金
基于分子动力学的沥青/集料界面行为Spatiotemporal模型
- 批准号:51378073
- 批准年份:2013
- 资助金额:72.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: Planning: FIRE-PLAN:High-Spatiotemporal-Resolution Sensing and Digital Twin to Advance Wildland Fire Science
合作研究:规划:FIRE-PLAN:高时空分辨率传感和数字孪生,以推进荒地火灾科学
- 批准号:
2335568 - 财政年份:2024
- 资助金额:
$ 28.08万 - 项目类别:
Standard Grant
Collaborative Research: Planning: FIRE-PLAN:High-Spatiotemporal-Resolution Sensing and Digital Twin to Advance Wildland Fire Science
合作研究:规划:FIRE-PLAN:高时空分辨率传感和数字孪生,以推进荒地火灾科学
- 批准号:
2335569 - 财政年份:2024
- 资助金额:
$ 28.08万 - 项目类别:
Standard Grant
Collaborative Research: Understanding the Influence of Turbulent Processes on the Spatiotemporal Variability of Downslope Winds in Coastal Environments
合作研究:了解湍流过程对沿海环境下坡风时空变化的影响
- 批准号:
2331729 - 财政年份:2024
- 资助金额:
$ 28.08万 - 项目类别:
Continuing Grant
Collaborative Research: OAC Core: Distributed Graph Learning Cyberinfrastructure for Large-scale Spatiotemporal Prediction
合作研究:OAC Core:用于大规模时空预测的分布式图学习网络基础设施
- 批准号:
2403312 - 财政年份:2024
- 资助金额:
$ 28.08万 - 项目类别:
Standard Grant
Spatiotemporal dynamics of acetylcholine activity in adaptive behaviors and response patterns
适应性行为和反应模式中乙酰胆碱活性的时空动态
- 批准号:
24K10485 - 财政年份:2024
- 资助金额:
$ 28.08万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Collaborative Research: Planning: FIRE-PLAN:High-Spatiotemporal-Resolution Sensing and Digital Twin to Advance Wildland Fire Science
合作研究:规划:FIRE-PLAN:高时空分辨率传感和数字孪生,以推进荒地火灾科学
- 批准号:
2335570 - 财政年份:2024
- 资助金额:
$ 28.08万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Learning AI Surrogate of Large-Scale Spatiotemporal Simulations for Coastal Circulation
合作研究:OAC Core:学习沿海环流大规模时空模拟的人工智能替代品
- 批准号:
2402947 - 财政年份:2024
- 资助金额:
$ 28.08万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Distributed Graph Learning Cyberinfrastructure for Large-scale Spatiotemporal Prediction
合作研究:OAC Core:用于大规模时空预测的分布式图学习网络基础设施
- 批准号:
2403313 - 财政年份:2024
- 资助金额:
$ 28.08万 - 项目类别:
Standard Grant
EAGER: Enhancement of Ammonia combustion by spatiotemporal control of plasma kinetics
EAGER:通过等离子体动力学的时空控制增强氨燃烧
- 批准号:
2337461 - 财政年份:2024
- 资助金额:
$ 28.08万 - 项目类别:
Standard Grant
LTREB: Population persistence in a variable world: spatiotemporal variation in climate and demography across the range of scarlet monkeyflower
LTREB:可变世界中的人口持久性:猩红猴花范围内气候和人口的时空变化
- 批准号:
2311414 - 财政年份:2024
- 资助金额:
$ 28.08万 - 项目类别:
Continuing Grant