BIGDATA: F: Collaborative Research: From Visual Data to Visual Understanding

BIGDATA:F:协作研究:从视觉数据到视觉理解

基本信息

项目摘要

The field of visual recognition, which focuses on creating computer algorithms for automatically understanding photographs and videos, has made tremendous gains in the past few years. Algorithms can now recognize and localize thousands of objects with reasonable accuracy as well as identify other visual content, such as scenes and activities. For instance, there are now smart phone apps that can automatically sift through a user's photos and find all party pictures, or all pictures of cars, or all sunset photos. However, the type of "visual understanding" done by these methods is still rather superficial, exhibiting mostly rote memorization rather than true reasoning. For example, current algorithms have a hard time telling if an image is typical (e.g., car on a road) or unusual (e.g., car in the sky), or answering simple questions about a photograph, e.g., "what are the people looking at?", "what just happened?", "what might happen next?" A central problem is that current methods lack the data about the world outside of the photograph. To achieve true human-like visual understanding, computers will have to reason about the broader spatial, temporal, perceptual, and social context suggested by a given visual input. This project is using big visual data to gather large-scale deep semantic knowledge about how events, physical and social interactions, and how people perceive the world and each other. The research focuses on developing methods to capture and represent this knowledge in a way that makes it broadly applicable to a range of visual understanding tasks. This will enable novel computer algorithms that have a deeper, more human-like, understanding of the visual world and can effectively function in complex, real-world situations and environments. For example, if a robot can predict what a person might do next in a given situation, then the robot can better aid the person in their task. Broader impacts will include new publicly-available software tools and data that can be used for various visual reasoning tasks. Additionally, the project will have a multi-pronged educational component, including incorporating aspects of the research in the graduate teaching curriculum, undergraduate and K-12 outreach, as well as special mentoring and focused events for advancement of women in computer science.The main technical focus of this project is to advance computational recognition efforts toward producing a general human-like visual understanding of images and video that can function on previously unseen data, unseen tasks and settings. The aim of this project is to develop a new large-scale knowledge base called the visual Memex that extracts and stores vast set of visual relationships between data items in a multi-graph representation, with nodes corresponding to data items and edges indicating different types of relationships. This large knowledge base will be used in a lambda-calculus-powered reasoning engine to make inferences about visual data on a global scale. Additionally, the project will test computational recognition algorithms on several visual understanding tasks designed to evaluate progress on a variety of aspects of visual understanding, including: linguistic (evaluating our understanding about imagery through language tasks such as visual question-answering), to purely visual (evaluating our understanding of spatial context through visual fill-in-the-blanks), to temporal (evaluating our temporal understanding by predicting future states), to physical (evaluating our understanding of human-object and human-scene interactions by predicting affordances). Datasets, knowledge base, and evaluation tools will be hosted on the project web site (http://www.tamaraberg.com/grants/bigdata.html).
视觉识别领域专注于创建自动理解照片和视频的计算机算法,在过去几年中取得了巨大的进步。算法现在可以以合理的精度识别和定位数千个对象,并识别其他视觉内容,如场景和活动。例如,现在有智能手机应用程序可以自动筛选用户的照片,并找到所有的聚会照片,或所有的汽车照片,或所有日落照片。然而,这些方法所达到的“视觉理解”仍然是相当肤浅的,大多表现为死记硬背,而不是真正的推理。例如,当前的算法很难判断图像是否是典型的(例如,道路上的汽车)或不寻常的(例如,天空中的汽车),或者回答关于照片的简单问题,例如,“人们在看什么?”,“刚才发生了什么事?”,“接下来可能会发生什么?“一个核心问题是,目前的方法缺乏关于照片之外世界的数据。为了实现真正的人类视觉理解,计算机必须对给定视觉输入所暗示的更广泛的空间,时间,感知和社会背景进行推理。该项目使用大视觉数据来收集关于事件、物理和社会互动以及人们如何感知世界和彼此的大规模深层语义知识。该研究的重点是开发方法来捕捉和表示这种知识的方式,使其广泛适用于一系列的视觉理解任务。这将使新的计算机算法能够对视觉世界有更深入、更人性化的理解,并能在复杂的现实情况和环境中有效地发挥作用。例如,如果一个机器人可以预测一个人在给定的情况下下一步会做什么,那么机器人就可以更好地帮助这个人完成任务。更广泛的影响将包括可用于各种视觉推理任务的新的公开软件工具和数据。此外,该项目将有一个多管齐下的教育组成部分,包括将研究的各个方面纳入研究生教学课程,本科和K-12推广,以及特别辅导和重点活动,以提高妇女在计算机科学。该项目的主要技术重点是推进计算识别的努力,以产生一个一般的人类-比如对图像和视频的视觉理解,可以在以前看不见的数据、看不见的任务和设置上发挥作用。该项目的目的是开发一个新的大规模知识库,称为可视化Memex,它可以提取和存储多图表示中数据项之间的大量可视化关系,节点对应于数据项,边缘表示不同类型的关系。这个庞大的知识库将被用于一个基于微积分的推理引擎中,以在全球范围内对视觉数据进行推理。此外,该项目将在几个视觉理解任务上测试计算识别算法,旨在评估视觉理解各个方面的进展,包括:语言(通过视觉问答等语言任务评估我们对图像的理解),以纯粹的视觉(通过视觉填空来评估我们对空间背景的理解),(通过预测未来状态来评估我们对时间的理解),到物理(通过预测启示来评估我们对人-物体和人-场景交互的理解)。数据集、知识库和评价工具将放在项目网站(http://www.tamaraberg.com/grants/bigdata.html)上。

项目成果

期刊论文数量(11)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
From image to language and back again
  • DOI:
    10.1017/s1351324918000086
  • 发表时间:
    2018-04
  • 期刊:
  • 影响因子:
    2.5
  • 作者:
    Anya Belz;Tamara L. Berg;Licheng Yu
  • 通讯作者:
    Anya Belz;Tamara L. Berg;Licheng Yu
Multi-Target Embodied Question Answering
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions
TVQA: Localized, Compositional Video Question Answering
  • DOI:
    10.18653/v1/d18-1167
  • 发表时间:
    2018-09
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jie Lei;Licheng Yu;Mohit Bansal;Tamara L. Berg
  • 通讯作者:
    Jie Lei;Licheng Yu;Mohit Bansal;Tamara L. Berg
Combining Multiple Cues for Visual Madlibs Question Answering
  • DOI:
    10.1007/s11263-018-1096-0
  • 发表时间:
    2016-11
  • 期刊:
  • 影响因子:
    19.5
  • 作者:
    T. Tommasi;Arun Mallya;Bryan A. Plummer;Svetlana Lazebnik;A. Berg;Tamara L. Berg
  • 通讯作者:
    T. Tommasi;Arun Mallya;Bryan A. Plummer;Svetlana Lazebnik;A. Berg;Tamara L. Berg
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ashok Krishnamurthy其他文献

The spinal stenosis pedometer and nutrition lifestyle intervention (SSPANLI) randomized controlled trial protocol
椎管狭窄计步器和营养生活方式干预(SSPANLI)随机对照试验方案
  • DOI:
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    2.3
  • 作者:
    C. Tomkins;L. Lafave;J. Parnell;Ashok Krishnamurthy;Jocelyn Rempel;L. Macedo;Stephanie Moriartey;K. Stuber;P. Wilson;Richard W. Hu;Yvette M Andreas
  • 通讯作者:
    Yvette M Andreas
P015 Nature Teaches Us to Grieve: The Place of Parks and Nature at End of Life
  • DOI:
    10.1016/j.jpainsymman.2016.10.136
  • 发表时间:
    2016-12-01
  • 期刊:
  • 影响因子:
  • 作者:
    Sonya L. Jakubec;Don Carruthers Den Hoed;Ashok Krishnamurthy;Heather Ray;Michael Quinn
  • 通讯作者:
    Michael Quinn
Correction: FHIR PIT: a geospatial and spatiotemporal data integration pipeline to support subject-level clinical research
  • DOI:
    10.1186/s12911-025-02940-w
  • 发表时间:
    2025-02-25
  • 期刊:
  • 影响因子:
    3.800
  • 作者:
    Karamarie Fecho;Juan J. Garcia;Hong Yi;Griffin Roupe;Ashok Krishnamurthy
  • 通讯作者:
    Ashok Krishnamurthy
A Computational Science IDE for HPC Systems: Design and Applications
  • DOI:
    10.1007/s10766-008-0084-3
  • 发表时间:
    2008-09-24
  • 期刊:
  • 影响因子:
    0.900
  • 作者:
    David E. Hudak;Neil Ludban;Ashok Krishnamurthy;Vijay Gadepally;Siddharth Samsi;John Nehrbass
  • 通讯作者:
    John Nehrbass
Towards a Cybernetic Model of Human Movement
迈向人类运动的控制论模型
  • DOI:
    10.5539/mer.v6n1p29
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    H. Hemami;Eric W. Tarr;Boren Li;Ashok Krishnamurthy;B. Clymer;B. Dariush
  • 通讯作者:
    B. Dariush

Ashok Krishnamurthy的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ashok Krishnamurthy', 18)}}的其他基金

EAGER: Distilling a Process for a National CI Roadmap from NSF Collaboratories
EAGER:从 NSF 合作实验室提炼国家 CI 路线图流程
  • 批准号:
    1153775
  • 财政年份:
    2011
  • 资助金额:
    $ 35万
  • 项目类别:
    Standard Grant
NSF Workshop High Performance Computing Center Sustainability
NSF 高性能计算中心可持续发展研讨会
  • 批准号:
    0944039
  • 财政年份:
    2009
  • 资助金额:
    $ 35万
  • 项目类别:
    Standard Grant

相似海外基金

BIGDATA: IA: Collaborative Research: Asynchronous Distributed Machine Learning Framework for Multi-Site Collaborative Brain Big Data Mining
BIGDATA:IA:协作研究:用于多站点协作大脑大数据挖掘的异步分布式机器学习框架
  • 批准号:
    2348159
  • 财政年份:
    2023
  • 资助金额:
    $ 35万
  • 项目类别:
    Standard Grant
BIGDATA: IA: Collaborative Research: Intelligent Solutions for Navigating Big Data from the Arctic and Antarctic
BIGDATA:IA:协作研究:导航北极和南极大数据的智能解决方案
  • 批准号:
    2308649
  • 财政年份:
    2022
  • 资助金额:
    $ 35万
  • 项目类别:
    Standard Grant
BIGDATA: Collaborative Research: F: Holistic Optimization of Data-Driven Applications
BIGDATA:协作研究:F:数据驱动应用程序的整体优化
  • 批准号:
    2027516
  • 财政年份:
    2020
  • 资助金额:
    $ 35万
  • 项目类别:
    Standard Grant
BIGDATA: F: Collaborative Research: Practical Analysis of Large-Scale Data with Lyme Disease Case Study
BIGDATA:F:协作研究:莱姆病案例研究大规模数据的实际分析
  • 批准号:
    1934319
  • 财政年份:
    2019
  • 资助金额:
    $ 35万
  • 项目类别:
    Standard Grant
BIGDATA: IA: Collaborative Research: Protecting Yourself from Wildfire Smoke: Big Data-Driven Adaptive Air Quality Prediction Methodologies
大数据:IA:协作研究:保护自己免受野火烟雾的侵害:大数据驱动的自适应空气质量预测方法
  • 批准号:
    1838022
  • 财政年份:
    2019
  • 资助金额:
    $ 35万
  • 项目类别:
    Standard Grant
BIGDATA: F: Collaborative Research: Foundations of Responsible Data Management
大数据:F:协作研究:负责任的数据管理的基础
  • 批准号:
    1926250
  • 财政年份:
    2019
  • 资助金额:
    $ 35万
  • 项目类别:
    Standard Grant
BIGDATA: IA: Collaborative Research: Intelligent Solutions for Navigating Big Data from the Arctic and Antarctic
BIGDATA:IA:协作研究:导航北极和南极大数据的智能解决方案
  • 批准号:
    1947584
  • 财政年份:
    2019
  • 资助金额:
    $ 35万
  • 项目类别:
    Standard Grant
BIGDATA: IA: Collaborative Research: Asynchronous Distributed Machine Learning Framework for Multi-Site Collaborative Brain Big Data Mining
BIGDATA:IA:协作研究:用于多站点协作大脑大数据挖掘的异步分布式机器学习框架
  • 批准号:
    1837964
  • 财政年份:
    2019
  • 资助金额:
    $ 35万
  • 项目类别:
    Standard Grant
BIGDATA: F: Collaborative Research: Optimizing Log-Structured-Merge-Based Big Data Management Systems
BIGDATA:F:协作研究:优化基于日志结构合并的大数据管理系统
  • 批准号:
    1838222
  • 财政年份:
    2019
  • 资助金额:
    $ 35万
  • 项目类别:
    Standard Grant
BIGDATA: F: Collaborative Research: Optimizing Log-Structured-Merge-Based Big Data Management Systems
BIGDATA:F:协作研究:优化基于日志结构合并的大数据管理系统
  • 批准号:
    1838248
  • 财政年份:
    2019
  • 资助金额:
    $ 35万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了