INT2-Medium: Understanding the meaning of images

INT2-Medium:理解图像的含义

基本信息

  • 批准号:
    0803603
  • 负责人:
  • 金额:
    $ 55万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2008
  • 资助国家:
    美国
  • 起止时间:
    2008-08-15 至 2012-07-31
  • 项目状态:
    已结题

项目摘要

The ability to recognize objects in images is a core problem in computer vision. The last decade has seen astonishing advances in our methods to build object detectors. However, images convey richer information about the objects depicted in them: objects may form a scene ("A view of mountains and meadows"); objects are in relations with one another ("The cat sits on the mat"); different instances may look different ("The tabby cat sits on the blue mat"); objects may acting on others ("The cat is chasing the mouse"). This task of identifying the entities depicted in images, their attributes and relations is image understanding. This poses a number of new research questions: What objects should one remark on? What attributes of and relations between the objects depicted the image are important? That is, what is the visually salient information conveyed in an image?Many images (e.g. a large fraction of those on the web) are accompanied by text which describes or gives additional information about the entities depicted in them. The entities referred to in this text are typically visually salient ones. This correspondence between the information conveyed in the text and the image can be used in the creation of image understanding systems. Much current work treats image annotations that consist of individual words. The richer representations of meaning required to train image understanding systems can be obtained if annotating text is treated as sentences (rather than just bags of words). Sentences provide cues to: what is salient in an image; what salient objects likely look like (e.g. color, texture and form); and what relations might appear between them. Exposing this information will provide a rich body of training data for the next generation of computer vision systems.Research in natural language processing has created statistical wide- coverage parsers that can recover the semantic interpretation of sentences. These parsers differ from purely syntactic parsers in that they are based on linguistically expressive grammars that allow such interpretations to be built directly from the syntactic analysis. However, linking sentences with accompanying images requires a level of representation that goes beyond lists of the entities, states and events mentioned in a sentence. The writer of an image caption will typically assume that the reader sees the image, and can therefore refer to the entities depicted in it as known to the reader. There is a need parsers that are able to uncover the information structure of sentences -- what information is assumed to be shared knowledge between speaker and hearer, and what is new information asserted by the sentence. How information structure is encoded in natural language is well understood, and can be modeled with the same kinds of grammars that are used by those parsers that return semantic interpretations. Although there are currently no large corpora annotated with information structure, we will exploit the correspondence between images and their captions to develop novel, partially supervised, training regimes for parsers. These training regimes could also enable the bootstrapping of parsers for languages with no or little annotated training data.This project will build a novel parser that recovers richer linguistic representations, including information structure. It will build a novel image understanding system that recovers the salient entities depicted in an image together with their attributes and relations. The project will train these systems both separately on datasets consisting of sentences marked up with correct parses and images marked up with labels attached to objects, and jointly on a dataset of captioned images.Intellectual merits: The project goals are ambitious, but within reach, because both object recognition and parsing technology has advanced significantly. The project presents the vision and parsing communities with new goals, which are practically important and technically demanding. The aim of integrating natural language processing and computer vision creates a novel impetus to develop parsers that return richer linguistic representations, which will in turn have a deep impact on research within the natural language processing community itself. It will open up key directions in computer vision and natural language processing by demanding and enabling the recovery of richer representations of linguistic and visual information, and by studying how linguistic descriptions are grounded in the visual world.Broader impact: The project has significant practical implications in a number of areas such as image search, natural language interfaces for robotics, and will ultimately pave the way for new applications such as automatic captioning systems. The resulting advances in object recognition offer possibilities for the creation of safer autonomous vehicles, safer homes for better home care, and efficient management of surveillance data.URL: http://luthuli.cs.uiuc.edu/~daf/meaningofimages.html
图像中目标的识别能力是计算机视觉的核心问题。 在过去的十年里,我们在建造物体探测器的方法上取得了惊人的进步。 然而,图像传达了关于它们所描绘的对象的更丰富的信息:对象可以形成一个场景(“山脉和草地的景色”);对象彼此之间存在关系(“猫坐在垫子上”);不同的实例可能看起来不同(“虎斑猫坐在蓝色垫子上”);对象可能对其他对象起作用(“猫在追老鼠”)。识别图像中描述的实体、它们的属性和关系的任务就是图像理解。 这就提出了一些新的研究问题:人们应该评论什么样的对象?图像所描绘的对象的哪些属性和它们之间的关系是重要的? 也就是说,图像中传达的视觉显著信息是什么?许多图像(例如,网络上的大部分图像)都附有文本,这些文本描述或提供有关其中所描绘的实体的附加信息。 本文中提到的实体通常是视觉上突出的实体。文本和图像中传达的信息之间的这种对应关系可以用于创建图像理解系统。目前的许多工作对待图像注释,包括个别的话。如果将注释文本视为句子(而不仅仅是单词),则可以获得训练图像理解系统所需的更丰富的含义表示。 句子提供线索:图像中什么是突出的;突出的对象可能看起来像什么(例如颜色、纹理和形式);以及它们之间可能出现什么关系。公开这些信息将为下一代计算机视觉系统提供丰富的训练数据。自然语言处理方面的研究已经创建了统计广泛覆盖的解析器,可以恢复句子的语义解释。 这些解析器不同于纯粹的句法解析器,因为它们基于语言表达语法,允许直接从句法分析中构建这种解释。 然而,将句子与伴随的图像联系起来需要一种超越句子中提到的实体、状态和事件列表的表示水平。图像标题的作者通常会假设读者看到了图像,因此可以将其中描绘的实体称为读者已知的实体。需要能够揭示句子的信息结构的分析器--哪些信息被认为是说话者和听话者之间的共享知识,哪些是句子所断言的新信息。信息结构如何在自然语言中编码是很好理解的,并且可以用那些返回语义解释的解析器所使用的相同类型的语法来建模。 虽然目前还没有大型语料库注释的信息结构,我们将利用图像和它们的标题之间的对应关系,开发新的,部分监督,训练制度的解析器。这些训练机制也可以使没有或很少注释训练数据的语言的解析器自举。本项目将建立一个新的解析器,恢复更丰富的语言表示,包括信息结构。它将建立一个新的图像理解系统,恢复图像中描绘的显着实体及其属性和关系。该项目将分别在由标记了正确解析的句子和标记了对象标签的图像组成的数据集上训练这些系统,并在标题图像数据集上联合训练这些系统。智力优势:项目目标雄心勃勃,但可以实现,因为对象识别和解析技术都有了显着的进步。 该项目向视觉和解析社区提出了新的目标,这些目标实际上很重要,技术要求也很高。 整合自然语言处理和计算机视觉的目标为开发返回更丰富语言表示的解析器创造了新的动力,这反过来又将对自然语言处理社区本身的研究产生深远的影响。 它将通过要求和实现更丰富的语言和视觉信息表示的恢复,以及通过研究语言描述如何在视觉世界中扎根,开辟计算机视觉和自然语言处理的关键方向。该项目在图像搜索、机器人自然语言接口、并将最终为诸如自动字幕系统的新应用铺平道路。由此产生的物体识别技术的进步为创造更安全的自动驾驶汽车、更安全的家庭提供了可能性,从而实现更好的家庭护理,并有效管理监控数据。URL:http://luthuli.cs.uiuc.edu/~daf/meaningofimages.html

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

David Forsyth其他文献

Supplement - Convex Decomposition of Indoor Scenes
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
    David Forsyth
  • 通讯作者:
    David Forsyth
Hidden Markov Models
隐马尔可夫模型
  • DOI:
    10.1007/978-3-030-18114-7_13
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    0
  • 作者:
    David Forsyth
  • 通讯作者:
    David Forsyth
Preserving Image Properties Through Initializations in Diffusion Models
通过扩散模型中的初始化保留图像属性
Fully spectrum-sliced four-wave mixing wavelength conversion in a Semiconductor Optical Amplifier
半导体光放大器中的全光谱切片四波混频波长转换
Scientific report on Modeling and Prediction of Human Intent for Primitive Activation
关于人类原始激活意图的建模和预测的科学报告
  • DOI:
  • 发表时间:
    2014
  • 期刊:
  • 影响因子:
    0
  • 作者:
    David Forsyth
  • 通讯作者:
    David Forsyth

David Forsyth的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('David Forsyth', 18)}}的其他基金

RI: Medium: Creating Knowledge with All-Novel-Class Computer Vision
RI:媒介:利用新颖的计算机视觉创造知识
  • 批准号:
    2106825
  • 财政年份:
    2021
  • 资助金额:
    $ 55万
  • 项目类别:
    Continuing Grant
Collaborative Research: Computational Behavioral Science: Modeling, Analysis, and Visualization of Social and Communicative Behavior
合作研究:计算行为科学:社交和交流行为的建模、分析和可视化
  • 批准号:
    1029035
  • 财政年份:
    2010
  • 资助金额:
    $ 55万
  • 项目类别:
    Continuing Grant
RI: Small: Exploiting Geometric and Illumination Context in Indoor Scenes
RI:小:利用室内场景中的几何和照明环境
  • 批准号:
    0916014
  • 财政年份:
    2009
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
Interpreting Human Behaviour in Video using FSA's and Object Context
使用 FSA 和对象上下文解释视频中的人类行为
  • 批准号:
    0534837
  • 财政年份:
    2006
  • 资助金额:
    $ 55万
  • 项目类别:
    Continuing Grant
Finding and Tracking People from the Bottom Up
自下而上查找和跟踪人员
  • 批准号:
    0098682
  • 财政年份:
    2001
  • 资助金额:
    $ 55万
  • 项目类别:
    Continuing Grant
Purchase of a Molecular Modeling System
购买分子建模系统
  • 批准号:
    9974642
  • 财政年份:
    1999
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
SGER: MCMC Algorithms for Object Recognition
SGER:用于对象识别的 MCMC 算法
  • 批准号:
    9979201
  • 财政年份:
    1999
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
A Spiral Approach to Chemical Concepts Using GC/MS
使用 GC/MS 探索化学概念的螺旋方法
  • 批准号:
    9850580
  • 财政年份:
    1998
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
Workshop on Shape, Contour and Grouping
形状、轮廓和分组研讨会
  • 批准号:
    9712426
  • 财政年份:
    1997
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
Recognising curved surfaces from their outlines
从轮廓识别曲面
  • 批准号:
    9596025
  • 财政年份:
    1994
  • 资助金额:
    $ 55万
  • 项目类别:
    Continuing Grant

相似海外基金

Collaborative Research: SaTC: CORE: Medium: Understanding the Impact of Privacy Interventions on the Online Publishing Ecosystem
协作研究:SaTC:核心:媒介:了解隐私干预对在线出版生态系统的影响
  • 批准号:
    2237329
  • 财政年份:
    2023
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
Collaborative Research: SaTC: CORE: Medium: Understanding and Combatting Impersonation Attacks and Data Leakage in Online Advertising
协作研究:SaTC:核心:媒介:理解和打击在线广告中的冒充攻击和数据泄露
  • 批准号:
    2247516
  • 财政年份:
    2023
  • 资助金额:
    $ 55万
  • 项目类别:
    Continuing Grant
Postdoctoral Fellowship: AAPF: All Shook Up: Understanding the Chemistry, Dynamics, and Kinematics of the Diffuse Interstellar Medium
博士后奖学金:AAPF:一切都震惊了:了解弥漫星际介质的化学、动力学和运动学
  • 批准号:
    2303902
  • 财政年份:
    2023
  • 资助金额:
    $ 55万
  • 项目类别:
    Fellowship Award
Collaborative Research: SaTC: TTP: Medium: iDRAMA.cloud: A Platform for Measuring and Understanding Information Manipulation
协作研究:SaTC:TTP:中:iDRAMA.cloud:测量和理解信息操纵的平台
  • 批准号:
    2247867
  • 财政年份:
    2023
  • 资助金额:
    $ 55万
  • 项目类别:
    Continuing Grant
Collaborative Research: SaTC: CORE: Medium: Understanding the Impact of Privacy Interventions on the Online Publishing Ecosystem
协作研究:SaTC:核心:媒介:了解隐私干预对在线出版生态系统的影响
  • 批准号:
    2237328
  • 财政年份:
    2023
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
Collaborative Research: CompCog: RI: Medium: Understanding human planning through AI-assisted analysis of a massive chess dataset
合作研究:CompCog:RI:中:通过人工智能辅助分析海量国际象棋数据集了解人类规划
  • 批准号:
    2312374
  • 财政年份:
    2023
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
Collaborative Research: CompCog: RI: Medium: Understanding human planning through AI-assisted analysis of a massive chess dataset
合作研究:CompCog:RI:中:通过人工智能辅助分析海量国际象棋数据集了解人类规划
  • 批准号:
    2312373
  • 财政年份:
    2023
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
Collaborative Research: SaTC: CORE: Medium: Understanding the Impact of Privacy Interventions on the Online Publishing Ecosystem
协作研究:SaTC:核心:媒介:了解隐私干预对在线出版生态系统的影响
  • 批准号:
    2237327
  • 财政年份:
    2023
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
Collaborative Research: IIS: III: MEDIUM: Learning Protein-ish: Foundational Insight on Protein Language Models for Better Understanding, Democratized Access, and Discovery
协作研究:IIS:III:中等:学习蛋白质:对蛋白质语言模型的基础洞察,以更好地理解、民主化访问和发现
  • 批准号:
    2310113
  • 财政年份:
    2023
  • 资助金额:
    $ 55万
  • 项目类别:
    Standard Grant
Collaborative Research: SaTC: TTP: Medium: iDRAMA.cloud: A Platform for Measuring and Understanding Information Manipulation
协作研究:SaTC:TTP:中:iDRAMA.cloud:测量和理解信息操纵的平台
  • 批准号:
    2247868
  • 财政年份:
    2023
  • 资助金额:
    $ 55万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了