Visual Sense. Tagging visual data with semantic descriptions

视觉感。

基本信息

  • 批准号:
    EP/K01904X/1
  • 负责人:
  • 金额:
    $ 42.22万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2013
  • 资助国家:
    英国
  • 起止时间:
    2013 至 无数据
  • 项目状态:
    已结题

项目摘要

Recent years have witnessed an unprecedented growth in the number of image and video collections, partially due to the increased popularity of photo and video sharing websites. One such website alone (Flickr) stores billions of images. And this is not the only way in which visual content is present on the Web: in fact most web pages contain some form of visual content. However, while most traditional tools for search and retrieval can successfully handle textual content, they are not prepared to handle heterogeneous documents. This new type of content demands the development of new efficient tools for search and retrieval.The large number of readily accessible multi-media data-collections pose both an opportunity and a challenge. The opportunity lies in the potential to mine this data to automatically discover mappings between visual and textual content. The challenge is to develop tools to classify, filter, browse and search such heterogeneous data. In brief, the data is available, but the tools to make sense of it are missing.The Visual Sense project aims to automatically mine the semantic content of visual data to enable "machine reading" of images. In recent years, we have witnessed significant advances in the automatic recognition of visual concepts. These advances allowed for the creation of systems that can automatically generate keyword-based image annotations. However, these annotations, e.g. "man" and "pot", fall far short of the sort of more meaningful descriptive captions necessary for indexing and retrieval of images, for example,"Man cooking in kitchen". The goal of this project is to move a step forward and predict semantic image representations that can be used to generate more informative sentence-based image annotations, thus facilitating search and browsing of large multi-modal collections. It will address the following key open research challenges:1) Develop methods that can derive a semantic representation of visual content. Such representations must go beyond the detection of objects and scenes and also include a wide range of object relations.2) Extend state-of-the-art natural language techniques to the tasks of mining large collections of multi-modal documents and generating image captions using both semantic representations of visual content and object/scene type models derived from semantic representations of the textual component of multi-modal documents.3) Develop learning algorithms that can exploit available multi-modal data to discover mappings between visual and textual content. These algorithms should be able to leverage 'weakly' annotated data and be robust to large amounts of noise.Thus, the main focus of the Visual Sense project is the development of machine learning methods for knowledge and information extraction from large collections of visual and textual content and for the fusion of this information across modalities. The tools and techniques developed in this project will have a variety of applications. To demonstrate them, we will address three case studies: 1) evaluation of generated descriptive image captions in established international image annotation benchmarks, 2) re-ranking for improved image search and 3) automatic illustration of articles with images.To address these broad challenges, the project will build on expertise from multiple disciplines, including computer vision, machine learning and natural language processing (NLP). It brings together four research groups from University of Surrey (Surrey, UK), Institut de Robotica i Informatica Industrial (IRI, Spain), Ecole Centrale de Lyon (ECL, France), and University of Sheffield (Sheffield, UK) having each well established and complementary expertise in their respective areas of research.
近年来,图片和视频收藏的数量出现了前所未有的增长,部分原因是照片和视频分享网站的日益普及。一个这样的网站(Flickr)就存储了数十亿张图片。这并不是视觉内容在网络上呈现的唯一方式:事实上,大多数网页都包含某种形式的视觉内容。然而,尽管大多数传统的搜索和检索工具可以成功地处理文本内容,但它们还没有准备好处理异构文档。这种新型的内容要求开发新的高效的搜索和检索工具。大量易于获取的多媒体数据集既是机遇也是挑战。机会在于挖掘这些数据以自动发现视觉和文本内容之间的映射的潜力。挑战在于开发工具来分类、过滤、浏览和搜索这些异构数据。简而言之,数据是可用的,但缺乏理解数据的工具。Visual Sense项目旨在自动挖掘视觉数据的语义内容,以实现图像的“机器阅读”。近年来,我们在视觉概念的自动识别方面取得了重大进展。这些进步允许创建能够自动生成基于关键字的图像注释的系统。然而,这些注释,例如:“人”和“锅”,远远不够索引和检索图像所需的那种更有意义的描述性说明,例如,“男人在厨房做饭”。这个项目的目标是向前迈进一步,预测语义图像表示,可以用来生成更多基于句子的信息图像注释,从而促进大型多模态集合的搜索和浏览。它将解决以下关键的开放式研究挑战:1)开发可以派生视觉内容的语义表示的方法。这种表征必须超越对象和场景的检测,还包括广泛的对象关系。2)将最先进的自然语言技术扩展到挖掘大型多模态文档集合的任务中,并使用视觉内容的语义表示和从多模态文档文本组件的语义表示派生的对象/场景类型模型来生成图像标题。3)开发学习算法,利用可用的多模态数据来发现视觉和文本内容之间的映射。这些算法应该能够利用“弱”注释数据,并且对大量噪声具有鲁棒性。因此,视觉感知项目的主要重点是开发机器学习方法,用于从大量视觉和文本内容中提取知识和信息,并用于跨模式融合这些信息。本项目开发的工具和技术将有各种各样的应用。为了演示它们,我们将讨论三个案例研究:1)在已建立的国际图像注释基准中评估生成的描述性图像标题,2)为改进的图像搜索重新排序,3)用图像自动说明文章。为了应对这些广泛的挑战,该项目将建立在多个学科的专业知识基础上,包括计算机视觉、机器学习和自然语言处理(NLP)。它汇集了来自萨里大学(萨里,英国)、机器人信息工业研究所(IRI,西班牙)、里昂中央学院(ECL,法国)和谢菲尔德大学(谢菲尔德,英国)的四个研究小组,每个小组在各自的研究领域都有完善的和互补的专业知识。

项目成果

期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Robust Registration and Filtering for Moving Object Detection in Aerial Videos
Ranking Images Based on Aesthetic Qualities
Full ranking as local descriptor for visual recognition: A comparison of distance metrics on sn
  • DOI:
    10.1016/j.patcog.2014.10.010
  • 发表时间:
    2015-04
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Chi-Ho Chan;F. Yan;J. Kittler;K. Mikolajczyk
  • 通讯作者:
    Chi-Ho Chan;F. Yan;J. Kittler;K. Mikolajczyk
Online Learning and Detection with Part-Based, Circulant Structure
Improving Object Tracking with Voting from False Positive Detections
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Krystian Mikolajczyk其他文献

Krystian Mikolajczyk的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Krystian Mikolajczyk', 18)}}的其他基金

Interactive Perception-Action-Learning for Modelling Objects
用于建模对象的交互式感知-动作-学习
  • 批准号:
    EP/S032398/1
  • 财政年份:
    2019
  • 资助金额:
    $ 42.22万
  • 项目类别:
    Research Grant
Visual Sense. Tagging visual data with semantic descriptions
视觉感。
  • 批准号:
    EP/K01904X/2
  • 财政年份:
    2015
  • 资助金额:
    $ 42.22万
  • 项目类别:
    Research Grant
Recognition of Object Categories and Scenes
物体类别和场景的识别
  • 批准号:
    EP/F003420/1
  • 财政年份:
    2008
  • 资助金额:
    $ 42.22万
  • 项目类别:
    Research Grant

相似海外基金

Using Virtual Reality to investigate the sense of self
使用虚拟现实研究自我意识
  • 批准号:
    ES/Y008316/1
  • 财政年份:
    2024
  • 资助金额:
    $ 42.22万
  • 项目类别:
    Fellowship
REGULATING THE FLOW: Uncovering How Roots Sense and Respond to Water Availability
调节流量:揭示根部如何感知和响应水的可用性
  • 批准号:
    BB/Z514482/1
  • 财政年份:
    2024
  • 资助金额:
    $ 42.22万
  • 项目类别:
    Fellowship
Creating a Path to Achieving Success and Sense of Belonging in Computer Science
创造一条在计算机科学领域取得成功和归属感的道路
  • 批准号:
    2322665
  • 财政年份:
    2024
  • 资助金额:
    $ 42.22万
  • 项目类别:
    Standard Grant
Motion Sense-free Cabin:自動走行時の快適性向上を目的とした搭乗者の移動感覚制御
无体感驾驶室:控制乘客运动感觉,提高自动驾驶舒适度
  • 批准号:
    24K02978
  • 财政年份:
    2024
  • 资助金额:
    $ 42.22万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
World Crime Fiction: Making Sense of a Global Genre
世界犯罪小说:理解全球类型
  • 批准号:
    DP240102250
  • 财政年份:
    2024
  • 资助金额:
    $ 42.22万
  • 项目类别:
    Discovery Projects
Discovering How Root Sense Hard Soils
探索根系如何感知硬土
  • 批准号:
    EP/Y036697/1
  • 财政年份:
    2024
  • 资助金额:
    $ 42.22万
  • 项目类别:
    Research Grant
Applying a complex systems perspective to investigate the relationship between choreography and agent-based modeling as tools for scientific sense-making
应用复杂系统的视角来研究编排和基于代理的建模之间的关系,作为科学意义构建的工具
  • 批准号:
    2418539
  • 财政年份:
    2024
  • 资助金额:
    $ 42.22万
  • 项目类别:
    Continuing Grant
Augmented Social Play (ASP): smartphone-enabled group psychotherapeutic interventions that boost adolescent mental health by supporting real-world connection and sense of belonging
增强社交游戏 (ASP):智能手机支持的团体心理治疗干预措施,通过支持现实世界的联系和归属感来促进青少年心理健康
  • 批准号:
    10077933
  • 财政年份:
    2023
  • 资助金额:
    $ 42.22万
  • 项目类别:
    EU-Funded
University of Essex (The) and Soil Moisture Sense Limited KTP 22_23 R1
埃塞克斯大学 (The) 和 Soil Moisture Sense Limited KTP 22_23 R1
  • 批准号:
    10032462
  • 财政年份:
    2023
  • 资助金额:
    $ 42.22万
  • 项目类别:
    Knowledge Transfer Partnership
Operation support for personal mobility considering the sense of agency based on the estimation of user's intention
基于用户意图估计的考虑代理感的个人移动操作支持
  • 批准号:
    23K11269
  • 财政年份:
    2023
  • 资助金额:
    $ 42.22万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了