Developing Foundation Model Capabilities for Video Understanding in the Open World

开发开放世界中视频理解的基础模型能力

基本信息

  • 批准号:
    2711268
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Studentship
  • 财政年份:
    2022
  • 资助国家:
    英国
  • 起止时间:
    2022 至 无数据
  • 项目状态:
    未结题

项目摘要

DescriptionThe goal of this project is to develop open-world deep learning models for video understanding that allow users to ask queries about video content using natural language descriptions. Rather than training deep learning models from scratch, the methods developed will leverage pre-trained foundation models. A foundation model is a machine learning model trained on large quantities of data that can be adapted to solve a wide variety of downstream tasks. Extending a foundation model to solve a specific problem usually requires less data than training it from scratch and improves the generalizability of the specialist model. Methods will be developed to solve open-world image-level problems requiring natural language input with pre-trained foundation models. Insights from these developments will inspire the construction of models that solve analogous problems in videos. Examples of problems include counting text-specified objects in images and repetitions in videos and answering queries about the area, shape, and structure of objects in a scene. Rather than solving a problem for a particular class, the models developed will allow users to solve the problem for any arbitrary class by providing text input about the class of interest at inference time. Importantly, adapting such open-world models to new classes would require no additional training or data, even if the class were unseen during training. Hence, this work will result in AI systems that are more accessible to the general public, who may not have access to the large quantities of labelled data and compute typically necessary to train class-specific models.Aims & Objectives1. Develop models for image understanding that allow users to ask questions on the image content using natural language.2. Leverage insights and methods from step (1) to develop models with similar capabilities for video understanding that allow users to ask questions on video content using natural language. For instance, a model developed to count objects in images using text in step (1) could inspire a model to count objects in videos using text in step (2).3. Iterate on steps (1) and (2), adding more capabilities. Novelty of the Research MethodologyWhile leveraging pre-trained vision-language foundation models for tasks such as image retrieval, object detection, and instance segmentation has been significantly explored for images, similar developments have been less explored for videos. This is because learning from videos is more complex due to an additional temporal dimension. Furthermore, methods developed in this project will include novel deep learning architectures that are more general and perform better at existing tasks or that solve new problems such as repetition counting in videos using natural language descriptions and answering arbitrary natural language queries about the size, shape, and structure of objects.Alignment to the EPSRC's Strategies & Research AreasThis project relates to the "Artificial Intelligence Technologies" research area.Any Companies or Collaborators Involved?No.
本项目的目标是开发用于视频理解的开放世界深度学习模型,允许用户使用自然语言描述来询问有关视频内容的问题。开发的方法将利用预先训练的基础模型,而不是从头开始训练深度学习模型。基础模型是一种基于大量数据训练的机器学习模型,这些数据可以适用于解决各种下游任务。扩展基础模型来解决特定问题通常需要比从头开始训练更少的数据,并提高专家模型的泛化能力。将开发各种方法来解决开放世界的图像级问题,这些问题需要通过预先训练的基础模型进行自然语言输入。来自这些发展的见解将启发构建解决视频中类似问题的模型。问题的例子包括计算图像中的文本指定对象和视频中的重复,以及回答关于场景中对象的面积、形状和结构的询问。开发的模型将允许用户通过在推理时提供关于感兴趣的类的文本输入来解决任意类的问题,而不是针对特定类解决问题。重要的是,使这种开放世界模型适应新的班级不需要额外的培训或数据,即使在培训期间看不到班级。因此,这项工作将导致人工智能系统更容易为普通公众所访问,他们可能无法访问训练特定类别的模型通常所需的大量标记数据和计算。开发图像理解模型,允许用户使用自然语言就图像内容提问。利用第(1)步中的见解和方法开发具有类似视频理解功能的模型,允许用户使用自然语言提出有关视频内容的问题。例如,在步骤(1)中开发的使用文本对图像中的对象进行计数的模型可以激励在步骤(2)中使用文本对视频中的对象进行计数的模型。重复第(1)步和第(2)步,添加更多功能。研究方法的新颖性虽然在图像检索、对象检测和实例分割等任务中利用预先训练的视觉语言基础模型进行了大量探索,但在视频中类似的开发探索较少。这是因为从视频中学习由于额外的时间维度而更加复杂。此外,在这个项目中开发的方法将包括新的深度学习架构,这些架构更通用,在现有任务中表现更好,或者解决新的问题,例如使用自然语言描述视频中的重复计数,以及回答关于对象的大小、形状和结构的任意自然语言查询。与EPSRC的战略和研究区域保持一致这个项目涉及“人工智能技术”研究领域。是否有公司或合作者参与?没有。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

其他文献

吉治仁志 他: "トランスジェニックマウスによるTIMP-1の線維化促進機序"最新医学. 55. 1781-1787 (2000)
Hitoshi Yoshiji 等:“转基因小鼠中 TIMP-1 的促纤维化机制”现代医学 55. 1781-1787 (2000)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
LiDAR Implementations for Autonomous Vehicle Applications
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
生命分子工学・海洋生命工学研究室
生物分子工程/海洋生物技术实验室
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
吉治仁志 他: "イラスト医学&サイエンスシリーズ血管の分子医学"羊土社(渋谷正史編). 125 (2000)
Hitoshi Yoshiji 等人:“血管医学与科学系列分子医学图解”Yodosha(涉谷正志编辑)125(2000)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Effect of manidipine hydrochloride,a calcium antagonist,on isoproterenol-induced left ventricular hypertrophy: "Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,K.,Teragaki,M.,Iwao,H.and Yoshikawa,J." Jpn Circ J. 62(1). 47-52 (1998)
钙拮抗剂盐酸马尼地平对异丙肾上腺素引起的左心室肥厚的影响:“Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:

的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('', 18)}}的其他基金

An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
  • 批准号:
    2901954
  • 财政年份:
    2028
  • 资助金额:
    --
  • 项目类别:
    Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
  • 批准号:
    2896097
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
  • 批准号:
    2780268
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
  • 批准号:
    2908918
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
  • 批准号:
    2908693
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
  • 批准号:
    2908917
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
  • 批准号:
    2879438
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
CDT year 1 so TBC in Oct 2024
CDT 第 1 年,预计 2024 年 10 月
  • 批准号:
    2879865
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
  • 批准号:
    2890513
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
  • 批准号:
    2876993
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship

相似海外基金

SBIR Phase I: An Artificial Intelligence System to Accelerate Semiconductor Production using Physics-embedded Lithographic Foundation Model
SBIR 第一阶段:使用物理嵌入式光刻基础模型加速半导体生产的人工智能系统
  • 批准号:
    2336079
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
microscopic foundation of the shell model based on the scattering theory and the many-body perturbation theory
基于散射理论和多体摄动理论的壳模型微观基础
  • 批准号:
    23K03420
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Clinical foundation model for structured clinical data
结构化临床数据的临床基础模型
  • 批准号:
    10639397
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
Building Foundation for CPS Security Risk Evaluation based on Continuous State-Space Model
基于连续状态空间模型的CPS安全风险评估奠定基础
  • 批准号:
    22K21272
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Research Activity Start-up
Setting the foundation for the development of a patient- and provider-informed cataract surgery care model
为开发患者和提供者知情的白内障手术护理模式奠定基础
  • 批准号:
    457435
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
    Operating Grants
Research on the input trade model with Keynesian unemployment: A theoretical foundation of international input-output analysis
凯恩斯主义失业下的投入贸易模型研究:国际投入产出分析的理论基础
  • 批准号:
    21K01437
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Building a foundation for elucidating speciation mechanisms mediate genome polyploidization using non-model plants
使用非模型植物为阐明介导基因组多倍化的物种形成机制奠定基础
  • 批准号:
    21K20580
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Research Activity Start-up
Development of seismic response model for piles and foundation girders considering strong ground motion and ground displacement
考虑强地震动和地面位移的桩基梁地震响应模型的开发
  • 批准号:
    21H01477
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Conducting an energy study and model of the Ross-Island Energy Grid and McMurdo Power Plant, in support of the National Science Foundation and the United States Antarctic Program.
在国家科学基金会和美国南极计划的支持下,对罗斯岛能源网和麦克默多发电厂进行能源研究和模型。
  • 批准号:
    2034195
  • 财政年份:
    2020
  • 资助金额:
    --
  • 项目类别:
    Contract Interagency Agreement
Development of analysis model for backward erosion piping in foundation under levee based on new concept of seepage flow
基于渗流新概念的堤下基础反冲管道分析模型建立
  • 批准号:
    20K14841
  • 财政年份:
    2020
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了