MIMIc: Multimodal Imitation Learning in MultI-Agent Environments
MIMIc:多代理环境中的多模式模仿学习
基本信息
- 批准号:EP/T000783/1
- 负责人:
- 金额:$ 32.99万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2019
- 资助国家:英国
- 起止时间:2019 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In UK, we are not allowed to drive a vehicle until we are 17. It is because, driving is a complex and safety critical activity that requires many advanced cognitive skills like recognition of possible threats, anticipation of behavior of other road users and agile reaction to emerging situations. Think about a football player making decisions on field. A good player can sense the opportunities, through anticipating what other players will do, and select an action that will increase the odds of scoring. It takes a long time for humans to develop these advanced cognitive skills, to become an expert at such complex real-world tasks. Artificial Intelligence has made significant progress during the last decade, demonstrated by breakthroughs in cancer detection, computers beating 'Go' masters and intelligent robotics. However, if AI is to live up to its science fictional promises to assist humanity or even supersede human intelligence, it should at least be equipped with cognitive skills such as those possessed by humans. This project aims to develop ground breaking algorithms that equip autonomous systems with human like cognitive skills required to thrive in real world environments.We are focused on applications that require autonomous agents (e.g. Robot or Driverless car) to interact with multiple intelligent agents in the environment to accomplish a task (known as Multi-Agent Environments: MAEs). Such applications require an agent to anticipate the behaviour of other agents and to select the most appropriate course of actions. Equipping agents with such autonomous decision-making capability is known as policy learning. Compared to policy learning in single agent domains (teaching a robot to walk or a computer to play a video game), the recent progress of policy learning in MAEs has been quite modest. This is due to multiple reasons: 1)Due to agent actions the environment is dynamic 2)multi-agent policy learning suffers from a theoretical limitation known as curse of dimensionality (CoD) 3)Utility functions that capture agent objectives are difficult to define 4)there is a significant lack of adequate multi-agent datasets that allow meaningful research. This project proposes to undertake research in to policy learning in MAEs, by addressing the above limitations. Our unique approach to policy learning in MAEs is motivated by how humans thrive in similar settings. Firstly, we perceive the world through multiple senses, (i.e. vision, audition, touch) enabling a rich perception of the world. Secondly, when acting in a MAE, humans do not pay attention to all the stimuli but only to key stimuli e.g. when a football player is attacking the ball, the player pays attention only to the teammates capable of effecting a goal and the key defenders. Finally, the learning paradigm we employ known as imitation learning is an emerging methodology to learn by observing experts, which is a productive approach that we use to learn new skills. Accordingly, we propose to learn realistic policies in MAEs through imitation learning by leveraging multimodal data fusion and selective-attention modelling. Multimodal data fusion allows to capture high dimensional context of the real world and selective attention model allows for allaying the issue of CoD. We have been provided a unique multimodal multi-agent dataset and access to state-of-the-art facilities to capture data, by an elite football club facilitating this ambitious research project.The project outputs will be subjectively validated as a tool to answer "what-if" questions related to game play in football assisting coaching staff to visualize speculative game strategies, and as a computational benchmark to quantify cognitive skills of football players. The planned impact activities will ensure the project will leave a legacy in AI development benefiting UK PLC through significant contribution in multiple high growth areas, such as driverless vehicles, video gaming, and assistive robots.
在英国,17岁之前不允许开车。这是因为,驾驶是一项复杂的、对安全至关重要的活动,需要许多高级认知技能,如识别潜在威胁、预测其他道路使用者的行为以及对新出现的情况做出灵活反应。想一想一个足球运动员在球场上做决定。一个好的球员可以通过预测其他球员会做什么来感知机会,并选择一个可以增加得分几率的动作。人类需要很长时间来发展这些高级认知技能,才能成为如此复杂的现实世界任务的专家。人工智能在过去十年中取得了重大进展,表现在癌症检测、计算机击败围棋大师和智能机器人方面的突破。然而,如果人工智能要兑现其科学虚构的承诺,帮助人类,甚至取代人类的智能,它至少应该配备人类拥有的认知技能。该项目旨在开发开创性的算法,使自主系统具备在现实世界环境中茁壮成长所需的类似人类的认知技能。我们专注于需要自主代理(例如机器人或无人驾驶汽车)与环境中的多个智能代理交互以完成任务的应用(称为多代理环境:MAES)。这类应用要求代理预测其他代理的行为,并选择最合适的操作过程。为代理配备这种自主决策能力称为策略学习。与单智能体领域中的策略学习(教机器人行走或计算机玩视频游戏)相比,MAES中的策略学习最近的进展相当温和。这是由于多个原因:1)由于代理行为,环境是动态的;2)多代理策略学习受到称为维度灾难(CoD)的理论限制;3)很难定义捕获代理目标的效用函数;4)严重缺乏足够的多代理数据集来进行有意义的研究。本项目建议通过解决上述限制,开展关于在MAES中进行政策学习的研究。我们在MAES中学习政策的独特方法是受到人类在类似环境中茁壮成长的激励。首先,我们通过多种感官(即视觉、听觉、触觉)感知世界,使我们能够对世界有丰富的感知。其次,当在MAE中行动时,人类并不关注所有的刺激,而只关注关键的刺激。例如,当足球运动员进攻时,球员只关注能够实现进球的队友和关键的后卫。最后,我们采用的被称为模仿学习的学习范式是一种通过观察专家来学习的新兴方法,这是我们用来学习新技能的一种富有成效的方法。因此,我们建议通过模仿学习,利用多通道数据融合和选择性注意建模来学习MAES中的现实政策。多通道数据融合可以捕捉真实世界的高维背景,而选择性注意模型可以缓解CoD问题。一家精英足球俱乐部为这项雄心勃勃的研究项目提供了独特的多模式多主体数据集和使用最先进的设施来获取数据的权限。项目输出将被主观验证为一种工具,用于回答与足球比赛相关的“假设”问题,帮助教练组将投机游戏策略可视化,并作为量化足球运动员认知技能的计算基准。计划中的影响活动将确保该项目将在人工智能开发方面留下遗产,通过在多个高增长领域做出重大贡献,使英国PLC受益,如无人驾驶汽车、视频游戏和辅助机器人。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Learning data-driven decision-making policies in multi-agent environments for autonomous systems
在自治系统的多代理环境中学习数据驱动的决策策略
- DOI:10.1016/j.cogsys.2020.09.006
- 发表时间:2021
- 期刊:
- 影响因子:3.9
- 作者:Hook J
- 通讯作者:Hook J
Learning Control Policies of Driverless Vehicles from UAV Video Streams in Complex Urban Environments
- DOI:10.3390/rs11232723
- 发表时间:2019-11
- 期刊:
- 影响因子:0
- 作者:Katie Inder;V. D. Silva;Xiyu Shi
- 通讯作者:Katie Inder;V. D. Silva;Xiyu Shi
A machine learning framework for quantifying in-game space-control efficiency in football
用于量化足球比赛中空间控制效率的机器学习框架
- DOI:10.1016/j.knosys.2023.111123
- 发表时间:2024
- 期刊:
- 影响因子:8.8
- 作者:Gu C
- 通讯作者:Gu C
Intelligent Systems and Pattern Recognition - Third International Conference, ISPR 2023, Hammamet, Tunisia, May 11-13, 2023, Revised Selected Papers, Part II
智能系统和模式识别 - 第三届国际会议,ISPR 2023,突尼斯哈马马特,2023 年 5 月 11-13 日,修订后的精选论文,第二部分
- DOI:10.1007/978-3-031-46338-9_12
- 发表时间:2024
- 期刊:
- 影响因子:0
- 作者:Artaud C
- 通讯作者:Artaud C
Learning Independently from Causality in Multi-Agent Environments
- DOI:10.5220/0011747900003411
- 发表时间:2023-11
- 期刊:
- 影响因子:0
- 作者:Rafael Pina;V. D. Silva;Corentin Artaud
- 通讯作者:Rafael Pina;V. D. Silva;Corentin Artaud
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Varuna De Silva其他文献
A Coupled Human and Natural Systems (CHANS) framework integrated with reinforcement learning for urban flood mitigation
一个与强化学习相结合的人地耦合系统(CHANS)框架用于城市防洪
- DOI:
10.1016/j.jhydrol.2024.131918 - 发表时间:
2024-11-01 - 期刊:
- 影响因子:6.300
- 作者:
Haoyang Qin;Qiuhua Liang;Huili Chen;Varuna De Silva - 通讯作者:
Varuna De Silva
Use of Machine Learning to Automate the Identification of Basketball Strategies Using Whole Team Player Tracking Data
利用机器学习利用全队球员跟踪数据自动识别篮球策略
- DOI:
10.3390/app10010024 - 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Changjia Tian;Varuna De Silva;M. Caine;Steve Swanson - 通讯作者:
Steve Swanson
Measuring Public Policy Effectiveness in the Age of Data and AI: Insights from COVID-19
衡量数据和人工智能时代公共政策的有效性:来自 COVID-19 的见解
- DOI:
10.1109/globconet56651.2023.10150134 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Haileleol Tibebu;Eden Mekonnen;I. Kakadiaris;Varuna De Silva - 通讯作者:
Varuna De Silva
A High‐Performance Coupled Human And Natural Systems (CHANS) Model for Flood Risk Assessment and Reduction
用于洪水风险评估和减少的高性能人与自然系统耦合 (CHANS) 模型
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:5.4
- 作者:
Haoyang Qin;Qiuhua Liang;Huili Chen;Varuna De Silva - 通讯作者:
Varuna De Silva
A two-way coupled CHANS model for flood emergency management, with a focus on temporary flood defences
- DOI:
10.1016/j.envsoft.2024.106166 - 发表时间:
2024-10-01 - 期刊:
- 影响因子:
- 作者:
Haoyang Qin;Qiuhua Liang;Huili Chen;Varuna De Silva - 通讯作者:
Varuna De Silva
Varuna De Silva的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
HoloSurge: Multimodal 3D Holographic tool and real-time Guidance System with point-of-care diagnostics for surgical planning and interventions on liver and pancreatic cancers
HoloSurge:多模态 3D 全息工具和实时指导系统,具有护理点诊断功能,可用于肝癌和胰腺癌的手术规划和干预
- 批准号:
10103131 - 财政年份:2024
- 资助金额:
$ 32.99万 - 项目类别:
EU-Funded
Where Gesture Meets Grammar: Crosslinguistic Multimodal Communication
手势与语法的结合:跨语言多模式交流
- 批准号:
DP240102369 - 财政年份:2024
- 资助金额:
$ 32.99万 - 项目类别:
Discovery Projects
Exploring the Mechanisms of Multimodal Metaphor Creation in Japanese Children
探索日本儿童多模态隐喻创造的机制
- 批准号:
24K16041 - 财政年份:2024
- 资助金额:
$ 32.99万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
ZooCELL: Tracing the evolution of sensory cell types in animal diversity: multidisciplinary training in 3D cellular reconstruction, multimodal data ..
ZooCELL:追踪动物多样性中感觉细胞类型的进化:3D 细胞重建、多模态数据方面的多学科培训..
- 批准号:
EP/Y037049/1 - 财政年份:2024
- 资助金额:
$ 32.99万 - 项目类别:
Research Grant
Tracing the evolution of sensory cell types in animal diversity: multidisciplinary training in 3D cellular reconstruction, multimodal data analysis
追踪动物多样性中感觉细胞类型的进化:3D 细胞重建、多模式数据分析的多学科培训
- 批准号:
EP/Y037081/1 - 财政年份:2024
- 资助金额:
$ 32.99万 - 项目类别:
Research Grant
mLMT: Multimodal Large Machine Translation Model
mLMT:多模态大型机器翻译模型
- 批准号:
24K20841 - 财政年份:2024
- 资助金额:
$ 32.99万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Next Generation Tools For Genome-Centric Multimodal Data Integration In Personalised Cardiovascular Medicine
个性化心血管医学中以基因组为中心的多模式数据集成的下一代工具
- 批准号:
10104323 - 财政年份:2024
- 资助金额:
$ 32.99万 - 项目类别:
EU-Funded
Integrated multimodal microscopy facility for single molecule analysis
用于单分子分析的集成多模态显微镜设施
- 批准号:
LE240100086 - 财政年份:2024
- 资助金额:
$ 32.99万 - 项目类别:
Linkage Infrastructure, Equipment and Facilities
Towards Evolvable and Sustainable Multimodal Machine Learning
迈向可进化和可持续的多模式机器学习
- 批准号:
DE240100105 - 财政年份:2024
- 资助金额:
$ 32.99万 - 项目类别:
Discovery Early Career Researcher Award
Class-Balanced Contrastive Learning for Multimodal Recognition
多模态识别的类平衡对比学习
- 批准号:
24K20831 - 财政年份:2024
- 资助金额:
$ 32.99万 - 项目类别:
Grant-in-Aid for Early-Career Scientists