Towards Open-ended Reinforcement Learning using Synthetic Environment Generation
使用合成环境生成实现开放式强化学习
基本信息
- 批准号:2711309
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:英国
- 项目类别:Studentship
- 财政年份:2022
- 资助国家:英国
- 起止时间:2022 至 无数据
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Sequential decision making problems are ubiquitous in engineering and science. Reinforcement Learning (RL), anAI paradigm where agents learn decision-making skills via trial-and-error interactions with their environment, hasachieved significant success in handling complex decision tasks. However, these agents often struggle to generalize,exhibiting suboptimal performance on previously unseen tasks. Furthermore, once deployed, an AI that underperforms in a new task often lacks opportunities for improvement, as its learning process ceases after the initialtraining phase. These challenges restrict the practical use of such systems in real-worlds settings, such as sim2real.Open-ended Learning (OEL) seeks to overcome these limitations with the goal of producing learning systems thatare robust to situations not explicitly considered during design and training.Aims and ObjectivesWhile there are many possible directions towards achieving OEL agents, this proposal specifically focuses on thedevelopment of automated curricula methods through novel techniques for synthetic environment generation. Assuch, the goals of this proposal are decomposed as follows:Develop new methods for generating synthetic tasks/environments.Use synthetic environment generation to develop novel Unsupervised Environment Design (UED) methods forautomatic environment curricula generation.Show that applying such curricula to agent training produces agents with strong out-of-distribution generalization.Novelty of the Research MethodologyThe proposed research primarily looks to extend work in the developing subfield of Unsupervised EnvironmentDesign (UED), which seeks to generate environments tailored to the current learning agent to facilitate continuedlearning. However, UED is currently limited to generating levels, which are configurations of a specific task e.g.,the layout of a maze in a navigation task. We seek to improve state-of-the-art UED methods to generate entireenvironments, not just levels, which are novel tasks for a more general agent to train on and solve. Doing so willrequire more sophisticated generative AI techniques, where we look to leverage recent advances such as diffusionmethods. Ultimately, this work will operate on the intersection of multiple AI subfields, at a high level namely RLand generative AI.Alignment to EPSRC's strategies and research areasThe proposed work falls under the EPSRC's Artificial intelligence technologies remit. It aligns with EPSRC's goals ina number of ways, including developing new AI techniques that are deployable in real world situations. We will alsowork on the intersection of multiple AI subfields, in line with EPSRC's goal of supporting interdisciplinary researchmethods.
顺序决策问题在工程和科学中普遍存在。强化学习(RL)是一种人工智能范式,智能体通过与环境的试错互动来学习决策技能,在处理复杂的决策任务方面取得了重大成功。然而,这些智能体往往难以泛化,在以前看不见的任务上表现出次优的性能。此外,一旦部署,在新任务中表现不佳的人工智能通常缺乏改进的机会,因为它的学习过程在初始训练阶段之后就停止了。这些挑战限制了此类系统在现实环境(如sim2real)中的实际使用。开放式学习(OEL)旨在克服这些限制,其目标是产生对设计和培训期间未明确考虑的情况具有鲁棒性的学习系统。虽然实现OEL代理有许多可能的方向,但本提案特别侧重于通过合成环境生成的新技术开发自动化课程方法。因此,本提案的目标分解如下:开发生成合成任务/环境的新方法。采用合成环境生成方法,开发了新的无监督环境设计方法,实现了环境课程的自动生成。结果表明,将此类课程应用于智能体训练,可以产生具有较强分布外泛化的智能体。研究方法的新颖性提议的研究主要着眼于扩展无监督环境设计(UED)的发展子领域的工作,该领域旨在生成适合当前学习代理的环境,以促进持续学习。然而,UED目前仅限于生成关卡,即特定任务的配置,例如导航任务中的迷宫布局。我们寻求改进最先进的UED方法来生成整个环境,而不仅仅是关卡,这是一个更一般的智能体需要训练和解决的新任务。这样做将需要更复杂的生成人工智能技术,我们希望利用扩散方法等最新进展。最终,这项工作将在多个人工智能子领域的交叉点上运行,在一个高层次上,即RLand生成人工智能。与EPSRC的战略和研究领域保持一致拟议的工作属于EPSRC的人工智能技术职权范围。它在许多方面与EPSRC的目标保持一致,包括开发可在现实世界中部署的新人工智能技术。我们还将在多个人工智能子领域的交叉领域开展工作,这符合EPSRC支持跨学科研究方法的目标。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
其他文献
吉治仁志 他: "トランスジェニックマウスによるTIMP-1の線維化促進機序"最新医学. 55. 1781-1787 (2000)
Hitoshi Yoshiji 等:“转基因小鼠中 TIMP-1 的促纤维化机制”现代医学 55. 1781-1787 (2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
LiDAR Implementations for Autonomous Vehicle Applications
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
吉治仁志 他: "イラスト医学&サイエンスシリーズ血管の分子医学"羊土社(渋谷正史編). 125 (2000)
Hitoshi Yoshiji 等人:“血管医学与科学系列分子医学图解”Yodosha(涉谷正志编辑)125(2000)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Effect of manidipine hydrochloride,a calcium antagonist,on isoproterenol-induced left ventricular hypertrophy: "Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,K.,Teragaki,M.,Iwao,H.and Yoshikawa,J." Jpn Circ J. 62(1). 47-52 (1998)
钙拮抗剂盐酸马尼地平对异丙肾上腺素引起的左心室肥厚的影响:“Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('', 18)}}的其他基金
An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
- 批准号:
2901954 - 财政年份:2028
- 资助金额:
-- - 项目类别:
Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
- 批准号:
2896097 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
- 批准号:
2780268 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
- 批准号:
2908918 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
- 批准号:
2908693 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
- 批准号:
2908917 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
- 批准号:
2879438 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
- 批准号:
2890513 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
- 批准号:
2876993 - 财政年份:2027
- 资助金额:
-- - 项目类别:
Studentship
相似国自然基金
精子发生中mRNA下游开放阅读框(downstream Open Reading Frame,dORF)的功能研究
- 批准号:
- 批准年份:2022
- 资助金额:54 万元
- 项目类别:面上项目
基于升阶谱方法和Open CASCADE的高阶网格自动生成技术研究
- 批准号:11972004
- 批准年份:2019
- 资助金额:62.0 万元
- 项目类别:面上项目
基于Linked Open Data的Web服务语义互操作关键技术
- 批准号:61373035
- 批准年份:2013
- 资助金额:77.0 万元
- 项目类别:面上项目
变分与拓扑方法和Schrodinger方程中的Open 问题
- 批准号:10871109
- 批准年份:2008
- 资助金额:23.0 万元
- 项目类别:面上项目
相似海外基金
EAGER: Co-Designing a Cognitive Teaching Assistant to Support Evidence-Based Instruction in Open-Ended Learning Environments
EAGER:共同设计认知助教,支持开放式学习环境中的循证教学
- 批准号:
2327708 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Hybrid Human Artificial Collective Intelligence in Open-Ended Decision Making
开放式决策中的混合人类人工智能集体智能
- 批准号:
10037991 - 财政年份:2022
- 资助金额:
-- - 项目类别:
EU-Funded
The theory and practice of 'trans-imperial history': towards an open-ended framework of research
“跨帝国史”的理论与实践:迈向开放式研究框架
- 批准号:
22H00690 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (B)
Enabling rich, open-ended human-robot interaction through robust, advanced multimodal perceptual capabilities for high-level reasoning
通过强大、先进的多模态感知能力进行高级推理,实现丰富、开放式的人机交互
- 批准号:
RGPIN-2019-06047 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
Supporting open-ended play for wellbeing in adulthood through interactive installations using generative systems
通过使用生成系统的互动装置,支持开放式游戏,促进成年后的福祉
- 批准号:
2598279 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Studentship
Enabling rich, open-ended human-robot interaction through robust, advanced multimodal perceptual capabilities for high-level reasoning
通过强大、先进的多模态感知能力进行高级推理,实现丰富、开放式的人机交互
- 批准号:
RGPIN-2019-06047 - 财政年份:2021
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
Enabling rich, open-ended human-robot interaction through robust, advanced multimodal perceptual capabilities for high-level reasoning
通过强大、先进的多模态感知能力进行高级推理,实现丰富、开放式的人机交互
- 批准号:
RGPIN-2019-06047 - 财政年份:2020
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual
Experimental Investigation of Open-Ended Pipe Piles Subjected to Axial and Lateral Loads
轴向和横向荷载下开口管桩的试验研究
- 批准号:
2028672 - 财政年份:2020
- 资助金额:
-- - 项目类别:
Standard Grant
Open-Ended Discovery of Skill Hierarchies in Artificial Intelligence
人工智能技能层次结构的开放式发现
- 批准号:
2278914 - 财政年份:2019
- 资助金额:
-- - 项目类别:
Studentship
Enabling rich, open-ended human-robot interaction through robust, advanced multimodal perceptual capabilities for high-level reasoning
通过强大、先进的多模态感知能力进行高级推理,实现丰富、开放式的人机交互
- 批准号:
RGPIN-2019-06047 - 财政年份:2019
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual