Scaling Genetic Programming to Complex Reinforcement Learning Tasks
将遗传编程扩展到复杂的强化学习任务
基本信息
- 批准号:RGPIN-2020-04438
- 负责人:
- 金额:$ 2.11万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Reinforcement learning (RL) represents a type of task in which an agent interacts with an environment to maximize its long term reward. A lot of progress has recently been made with deep learning under high-dimensional state and action spaces. This means that rather than having to first develop a suite of appropriate input features, sensors such as video can be used directly. An enormous number of applications have benefited from this development, from algorithms that play Go and Chess better than humans, to facilitating new levels of human competitive performance for robot control tasks. However, one drawback of such an approach is that they generally represent complex black box solutions that require hardware support to deploy, even after training. We recently proposed an alternative approach for scaling RL to high-dimensional state spaces using genetic programming. To do so, teams of programs self organize into Tangled Program Graphs (TPG), which represents an approach of organizing teams of programs into graphs. Our initial benchmarking under high-dimensional RL tasks demonstrates that equivalent quality solutions can be discovered, but with multiple orders of magnitude lower complexity. The proposed research program will greatly expand on the TPG approach to efficiently discover solutions to non-reactive RL tasks requiring multiple simultaneous actions per time step. The long term research program is organized around three objectives: 1) Support for the Automatic identification of behavioural subgraphs: provides the basis for task transfer, accelerated training and increased transparency of machine learning solutions. 2) Develop Multiple concurrent memory models: is the basis for scaling TPG to a wide cross section of non-reactive RL tasks. Without this, it would not be possible to scale to partially observable problems, a class of tasks of widespread impact. 3) Support for describing actions as Multi-dimensional spaces: means that decisions involving multiple real and discrete actions per state can be made simultaneously. A capability that also potentially appears in many applications. Successful completion of this research program will result in a TPG framework that provides solution quality complementing those from deep learning. However, TPG constructs solutions by explicitly discovering mechanisms for decomposing the decision making task. This means that solutions are light-weight, executing in real-time without any form of hardware support. The simplicity of solutions will also support insights into attribute support and solution transparency. This is particularly important when attempting to gain knowledge from solutions post training. Success in the proposed research program would demonstrate new models for addressing open ended questions regarding the application and deployment of RL agents to navigation, motor control and strategic decision making in real-time partially observable environments.
强化学习(RL)代表一种任务类型,其中代理与环境交互以最大化其长期奖励。近年来,在高维状态和动作空间下的深度学习已经取得了很大的进展。这意味着无需首先开发一套合适的输入功能,可以直接使用视频等传感器。大量的应用程序受益于这一发展,从比人类更擅长围棋和国际象棋的算法,到促进人类在机器人控制任务中的竞争表现的新水平。然而,这种方法的一个缺点是,它们通常表示复杂的黑盒解决方案,需要硬件支持才能部署,甚至在培训之后也是如此。我们最近提出了一种使用遗传规划将RL扩展到高维状态空间的替代方法。为此,程序团队自组织成纠缠程序图(TPG),它代表了一种将程序团队组织成图的方法。我们在高维RL任务下的初步基准测试表明,可以发现同等质量的解决方案,但复杂性降低了多个数量级。提出的研究计划将极大地扩展TPG方法,以有效地发现每个时间步需要多个同时动作的非反应性RL任务的解决方案。长期研究计划围绕三个目标组织:1)支持行为子图的自动识别:为任务转移,加速训练和提高机器学习解决方案的透明度提供基础。2)开发多个并发内存模型:这是将TPG扩展到广泛的非反应性RL任务的基础。没有这一点,就不可能扩展到部分可观察到的问题,这是一类具有广泛影响的任务。3)支持将动作描述为多维空间:意味着可以同时做出涉及每个状态的多个真实和离散动作的决策。这一功能也可能出现在许多应用程序中。该研究项目的成功完成将产生一个TPG框架,该框架提供的解决方案质量与深度学习的解决方案质量相辅相成。然而,TPG通过显式地发现分解决策任务的机制来构建解决方案。这意味着解决方案是轻量级的,无需任何形式的硬件支持即可实时执行。解决方案的简单性还将支持对属性支持和解决方案透明性的洞察。当试图从培训后的解决方案中获得知识时,这一点尤为重要。该研究计划的成功将展示解决开放式问题的新模型,这些问题涉及RL代理在实时部分可观察环境中的导航、运动控制和战略决策的应用和部署。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Heywood, Malcolm其他文献
Heywood, Malcolm的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Heywood, Malcolm', 18)}}的其他基金
Scaling Genetic Programming to Complex Reinforcement Learning Tasks
将遗传编程扩展到复杂的强化学习任务
- 批准号:
RGPIN-2020-04438 - 财政年份:2021
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Scaling Genetic Programming to Complex Reinforcement Learning Tasks
将遗传编程扩展到复杂的强化学习任务
- 批准号:
RGPIN-2020-04438 - 财政年份:2020
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Permutation based task transfer for genetic programming
基于排列的遗传编程任务转移
- 批准号:
RGPIN-2015-06117 - 财政年份:2019
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Coevolutionary automatic game content generation of physics and flighting style games
物理和飞行风格游戏的协同进化自动游戏内容生成
- 批准号:
499792-2016 - 财政年份:2018
- 资助金额:
$ 2.11万 - 项目类别:
Collaborative Research and Development Grants
Permutation based task transfer for genetic programming
基于排列的遗传编程任务转移
- 批准号:
RGPIN-2015-06117 - 财政年份:2018
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Permutation based task transfer for genetic programming
基于排列的遗传编程任务转移
- 批准号:
RGPIN-2015-06117 - 财政年份:2017
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Coevolutionary automatic game content generation of physics and flighting style games
物理和飞行风格游戏的协同进化自动游戏内容生成
- 批准号:
499792-2016 - 财政年份:2017
- 资助金额:
$ 2.11万 - 项目类别:
Collaborative Research and Development Grants
Permutation based task transfer for genetic programming
基于排列的遗传编程任务转移
- 批准号:
RGPIN-2015-06117 - 财政年份:2016
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Coevolutionary automatic game content generation of physics and flighting style games
物理和飞行风格游戏的协同进化自动游戏内容生成
- 批准号:
499792-2016 - 财政年份:2016
- 资助金额:
$ 2.11万 - 项目类别:
Collaborative Research and Development Grants
Constructing risk predictors for mobile device behaviour analytics
构建移动设备行为分析的风险预测器
- 批准号:
485070-2015 - 财政年份:2015
- 资助金额:
$ 2.11万 - 项目类别:
Engage Grants Program
相似海外基金
Generation of human inner ear organoids via genetic programming
通过基因编程生成人类内耳类器官
- 批准号:
10425588 - 财政年份:2022
- 资助金额:
$ 2.11万 - 项目类别:
Evolution and optimization of synthetic <READ/WRITE> function from and into cells using genetic programming
使用遗传编程从细胞中进化和优化合成<READ/WRITE>功能
- 批准号:
10668511 - 财政年份:2021
- 资助金额:
$ 2.11万 - 项目类别:
Scaling Genetic Programming to Complex Reinforcement Learning Tasks
将遗传编程扩展到复杂的强化学习任务
- 批准号:
RGPIN-2020-04438 - 财政年份:2021
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Genetic Programming for Big Data Analytics
大数据分析的基因编程
- 批准号:
DE210101808 - 财政年份:2021
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Early Career Researcher Award
A Semantic Genetic Programming Approach to Evolving Convolutional Neural Networks
进化卷积神经网络的语义遗传编程方法
- 批准号:
564963-2021 - 财政年份:2021
- 资助金额:
$ 2.11万 - 项目类别:
University Undergraduate Student Research Awards
Genetic Programming Techniques for Modelling and Design
用于建模和设计的遗传编程技术
- 批准号:
RGPIN-2016-03653 - 财政年份:2021
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Benchmarking linear Genetic Programming in multi-class classification tasks
多类分类任务中线性遗传规划的基准测试
- 批准号:
564645-2021 - 财政年份:2021
- 资助金额:
$ 2.11万 - 项目类别:
University Undergraduate Student Research Awards
Evolution and optimization of synthetic <READ/WRITE> function from and into cells using genetic programming
使用遗传编程从细胞中进化和优化合成<READ/WRITE>功能
- 批准号:
10488161 - 财政年份:2021
- 资助金额:
$ 2.11万 - 项目类别:
Scaling Genetic Programming to Complex Reinforcement Learning Tasks
将遗传编程扩展到复杂的强化学习任务
- 批准号:
RGPIN-2020-04438 - 财政年份:2020
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Grants Program - Individual
Uniting Emergent Modularity and Temporal Memory in Genetic Programming
在遗传编程中结合紧急模块化和时间记忆
- 批准号:
532961-2019 - 财政年份:2020
- 资助金额:
$ 2.11万 - 项目类别:
Postdoctoral Fellowships