Scaling Genetic Programming to Complex Reinforcement Learning Tasks
将遗传编程扩展到复杂的强化学习任务
基本信息
- 批准号:RGPIN-2020-04438
- 负责人:
- 金额:$ 2.11万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2020
- 资助国家:加拿大
- 起止时间:2020-01-01 至 2021-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Reinforcement learning (RL) represents a type of task in which an agent interacts with an environment to maximize its long term reward. A lot of progress has recently been made with deep learning under high-dimensional state and action spaces. This means that rather than having to first develop a suite of appropriate input features, sensors such as video can be used directly. An enormous number of applications have benefited from this development, from algorithms that play Go and Chess better than humans, to facilitating new levels of human competitive performance for robot control tasks. However, one drawback of such an approach is that they generally represent complex black box solutions that require hardware support to deploy, even after training.
We recently proposed an alternative approach for scaling RL to high-dimensional state spaces using genetic programming. To do so, teams of programs self organize into Tangled Program Graphs (TPG), which represents an approach of organizing teams of programs into graphs. Our initial benchmarking under high-dimensional RL tasks demonstrates that equivalent quality solutions can be discovered, but with multiple orders of magnitude lower complexity. 
The proposed research program will greatly expand on the TPG approach to efficiently discover solutions to non-reactive RL tasks requiring multiple simultaneous actions per time step. The long term research program is organized around three objectives:
1) Support for the Automatic identification of behavioural subgraphs: provides the basis for task transfer, accelerated training and increased transparency of machine learning solutions.
2) Develop Multiple concurrent memory models: is the basis for scaling TPG to a wide cross section of non-reactive RL tasks. Without this, it would not be possible to scale to partially observable problems, a class of tasks of widespread impact.
3) Support for describing actions as Multi-dimensional spaces: means that decisions involving multiple real and discrete actions per state can be made simultaneously. A capability that also potentially appears in many applications.
Successful completion of this research program will result in a TPG framework that provides solution quality complementing those from deep learning. However, TPG constructs solutions by explicitly discovering mechanisms for decomposing the decision making task. This means that solutions are light-weight, executing in real-time without any form of hardware support. The simplicity of solutions will also support insights into attribute support and solution transparency. This is particularly important when attempting to gain knowledge from solutions post training. Success in the proposed research program would demonstrate new models for addressing open ended questions regarding the application and deployment of RL agents to navigation, motor control and strategic decision making in real-time partially observable environments.
强化学习(RL)是一种任务,其中代理与环境交互以最大化其长期奖励。最近,在高维状态和动作空间下的深度学习取得了很大进展。这意味着不必首先开发一套适当的输入功能,而可以直接使用视频等传感器。大量的应用程序从这一发展中受益,从比人类更好地下围棋和国际象棋的算法,到促进机器人控制任务的人类竞争性能的新水平。然而,这种方法的一个缺点是,它们通常代表复杂的黑盒解决方案,即使在培训之后也需要硬件支持来部署。
我们最近提出了一种替代方法,使用遗传编程将RL扩展到高维状态空间。为了做到这一点,程序团队自组织成缠结程序图(TPG),这代表了一种将程序团队组织成图的方法。我们在高维RL任务下的初始基准测试表明,可以发现同等质量的解决方案,但复杂性要低多个数量级。 
拟议的研究计划将大大扩展TPG方法,以有效地发现解决方案,非反应性RL任务,需要多个同时行动,每个时间步。长期研究计划围绕三个目标组织:
1)支持行为子图的自动识别:为任务转移、加速训练和提高机器学习解决方案的透明度提供基础。
2)开发多个并发内存模型:是将TPG扩展到广泛的非反应性RL任务的基础。没有这一点,就不可能扩展到部分可观察的问题,这是一类具有广泛影响的任务。
3)支持将动作描述为多维空间:意味着可以同时做出涉及每个状态的多个真实的和离散动作的决策。这种能力也可能出现在许多应用程序中。
成功完成这项研究计划将产生一个TPG框架,提供解决方案质量,补充深度学习的解决方案质量。然而,TPG通过明确地发现分解决策任务的机制来构建解决方案。这意味着解决方案是轻量级的,实时执行,无需任何形式的硬件支持。解决方案的简单性还将支持对属性支持和解决方案透明度的深入了解。当试图从培训后的解决方案中获得知识时,这一点尤其重要。拟议研究计划的成功将展示新的模型,用于解决有关RL代理在实时部分可观察环境中的导航,运动控制和战略决策的应用和部署的开放式问题。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
                item.title }}
{{ item.translation_title }}
- DOI:{{ item.doi }} 
- 发表时间:{{ item.publish_year }} 
- 期刊:
- 影响因子:{{ item.factor }}
- 作者:{{ item.authors }} 
- 通讯作者:{{ item.author }} 
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ patent.updateTime }}
Heywood, Malcolm其他文献
Heywood, Malcolm的其他文献
{{
              item.title }}
{{ item.translation_title }}
- DOI:{{ item.doi }} 
- 发表时间:{{ item.publish_year }} 
- 期刊:
- 影响因子:{{ item.factor }}
- 作者:{{ item.authors }} 
- 通讯作者:{{ item.author }} 
{{ truncateString('Heywood, Malcolm', 18)}}的其他基金
Scaling Genetic Programming to Complex Reinforcement Learning Tasks
将遗传编程扩展到复杂的强化学习任务
- 批准号:RGPIN-2020-04438 
- 财政年份:2022
- 资助金额:$ 2.11万 
- 项目类别:Discovery Grants Program - Individual 
Scaling Genetic Programming to Complex Reinforcement Learning Tasks
将遗传编程扩展到复杂的强化学习任务
- 批准号:RGPIN-2020-04438 
- 财政年份:2021
- 资助金额:$ 2.11万 
- 项目类别:Discovery Grants Program - Individual 
Permutation based task transfer for genetic programming
基于排列的遗传编程任务转移
- 批准号:RGPIN-2015-06117 
- 财政年份:2019
- 资助金额:$ 2.11万 
- 项目类别:Discovery Grants Program - Individual 
Coevolutionary automatic game content generation of physics and flighting style games
物理和飞行风格游戏的协同进化自动游戏内容生成
- 批准号:499792-2016 
- 财政年份:2018
- 资助金额:$ 2.11万 
- 项目类别:Collaborative Research and Development Grants 
Permutation based task transfer for genetic programming
基于排列的遗传编程任务转移
- 批准号:RGPIN-2015-06117 
- 财政年份:2018
- 资助金额:$ 2.11万 
- 项目类别:Discovery Grants Program - Individual 
Permutation based task transfer for genetic programming
基于排列的遗传编程任务转移
- 批准号:RGPIN-2015-06117 
- 财政年份:2017
- 资助金额:$ 2.11万 
- 项目类别:Discovery Grants Program - Individual 
Coevolutionary automatic game content generation of physics and flighting style games
物理和飞行风格游戏的协同进化自动游戏内容生成
- 批准号:499792-2016 
- 财政年份:2017
- 资助金额:$ 2.11万 
- 项目类别:Collaborative Research and Development Grants 
Permutation based task transfer for genetic programming
基于排列的遗传编程任务转移
- 批准号:RGPIN-2015-06117 
- 财政年份:2016
- 资助金额:$ 2.11万 
- 项目类别:Discovery Grants Program - Individual 
Coevolutionary automatic game content generation of physics and flighting style games
物理和飞行风格游戏的协同进化自动游戏内容生成
- 批准号:499792-2016 
- 财政年份:2016
- 资助金额:$ 2.11万 
- 项目类别:Collaborative Research and Development Grants 
Constructing risk predictors for mobile device behaviour analytics
构建移动设备行为分析的风险预测器
- 批准号:485070-2015 
- 财政年份:2015
- 资助金额:$ 2.11万 
- 项目类别:Engage Grants Program 
相似海外基金
Generation of human inner ear organoids via genetic programming
通过基因编程生成人类内耳类器官
- 批准号:10425588 
- 财政年份:2022
- 资助金额:$ 2.11万 
- 项目类别:
Scaling Genetic Programming to Complex Reinforcement Learning Tasks
将遗传编程扩展到复杂的强化学习任务
- 批准号:RGPIN-2020-04438 
- 财政年份:2022
- 资助金额:$ 2.11万 
- 项目类别:Discovery Grants Program - Individual 
Evolution and optimization of synthetic <READ/WRITE> function from and into cells using genetic programming
使用遗传编程从细胞中进化和优化合成<READ/WRITE>功能
- 批准号:10668511 
- 财政年份:2021
- 资助金额:$ 2.11万 
- 项目类别:
Scaling Genetic Programming to Complex Reinforcement Learning Tasks
将遗传编程扩展到复杂的强化学习任务
- 批准号:RGPIN-2020-04438 
- 财政年份:2021
- 资助金额:$ 2.11万 
- 项目类别:Discovery Grants Program - Individual 
Genetic Programming for Big Data Analytics
大数据分析的基因编程
- 批准号:DE210101808 
- 财政年份:2021
- 资助金额:$ 2.11万 
- 项目类别:Discovery Early Career Researcher Award 
Evolution and optimization of synthetic <READ/WRITE> function from and into cells using genetic programming
使用遗传编程从细胞中进化和优化合成<READ/WRITE>功能
- 批准号:10488161 
- 财政年份:2021
- 资助金额:$ 2.11万 
- 项目类别:
Benchmarking linear Genetic Programming in multi-class classification tasks
多类分类任务中线性遗传规划的基准测试
- 批准号:564645-2021 
- 财政年份:2021
- 资助金额:$ 2.11万 
- 项目类别:University Undergraduate Student Research Awards 
A Semantic Genetic Programming Approach to Evolving Convolutional Neural Networks
进化卷积神经网络的语义遗传编程方法
- 批准号:564963-2021 
- 财政年份:2021
- 资助金额:$ 2.11万 
- 项目类别:University Undergraduate Student Research Awards 
Genetic Programming Techniques for Modelling and Design
用于建模和设计的遗传编程技术
- 批准号:RGPIN-2016-03653 
- 财政年份:2021
- 资助金额:$ 2.11万 
- 项目类别:Discovery Grants Program - Individual 
Uniting Emergent Modularity and Temporal Memory in Genetic Programming
在遗传编程中结合紧急模块化和时间记忆
- 批准号:532961-2019 
- 财政年份:2020
- 资助金额:$ 2.11万 
- 项目类别:Postdoctoral Fellowships 

 刷新
              刷新
            
















 {{item.name}}会员
              {{item.name}}会员
            



