Dynamic Abstraction in Reinforcement Learning
强化学习中的动态抽象
基本信息
- 批准号:0218125
- 负责人:
- 金额:$ 19.96万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2002
- 资助国家:美国
- 起止时间:2002-09-01 至 2005-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project investigates reinforcement learning algorithms that use dynamic abstraction to exploit the spatial and temporal structure of complex environments to facilitate learning. The use of abstraction is one of the features of human intelligence that allows us to operate as effectively as we do in complex environments. We systematically ignore details that are not relevant to a task at hand, and we rapidly switch between abstractions when we focus on a succession of subtasks. For example, in planning everyday activities, such as driving to work, we abstract out irrelevant details such as the layout of objects inside the car, but when we actually drive, many of these details become relevant, such as the locations of the steering wheel and the accelerator. Different abstractions are appropriate for different tasks or subtasks, and the agent has to shift abstractions as it shifts to new tasks or to new subtasks.This project combines the theory of options with factored state and action representations to give precise meaning to the concept of dynamic abstractions and to study methods for creating and exploiting them. It will develop formalisms for representing option models in terms of factored state and action representations by extending existing formalisms for single-step dynamic Bayes network models to the multi-time case. It will investigate how the multi-time formulation call facilitate creating and using dynamic abstractions. An algebraic theory of abstraction will be developed by extending relevant concepts from classical automata theory to multi-time factored models. Methods will be developed for learning compact multistep option models by extending an existing mixture model algorithm for learning transition models from single-step to multi-step models. In general the notion of dynamic abstraction will be a valuable tool to apply to many difficult optimization problems in large-scale manufacturing (e.g., factory process control), robotics (navigation), multi-agent coordination, and other state-of-the-art applications of reinforcement learning. Since this research combines ideas from the fields of decision theory, operations research, control theory, cognitive science, and AI, it may provide a useful bridge that has the potential to foster contributions in all of these fields.
该项目研究强化学习算法,这些算法使用动态抽象来利用复杂环境的空间和时间结构来促进学习。使用抽象是人类智能的特征之一,它使我们能够像在复杂环境中一样有效地操作。我们系统地忽略与手头任务无关的细节,当我们专注于一系列子任务时,我们会在抽象之间快速切换。例如,在计划日常活动时,例如开车上班,我们抽象出不相关的细节,例如车内物体的布局,但当我们实际驾驶时,许多细节变得相关,例如方向盘和加速器的位置。不同的抽象适用于不同的任务或子任务,当代理转移到新的任务或新的子任务时,代理必须转移抽象。本项目将选项理论与因子化状态和动作表示相结合,以精确地理解动态抽象的概念,并研究创建和利用它们的方法。它将发展形式主义表示期权模型的因素化的状态和动作表示扩展现有的形式主义单步动态贝叶斯网络模型的多时间的情况下。它将研究多时间公式化调用如何促进创建和使用动态抽象。一个抽象的代数理论将通过从经典自动机理论扩展到多时间因子模型的相关概念来发展。方法将开发用于学习紧凑的多步期权模型,通过扩展现有的混合模型算法学习过渡模型从单步到多步模型。一般来说,动态抽象的概念将是应用于大规模制造中的许多困难的优化问题的有价值的工具(例如,工厂过程控制)、机器人(导航)、多智能体协调以及强化学习的其他最先进应用。由于这项研究结合了决策理论,运筹学,控制理论,认知科学和人工智能领域的思想,它可能提供一个有用的桥梁,有可能促进所有这些领域的贡献。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Andrew Barto其他文献
Andrew Barto的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Andrew Barto', 18)}}的其他基金
CRCNS: Collaborative Research: Neural Correlates of Hierarchical Reinforcement Learning
CRCNS:协作研究:分层强化学习的神经关联
- 批准号:
1208051 - 财政年份:2012
- 资助金额:
$ 19.96万 - 项目类别:
Continuing Grant
NRI-Small: Collaborative Research: Multiple Task Learning from Unstructured Demonstrations
NRI-Small:协作研究:从非结构化演示中进行多任务学习
- 批准号:
1208497 - 财政年份:2012
- 资助金额:
$ 19.96万 - 项目类别:
Standard Grant
SGER: Building Blocks for Creative Search
SGER:创意搜索的构建模块
- 批准号:
0733581 - 财政年份:2007
- 资助金额:
$ 19.96万 - 项目类别:
Standard Grant
Collaborative Research: Intrinsically Motivated Learning in Artificial Agents
协作研究:人工智能体的内在动机学习
- 批准号:
0432143 - 财政年份:2004
- 资助金额:
$ 19.96万 - 项目类别:
Continuing Grant
Lyapunov Methods for Reinforcement Learning
强化学习的李亚普诺夫方法
- 批准号:
0070102 - 财政年份:2000
- 资助金额:
$ 19.96万 - 项目类别:
Standard Grant
KDI: Temporal Abstraction in Reinforcement Learning
KDI:强化学习中的时间抽象
- 批准号:
9980062 - 财政年份:1999
- 资助金额:
$ 19.96万 - 项目类别:
Standard Grant
Multiple Time Scale Reinforcement Learning
多时间尺度强化学习
- 批准号:
9511805 - 财政年份:1995
- 资助金额:
$ 19.96万 - 项目类别:
Continuing Grant
Reinforcement Learning Algorithms Based on Dynamic Programming
基于动态规划的强化学习算法
- 批准号:
9214866 - 财政年份:1992
- 资助金额:
$ 19.96万 - 项目类别:
Continuing Grant
Neural Networks for Adaptive Control
用于自适应控制的神经网络
- 批准号:
8912623 - 财政年份:1989
- 资助金额:
$ 19.96万 - 项目类别:
Continuing Grant
Conference on the Neurone as a Computational Unit, June 28--July 1, 1988, King's College, Cambridge, England
神经元作为计算单元会议,1988 年 6 月 28 日至 7 月 1 日,英国剑桥国王学院
- 批准号:
8808758 - 财政年份:1988
- 资助金额:
$ 19.96万 - 项目类别:
Standard Grant
相似海外基金
An Abstraction-based Technique for Safe Reinforcement Learning
一种基于抽象的安全强化学习技术
- 批准号:
EP/X015823/1 - 财政年份:2023
- 资助金额:
$ 19.96万 - 项目类别:
Research Grant
Multi-agent Reinforcement Learning for Cooperative Policy with Different Abstraction
不同抽象的合作策略的多主体强化学习
- 批准号:
20K23326 - 财政年份:2020
- 资助金额:
$ 19.96万 - 项目类别:
Grant-in-Aid for Research Activity Start-up
Abstraction and Generalisation in Reinforcement Learning
强化学习中的抽象和泛化
- 批准号:
2281998 - 财政年份:2019
- 资助金额:
$ 19.96万 - 项目类别:
Studentship
Improving reinforcement learning by combining imitation with abstraction
通过模仿与抽象相结合来改进强化学习
- 批准号:
389047-2010 - 财政年份:2012
- 资助金额:
$ 19.96万 - 项目类别:
Postgraduate Scholarships - Doctoral
Improving reinforcement learning by combining imitation with abstraction
通过模仿与抽象相结合来改进强化学习
- 批准号:
389047-2010 - 财政年份:2011
- 资助金额:
$ 19.96万 - 项目类别:
Postgraduate Scholarships - Doctoral
Improving reinforcement learning by combining imitation with abstraction
通过模仿与抽象相结合来改进强化学习
- 批准号:
389047-2010 - 财政年份:2010
- 资助金额:
$ 19.96万 - 项目类别:
Postgraduate Scholarships - Doctoral
Abstraction in reinforcement learning
强化学习中的抽象
- 批准号:
238988-2001 - 财政年份:2003
- 资助金额:
$ 19.96万 - 项目类别:
Discovery Grants Program - Individual
Abstraction in reinforcement learning
强化学习中的抽象
- 批准号:
238988-2001 - 财政年份:2002
- 资助金额:
$ 19.96万 - 项目类别:
Discovery Grants Program - Individual
Abstraction in reinforcement learning
强化学习中的抽象
- 批准号:
238988-2001 - 财政年份:2001
- 资助金额:
$ 19.96万 - 项目类别:
Discovery Grants Program - Individual
Abstraction in reinforcement learning
强化学习中的抽象
- 批准号:
238988-2001 - 财政年份:2000
- 资助金额:
$ 19.96万 - 项目类别:
Discovery Grants Program - Individual