权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Medium: Learning Task-Specific Representations for Broadly Capable Reinforcement Learning Agents

RI：中：学习具有广泛能力的强化学习代理的特定任务表示

基本信息

批准号：
1955361
负责人：
George Konidaris
金额：
$ 119.97万
依托单位：
Brown University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2020
资助国家：
美国
起止时间：
2020-10-01 至 2024-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1955361&HistoricalAwards=false
关键词：
RI Medium Learning Task Specific

项目摘要

While artificially intelligent agents have achieved expert-level performance on some specialized tasks, progress on designing agents that are broadly capable---able to reach adequate performance on a wide range of tasks---remains elusive. One major obstacle is that the sensors and actuators required by a general-purpose agent must be very complex, to support all the different tasks it may be required to solve. The resulting complexity makes decision-making much harder and drastically hinders the effectiveness of such agents. By contrast, agents that do only one thing can be given much simpler inputs and outputs that are carefully designed to be low-dimensional, highly informative, and task-relevant; such agents often demonstrate satisfactory performance. This project posits that a key requirement for generally intelligent agents is the ability to autonomously formulate such representations for themselves---as abstactions over their complex sensor and actuator spaces---and plans to design new algorithms to do so. AI systems with this ability could be re-tasked to solve many different problems without modification, rather than requiring substantial (and often prohibitive) engineering effort for each new application.This project aims to develop new algorithms that enable agents to learn compact, task-specific abstractions of new problems, by combining and extending techniques for discovering high-level actions, discovering perceptual abstractions that support planning with high-level actions, and formally characterizing the complexity and value loss of using those abstractions. The project will: 1) design new algorithms for reward-driven (and therefore task-specific) perceptual- and action-abstraction discovery; 2) enable inter-task abstraction transfer (which avoids having to re-learn abstractions from scratch each time) through new algorithms for learning generalized skills and constructing modular action-perception-abstraction packages, and new theory characterizing the value loss of using such generalized abstractions; and 3) create principled methods for incrementally constructing a library of modular action-perception abstractions and for adaptively recruiting existing action-state abstractions to solve new tasks.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

虽然人工智能代理已经在一些专门的任务上实现了专家级的性能，但在设计具有广泛能力的代理方面的进展-能够在广泛的任务上达到足够的性能-仍然难以捉摸。一个主要的障碍是，通用代理所需的传感器和执行器必须非常复杂，以支持可能需要解决的所有不同任务。由此产生的复杂性使得决策更加困难，并大大阻碍了这些代理的有效性。相比之下，只做一件事的智能体可以得到更简单的输入和输出，这些输入和输出经过精心设计，具有低维、高信息量和任务相关性;这些智能体通常表现出令人满意的性能。该项目假定，一般智能代理的一个关键要求是能够自主地制定这样的表示为自己-作为抽象的复杂的传感器和执行器空间-并计划设计新的算法来做到这一点。具有这种能力的人工智能系统可以重新分配任务来解决许多不同的问题，而无需修改，而不需要大量的这个项目旨在开发新的算法，使代理能够学习新问题的紧凑，特定于任务的抽象，通过组合和扩展用于发现高级操作的技术，发现支持高级别行动计划的感知抽象，并正式描述使用这些抽象的复杂性和价值损失。该项目将：1）设计奖励驱动的新算法（因此是特定于任务的）感知和动作抽象发现; 2）实现任务间抽象转移（这避免了每次都必须从头开始重新学习抽象），通过用于学习广义技能和构建模块化动作感知抽象包的新算法，以及表征使用这种广义抽象的价值损失的新理论;以及3）创建原则性方法，用于增量地构建模块化动作感知抽象库，并用于自适应地招募现有的动作状态抽象来解决新任务。该奖项反映了NSF的法定使命，并通过使用基金会的智力价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（17）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Robustly Learning Composable Options in Deep Reinforcement Learning

DOI：
10.24963/ijcai.2021/298
发表时间：
2021-08
期刊：
影响因子：
0
作者：
Akhil Bagaria;J. Senthil;Matthew Slivinski;G. Konidaris
通讯作者：
Akhil Bagaria;J. Senthil;Matthew Slivinski;G. Konidaris

Model-based Lifelong Reinforcement Learning with Bayesian Exploration

DOI：
10.48550/arxiv.2210.11579
发表时间：
2022-10
期刊：
ArXiv
影响因子：
0
作者：
Haotian Fu;Shangqun Yu;Michael S. Littman;G. Konidaris
通讯作者：
Haotian Fu;Shangqun Yu;Michael S. Littman;G. Konidaris

Skill Discovery for Exploration and Planning using Deep Skill Graphs

使用深度技能图进行探索和规划的技能发现

DOI：
发表时间：
2021
期刊：
Proceedings of the Thirty-Eighth International Conference on Machine Learning
影响因子：
0
作者：
Bagaria, A;Senthil, J.;Konidaris, G.D.
通讯作者：
Konidaris, G.D.

Autonomous Learning of Object-Centric Abstractions for High-Level Planning

用于高层规划的以对象为中心的抽象的自主学习

DOI：
发表时间：
2022
期刊：
Proceedings of the The Tenth International Conference on Learning Representations
影响因子：
0
作者：
James, S;Rosman, B;Konidaris, G.D.
通讯作者：
Konidaris, G.D.

Coarse-Grained Smoothness for Reinforcement Learning in Metric Spaces

度量空间中强化学习的粗粒度平滑度

DOI：
发表时间：
2023
期刊：
Proceedings of the 26th International Conference on Artificial Intelligence and Statistics
影响因子：
0
作者：
Gottesman, O;Asadi, K;Allen, C;Lobel, S;Konidaris, GD;Littman, ML
通讯作者：
Littman, ML