Scalable Autonomous Reinforcement Learning - From scratch to less and less structure

可扩展的自主强化学习——从头开始到越来越少的结构

基本信息

项目摘要

Over the course of the last decade, the framework of reinforcement learning (RL) has developed into a promising tool for learning a large variety of different tasks in robotics. During this timeframe, a lot of progress has been made towards scaling reinforcement learning to high-dimensional systems and solving tasks of increasing complexity. Unfortunately, this scalability has been achieved by using expert knowledge to pre-structure the learning problem in several dimensions. As a consequence, the state-of-the-art methods in robot reinforcement learning generally depend on hand-crafted state representations, pre-structured parametrized policies, well-shaped reward functions and demonstrations by a human expert to aid scaling of the learning algorithm.In this proposal, we want to advance the field by starting with a 'classical' reinforcement learning setting for a challenging robotic task (i.e., tetherball). Solving this task by RL methods will be already a valuable contribution. From there on, we will start to identify the components for which the learning task design still needs engineering experience. In the course of this proposal, we show how we aim to drive each of these components towards more autonomy while developing highly scalable approaches.To this end, we will develop systematic methods to increase the autonomy of the learning system by going beyond traditional approaches: (1) proposing methods for learning state representations for reinforcement learning automatically; (2) developing generic policy classes capable of representing the large variety of control policies that are necessary for truly autonomous behavior; (3) discovering informative reward functions autonomously. Progress in each of these aspects will lift the learning algorithm to a higher level of autonomy. The advances will be grounded in the well established theoretical framework of policy search and enabled through improvements to state-of-the-art reinforcement learning algorithms. Ultimately the resulting system should learn how to map raw sensory inputs to raw control signals from simple, generic principles, discovering structure within its environment automatically and solving difficult control tasks without expert knowledge. If successful, both the complete methodology developed within this project as well as sub-parts of it will help to establish a new, substantially more powerful generation of reinforcement learning algorithms that are capable of solving complicated robot control problems autonomously.
在过去的十年中,强化学习(RL)框架已经发展成为一种有前途的工具,用于学习机器人技术中的各种不同任务。在此期间,在将强化学习扩展到高维系统和解决日益复杂的任务方面取得了很大进展。不幸的是,这种可扩展性是通过使用专业知识在多个维度上预先构建学习问题来实现的。因此,机器人强化学习中最先进的方法通常依赖于手工制作的状态表示、预先结构化的参数化策略、良好的奖励函数以及人类专家的演示,以帮助扩展学习算法。在本提案中,我们希望通过针对具有挑战性的机器人任务(即绳球)的“经典”强化学习设置来推进该领域的发展。通过强化学习方法解决这项任务已经是一个有价值的贡献。从那时起,我们将开始确定学习任务设计仍需要工程经验的组件。在本提案的过程中,我们展示了如何在开发高度可扩展的方法的同时,推动每个组件实现更大的自主性。为此,我们将超越传统方法,开发系统方法来增加学习系统的自主性:(1)提出自动学习强化学习状态表示的方法; (2) 开发能够代表真正自主行为所需的各种控制策略的通用策略类; (3) 自主发现信息奖励函数。这些方面的进步都将把学习算法提升到更高的自主水平。这些进步将基于完善的政策搜索理论框架,并通过改进最先进的强化学习算法来实现。最终,最终的系统应该学习如何根据简单、通用的原理将原始感官输入映射到原始控制信号,自动发现其环境中的结构,并在没有专业知识的情况下解决困难的控制任务。如果成功,该项目中开发的完整方法及其子部分将有助于建立新一代、更强大的强化学习算法,能够自主解决复杂的机器人控制问题。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Manifold-based multi-objective policy search with sample reuse
  • DOI:
    10.1016/j.neucom.2016.11.094
  • 发表时间:
    2017-11
  • 期刊:
  • 影响因子:
    6
  • 作者:
    Simone Parisi;Matteo Pirotta;Jan Peters
  • 通讯作者:
    Simone Parisi;Matteo Pirotta;Jan Peters
Reinforcement learning vs human programming in tetherball robot games
Goal-driven dimensionality reduction for reinforcement learning
Local-utopia policy selection for multi-objective reinforcement learning
多目标强化学习的本地乌托邦策略选择
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Professor Dr. Joschka Bödecker, since 4/2015其他文献

Professor Dr. Joschka Bödecker, since 4/2015的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

CAREER: Temporal Causal Reinforcement Learning and Control for Autonomous and Swarm Cyber-Physical Systems
职业:自治和群体网络物理系统的时间因果强化学习和控制
  • 批准号:
    2339774
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Cross-Layer Uncertainty-Aware Reinforcement Learning for Safe Autonomous Driving
用于安全自动驾驶的跨层不确定性感知强化学习
  • 批准号:
    EP/Y002644/1
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Research Grant
CPS: Small: NSF-DST: Safety-Aware Behaviour-Driven Reinforcement Learning Based Autonomous Driving Solution for Urban Areas
CPS:小型:NSF-DST:基于安全意识行为驱动的强化学习的城市自动驾驶解决方案
  • 批准号:
    2343167
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Development of Collision Avoidance System for Maritime Autonomous Surface Ship: Imitating and Surpassing Human Experts by Deep Inverse Reinforcement Learning
海上自主水面船舶防撞系统开发:通过深度逆强化学习模仿并超越人类专家
  • 批准号:
    22KJ2623
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Human-Drones Teaming for Autonomous Rescue with Fast Reinforcement Learning
人机协作通过快速强化学习进行自主救援
  • 批准号:
    2788094
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Studentship
Intelligent and Integrated Control of V2X-Enabled Autonomous Vehicles using Deep Reinforcement Learning
使用深度强化学习对支持 V2X 的自动驾驶车辆进行智能集成控制
  • 批准号:
    RGPIN-2021-02839
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Discovery Grants Program - Individual
Reinforcement-Learning Based Routing for Public/Private Partnership in Autonomous Transportation Systems
自主交通系统中基于强化学习的公共/私人合作伙伴关系的路由
  • 批准号:
    546706-2020
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Autonomous Control of Heavy Machinery by Deep Reinforcement Learning for Automation of Skilled Work
通过深度强化学习实现重型机械自主控制,实现熟练工作自动化
  • 批准号:
    22K04273
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Development of inverse reinforcement learning focusing on the multiobjective nature of humans and autonomous systems: towards zero risk and comfort maximization.
逆向强化学习的发展侧重于人类和自主系统的多目标性质:实现零风险和舒适度最大化。
  • 批准号:
    22H03665
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Intelligent and Integrated Control of V2X-Enabled Autonomous Vehicles using Deep Reinforcement Learning
使用深度强化学习对支持 V2X 的自动驾驶车辆进行智能集成控制
  • 批准号:
    RGPIN-2021-02839
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了