Scalable Reinforcement Learning Methods for Learning in Real-Time with Robots

用于机器人实时学习的可扩展强化学习方法

基本信息

  • 批准号:
    RGPIN-2021-02690
  • 负责人:
  • 金额:
    $ 1.75万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2021
  • 资助国家:
    加拿大
  • 起止时间:
    2021-01-01 至 2022-12-31
  • 项目状态:
    已结题

项目摘要

Reinforcement learning brings the promise of continually adaptive systems for numerous tasks that humans do well but are physically laborious such as housekeeping, warehouse fulfillment, and delivery services. Such tasks require a common-sense understanding from the agent's part of a dynamically changing physical environment, which is difficult to enumerate and include in a system through hand-engineering. The proposed program aims at developing real-time learning robotic systems that interact with the physical world and adapt in real-time. Some of the most promising approaches in reinforcement learning for robotics are based on learning from human-provided demonstration data and simulators. However, approaches reliant on human interventions are not scalable or sufficient for developing robotics systems that can adapt their performance in real-time under new or changing environments. Our proposed program complements the existing approaches by developing scalable and automatic mechanisms for continually learning robotic systems. All advanced deep reinforcement learning methods for control use expensive learning mechanisms such as those based on experience replay buffers. While such expensive learning mechanisms are more appropriate for training offline or over clouds, we propose a lightweight onboard learning system to adapt and react to changes quickly in real-time. Our proposed onboard learning system will be composed of computationally inexpensive and stable policy and representation learning algorithms. We consider the policy to be only the last semi-linear layer of the network, for which gradient updates can be made more stably without using replay buffers. In addition, the onboard system will perform representation learning only through random perturbation to a small portion of the hidden nodes. We investigate whether such a lightweight learning system in conjunction with a more expensive replay-based learning system performs better than replay-based learning alone. The proposed program also aims at developing efficient and stable policy and representation learning methods. We develop a theoretical framework that enriches our understanding of how to create new and efficient policy learning methods in a directed way. For representation learning, we extend an existing strategy for representation search called generate-and-test to reinforcement learning. We develop a general mechanism of generate-and-test where the utility of features is defined solely based on the loss function, allowing applicability to any loss function and neural architecture. Computationally inexpensive learning mechanisms are essential for making reinforcement learning systems more accessible and applicable to robotics. The lightweight onboard system of the proposed program will allow graduate students, entrepreneurs, and enthusiasts around the world to build continually learning robots more easily, relieving humans from numerous laborious tasks.
强化学习带来了持续自适应系统的希望,用于人类做得很好但体力劳动的许多任务,如家务,仓库履行和送货服务。这样的任务需要一个常识性的理解,从代理的一部分,一个动态变化的物理环境,这是很难枚举,并包括在一个系统中,通过手工工程。该计划旨在开发实时学习机器人系统,与物理世界进行交互并实时适应。机器人强化学习中一些最有前途的方法是基于从人类提供的演示数据和模拟器中学习。然而,依赖于人类干预的方法对于开发可以在新的或变化的环境下实时调整其性能的机器人系统来说是不可扩展的或不充分的。我们提出的计划通过开发可扩展的自动机制来补充现有的方法,以不断学习机器人系统。所有用于控制的高级深度强化学习方法都使用昂贵的学习机制,例如基于经验重放缓冲区的学习机制。虽然这种昂贵的学习机制更适合离线或云端训练,但我们提出了一种轻量级的板载学习系统,以实时快速适应和响应变化。我们提出的板载学习系统将由计算成本低,稳定的政策和表示学习算法。我们认为该策略只是网络的最后一个半线性层,在不使用重放缓冲区的情况下,可以更稳定地进行梯度更新。此外,机载系统将仅通过对一小部分隐藏节点的随机扰动来执行表示学习。我们调查是否这样一个轻量级的学习系统,结合更昂贵的基于重放的学习系统比单独的基于重放的学习更好地执行。该计划还旨在开发高效稳定的策略和表示学习方法。我们开发了一个理论框架,丰富了我们的理解,如何创造新的和有效的政策学习方法的方向。对于表示学习,我们扩展了现有的表示搜索策略,称为生成和测试强化学习。我们开发了一种通用的生成和测试机制,其中功能的效用仅基于损失函数定义,允许适用于任何损失函数和神经架构。计算成本低廉的学习机制对于使强化学习系统更容易获得和适用于机器人至关重要。该计划的轻型机载系统将使世界各地的研究生、企业家和爱好者能够更容易地建造不断学习的机器人,从而将人类从众多繁重的任务中解放出来。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Mahmood, Ashique其他文献

Mahmood, Ashique的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Mahmood, Ashique', 18)}}的其他基金

Scalable Reinforcement Learning Methods for Learning in Real-Time with Robots
用于机器人实时学习的可扩展强化学习方法
  • 批准号:
    RGPIN-2021-02690
  • 财政年份:
    2022
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Discovery Grants Program - Individual
Scalable Reinforcement Learning Methods for Learning in Real-Time with Robots
用于机器人实时学习的可扩展强化学习方法
  • 批准号:
    DGECR-2021-00133
  • 财政年份:
    2021
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Discovery Launch Supplement

相似国自然基金

海桑属杂种区强化(Reinforcement)的检验与遗传基础研究
  • 批准号:
    30800060
  • 批准年份:
    2008
  • 资助金额:
    23.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Learning to Reason in Reinforcement Learning
在强化学习中学习推理
  • 批准号:
    DP240103278
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Discovery Projects
Collaborative Research: CDS&E: Generalizable RANS Turbulence Models through Scientific Multi-Agent Reinforcement Learning
合作研究:CDS
  • 批准号:
    2347423
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Standard Grant
CAREER: Stochasticity and Resilience in Reinforcement Learning: From Single to Multiple Agents
职业:强化学习中的随机性和弹性:从单个智能体到多个智能体
  • 批准号:
    2339794
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Continuing Grant
CAREER: Towards Real-world Reinforcement Learning
职业:走向现实世界的强化学习
  • 批准号:
    2339395
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Continuing Grant
CAREER: Robust Reinforcement Learning Under Model Uncertainty: Algorithms and Fundamental Limits
职业:模型不确定性下的鲁棒强化学习:算法和基本限制
  • 批准号:
    2337375
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Continuing Grant
Optimizing Intelligent Vehicular Routing with Edge Computing through Multi-Agent Reinforcement Learning
通过多智能体强化学习利用边缘计算优化智能车辆路由
  • 批准号:
    24K14913
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
CAREER: Temporal Causal Reinforcement Learning and Control for Autonomous and Swarm Cyber-Physical Systems
职业:自治和群体网络物理系统的时间因果强化学习和控制
  • 批准号:
    2339774
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Continuing Grant
Federated Reinforcement Learning Empowered Point Cloud Video Streaming
联合强化学习赋能点云视频流
  • 批准号:
    24K14927
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Collaborative Research: CDS&E: Generalizable RANS Turbulence Models through Scientific Multi-Agent Reinforcement Learning
合作研究:CDS
  • 批准号:
    2347422
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Standard Grant
CAREER: Structure Exploiting Multi-Agent Reinforcement Learning for Large Scale Networked Systems: Locality and Beyond
职业:为大规模网络系统利用多智能体强化学习的结构:局部性及其他
  • 批准号:
    2339112
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了