Scalable Reinforcement Learning Methods for Learning in Real-Time with Robots

用于机器人实时学习的可扩展强化学习方法

基本信息

  • 批准号:
    RGPIN-2021-02690
  • 负责人:
  • 金额:
    $ 1.75万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

Reinforcement learning brings the promise of continually adaptive systems for numerous tasks that humans do well but are physically laborious such as housekeeping, warehouse fulfillment, and delivery services. Such tasks require a common-sense understanding from the agent's part of a dynamically changing physical environment, which is difficult to enumerate and include in a system through hand-engineering. The proposed program aims at developing real-time learning robotic systems that interact with the physical world and adapt in real-time. Some of the most promising approaches in reinforcement learning for robotics are based on learning from human-provided demonstration data and simulators. However, approaches reliant on human interventions are not scalable or sufficient for developing robotics systems that can adapt their performance in real-time under new or changing environments. Our proposed program complements the existing approaches by developing scalable and automatic mechanisms for continually learning robotic systems. All advanced deep reinforcement learning methods for control use expensive learning mechanisms such as those based on experience replay buffers. While such expensive learning mechanisms are more appropriate for training offline or over clouds, we propose a lightweight onboard learning system to adapt and react to changes quickly in real-time. Our proposed onboard learning system will be composed of computationally inexpensive and stable policy and representation learning algorithms. We consider the policy to be only the last semi-linear layer of the network, for which gradient updates can be made more stably without using replay buffers. In addition, the onboard system will perform representation learning only through random perturbation to a small portion of the hidden nodes. We investigate whether such a lightweight learning system in conjunction with a more expensive replay-based learning system performs better than replay-based learning alone. The proposed program also aims at developing efficient and stable policy and representation learning methods. We develop a theoretical framework that enriches our understanding of how to create new and efficient policy learning methods in a directed way. For representation learning, we extend an existing strategy for representation search called generate-and-test to reinforcement learning. We develop a general mechanism of generate-and-test where the utility of features is defined solely based on the loss function, allowing applicability to any loss function and neural architecture. Computationally inexpensive learning mechanisms are essential for making reinforcement learning systems more accessible and applicable to robotics. The lightweight onboard system of the proposed program will allow graduate students, entrepreneurs, and enthusiasts around the world to build continually learning robots more easily, relieving humans from numerous laborious tasks.
强化学习为许多人类做得很好但很费力的任务带来了持续适应系统的前景,如内务管理、仓库履行和递送服务。这样的任务需要对动态变化的物理环境中代理部分的常识性理解,这很难通过手工工程来列举和包括在系统中。拟议的计划旨在开发与物理世界交互并实时适应的实时学习机器人系统。机器人强化学习中一些最有前途的方法是基于从人类提供的演示数据和模拟器中学习的。然而,依赖于人类干预的方法对于开发能够在新的或变化的环境中实时调整其性能的机器人系统来说是不可扩展的,或者是不够的。我们提出的计划通过开发可扩展的自动机制来持续学习机器人系统,从而补充了现有的方法。所有用于控制的高级深度强化学习方法都使用昂贵的学习机制,例如基于经验重放缓冲区的学习机制。虽然这种昂贵的学习机制更适合于离线或云上培训,但我们提出了一个轻量级的车载学习系统,以实时快速适应和反应变化。我们建议的车载学习系统将由计算成本低且稳定的策略和表示学习算法组成。我们认为该策略只是网络的最后一层半线性,在不使用重放缓冲区的情况下,可以更稳定地对其进行梯度更新。此外,车载系统将仅通过对一小部分隐藏节点的随机扰动来执行表示学习。我们调查了这种轻量级学习系统与更昂贵的基于重播的学习系统相结合是否比单独基于重播的学习系统性能更好。拟议的方案还旨在开发高效和稳定的政策和表征学习方法。我们开发了一个理论框架,丰富了我们对如何以定向的方式创建新的、高效的政策学习方法的理解。对于表示学习,我们将一种已有的表示搜索策略--生成并测试策略扩展到强化学习。我们开发了一种通用的生成和测试机制,其中特征的效用完全基于损失函数定义,允许适用于任何损失函数和神经结构。在计算上廉价的学习机制是使强化学习系统更容易访问和适用于机器人的关键。拟议项目的轻量级车载系统将允许世界各地的研究生、企业家和爱好者更容易地制造持续学习的机器人,将人类从无数繁重的任务中解放出来。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Mahmood, Ashique其他文献

Mahmood, Ashique的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Mahmood, Ashique', 18)}}的其他基金

Scalable Reinforcement Learning Methods for Learning in Real-Time with Robots
用于机器人实时学习的可扩展强化学习方法
  • 批准号:
    RGPIN-2021-02690
  • 财政年份:
    2021
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Discovery Grants Program - Individual
Scalable Reinforcement Learning Methods for Learning in Real-Time with Robots
用于机器人实时学习的可扩展强化学习方法
  • 批准号:
    DGECR-2021-00133
  • 财政年份:
    2021
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Discovery Launch Supplement

相似国自然基金

海桑属杂种区强化(Reinforcement)的检验与遗传基础研究
  • 批准号:
    30800060
  • 批准年份:
    2008
  • 资助金额:
    23.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: CDS&E: Generalizable RANS Turbulence Models through Scientific Multi-Agent Reinforcement Learning
合作研究:CDS
  • 批准号:
    2347423
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Standard Grant
CAREER: Stochasticity and Resilience in Reinforcement Learning: From Single to Multiple Agents
职业:强化学习中的随机性和弹性:从单个智能体到多个智能体
  • 批准号:
    2339794
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Continuing Grant
Learning to Reason in Reinforcement Learning
在强化学习中学习推理
  • 批准号:
    DP240103278
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Discovery Projects
Optimizing Intelligent Vehicular Routing with Edge Computing through Multi-Agent Reinforcement Learning
通过多智能体强化学习利用边缘计算优化智能车辆路由
  • 批准号:
    24K14913
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
CAREER: Towards Real-world Reinforcement Learning
职业:走向现实世界的强化学习
  • 批准号:
    2339395
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Continuing Grant
CAREER: Robust Reinforcement Learning Under Model Uncertainty: Algorithms and Fundamental Limits
职业:模型不确定性下的鲁棒强化学习:算法和基本限制
  • 批准号:
    2337375
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Continuing Grant
CAREER: Temporal Causal Reinforcement Learning and Control for Autonomous and Swarm Cyber-Physical Systems
职业:自治和群体网络物理系统的时间因果强化学习和控制
  • 批准号:
    2339774
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Continuing Grant
Federated Reinforcement Learning Empowered Point Cloud Video Streaming
联合强化学习赋能点云视频流
  • 批准号:
    24K14927
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Collaborative Research: CDS&E: Generalizable RANS Turbulence Models through Scientific Multi-Agent Reinforcement Learning
合作研究:CDS
  • 批准号:
    2347422
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Standard Grant
CAREER: Structure Exploiting Multi-Agent Reinforcement Learning for Large Scale Networked Systems: Locality and Beyond
职业:为大规模网络系统利用多智能体强化学习的结构:局部性及其他
  • 批准号:
    2339112
  • 财政年份:
    2024
  • 资助金额:
    $ 1.75万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了