RI: Medium: Learning MDP abstractions for Autonomous Systems using Variational Methods and Symmetry Groups

RI:中:使用变分方法和对称群学习自治系统的 MDP 抽象

基本信息

  • 批准号:
    2107256
  • 负责人:
  • 金额:
    $ 119.99万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2021
  • 资助国家:
    美国
  • 起止时间:
    2021-10-01 至 2025-09-30
  • 项目状态:
    未结题

项目摘要

Autonomous systems such as self-driving vehicles, hospital platforms, and household robots have great potential social and economic benefits, with the ability to transform the future of work, healthcare, and our daily routines. However, successful autonomy requires the robot to be able to make its own decisions and learn from its own experiences. This can be challenging because the real world is rich and complex and autonomous robotic systems can become confused by the details. It is sometimes the case that an autonomous system will not generalize properly: it will perceive two very similar situations to be fundamentally different. This project aims to develop new methods for learning task- and domain-appropriate abstractions that will help autonomous systems generalize to new situations more effectively. Better abstraction will allow autonomous systems to make decisions more efficiently leading to improved learning and effective control.This project will study the problem of abstraction within the decision-theoretic framework of Markov decision processes and reinforcement learning, which have been widely used as a framework for automated decision making. Recent advances in reinforcement learning have enabled autonomous agents and robots to accomplish challenging tasks, sometimes even surpassing human experts. However, this comes at an extremely high cost, both in sample and computational complexity; millions of training steps and days of training time are typical, even in game-like environments. This project will develop approaches for making this process much more efficient, by explicitly encoding objectives for learning good abstractions into the agent's cost function. Specifically, the PIs will study and develop approaches for compressing large continuous decision-making problems into small discrete ones, as well as approaches that incorporate explicit symmetry constraints that encode irrelevances in the problem. These methods will be evaluated on a variety of domains of varying complexity, including tasks on autonomous systems involving mobile navigation and robot manipulation. The overall objective is to develop approaches that improve learning efficiency, abstraction quality, and generalization to new tasks and situations.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
自动驾驶车辆、医院平台和家用机器人等自动驾驶系统具有巨大的潜在社会和经济效益,能够改变未来的工作、医疗保健和我们的日常生活。然而,成功的自主需要机器人能够做出自己的决定,并从自己的经验中学习。这可能是具有挑战性的,因为现实世界是丰富和复杂的,自主机器人系统可能会被细节搞糊涂。有时,自治系统不会正确概括:它会感觉到两种非常相似的情况是根本不同的。这个项目旨在开发新的方法来学习任务和领域适当的抽象,这将帮助自治系统更有效地概括到新的情况。更好的抽象将允许自治系统更有效地进行决策,从而改进学习和有效控制。本项目将在马尔可夫决策过程和强化学习的决策理论框架内研究抽象问题,这两种方法已被广泛用作自动决策的框架。强化学习的最新进展使自主代理和机器人能够完成具有挑战性的任务,有时甚至超过人类专家。然而,这在样本和计算复杂性方面都付出了极高的成本;即使在类似游戏的环境中,数百万个训练步骤和数天的训练时间也是典型的。这个项目将开发使这个过程更有效率的方法,通过明确地将学习良好抽象的目标编码到代理的成本函数中。具体地说,PI将研究和开发将大型连续决策问题压缩为小型离散决策问题的方法,以及包含显式对称约束的方法,这些对称约束编码问题中的无关紧要。这些方法将在不同复杂程度的各种领域进行评估,包括涉及移动导航和机器人操作的自主系统上的任务。总体目标是开发提高学习效率、抽象质量和对新任务和新情况的概括性的方法。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(27)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
SEIL: Simulation-augmented Equivariant Imitation Learning
Approximately Equivariant Networks for Imperfectly Symmetric Dynamics
  • DOI:
  • 发表时间:
    2022-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Rui Wang;R. Walters;Rose Yu
  • 通讯作者:
    Rui Wang;R. Walters;Rose Yu
On-Robot Learning With Equivariant Models
  • DOI:
  • 发表时间:
    2022-03
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Dian Wang;Ming Jia;Xu Zhu;R. Walters;Robert W. Platt
  • 通讯作者:
    Dian Wang;Ming Jia;Xu Zhu;R. Walters;Robert W. Platt
Symmetry Teleportation for Accelerated Optimization
用于加速优化的对称隐形传态
Scaling up and Stabilizing Differentiable Planning with Implicit Differentiation
  • DOI:
    10.48550/arxiv.2210.13542
  • 发表时间:
    2022-10
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Linfeng Zhao;Huazhe Xu;Lawson L. S. Wong
  • 通讯作者:
    Linfeng Zhao;Huazhe Xu;Lawson L. S. Wong
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Lawson Wong其他文献

Lawson Wong的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
  • 批准号:
    2313151
  • 财政年份:
    2023
  • 资助金额:
    $ 119.99万
  • 项目类别:
    Continuing Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
  • 批准号:
    2313149
  • 财政年份:
    2023
  • 资助金额:
    $ 119.99万
  • 项目类别:
    Continuing Grant
Collaborative Research: RI: Medium: Superhuman Imitation Learning from Heterogeneous Demonstrations
合作研究:RI:媒介:异质演示中的超人模仿学习
  • 批准号:
    2312955
  • 财政年份:
    2023
  • 资助金额:
    $ 119.99万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
  • 批准号:
    2313150
  • 财政年份:
    2023
  • 资助金额:
    $ 119.99万
  • 项目类别:
    Continuing Grant
RI: Medium: Foundations of Recourse Verification in Machine Learning
RI:媒介:机器学习资源验证的基础
  • 批准号:
    2313105
  • 财政年份:
    2023
  • 资助金额:
    $ 119.99万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Superhuman Imitation Learning from Heterogeneous Demonstrations
合作研究:RI:媒介:异质演示中的超人模仿学习
  • 批准号:
    2312956
  • 财政年份:
    2023
  • 资助金额:
    $ 119.99万
  • 项目类别:
    Standard Grant
RI: Medium: Foundations of Self-Supervised Learning Through the Lens of Probabilistic Generative Models
RI:媒介:通过概率生成模型的视角进行自我监督学习的基础
  • 批准号:
    2211907
  • 财政年份:
    2022
  • 资助金额:
    $ 119.99万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Bootstrapping natural feedback for reinforcement learning
合作研究:RI:中:引导强化学习的自然反馈
  • 批准号:
    2212310
  • 财政年份:
    2022
  • 资助金额:
    $ 119.99万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: MoDL: Occams Razor in Deep and Physical Learning
合作研究:RI:媒介:MoDL:深度学习和物理学习中的奥卡姆斯剃刀
  • 批准号:
    2212519
  • 财政年份:
    2022
  • 资助金额:
    $ 119.99万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Learning Compositional Implicit Representations for 3D Scene Understanding
合作研究:RI:媒介:学习 3D 场景理解的组合隐式表示
  • 批准号:
    2211258
  • 财政年份:
    2022
  • 资助金额:
    $ 119.99万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了