权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Medium: Learning MDP abstractions for Autonomous Systems using Variational Methods and Symmetry Groups

RI：中：使用变分方法和对称群学习自治系统的 MDP 抽象

基本信息

批准号：
2107256
负责人：
Lawson Wong
金额：
$ 119.99万
依托单位：
Northeastern University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-10-01 至 2025-09-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2107256&HistoricalAwards=false
关键词：
RI Medium Learning MDP abstractions

项目摘要

Autonomous systems such as self-driving vehicles, hospital platforms, and household robots have great potential social and economic benefits, with the ability to transform the future of work, healthcare, and our daily routines. However, successful autonomy requires the robot to be able to make its own decisions and learn from its own experiences. This can be challenging because the real world is rich and complex and autonomous robotic systems can become confused by the details. It is sometimes the case that an autonomous system will not generalize properly: it will perceive two very similar situations to be fundamentally different. This project aims to develop new methods for learning task- and domain-appropriate abstractions that will help autonomous systems generalize to new situations more effectively. Better abstraction will allow autonomous systems to make decisions more efficiently leading to improved learning and effective control.This project will study the problem of abstraction within the decision-theoretic framework of Markov decision processes and reinforcement learning, which have been widely used as a framework for automated decision making. Recent advances in reinforcement learning have enabled autonomous agents and robots to accomplish challenging tasks, sometimes even surpassing human experts. However, this comes at an extremely high cost, both in sample and computational complexity; millions of training steps and days of training time are typical, even in game-like environments. This project will develop approaches for making this process much more efficient, by explicitly encoding objectives for learning good abstractions into the agent's cost function. Specifically, the PIs will study and develop approaches for compressing large continuous decision-making problems into small discrete ones, as well as approaches that incorporate explicit symmetry constraints that encode irrelevances in the problem. These methods will be evaluated on a variety of domains of varying complexity, including tasks on autonomous systems involving mobile navigation and robot manipulation. The overall objective is to develop approaches that improve learning efficiency, abstraction quality, and generalization to new tasks and situations.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

自动驾驶车辆、医院平台和家用机器人等自动驾驶系统具有巨大的潜在社会和经济效益，能够改变未来的工作、医疗保健和我们的日常生活。然而，成功的自主需要机器人能够做出自己的决定，并从自己的经验中学习。这可能是具有挑战性的，因为现实世界是丰富和复杂的，自主机器人系统可能会被细节搞糊涂。有时，自治系统不会正确概括：它会感觉到两种非常相似的情况是根本不同的。这个项目旨在开发新的方法来学习任务和领域适当的抽象，这将帮助自治系统更有效地概括到新的情况。更好的抽象将允许自治系统更有效地进行决策，从而改进学习和有效控制。本项目将在马尔可夫决策过程和强化学习的决策理论框架内研究抽象问题，这两种方法已被广泛用作自动决策的框架。强化学习的最新进展使自主代理和机器人能够完成具有挑战性的任务，有时甚至超过人类专家。然而，这在样本和计算复杂性方面都付出了极高的成本；即使在类似游戏的环境中，数百万个训练步骤和数天的训练时间也是典型的。这个项目将开发使这个过程更有效率的方法，通过明确地将学习良好抽象的目标编码到代理的成本函数中。具体地说，PI将研究和开发将大型连续决策问题压缩为小型离散决策问题的方法，以及包含显式对称约束的方法，这些对称约束编码问题中的无关紧要。这些方法将在不同复杂程度的各种领域进行评估，包括涉及移动导航和机器人操作的自主系统上的任务。总体目标是开发提高学习效率、抽象质量和对新任务和新情况的概括性的方法。该奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（27）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

SEIL: Simulation-augmented Equivariant Imitation Learning

DOI：
10.1109/icra48891.2023.10161252
发表时间：
2022-10
期刊：
2023 IEEE International Conference on Robotics and Automation (ICRA)
影响因子：
0
作者：
Ming Jia;Dian Wang;Guanang Su;David Klee;Xu Zhu;R. Walters;Robert W. Platt
通讯作者：
Ming Jia;Dian Wang;Guanang Su;David Klee;Xu Zhu;R. Walters;Robert W. Platt

Approximately Equivariant Networks for Imperfectly Symmetric Dynamics