权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

FRR: Symmetric Policy Learning for Robotic Manipulation

FRR：机器人操作的对称策略学习

基本信息

批准号：
2314182
负责人：
Robert Platt
金额：
$ 86.67万
依托单位：
Northeastern University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-09-01 至 2027-08-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2314182&HistoricalAwards=false
关键词：
FRR Symmetric Policy Learning Robotic

项目摘要

Machine learning has recently had a major impact on robotics, enabling robots to learn to solve problems in ways that would have been difficult to program manually. Unfortunately, most of this learning currently happens in simulation, and not in the real world. This approach is necessary because today’s machine learning algorithms generally require large amounts of data to learn anything meaningful and simulation is the most direct way of bringing this data to bear. However, there are challenges that come from mismatches between simulation and real experience and, ideally, robots would be able to learn from trial-and-error experience in the physical world as humans and animals do. By simplifying and generalizing real experiences, these systems can get more out of the limited amount of real-world data. This project explores an approach to achieve needed simplifications by incorporating problem symmetries into machine learning. Preliminary work suggests that this is a promising approach and we expect to be able to significantly improve the efficacy of robotic learning in general. This work has significant potential to impact applications in a variety of fields including defense applications, space applications, warehousing and logistics applications, healthcare applications, and applications in the home. The results of this work will be disseminated widely both in the research community and to the public at large.Nearly all planning and learning methods used in robotics today depend on models of the world – models that are sometimes wrong. Ideally, robotic systems would have the ability to adapt online to the nuances of the real world as they are encountered. The dominant paradigms for this type of adaptation are reinforcement learning (RL) and imitation learning (IL). However, today’s RL and IL algorithms are not nearly sample efficient enough to learn directly in the real world. Sample efficiency means that the algorithm can learn thoroughly from a small number of experiences. If sample efficiency were improved to the point that one could meaningfully do RL on physical robotic systems, it could dramatically improve the reliability of robotic control policies, especially for hard-to-model problems like contact-rich manipulation. This is the focus of this project – to improve sample efficiency so that robots can learn and adapt online directly in the real world via RL and learn from a small number of demonstrations via IL. The project will achieve this goal by leveraging a new class of symmetric neural models that encode problem symmetries present in many robotics domains. Preliminary work suggests these models can speed up learning by orders of magnitude in some cases. The project has the following main aims: 1) to expand current symmetric learning methods to handle domains with imperfect symmetries; 2) to explore object factored symmetric models; 3) to explore symmetric learning in visual force domains; 4) to explore policy learning directly on physical robotic systems.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

机器学习最近对机器人技术产生了重大影响，使机器人能够以手动编程难以解决的方式学习解决问题。不幸的是，目前这种学习大多发生在模拟中，而不是在真实的世界中。这种方法是必要的，因为今天的机器学习算法通常需要大量的数据来学习任何有意义的东西，而模拟是将这些数据应用于其中的最直接方式。然而，模拟和真实的经验之间的不匹配带来了挑战，理想情况下，机器人能够像人类和动物一样从物理世界的试错经验中学习。通过简化和概括真实的经验，这些系统可以从有限的真实数据中获得更多。该项目探索了一种通过将问题对称性纳入机器学习来实现所需简化的方法。初步工作表明，这是一种很有前途的方法，我们希望能够显着提高机器人学习的效率。这项工作具有重要的潜力，可以影响各种领域的应用，包括国防应用，空间应用，仓储和物流应用，医疗保健应用以及家庭应用。这项工作的结果将在研究界和公众中广泛传播。今天机器人技术中使用的几乎所有规划和学习方法都依赖于世界模型--有时是错误的模型。理想情况下，机器人系统将有能力在线适应真实的世界的细微差别，因为他们遇到。这种适应的主要范例是强化学习（RL）和模仿学习（IL）。然而，今天的RL和IL算法的样本效率还不足以直接在真实的世界中学习。样本效率意味着算法可以从少量经验中彻底学习。如果样本效率提高到可以在物理机器人系统上有意义地进行RL的程度，那么它可以显着提高机器人控制策略的可靠性，特别是对于难以建模的问题，如接触丰富的操作。这是这个项目的重点-提高样本效率，使机器人可以通过RL直接在真实的世界中在线学习和适应，并通过IL从少量演示中学习。该项目将通过利用一类新的对称神经模型来实现这一目标，这些模型对许多机器人领域中存在的问题对称性进行编码。初步工作表明，这些模型在某些情况下可以加快学习速度。该项目的主要目标是：1）扩展现有的对称学习方法，以处理不完全对称的领域; 2）探索对象因子对称模型; 3）探索视觉力领域的对称学习;四、该奖项反映了NSF的法定使命，并被认为值得通过使用基金会的知识价值和更广泛的影响审查标准。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Robert Platt其他文献

The nature of essential hypertension.

原发性高血压的性质。

DOI：
发表时间：
1959
期刊：
The Lancet
影响因子：
0
作者：
Robert Platt
通讯作者：
Robert Platt

Coarticulation in Markov Decision Processes

马尔可夫决策过程中的协同表达

DOI：
发表时间：
2004
期刊：
Neural Information Processing Systems
影响因子：
0
作者：
Khashayar Rohanimanesh;Robert Platt;S. Mahadevan;R. Grupen
通讯作者：
R. Grupen

MIT Open Access Articles LQR-RRT*: Optimal sampling-based motion planning with automatically derived extension heuristics

麻省理工学院开放获取文章 LQR-RRT*：基于自动导出的扩展启发式的最佳基于采样的运动规划

DOI：
发表时间：
期刊：
影响因子：
0
作者：
Alejandro Perez;Robert Platt;G. Konidaris;L. Kaelbling;Tomás Lozano
通讯作者：
Tomás Lozano

Improving Grasp Skills Using Schema Structured Learning

使用模式结构化学习提高掌握技能

DOI：
发表时间：
2006
期刊：
影响因子：
0
作者：
Robert Platt;R. Grupen;A. Fagg
通讯作者：
A. Fagg

TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods

TRIPOD AI 声明：使用回归或机器学习方法报告临床预测模型的更新指南

DOI：
发表时间：
2024
期刊：
British medical journal
影响因子：
0
作者：
Gary S. Collins;K. Moons;Paula Dhiman;Richard D. Riley;A. L. Beam;B. Calster;Marzyeh Ghassemi;Xiaoxuan Liu;Johannes B Reitsma;M. Smeden;A. Boulesteix;Jennifer Catherine Camaradou;L. Celi;S. Denaxas;A. Denniston;Ben Glocker;Robert M Golub;Hugh Harvey;Georg Heinze;Michael M Hoffman;A. Kengne;Emily Lam;Naomi Lee;Elizabeth W Loder;Lena Maier;B. Mateen;M. Mccradden;Lauren Oakden;Johan Ordish;Richard Parnell;Sherri Rose;Karandeep Singh;L. Wynants;P. Logullo;Abhishek Gupta;Adrian Barnett;Adrian Jonas;Agathe Truchot;Aiden Doherty;Alan Fraser;Alex Fowler;Alex Garaiman;Alistair Denniston;Amin Adibi;André Carrington;Andre Esteva;Andrew Althouse;Andrew Soltan;A. Appelt;Ari Ercole;Armando Bedoya;B. Vasey;B. Desiraju;Barbara Seeliger;B. Geerts;Beatrice Panico;Benjamin Fine;Benjamin Goldstein;B. Gravesteijn;Benjamin Wissel;B. Holzhauer;Boris Janssen;Boyi Guo;Brooke Levis;Catey Bunce;Charles Kahn;Chris Tomlinson;Christopher Kelly;Christopher Lovejoy;Clare McGenity;Conrad Harrison Constanza;Andaur Navarro;D. Nieboer;Dan Adler;Danial Bahudin;Daniel Stahl;Daniel Yoo;Danilo Bzdok;Darren Dahly;D. Treanor;David Higgins;David McClernon;David Pasquier;David Taylor;Declan O’Regan;Emily Bebbington;Erik Ranschaert;E. Kanoulas;Facundo Diaz;Felipe Kitamura;Flavio Clesio;Floor van Leeuwen;Frank Harrell;Frank Rademakers;G. Varoquaux;Garrett S Bullock;Gary Weissman;George Fowler;George Kostopoulos;Georgios Lyratzaopoulos;Gianluca Di;Gianluca Pellino;Girish Kulkarni;G. Zoccai;Glen Martin;Gregg Gascon;Harlan Krumholz;H. Sufriyana;Hongqiu Gu;H. Bogunović;Hui Jin;Ian Scott;Ijeoma Uchegbu;Indra Joshi;Irene M. Stratton;James Glasbey;Jamie Miles;Jamie Sergeant;Jan Roth;Jared Wohlgemut;Javier Carmona Sanz;J. Bibault;Jeremy Cohen;Ji Eun Park;Jie Ma;Joel Amoussou;John Pickering;J. Ensor;J. Flores;Joseph LeMoine;Joshua Bridge;Josip Car;Junfeng Wang;Keegan Korthauer;Kelly Reeve;L. Ación;Laura J. Bonnett;Lief Pagalan;L. Buturovic;L. Hooft;Maarten Luke Farrow;Van Smeden;Marianne Aznar;Mario Doria;Mark Gilthorpe;M. Sendak;M. Fabregate;M. Sperrin;Matthew Strother;Mattia Prosperi;Menelaos Konstantinidis;Merel Huisman;Michael O. Harhay;Miguel Angel Luque;M. Mansournia;Munya Dimairo;Musa Abdulkareem;M. Nagendran;Niels Peek;Nigam Shah;Nikolas Pontikos;N. Noor;Oilivier Groot;Páll Jónsson;Patrick Bossuyt;Patrick Lyons;Patrick Omoumi;Paul Tiffin;Peter Austin;Q. Noirhomme;Rachel Kuo;Ram Bajpal;Ravi Aggarwal;Richiardi Jonas;Robert Platt;Rohit Singla;Roi Anteby;Rupa Sakar;Safoora Masoumi;Sara Khalid;Saskia Haitjema;Seong Park;Shravya Shetty;Stacey Fisher;Stephanie Hicks;Susan Shelmerdine;Tammy Clifford;Tatyana Shamliyan;Teus Kappen;Tim Leiner;Tim Liu;Tim Ramsay;Toni Martinez;Uri Shalit;Valentijn de Jong;Valentyn Bezshapkin;V. Cheplygina;Victor Castro;V. Sounderajah;Vineet Kamal;V. Harish;Wim Weber;W. Amsterdam;Xioaxuan Liu;Zachary Cohen;Zakia Salod;Zane Perkins
通讯作者：
Zane Perkins