权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CISE-ANR: RI: Small: Numerically efficient reinforcement learning for constrained systems with super-linear convergence (NERL)

CISE-ANR：RI：小：具有超线性收敛 (NERL) 的约束系统的数值高效强化学习

基本信息

批准号：
2315396
负责人：
Ludovic Righetti
金额：
$ 54.41万
依托单位：
New York University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-10-01 至 2026-09-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2315396&HistoricalAwards=false
关键词：
CISE ANR RI Small Numerically

项目摘要

Reinforcement learning is the name of a learning technique used to teach robots new skills. Reinforcement Learning is also used in training the robot behave in certain ways and this training enables important applications from game playing to package handling. Yet, these results seldom lead to real-world large-scale applications. The main challenges stem from the limited abilities of reinforcement learning techniques to efficiently create behaviors that can be used in realistic environments, across objects, obstacles and tasks, while still ensuring operational safety. Optimal control is another method that is used to control systems. Such methods can be very efficient for performing numerical computations but are generally limited to rather narrow behaviors. The project aims at casting a new light on reinforcement learning and optimal control, which share common foundations but until now have failed to produce a single method combining the advantages of both approaches. This research will include the development of new methods to improve learning efficacy and guarantee safety for real physical systems. To demonstrate the broad applicability of the approach, the project will evaluate the methods in four realistic application domains: towing kites for energy supply, robots with arms and legs, avatars and microscopic movement of proteins. This project contributes to the advance of national health, prosperity and welfare by improving the capabilities, reliability and safety of robots in a wide area of applications with high industrial potential. The project will be conducted by a French-US team of researchers which will help train the next generation of the workforce by providing a unique international research experience.The project is articulated around two main research goals. The first goal is to produce a new reinforcement learning algorithm which better exploits prior model knowledge, in particular model derivatives, to accelerate convergence, guarantee a convergence rate and enforce hard constraints. The second goal aims to efficiently solve a particular class of hard problems, namely problems with hybrid (discrete/continuous) dynamics. The combination of both objectives will result in a common theoretical framework to merge optimal control and reinforcement learning approaches as well as numerically efficient algorithms capable of generically solving complex high-dimensional problems. Finally, the side outcomes of the project will be several demonstrations which have value beyond the science.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

强化学习是一种用于教授机器人新技能的学习技术。强化学习还用于训练机器人以某些方式行为，这种训练可以实现从玩游戏到包装处理的重要应用。然而，这些结果很少导致现实世界的大规模应用。主要挑战来自强化学习技术的有限能力，无法有效地创建可用于现实环境中的行为，跨越对象，障碍和任务，同时仍确保操作安全。最优控制是用于控制系统的另一种方法。此类方法对于执行数值计算非常有效，但通常仅限于相当狭窄的行为。该项目旨在为强化学习和最优控制提供新的视角，这两种方法有着共同的基础，但到目前为止还没有产生一种结合两种方法优点的方法。这项研究将包括开发新方法，以提高学习效率和保证真实的物理系统的安全性。为了证明该方法的广泛适用性，该项目将在四个实际应用领域评估这些方法：牵引风筝供能，有胳膊和腿的机器人，化身和蛋白质的微观运动。该项目通过提高机器人在具有高工业潜力的广泛应用领域的能力，可靠性和安全性，为促进国民健康，繁荣和福利做出贡献。该项目将由一个法国-美国研究团队进行，通过提供独特的国际研究经验，帮助培训下一代劳动力。该项目围绕两个主要研究目标进行阐述。第一个目标是产生一种新的强化学习算法，该算法更好地利用先验模型知识，特别是模型导数，以加速收敛，保证收敛速度并执行硬约束。第二个目标旨在有效地解决一类特殊的困难问题，即混合（离散/连续）动态问题。这两个目标的结合将导致一个共同的理论框架，以合并最优控制和强化学习方法，以及能够通用地解决复杂的高维问题的数值高效算法。最后，该项目的附带成果将是几个具有超越科学价值的演示。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Ludovic Righetti其他文献

$$\mathcal {N}$$ IPM-HLSP: an efficient interior-point method for hierarchical least-squares programs

DOI：
10.1007/s11081-023-09823-x
发表时间：
2023-08-03
期刊：
OPTIMIZATION AND ENGINEERING
影响因子：
1.700
作者：
Kai Pfeiffer;Adrien Escande;Ludovic Righetti
通讯作者：
Ludovic Righetti

iDb-RRT: Sampling-based Kinodynamic Motion Planning with Motion Primitives and Trajectory Optimization

iDb-RRT：基于采样的运动动力学运动规划，具有运动基元和轨迹优化

DOI：
10.48550/arxiv.2403.10745
发表时间：
2024
期刊：
ArXiv
影响因子：
0
作者：
Joaquim Ortiz de Haro;Wolfgang Hönig;Valentin N. Hartmann;Marc Toussaint;Ludovic Righetti
通讯作者：
Ludovic Righetti