NSF-AoF: RI: Small: Safe Reinforcement Learning in Non-Stationary Environments With Fast Adaptation and Disturbance Prediction
NSF-AoF:RI:小型:具有快速适应和干扰预测功能的非平稳环境中的安全强化学习
基本信息
- 批准号:2133656
- 负责人:
- 金额:$ 50万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-09-01 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Reinforcement learning (RL) has shown impressive performance in the control of complex robotic systems for various tasks such as locomotion, manipulation, and playing sports, e.g., table tennis. Reinforcement learning enables a robot to autonomously discover an optimal behavior through trial-and-error interactions with its environment. However, the environmental perturbations could easily cause a behavior policy trained in an old environment to fail in a perturbed environment. The failure is unacceptable for safety-critical robotic systems such as self-driving cars, drones, flying taxies and construction machines. Existing robust methods try to consider all scenarios during the training phase and seek a fixed policy, leading to conservative behaviors. Existing adaptive methods try to update their behavior policies in the perturbed environment, but will only do that after the robot has “felt a difference” through its interaction with the environment. In contrast, a human could leverage his/her perception for prediction in the new environment and adjust his/her behavior accordingly even before interacting with it. In light of these conditions, this project envisions a new framework for safe and efficient RL in the presence of environmental changes leveraging fast adaptation and perception-based prediction. The framework will enable robotic and autonomous systems robustly and safely operate, learn and adapt in the real world. This project relies on the following thrusts: i) hybrid RL for safe and efficient policy updates, ii) robust adaptive control with safety guarantees; iii) vision-based disturbance prediction. More specifically, the project will develop robust adaptive control algorithms that ensure that the executed trajectory of a robot remains safe in the presence of disturbances induced by environmental changes. It will spur hybrid model-free/model-based RL algorithms that are capable of efficiently and safely updating the behavior policies with the help of the control algorithms. The project will advance novel methodologies for predicting the key parameters of the disturbances (e.g., the weight of a package) directly from the image observations, leading to new scalable methods for efficiently learning the mathematical model of the disturbances with quantified error bounds. All the ingredients will be holistically integrated to build a framework to enable robots to safely, robustly, and efficiently operate and adapt in real-world environments. Aerial and ground vehicles will be used for experimental validation.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
强化学习(RL)在控制复杂机器人系统的各种任务中表现出令人印象深刻的性能,例如运动,操纵和体育运动,例如,乒乓球强化学习使机器人能够通过与环境的试错交互来自主发现最佳行为。然而,环境扰动很容易导致在旧环境中训练的行为策略在扰动环境中失败。 对于自动驾驶汽车、无人机、飞行出租车和建筑机械等安全关键型机器人系统来说,这种故障是不可接受的。 现有的鲁棒方法试图在训练阶段考虑所有场景,并寻求固定的策略,导致保守的行为。 现有的自适应方法试图在扰动环境中更新其行为策略,但只有在机器人通过与环境的交互“感觉到差异”之后才能这样做。相比之下,人类可以利用他/她的感知来预测新环境,并在与之交互之前相应地调整他/她的行为。鉴于这些条件,该项目设想了一个新的框架,在环境变化的情况下,利用快速适应和基于感知的预测,实现安全有效的RL。该框架将使机器人和自主系统能够在真实的世界中稳健、安全地运行、学习和适应。该项目依赖于以下几个方面:i)用于安全和有效的策略更新的混合RL,ii)具有安全保证的鲁棒自适应控制; iii)基于视觉的干扰预测。更具体地说,该项目将开发鲁棒的自适应控制算法,以确保机器人的执行轨迹在环境变化引起的干扰下保持安全。它将刺激混合无模型/基于模型的强化学习算法,这些算法能够在控制算法的帮助下有效和安全地更新行为策略。该项目将提出新的方法来预测扰动的关键参数(例如,包裹的重量)直接从图像观察,导致新的可扩展方法,用于有效地学习具有量化误差界限的扰动的数学模型。所有成分将被整体整合,以构建一个框架,使机器人能够安全,稳健,高效地运行和适应现实世界的环境。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Convex Synthesis of Control Barrier Functions Under Input Constraints
- DOI:10.1109/lcsys.2023.3293765
- 发表时间:2023
- 期刊:
- 影响因子:3
- 作者:Pan Zhao;R. Ghabcheloo;Yikun Cheng;Hossein Abdi;N. Hovakimyan
- 通讯作者:Pan Zhao;R. Ghabcheloo;Yikun Cheng;Hossein Abdi;N. Hovakimyan
DiffTune+: Hyperparameter-Free Auto-Tuning using Auto-Differentiation
- DOI:10.48550/arxiv.2212.03194
- 发表时间:2022-12
- 期刊:
- 影响因子:0
- 作者:Sheng Cheng;Lin Song;Minkyung Kim;Shenlong Wang;N. Hovakimyan
- 通讯作者:Sheng Cheng;Lin Song;Minkyung Kim;Shenlong Wang;N. Hovakimyan
Tube-Certified Trajectory Tracking for Nonlinear Systems With Robust Control Contraction Metrics
- DOI:10.1109/lra.2022.3153712
- 发表时间:2021-09
- 期刊:
- 影响因子:5.2
- 作者:Pan Zhao;Arun Lakshmanan;K. Ackerman;Aditya Gahlawat;M. Pavone;N. Hovakimyan
- 通讯作者:Pan Zhao;Arun Lakshmanan;K. Ackerman;Aditya Gahlawat;M. Pavone;N. Hovakimyan
Safe and Efficient Reinforcement Learning using Disturbance-Observer-Based Control Barrier Functions
- DOI:
- 发表时间:2022-11
- 期刊:
- 影响因子:0
- 作者:Yikun Cheng;Pan Zhao;N. Hovakimyan
- 通讯作者:Yikun Cheng;Pan Zhao;N. Hovakimyan
Improving the Robustness of Reinforcement Learning Policies With ${\mathcal {L}_{1}}$ Adaptive Control
- DOI:10.1109/lra.2022.3169309
- 发表时间:2021-12
- 期刊:
- 影响因子:5.2
- 作者:Y. Cheng;Penghui Zhao;F. Wang;D. Block;N. Hovakimyan
- 通讯作者:Y. Cheng;Penghui Zhao;F. Wang;D. Block;N. Hovakimyan
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Naira Hovakimyan其他文献
Three-dimensional coordinated path-following control for second-order multi-agent networks
二阶多智能体网络三维协调路径跟踪控制
- DOI:
10.1016/j.jfranklin.2015.01.020 - 发表时间:
2015-09 - 期刊:
- 影响因子:0
- 作者:
Zongyu Zuo;Venanzio Cichella;Ming Xu;Naira Hovakimyan - 通讯作者:
Naira Hovakimyan
FlipDyn in Graphs: Resource Takeover Games in Graphs
图表中的 FlipDyn:图表中的资源接管游戏
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Sandeep Banik;Shaunak D. Bopardikar;Naira Hovakimyan - 通讯作者:
Naira Hovakimyan
Naira Hovakimyan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Naira Hovakimyan', 18)}}的其他基金
Collaborative Research: SLES: Guaranteed Tubes for Safe Learning across Autonomy Architectures
合作研究:SLES:跨自治架构安全学习的保证管
- 批准号:
2331878 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Distributionally Robust Adaptive Control: Enabling Safe and Robust Reinforcement Learning
分布式鲁棒自适应控制:实现安全鲁棒的强化学习
- 批准号:
2135925 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
NRI: INT: COLLAB: Synergetic Drone Delivery Network in Metropolis
NRI:INT:COLLAB:大都市的协同无人机交付网络
- 批准号:
1830639 - 财政年份:2018
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CPS: Medium: Collaborative Research: Against Coordinated Cyber and Physical Attacks: Unified Theory and Technologies
CPS:媒介:协作研究:对抗协调的网络和物理攻击:统一理论和技术
- 批准号:
1739732 - 财政年份:2017
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
NRI: Collaborative Research: ASPIRE: Automation Supporting Prolonged Independent Residence for the Elderly
NRI:合作研究:ASPIRE:自动化支持老年人长期独立居住
- 批准号:
1528036 - 财政年份:2015
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
EAGER: Human centered robotic system design
EAGER:以人为本的机器人系统设计
- 批准号:
1548409 - 财政年份:2015
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
相似海外基金
NSF-AoF: NeTS: Small: Local 6G Connectivity: Controlled, Resilient, and Secure (6G-ConCoRSe)
NSF-AoF:NetS:小型:本地 6G 连接:受控、弹性和安全 (6G-ConCoRSe)
- 批准号:
2326599 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326622 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326621 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CNS Core: Small: Towards Scalable and Al-based Solutions for Beyond-5G Radio Access Networks
合作研究:NSF-AoF:CNS 核心:小型:面向超 5G 无线接入网络的可扩展和基于人工智能的解决方案
- 批准号:
2225578 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
NSF-AoF: CIF: Small: Distributed AI for enhanced security in satellite-aided wireless navigation (RESILIENT)
NSF-AoF:CIF:小型:分布式 AI,用于增强卫星辅助无线导航的安全性(弹性)
- 批准号:
2326559 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CNS Core: Small: Towards Scalable and Al-based Solutions for Beyond-5G Radio Access Networks
合作研究:NSF-AoF:CNS 核心:小型:面向超 5G 无线接入网络的可扩展和基于人工智能的解决方案
- 批准号:
2225577 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
NSF-AoF: SOLID: System-wide Operation via Learning In-device Dissimilarities
NSF-AoF:SOLID:通过学习设备内差异进行系统范围的操作
- 批准号:
2225555 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: AF: Small: Energy-Efficient THz Communications Across Massive Dimensions
合作研究:NSF-AoF:CIF:AF:小型:大尺寸的节能太赫兹通信
- 批准号:
2225576 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
NSF-AoF: Collaborative Research: CIF: Small: 6G Wireless Communications via Enhanced Channel Modeling and Estimation, Channel Morphing and Machine Learning for mmWave Bands
NSF-AoF:协作研究:CIF:小型:通过增强型毫米波信道建模和估计、信道变形和机器学习实现 6G 无线通信
- 批准号:
2225617 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
NSF-AoF: CNS Core: Small: CRUISE: A Cross-system Architecture Design for Autonomous Wireless Networks based on Lifelong Machine Learning
NSF-AoF:CNS 核心:小型:CRUISE:基于终身机器学习的自主无线网络的跨系统架构设计
- 批准号:
2225427 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Standard Grant