Data-efficient Safe Control with Recovery-to-Optimality Guarantees
数据高效的安全控制,并保证恢复最佳性
基本信息
- 批准号:2227311
- 负责人:
- 金额:$ 40万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-08-15 至 2026-07-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
As rapid developments on learning-enabled systems in recent years have been advancing autonomy capabilities of systems, their safety certification becomes exceedingly important. While the recent progress on safe reinforcement learning (RL) algorithms for autonomous control design has been promising, these algorithms are accountable only in stable environments and under the availability of comprehensive and high-quality data sets. However, many systems must operate in unpredictable environments under which dangerous divergence might arise between safety and performance. In these environments, adaptation of safety and performance specifications to the context is required. Besides, RL agent must perform learning under realistic data quantity and quality. Current RL practice assumes availability of rich and high-quality data with full observability of the entire system’s states. These assumptions can be violated in many practical systems. This award supports research to create low-complexity safe learning-enabled algorithms for partially observable systems that are equipped with highly-efficient conflict management mechanisms to deliver as much performance as possible safely. Advances will have broad implications in applications of autonomous systems, robots, manufacturing, smart grids, and more.This research project aims to develop low-complexity, safe learning-enabled algorithms for partially observable systems equipped with highly efficient conflict management mechanisms. The objectives of this project are two-fold: 1) Proposing direct data-driven learning approaches for backup safe control policies in partially observable nonlinear systems with uncertain dynamics. The utilization of concepts such as L-extra sample dynamics, probabilistic contractivity, and convex lifting will enable the learning of safe control policies for nonlinear systems with nonconvex safe sets using only measured noisy input-output data. 2) Introducing novel merging approaches to proactively manage conflicts by merging learned backup safe control policies with learning-enabled control policies. Instead of providing reactive quick fixes to conflicts as they arise, these approaches will enable proactive conflict management to avoid destructive future conflicts. Towards conflict management, the level sets of the RL agent will be adapted to the situation to make the agent align with the safety constraint. That is, safety-shaped value functions will be learned to effectively resolve conflicts by considering safety and optimality concerns across the relevant domains.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
近年来,随着学习系统的快速发展,系统的自主能力不断提高,其安全认证变得尤为重要。虽然用于自主控制设计的安全强化学习(RL)算法的最新进展很有希望,但这些算法仅在稳定的环境下以及在全面和高质量数据集的可用性下负责。然而,许多系统必须在不可预测的环境中运行,在这种环境下,安全性和性能之间可能会出现危险的分歧。在这些环境中,需要根据上下文调整安全和性能规范。此外,RL agent必须在真实的数据数量和质量下进行学习。当前的RL实践假设有丰富的、高质量的数据可用性,并且整个系统的状态都是完全可观察的。在许多实际系统中,这些假设可能会被违背。该合同支持为部分可观察系统创建低复杂性安全学习算法的研究,这些系统配备了高效的冲突管理机制,以提供尽可能多的性能。这些进步将对自主系统、机器人、制造业、智能电网等领域的应用产生广泛影响。该研究项目旨在为配备高效冲突管理机制的部分可观察系统开发低复杂性,安全的学习算法。本项目的目标有两个方面:1)为具有不确定动力学的部分可观察非线性系统的备份安全控制策略提出直接的数据驱动学习方法。利用l -额外样本动力学、概率收缩性和凸提升等概念,将能够学习具有非凸安全集的非线性系统的安全控制策略,这些非线性系统仅使用测量的噪声输入输出数据。2)引入新的合并方法,通过将学习到的备份安全控制策略与可学习的控制策略合并,主动管理冲突。这些方法不是在冲突出现时提供反应性的快速修复,而是能够进行主动冲突管理,以避免破坏性的未来冲突。对于冲突管理,RL代理的级别集将根据情况进行调整,使代理与安全约束保持一致。也就是说,将学习安全型价值函数,通过考虑相关领域的安全性和最优性问题来有效地解决冲突。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Bahare Kiumarsi其他文献
Data-Driven Safety-Certified Predictive Control for Linear Systems
线性系统的数据驱动、安全认证的预测控制
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:3
- 作者:
Marjan Khaledi;P. Tooranjipour;Bahare Kiumarsi - 通讯作者:
Bahare Kiumarsi
Optimal Output Regulation of Linear Discrete-Time Systems With Unknown Dynamics Using Reinforcement Learning
使用强化学习的未知动态线性离散时间系统的最优输出调节
- DOI:
10.1109/tcyb.2018.2890046 - 发表时间:
2020-07 - 期刊:
- 影响因子:0
- 作者:
Yi Jiang;Bahare Kiumarsi;Jialu Fan;Tianyou Chai;Jinna Li;Frank L. Lewis - 通讯作者:
Frank L. Lewis
Risk-Aware Safe Optimal Control of Uncertain Linear Systems
不确定线性系统的风险感知安全最优控制
- DOI:
10.1109/allerton49937.2022.9929371 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
P. Tooranjipour;Bahare Kiumarsi;H. Modares - 通讯作者:
H. Modares
Heterogeneous formation control of multiple rotorcrafts with unknown dynamics by reinforcement learning
基于强化学习的未知动力学多旋翼飞行器异构编队控制
- DOI:
10.1016/j.ins.2021.01.011 - 发表时间:
2021 - 期刊:
- 影响因子:8.1
- 作者:
Hao Liu;Fachun Peng;Hamidreza Modares;Bahare Kiumarsi - 通讯作者:
Bahare Kiumarsi
Safety Planning Using Control Barrier Function: A Model Predictive Control Scheme
使用控制屏障函数进行安全规划:模型预测控制方案
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Z. Marvi;Bahare Kiumarsi - 通讯作者:
Bahare Kiumarsi
Bahare Kiumarsi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
固定参数可解算法在平面图问题的应用以及和整数线性规划的关系
- 批准号:60973026
- 批准年份:2009
- 资助金额:32.0 万元
- 项目类别:面上项目
相似海外基金
A real-time traffic signal system for safe and efficient intersections
实时交通信号系统,确保安全高效的十字路口
- 批准号:
LP220100226 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Linkage Projects
CAREER: Intelligent Battery Management with Safe, Efficient, Fast-Adaption Reinforcement Learning and Physics-Inspired Machine Learning: From Cells to Packs
职业:具有安全、高效、快速适应的强化学习和物理启发机器学习的智能电池管理:从电池到电池组
- 批准号:
2340194 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
Safe and efficient eco-driving using connected and automated vehicles
使用联网和自动驾驶车辆实现安全高效的生态驾驶
- 批准号:
DP240102189 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Discovery Projects
Transferring Pharmacists' Safe and Efficient Dispensing Know-How: Identifying Reasons for Success of Proficient Pharmacists Based on Eye Gaze Measurement.
传授药剂师安全高效的配药知识:基于眼睛注视测量确定熟练药剂师成功的原因。
- 批准号:
23K04309 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
CAREER: Safe and Efficient Robot Learning from Demonstration in the Real World
职业:安全高效的机器人从现实世界的演示中学习
- 批准号:
2323384 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CPS: SMALL: Formal Methods for Safe, Efficient, and Transferable Learning-enabled Autonomy
CPS:SMALL:安全、高效和可迁移的学习自主的正式方法
- 批准号:
2231257 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Safe and Efficient Marine Transportation of Liquid Hydrogen
安全高效的液氢海上运输
- 批准号:
10070575 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
EU-Funded
SAFeCRAFT: Safe and Efficient Use of Sustainable Fuels in Maritime Transport Applications
SAFeCRAFT:在海上运输应用中安全高效地使用可持续燃料
- 批准号:
10110519 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
EU-Funded
A Safe and Efficient Framework to Continuous Integration through Repaying Self-Admitted Technical Debt in Software Development
通过偿还软件开发中自我承认的技术债务来实现持续集成的安全高效的框架
- 批准号:
23KJ1589 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Safe, Ethical and Efficient Autonomous Vehicle Navigation Algorithms
安全、合乎道德且高效的自动驾驶汽车导航算法
- 批准号:
2885906 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Studentship