权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Data-efficient Safe Control with Recovery-to-Optimality Guarantees

数据高效的安全控制，并保证恢复最佳性

基本信息

批准号：
2227311
负责人：
Bahare Kiumarsi
金额：
$ 40万
依托单位：
Michigan State University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-08-15 至 2026-07-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2227311&HistoricalAwards=false
关键词：
Data efficient Safe Control Recovery

项目摘要

As rapid developments on learning-enabled systems in recent years have been advancing autonomy capabilities of systems, their safety certification becomes exceedingly important. While the recent progress on safe reinforcement learning (RL) algorithms for autonomous control design has been promising, these algorithms are accountable only in stable environments and under the availability of comprehensive and high-quality data sets. However, many systems must operate in unpredictable environments under which dangerous divergence might arise between safety and performance. In these environments, adaptation of safety and performance specifications to the context is required. Besides, RL agent must perform learning under realistic data quantity and quality. Current RL practice assumes availability of rich and high-quality data with full observability of the entire system’s states. These assumptions can be violated in many practical systems. This award supports research to create low-complexity safe learning-enabled algorithms for partially observable systems that are equipped with highly-efficient conflict management mechanisms to deliver as much performance as possible safely. Advances will have broad implications in applications of autonomous systems, robots, manufacturing, smart grids, and more.This research project aims to develop low-complexity, safe learning-enabled algorithms for partially observable systems equipped with highly efficient conflict management mechanisms. The objectives of this project are two-fold: 1) Proposing direct data-driven learning approaches for backup safe control policies in partially observable nonlinear systems with uncertain dynamics. The utilization of concepts such as L-extra sample dynamics, probabilistic contractivity, and convex lifting will enable the learning of safe control policies for nonlinear systems with nonconvex safe sets using only measured noisy input-output data. 2) Introducing novel merging approaches to proactively manage conflicts by merging learned backup safe control policies with learning-enabled control policies. Instead of providing reactive quick fixes to conflicts as they arise, these approaches will enable proactive conflict management to avoid destructive future conflicts. Towards conflict management, the level sets of the RL agent will be adapted to the situation to make the agent align with the safety constraint. That is, safety-shaped value functions will be learned to effectively resolve conflicts by considering safety and optimality concerns across the relevant domains.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

近年来，随着学习系统的快速发展，系统的自主能力不断提高，其安全认证变得尤为重要。虽然用于自主控制设计的安全强化学习（RL）算法的最新进展很有希望，但这些算法仅在稳定的环境下以及在全面和高质量数据集的可用性下负责。然而，许多系统必须在不可预测的环境中运行，在这种环境下，安全性和性能之间可能会出现危险的分歧。在这些环境中，需要根据上下文调整安全和性能规范。此外，RL agent必须在真实的数据数量和质量下进行学习。当前的RL实践假设有丰富的、高质量的数据可用性，并且整个系统的状态都是完全可观察的。在许多实际系统中，这些假设可能会被违背。该合同支持为部分可观察系统创建低复杂性安全学习算法的研究，这些系统配备了高效的冲突管理机制，以提供尽可能多的性能。这些进步将对自主系统、机器人、制造业、智能电网等领域的应用产生广泛影响。该研究项目旨在为配备高效冲突管理机制的部分可观察系统开发低复杂性，安全的学习算法。本项目的目标有两个方面：1)为具有不确定动力学的部分可观察非线性系统的备份安全控制策略提出直接的数据驱动学习方法。利用l -额外样本动力学、概率收缩性和凸提升等概念，将能够学习具有非凸安全集的非线性系统的安全控制策略，这些非线性系统仅使用测量的噪声输入输出数据。2)引入新的合并方法，通过将学习到的备份安全控制策略与可学习的控制策略合并，主动管理冲突。这些方法不是在冲突出现时提供反应性的快速修复，而是能够进行主动冲突管理，以避免破坏性的未来冲突。对于冲突管理，RL代理的级别集将根据情况进行调整，使代理与安全约束保持一致。也就是说，将学习安全型价值函数，通过考虑相关领域的安全性和最优性问题来有效地解决冲突。该奖项反映了美国国家科学基金会的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Bahare Kiumarsi其他文献

Data-Driven Safety-Certified Predictive Control for Linear Systems

线性系统的数据驱动、安全认证的预测控制

DOI：
发表时间：
2023
期刊：
IEEE Control Systems Letters
影响因子：
3
作者：
Marjan Khaledi;P. Tooranjipour;Bahare Kiumarsi
通讯作者：
Bahare Kiumarsi

Optimal Output Regulation of Linear Discrete-Time Systems With Unknown Dynamics Using Reinforcement Learning

使用强化学习的未知动态线性离散时间系统的最优输出调节

DOI：
10.1109/tcyb.2018.2890046
发表时间：
2020-07
期刊：
IEEE Tansactions on Cybernetics
影响因子：
0
作者：
Yi Jiang;Bahare Kiumarsi;Jialu Fan;Tianyou Chai;Jinna Li;Frank L. Lewis
通讯作者：
Frank L. Lewis

Risk-Aware Safe Optimal Control of Uncertain Linear Systems

不确定线性系统的风险感知安全最优控制

DOI：
10.1109/allerton49937.2022.9929371
发表时间：
2022
期刊：
2022 58th Annual Allerton Conference on Communication, Control, and Computing (Allerton)
影响因子：
0
作者：
P. Tooranjipour;Bahare Kiumarsi;H. Modares
通讯作者：
H. Modares