权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Distributionally Robust Adaptive Control: Enabling Safe and Robust Reinforcement Learning

分布式鲁棒自适应控制：实现安全鲁棒的强化学习

基本信息

批准号：
2135925
负责人：
Naira Hovakimyan
金额：
$ 37.5万
依托单位：
University of Illinois at Urbana-Champaign
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-07-01 至 2025-06-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2135925&HistoricalAwards=false
关键词：
Distributionally Robust Adaptive Control Enabling

项目摘要

Data-driven algorithms can autonomously control complex systems like autonomous cars and drones. However, the use of such powerful algorithms remains relegated primarily to controlled laboratory environments. The main reason for the minimal adoption of data-driven methods for safety-critical systems is the difficulty one encounters when attempting to establish safety and predictability guarantees as one would do with well-established control theoretical methods. This award supports fundamental research to identify the best methodologies to consolidate data-driven and control-theoretic tools so that the overall methodology is safe, robust, and high-performing. The new approach lifts control tools to speak the same language as the data-driven methods. In doing so, the performance of the data-driven methods is not compromised, and yet, the safety guarantees of control-theoretic tools can be constructed. Safe and predictable autonomous operation of complex systems can bring immense socio-economic benefits through its application in medical robotics, autonomous logistics, transportation, and extra-terrestrial exploration, to name a few. This research involves multiple disciplines, including robotics, control theory, statistical learning, and mathematics. The cross-disciplinary nature will assist underrepresented groups' broader participation in STEM and impact engineering education. To adopt data-driven methods that rely on reinforcement learning (RL) algorithms in safety-critical systems, we need guarantees on safety and robustness. Robust and adaptive control methodologies developed for classical systems with parametric uncertainties cannot be used directly in conjunction with RL because the latter operates on data-driven models for which identifying parametric and deterministic uncertainties is difficult, if not impossible. This research will construct a new class of robust adaptive controllers that are robust to errors in the learned distributions, thus allowing RL algorithms to directly interact with these controllers without further restrictions. Due to robustness at the level of distributions, notions of risk-aware safety can be included in a straightforward manner. This research will first aim to construct controllers that track temporally evolving state distributions with uniform bounds. Then, the epistemic uncertainties will be introduced with a novel adaptive control scheme to quantifiably control the effect of the uncertainties in the space of distributions. The results produced through this effort will bring the two distinct worlds of data-driven control and classical control together at a natural intersection point where trajectories of distributions, not of sample paths, are considered.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

数据驱动的算法可以自主控制自动汽车和无人机等复杂系统。然而，这种强大的算法的使用仍然主要局限于受控的实验室环境。对于安全关键系统，数据驱动方法的最小采用的主要原因是，当试图建立安全性和可预测性保证时遇到的困难，就像人们使用成熟的控制理论方法一样。该奖项支持基础研究，以确定整合数据驱动和控制理论工具的最佳方法，使整体方法安全，稳健和高性能。新方法提升了控制工具，使其与数据驱动方法使用相同的语言。在这样做时，数据驱动方法的性能不会受到影响，但可以构建控制理论工具的安全保证。复杂系统的安全和可预测的自主操作可以通过其在医疗机器人，自主物流，运输和外星探索等方面的应用带来巨大的社会经济效益。这项研究涉及多个学科，包括机器人，控制理论，统计学习和数学。跨学科的性质将有助于代表性不足的群体更广泛地参与STEM和影响工程教育。为了在安全关键系统中采用依赖于强化学习（RL）算法的数据驱动方法，我们需要保证安全性和鲁棒性。为具有参数不确定性的经典系统开发的鲁棒和自适应控制方法不能直接与RL结合使用，因为后者操作于数据驱动模型，对于该模型，识别参数和确定性不确定性是困难的，如果不是不可能的话。本研究将构建一类新的鲁棒自适应控制器，这些控制器对学习分布中的误差具有鲁棒性，从而允许RL算法直接与这些控制器进行交互，而无需进一步的限制。由于分布水平的稳健性，可以以直接的方式包含风险感知安全性的概念。本研究将首先设计控制器，以追踪具有一致边界的随时间演化的状态分布。然后，将认知不确定性引入一种新的自适应控制方案，以量化控制分布空间中的不确定性的影响。通过这一努力产生的结果将把数据驱动控制和经典控制两个不同的世界在一个自然的交叉点，其中分布的轨迹，而不是样本路径，被认为是一起的。这个奖项反映了NSF的法定使命，并已被认为是值得的支持，通过评估使用基金会的智力价值和更广泛的影响审查标准。