权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CPS: Medium: Collaborative Research: Certifiable reinforcement learning for cyber-physical systems

CPS：媒介：协作研究：网络物理系统的可认证强化学习

基本信息

批准号：
1836819
负责人：
Sam Burden
金额：
$ 66.63万
依托单位：
University of Washington
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2018
资助国家：
美国
起止时间：
2018-09-15 至 2022-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1836819&HistoricalAwards=false
关键词：
CPS Medium Collaborative Research Certifiable

项目摘要

We propose to generalize and certify the performance of reinforcement learning algorithms for control of cyber-physical systems (CPS). Broadly speaking, reinforcement learning applied to physical systems is concerned with making predictions from data to control the system to extremize a performance criterion. The project will particularly focus on developing theory and algorithms applicable to hybrid and multi-agent control systems, that is, systems with continuous and discrete elements and systems with multiple decision-making agents, which are ubiquitous in CPS across spatiotemporal scales and application domains. Reinforcement learning algorithms are not yet mature enough to guarantee performance when applied to control of CPS. In light of these limitations, this project aims to lay the theoretical and computational foundation to certify reinforcement learning algorithms so that they may be deployed in society with high confidence.This project will certify reinforcement learning algorithms that compute optimal control policies in systems with non-classical dynamics and non-classical costs. To achieve this goal, we will generalize convergent algorithms originally designed for purely continuous systems to apply in hybrid control systems whose states undergo a mixture of discrete and continuous transitions. Moreover, we specifically aim to ensure this approach is applicable to societal-scale CPS in which multiple agents, some of which may be humans, interact directly with the CPS. These algorithms will be experimentally validated on three testbeds that represent a range of hybrid and multi-agent phenomena that arise in CPS. The first testbed will test the performance of our algorithms on societal-scale traffic flow networks via simulation. The second testbed will consider heterogeneous teams of aerial and terrestrial mobile robots collaborating with human partners to perform construction, inspection, and maintenance tasks on scale facsimiles of infrastructure like bridges, and tunnels. The third testbed will study the closed-loop interaction between individual humans and remote, teleoperated robots that perform dynamic locomotion and manipulation behaviors. This project will also co-organize an interdisciplinary workshop with technology policy experts, the results of which will form the basis for an interdisciplinary multi-campus graduate-level seminar run by the PIs.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

我们建议推广和证明用于控制网络物理系统（CPS）的强化学习算法的性能。从广义上讲，应用于物理系统的强化学习涉及从数据中做出预测，以控制系统以达到性能标准的极值。该项目将特别侧重于开发适用于混合和多智能体控制系统的理论和算法，即具有连续和离散元素的系统以及具有多个决策代理的系统，这些系统在跨时空尺度和应用领域的CPS中无处不在。强化学习算法还不够成熟，无法保证应用于CPS控制时的性能。鉴于这些局限性，本项目旨在为验证强化学习算法奠定理论和计算基础，使其能够高信心地部署到社会中。该项目将验证在非经典动力学和非经典成本系统中计算最优控制策略的强化学习算法。为了实现这一目标，我们将推广最初为纯连续系统设计的收敛算法，以应用于状态经历离散和连续过渡混合的混合控制系统。此外，我们特别致力于确保这种方法适用于社会规模的CPS，其中多个代理（其中一些可能是人类）直接与CPS交互。这些算法将在三个试验台上进行实验验证，这些试验台代表了CPS中出现的一系列混合和多智能体现象。第一个测试平台将通过模拟测试我们的算法在社会规模交通流网络上的性能。第二个试验台将考虑由空中和地面移动机器人组成的异类团队与人类合作，在桥梁和隧道等基础设施的规模复制品上执行施工、检查和维护任务。第三个试验台将研究个人与远程遥控机器人之间的闭环交互，这些机器人执行动态运动和操作行为。该项目还将与技术政策专家共同举办一个跨学科讲习班，其结果将成为由政策研究所举办的跨学科多校区研究生水平研讨会的基础。该奖项反映了美国国家科学基金会的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（9）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Convergence Analysis of Gradient-Based Learning in Continuous Games

DOI：
发表时间：
2019
期刊：
影响因子：
0
作者：
Benjamin J. Chasnov;L. Ratliff;Eric V. Mazumdar;Samuel A. Burden
通讯作者：
Benjamin J. Chasnov;L. Ratliff;Eric V. Mazumdar;Samuel A. Burden

Implicit Learning Dynamics in Stackelberg Games: Equilibria Characterization, Convergence Analysis, and Empirical Study

DOI：
发表时间：
2020-07
期刊：
影响因子：
0
作者：
Tanner Fiez;Benjamin J. Chasnov;L. Ratliff
通讯作者：
Tanner Fiez;Benjamin J. Chasnov;L. Ratliff

Stackelberg Actor-Critic: A Game-Theoretic Perspective

斯塔克尔伯格演员评论家：博弈论的视角

DOI：
发表时间：
2021
期刊：
AAAI Workshop on Reinforcement Learning and Games
影响因子：
0
作者：
Zheng, Liyuan;Fiez, Tanner;Alumbaugh, Zane;Chasnov, Benjamin;Ratliff, Lillian J.
通讯作者：
Ratliff, Lillian J.

Experiments with sensorimotor games in dynamic human/machine interaction

动态人机交互中的感觉运动游戏实验

DOI：
10.1117/12.2519258
发表时间：
2019
期刊：
and Applications XI
影响因子：
0
作者：
Chasnov, Benjamin;Yamagami, Momona;Parsa, Behnoosh;Ratliff, Lillian J.;Burden, Samuel A.
通讯作者：
Burden, Samuel A.

Stackelberg Actor-Critic: Game-Theoretic Reinforcement Learning Algorithms