CPS: Small: Distributed Learning for Control of Cyber-Physical Systems
CPS:小型:用于控制信息物理系统的分布式学习
基本信息
- 批准号:1932011
- 负责人:
- 金额:$ 40.75万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-10-01 至 2023-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In state-of-the-art Cyber-Physical-Systems (CPS) supervised learning or unsupervised learning are typically used to analyze data. Nevertheless, in many such systems rules cannot be determined in advance and these data mining techniques are not directly applicable due to the dynamic nature of the data, their large volume that prohibits labelling in practice, and the fact that these data are added to the system piece by piece and not altogether in advance. On the other hand, control of CPS is usually done in a model-based manner, where a desired control policy is computed from a high-fidelity system model that has been derived at design-time, and potentially may be updated at runtime. However, this approach is not suitable for highly dynamical CPS, that potentially represent systems of systems whose spatial and temporal configurations may rapidly change. In fact, with such high number of configuration levels, it is almost impossible to derive suitable control policies using standard model-driven techniques. Consequently, it is critical to facilitate design of data-based controllers, with strong performance guarantees, in a way that allows for natural runtime control adaptation. Reinforcement Learning (RL) provides such a framework. In RL agents interact with the environment in a feedback loop to learn an optimal policy by taking appropriate sequences of actions in order to optimize longterm payoff. As such, RL can be much more efficient compared to supervised and unsupervised learning, in analyzing streaming data and especially in controlling a system. The goal of this project is to develop a distributed off-policy RL framework for the control of CPS. Distributed RL methods avoid the fragility, communication overhead, and privacy concerns of collecting all information at a central processing unit. Moreover, off-policy learning methods significantly improve sampling efficiency and ensure safer operation. The distributed RL framework developed under this project will have a profound impact on the control of CPS, in areas as diverse as transportation, manufacturing, health-care, smart city, urban planning, etc., that rely on multiple sensors for data collection and control. This project also involves an educational agenda focusing on K-12, undergraduate, and graduate level education. The outreach component of this project focuses on improving the pre-college students' awareness of the potential and attractiveness of a research and engineering career.The technical aims of this project are divided into four thrusts. The first thrust develops distributed off-policy RL methods using linear function approximation of the action-value function. Distributed RL algorithms using linear function approximation have been proposed for policy evaluation only. This thrust develops new RL algorithms that can also improve the policy until an optimal policy is found, which is necessary for control. Since defining appropriate feature vectors for RL problems is generally difficult and since linear mappings might not able to capture possibly nonlinear interactions between these features, the second thrust develops distributed off-policy RL methods using nonlinear function approximation, specifically, Neural Networks. The third thrust develops distributed off-policy Actor-Critic methods. When the action space is large or continuous, Actor-Critic methods are much more effective since they parameterize the target policy function using either linear or nonlinear function approximation and learn the optimal parameter so that the resulting policy maps to the optimal action for every state. Finally, the fourth thrust develops distributed RL methods for asynchronous, heterogeneous, and non-stationary data that are common in modern CPS, where sensors do not observe identically distributed data nor do they sample data at the same time. Moreover, the distributions from which data are sampled can change with time. This project focuses on the development of algorithms and supporting theoretical results. The developed algorithms are evaluated in simulation on resource allocation problems in CPS, specifically, on the control of distributed shared vehicle dispatch systems.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
在最先进的网络物理系统(CPS)中,监督学习或无监督学习通常用于分析数据。然而,在许多这样的系统中,规则不能提前确定,这些数据挖掘技术不能直接应用,因为数据的动态性,它们的大容量在实践中禁止标记,以及这些数据是一块一块地添加到系统中,而不是事先全部添加到系统中。另一方面,CPS的控制通常以基于模型的方式完成,其中所需的控制策略是从设计时派生的高保真系统模型中计算出来的,并且可能在运行时更新。然而,这种方法不适合高度动态的CPS,因为它可能代表空间和时间结构可能迅速变化的系统的系统。事实上,对于如此多的配置级别,几乎不可能使用标准的模型驱动技术派生出合适的控制策略。因此,以一种允许自然运行时控制自适应的方式,促进基于数据的控制器的设计,具有强大的性能保证是至关重要的。强化学习(RL)提供了这样一个框架。在强化学习中,智能体在一个反馈回路中与环境相互作用,通过采取适当的行动序列来学习最优策略,以优化长期收益。因此,在分析流数据,特别是控制系统方面,强化学习比监督学习和无监督学习更有效。这个项目的目标是开发一个分布式的非策略RL框架来控制CPS。分布式RL方法避免了在中央处理单元收集所有信息的脆弱性、通信开销和隐私问题。此外,非策略学习方法显著提高了采样效率,保证了操作的安全性。在该项目下开发的分布式RL框架将对依赖多个传感器进行数据收集和控制的交通、制造、医疗保健、智慧城市、城市规划等领域的CPS控制产生深远影响。该项目还涉及教育议程,重点是K-12,本科和研究生水平的教育。该项目的外展部分侧重于提高大学预科学生对研究和工程职业的潜力和吸引力的认识。这个项目的技术目标分为四个重点。第一个重点是利用动作值函数的线性函数逼近发展分布式离策略RL方法。使用线性函数近似的分布式RL算法已被提出仅用于策略评估。这种推力开发了新的强化学习算法,也可以改进策略,直到找到最优策略,这是控制所必需的。由于为RL问题定义适当的特征向量通常是困难的,并且线性映射可能无法捕获这些特征之间可能的非线性相互作用,因此第二个重点开发了使用非线性函数近似的分布式非策略RL方法,特别是神经网络。第三个重点是发展分布式的非政策行为者批评方法。当动作空间很大或连续时,Actor-Critic方法更有效,因为它们使用线性或非线性函数逼近来参数化目标策略函数,并学习最优参数,以便生成的策略映射到每个状态的最优动作。最后,第四个推力开发了用于异步、异构和非平稳数据的分布式RL方法,这些方法在现代CPS中很常见,其中传感器不会观察到相同分布的数据,也不会同时采样数据。此外,采样数据的分布可能随时间而变化。该项目侧重于算法的开发和支持理论结果。在CPS资源分配问题的仿真中对所开发的算法进行了评估,特别是对分布式共享车辆调度系统的控制。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(18)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Augmented Lagrangian optimization under fixed-point arithmetic
定点运算下的增强拉格朗日优化
- DOI:10.1016/j.automatica.2020.109218
- 发表时间:2020
- 期刊:
- 影响因子:6.4
- 作者:Zhang, Yan;Zavlanos, Michael M.
- 通讯作者:Zavlanos, Michael M.
Risk-Averse No-Regret Learning in Online Convex Games
- DOI:10.48550/arxiv.2203.08957
- 发表时间:2022-03
- 期刊:
- 影响因子:0
- 作者:Zifan Wang;Yi Shen;M. Zavlanos
- 通讯作者:Zifan Wang;Yi Shen;M. Zavlanos
Deep Learning for Robotic Mass Transport Cloaking
- DOI:10.1109/tro.2020.2980176
- 发表时间:2018-12
- 期刊:
- 影响因子:7.8
- 作者:Reza Khodayi-mehr;M. Zavlanos
- 通讯作者:Reza Khodayi-mehr;M. Zavlanos
Transfer Reinforcement Learning under Unobserved Contextual Information
未观察到的上下文信息下的迁移强化学习
- DOI:10.1109/iccps48487.2020.00015
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Zhang, Yan;Zavlanos, Michael M.
- 通讯作者:Zavlanos, Michael M.
Policy Evaluation in Distributional LQR
分布式 LQR 中的政策评估
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Wang, Zifan;Gao, Yulong;Wang, Siyi;Zavlanos, Michael M.;Abate, Alessandro;Johansson, Karl Henrik
- 通讯作者:Johansson, Karl Henrik
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michael Zavlanos其他文献
Michael Zavlanos的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michael Zavlanos', 18)}}的其他基金
CPS: Medium: Collaborative Research: Human-on-the-Loop Control for Smart Ultrasound Imaging
CPS:中:协作研究:智能超声成像的人在环控制
- 批准号:
1837499 - 财政年份:2018
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant
NeTS: Medium: Collaborative Research: Optimal Communication for Faster Sensor Network Coordination
NeTS:媒介:协作研究:更快传感器网络协调的最佳通信
- 批准号:
1302284 - 财政年份:2013
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant
NeTS: Synergy: Collaborative Research: Controlling Teams of Autonomous Mobile Beamformers
NeTS:协同:协作研究:自主移动波束形成器的控制团队
- 批准号:
1239339 - 财政年份:2013
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant
RI: Medium: Collaborative Research: Mobile Microrobot Platform for Advanced Manufacturing Applications
RI:中:协作研究:用于先进制造应用的移动微型机器人平台
- 批准号:
1302283 - 财政年份:2013
- 资助金额:
$ 40.75万 - 项目类别:
Continuing Grant
CAREER: Control of Mobile Robot Networks: Integrating the Communication and Physical Domains
职业:移动机器人网络的控制:集成通信和物理领域
- 批准号:
1261828 - 财政年份:2012
- 资助金额:
$ 40.75万 - 项目类别:
Continuing Grant
CAREER: Control of Mobile Robot Networks: Integrating the Communication and Physical Domains
职业:移动机器人网络的控制:集成通信和物理领域
- 批准号:
1054604 - 财政年份:2011
- 资助金额:
$ 40.75万 - 项目类别:
Continuing Grant
相似国自然基金
昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
- 批准号:32000033
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
- 批准号:31972324
- 批准年份:2019
- 资助金额:58.0 万元
- 项目类别:面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
- 批准号:81900988
- 批准年份:2019
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
- 批准号:31802058
- 批准年份:2018
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
- 批准号:31870821
- 批准年份:2018
- 资助金额:56.0 万元
- 项目类别:面上项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
- 批准号:31772128
- 批准年份:2017
- 资助金额:60.0 万元
- 项目类别:面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
- 批准号:81704176
- 批准年份:2017
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
- 批准号:91640114
- 批准年份:2016
- 资助金额:85.0 万元
- 项目类别:重大研究计划
相似海外基金
Collaborative Research: SHF: Small: Technical Debt Management in Dynamic and Distributed Systems
合作研究:SHF:小型:动态和分布式系统中的技术债务管理
- 批准号:
2232720 - 财政年份:2023
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant
CSR: Small: Squeezing More Performance Out of Distributed Storage Systems With a Transparent Ordering-Control Layer
CSR:小:通过透明排序控制层从分布式存储系统中榨取更多性能
- 批准号:
2327609 - 财政年份:2023
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: Technical Debt Management in Dynamic and Distributed Systems
合作研究:SHF:小型:动态和分布式系统中的技术债务管理
- 批准号:
2232721 - 财政年份:2023
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant
CNS Core: Small: Testing and detecting software upgrade failures in data-intensive distributed systems
CNS 核心:小型:测试和检测数据密集型分布式系统中的软件升级故障
- 批准号:
2300562 - 财政年份:2023
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant
SaTC: CORE: Small: Efficient Trustless Distributed Cryptography
SaTC:核心:小型:高效的无信任分布式密码学
- 批准号:
2240976 - 财政年份:2023
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant
NSF-AoF: CIF: Small: Distributed AI for enhanced security in satellite-aided wireless navigation (RESILIENT)
NSF-AoF:CIF:小型:分布式 AI,用于增强卫星辅助无线导航的安全性(弹性)
- 批准号:
2326559 - 财政年份:2023
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant
SHF: Small: A Distributed Scalable End-to-End Tail Latency SLO Guaranteed Resource Management Framework for Microservices
SHF:Small:分布式可扩展端到端尾部延迟 SLO 保证的微服务资源管理框架
- 批准号:
2226117 - 财政年份:2022
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant
CNS Core: Small: Automated testing for data- and compute-intensive distributed systems through feedback-based fuzzing
CNS 核心:小型:通过基于反馈的模糊测试对数据和计算密集型分布式系统进行自动测试
- 批准号:
2140305 - 财政年份:2022
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: Distributed Fragmented Software Design Meetings
协作研究:SHF:小型:分布式碎片化软件设计会议
- 批准号:
2210812 - 财政年份:2022
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant
CNS: Core: Small: Consistent, Geo-Distributed Data Stores on the Public Cloud Using Erasure Coding
CNS:核心:小型:使用纠删码在公共云上实现一致的地理分布式数据存储
- 批准号:
2211045 - 财政年份:2022
- 资助金额:
$ 40.75万 - 项目类别:
Standard Grant