权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Reinforcement Learning and Kullback-Leibler Stochastic Optimal Control for Complex Networks

复杂网络的强化学习和 Kullback-Leibler 随机最优控制

基本信息

批准号：
1935389
负责人：
Sean Meyn
金额：
$ 38万
依托单位：
University of Florida
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-09-15 至 2023-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1935389&HistoricalAwards=false
关键词：
Reinforcement Learning Kullback Leibler Stochastic

项目摘要

Natural and man-made networked systems are all around us. The power grid and the Internet are two examples of apparently complex interconnected systems, in which millions of "agents" are eager to extract value in the form of energy or bandwidth. While these systems are complex when measured in graph-theoretic terms, the behavior of communication and energy systems appears simple and highly predictable to the end users (in most of the world). This success is due in part to distributed control loops that manage system-wide supply-demand balance. An example of distributed control in the Internet is TCP/IP, and automatic generation control (AGC) in most electric power grids. While distributed control protocols are highly developed and widely accepted in communication applications, this is less true in other networked systems such as electric power and natural gas distribution. This project aims to advance control theory for complex interconnected systems. The application focus is on power systems, but the control techniques are general and are likely to have far broader impact. Recent control innovations are highlighted in the project as building blocks in the construction of algorithms for control, based on a combination of local decision making and global management of the ensemble: 1. Control techniques for local decision making will be a theme of the project using a new Kullback-Leibler-Quadratic optimal control approach introduced by the PI's group. 2. Reinforcement learning (RL) is the engine behind Google's recent computer game successes and is a natural framework for control synthesis in an uncertain complex environment. The Zap Q-learning algorithms introduced recently by the PI and his colleagues are a new class of RL algorithms that are virtually universally stable and have provably optimal convergence rate. 3. Mean field models have a long history in power systems (with roots in statistical physics), they will be used to approximate aggregate behavior, and as a foundation to construct algorithms to control the aggregate. Algorithm design will be complemented with simulation studies, focusing initially on applications to power systems. A course in smart grid technologies will be augmented and the project will include participation from undergraduate students.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

自然和人为的网络系统无处不在。电网和互联网是两个明显复杂的互联系统的例子，其中数以百万计的“代理”渴望以能量或带宽的形式提取价值。虽然用图论术语来衡量这些系统是复杂的，但通信和能源系统的行为对最终用户（在世界上大多数地区）来说是简单和高度可预测的。这一成功部分归功于分布式控制循环，它管理着全系统的供需平衡。Internet中分布式控制的一个例子是TCP/IP，以及大多数电网中的自动发电控制（AGC）。虽然分布式控制协议在通信应用中得到了高度发展和广泛接受，但在电力和天然气分配等其他网络系统中却不太适用。本项目旨在推进复杂互联系统的控制理论。应用的重点是电力系统，但控制技术是通用的，可能有更广泛的影响。最近的控制创新在项目中被强调为构建控制算法的基石，基于局部决策和整体管理的结合：1。局部决策的控制技术将是该项目的一个主题，使用PI小组引入的新的Kullback-Leibler-Quadratic最优控制方法。2. 强化学习（RL）是b谷歌最近电脑游戏成功背后的引擎，也是在不确定复杂环境中进行控制合成的自然框架。PI和他的同事最近介绍的Zap Q-learning算法是一种新的RL算法，它实际上是普遍稳定的，并且具有可证明的最佳收敛速度。3. 平均场模型在电力系统中有着悠久的历史（植根于统计物理），它们将被用来近似聚合行为，并作为构建控制聚合算法的基础。算法设计将辅以仿真研究，最初侧重于电力系统的应用。智能电网技术的课程将会增加，该项目将包括本科生的参与。该奖项反映了美国国家科学基金会的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（21）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Quasi-Stochastic Approximation: Design Principles With Applications to Extremum Seeking Control

拟随机逼近：设计原理及其在极值搜索控制中的应用

DOI：
10.1109/mcs.2023.3291884
发表时间：
2023
期刊：
IEEE Control Systems
影响因子：
0
作者：
Lauand, Caio Kalil;Meyn, Sean
通讯作者：
Meyn, Sean

The Curse of Memory in Stochastic Approximation

随机逼近中的记忆诅咒

DOI：
发表时间：
2023
期刊：
Proceedings of the IEEE Conference on Decision Control
影响因子：
0
作者：
Lauand, Caio Kalil
通讯作者：
Lauand, Caio Kalil

Approaching Quartic Convergence Rates for Quasi-Stochastic Approximation with Application to Gradient-Free Optimization

接近准随机近似的四次收敛率并应用于无梯度优化

DOI：
发表时间：
2022
期刊：
Advances in Neural Information Processing Systems
影响因子：
0
作者：
Kalil Lauand, Caio and
通讯作者：
Kalil Lauand, Caio and

Model-Free Primal-Dual Methods for Network Optimization with Application to Real-Time Optimal Power Flow

DOI：
10.23919/acc45564.2020.9147814
发表时间：
2019-09
期刊：
2020 American Control Conference (ACC)
影响因子：
0
作者：
Yue-Chun Chen;A. Bernstein;Adithya M. Devraj;Sean P. Meyn
通讯作者：
Yue-Chun Chen;A. Bernstein;Adithya M. Devraj;Sean P. Meyn

Load-Level Control Design for Demand Dispatch With Heterogeneous Flexible Loads

异构柔性负载需求调度的负载级控制设计

DOI：
10.1109/tcst.2023.3245287
发表时间：
2023
期刊：
IEEE Transactions on Control Systems Technology
影响因子：
4.8
作者：
Mathias, Joel;Bušić, Ana;Meyn, Sean
通讯作者：
Meyn, Sean

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Sean Meyn其他文献

Coding and control for communication networks

DOI：
10.1007/s11134-009-9148-3
发表时间：
2009-11-25
期刊：
QUEUEING SYSTEMS
影响因子：
0.700
作者：
Wei Chen;Danail Traskov;Michael Heindlmaier;Muriel Médard;Sean Meyn;Asuman Ozdaglar
通讯作者：
Asuman Ozdaglar