CAREER: Reinforcement Learning-Based Control of Heterogeneous Multi-Agent Systems in Structured Environments: Algorithms and Complexity
职业:结构化环境中异构多智能体系统的基于强化学习的控制:算法和复杂性
基本信息
- 批准号:2237830
- 负责人:
- 金额:$ 54.1万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-07-01 至 2028-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Reinforcement learning (RL) is a popular framework for learning optimal decision-making in complex environments, and many RL algorithms have been developed to improve decision-making of a single agent in normal environments. However, modern large-scale distributed learning applications usually involve multiple heterogeneous agents that interact with complex environments, making the optimal decision-making fundamentally more challenging to learn. For example, when navigating multiple drones in an open area, the drones need to properly cooperative with each other and take the environment uncertainty into account. As another example, in distributed wireless networks, the interaction of the agents (e.g., base stations or mobile phones) are subject to heterogeneous constraints on power and bandwidth, etc. This project aims to develop a resilient RL framework for managing heterogeneous multi-agent systems in complex environments, and systematically design efficient multi-agent RL algorithms with comprehensive convergence and complexity analysis. The project will produce RL algorithm packages that are fully accessible to the public. The research activities will also generate positive educational impacts on undergraduate and graduate students. The materials developed by this project will be integrated into courses on machine learning and optimization, and will benefit interdisciplinary students majoring in electrical and computer engineering, statistics and computer science. The project will actively involve underrepresented students and integrate research with education for undergraduate and graduate students in STEM. It will also produce introductory materials for K-12 students to be used in engineering summer research programs.The overarching goal of this project is to develop a resilient RL framework for managing multi-agent systems that involve heterogeneous agents in complex and structured environments, and systematically design scalable and computation-efficient RL algorithms with rigorous and comprehensive convergence and complexity analysis for managing such systems. The proposed research includes three major thrusts. First, to manage cooperative agents with heterogeneous constraints in various types of structured environments (e.g., homogeneity and uncertainty), the environment model structure will be leveraged to develop fully decentralized policy optimization algorithms with convergence and complexity analysis. Second, to manage competitive agents with heterogeneous constraints in uncertain environment, new tractable notions of constrained and robust equilibrium will be proposed. Their fundamental structures and properties will be studied, based on which fully-decentralized primal-dual type policy optimization algorithms and robust value-based algorithms with convergence guarantees will be developed. Lastly, to improve the generalizability of agents’ policies across heterogeneous environments, a new assistive RL framework that can substantially enhance the generalizability using few rounds of information exchange without data sharing will be developed. These RL algorithms will be applied to learn resilient and optimal control policies for interference management in wireless networks and energy control in power networks.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
强化学习(RL)是一种在复杂环境中学习最优决策的流行框架,并且已经开发了许多RL算法来改善单个Agent在正常环境中的决策。然而,现代大规模分布式学习应用通常涉及多个异构代理,这些代理与复杂的环境交互,使得最优决策从根本上更具挑战性。例如,当在开放区域中导航多个无人机时,无人机需要适当地彼此协作并考虑环境的不确定性。作为另一示例,在分布式无线网络中,代理(例如,本项目旨在开发一个用于管理复杂环境中异构多智能体系统的弹性强化学习框架,并系统地设计具有全面收敛性和复杂性分析的高效多智能体强化学习算法。该项目将产生RL算法包,完全可供公众访问。研究活动还将对本科生和研究生产生积极的教育影响。该项目开发的材料将被整合到机器学习和优化课程中,并将使电气和计算机工程、统计学和计算机科学专业的跨学科学生受益。该项目将积极参与代表性不足的学生,并将研究与STEM本科生和研究生的教育相结合。 该项目的总体目标是开发一个弹性强化学习框架,用于管理复杂和结构化环境中涉及异构代理的多代理系统,并系统地设计可扩展和计算高效的强化学习算法,并进行严格和全面的收敛和复杂性分析,以管理此类系统。拟议的研究包括三个主要方面。首先,为了在各种类型的结构化环境中管理具有异构约束的合作代理(例如,同质性和不确定性),将利用环境模型结构开发具有收敛性和复杂性分析的完全分散的策略优化算法。第二,为了在不确定环境中管理具有异质约束的竞争代理,将提出新的易处理的约束和鲁棒均衡的概念。研究了它们的基本结构和性质,并在此基础上提出了完全分散的原-对偶型策略优化算法和具有收敛保证的鲁棒值基算法。最后,为了提高代理的政策在异构环境中的通用性,一个新的辅助RL框架,可以大大提高通用性,使用几轮的信息交换,而无需数据共享将开发。这些RL算法将被应用于学习无线网络中干扰管理和电力网络中能量控制的弹性和最佳控制策略。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估而被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yi Zhou其他文献
RNN-Based Sequence-Preserved Attention for Dependency Parsing
基于 RNN 的序列保留注意力依存解析
- DOI:
10.1609/aaai.v32i1.12011 - 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Yi Zhou;Junying Zhou;Lu Liu;Jiangtao Feng;Haoyuan Peng;Xiaoqing Zheng - 通讯作者:
Xiaoqing Zheng
迷走神经背核NMDA受体依赖突触活动介导针刺足三里对胃运动的增强
- DOI:
- 发表时间:
2012 - 期刊:
- 影响因子:0
- 作者:
Qiwen Tan;Yi Zhou;Bing Zhu;Haifa Qiao - 通讯作者:
Haifa Qiao
Phylogenetic study of Ameiurus melas based on complete mitochondrial DNA sequence
基于完整线粒体DNA序列的黑腹鲫鱼系统发育研究
- DOI:
10.3109/19401736.2015.1106511 - 发表时间:
2016-11 - 期刊:
- 影响因子:0
- 作者:
Fan Yu;Juhua Yu;Yi Zhou;Jinpeng Yan;Yanhong Fang;Wenjun Wang;Zhong Yang - 通讯作者:
Zhong Yang
Inherent Oxygen Vacancies Boost Surface Reconstruction of Ultrathin Ni-Fe Layered-Double-Hydroxides toward Efficient Electrocatalytic Oxygen Evolution
固有氧空位促进超薄 Ni-Fe 层状双氢氧化物的表面重构,实现高效电催化析氧
- DOI:
10.1021/acssuschemeng.1c02256 - 发表时间:
2021-05 - 期刊:
- 影响因子:8.4
- 作者:
Yi Zhou;Wenbiao Zhang;Jialai Hu;Dan Li;Xing Yin;Qingsheng Gao - 通讯作者:
Qingsheng Gao
Identification of Flavonoid 3′-Hydroxylase Genes from Red Chinese Sand Pear (Pyrus pyrifolia Nakai) and Their Regulation of Anthocyanin Accumulation in Fruit Peel
红沙梨中黄酮3′-羟化酶基因的鉴定及其对果皮花色苷积累的调控
- DOI:
10.3390/horticulturae10060535 - 发表时间:
2024 - 期刊:
- 影响因子:3.1
- 作者:
Yi Zhou;Ruiyan Tao;J. Ni;Minjie Qian;Yuanwen Teng - 通讯作者:
Yuanwen Teng
Yi Zhou的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yi Zhou', 18)}}的其他基金
Collaborative Research: SCALE MoDL: Advancing Theoretical Minimax Deep Learning: Optimization, Resilience, and Interpretability
合作研究:SCALE MoDL:推进理论极小极大深度学习:优化、弹性和可解释性
- 批准号:
2134223 - 财政年份:2021
- 资助金额:
$ 54.1万 - 项目类别:
Continuing Grant
CIF: Small: Self-Adaptive Optimization Algorithms with Fast Convergence via Geometry-Adapted Hyper-Parameter Scheduling
CIF:小型:通过几何自适应超参数调度实现快速收敛的自适应优化算法
- 批准号:
2106216 - 财政年份:2021
- 资助金额:
$ 54.1万 - 项目类别:
Standard Grant
Collaborative Research: Neural-cognitive analysis of spatial scenes with competing, dynamic sound sources
合作研究:对具有竞争性动态声源的空间场景进行神经认知分析
- 批准号:
1539376 - 财政年份:2015
- 资助金额:
$ 54.1万 - 项目类别:
Standard Grant
相似国自然基金
海桑属杂种区强化(Reinforcement)的检验与遗传基础研究
- 批准号:30800060
- 批准年份:2008
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
相似海外基金
CAREER: Stochasticity and Resilience in Reinforcement Learning: From Single to Multiple Agents
职业:强化学习中的随机性和弹性:从单个智能体到多个智能体
- 批准号:
2339794 - 财政年份:2024
- 资助金额:
$ 54.1万 - 项目类别:
Continuing Grant
CAREER: Towards Real-world Reinforcement Learning
职业:走向现实世界的强化学习
- 批准号:
2339395 - 财政年份:2024
- 资助金额:
$ 54.1万 - 项目类别:
Continuing Grant
CAREER: Robust Reinforcement Learning Under Model Uncertainty: Algorithms and Fundamental Limits
职业:模型不确定性下的鲁棒强化学习:算法和基本限制
- 批准号:
2337375 - 财政年份:2024
- 资助金额:
$ 54.1万 - 项目类别:
Continuing Grant
CAREER: Temporal Causal Reinforcement Learning and Control for Autonomous and Swarm Cyber-Physical Systems
职业:自治和群体网络物理系统的时间因果强化学习和控制
- 批准号:
2339774 - 财政年份:2024
- 资助金额:
$ 54.1万 - 项目类别:
Continuing Grant
CAREER: Structure Exploiting Multi-Agent Reinforcement Learning for Large Scale Networked Systems: Locality and Beyond
职业:为大规模网络系统利用多智能体强化学习的结构:局部性及其他
- 批准号:
2339112 - 财政年份:2024
- 资助金额:
$ 54.1万 - 项目类别:
Continuing Grant
CAREER: Intelligent Battery Management with Safe, Efficient, Fast-Adaption Reinforcement Learning and Physics-Inspired Machine Learning: From Cells to Packs
职业:具有安全、高效、快速适应的强化学习和物理启发机器学习的智能电池管理:从电池到电池组
- 批准号:
2340194 - 财政年份:2024
- 资助金额:
$ 54.1万 - 项目类别:
Continuing Grant
CAREER: Dual Reinforcement Learning: A Unifying Framework with Guarantees
职业:双重强化学习:有保证的统一框架
- 批准号:
2340651 - 财政年份:2024
- 资助金额:
$ 54.1万 - 项目类别:
Continuing Grant
CAREER: Foundations of Reinforcement Learning under Partial Observability
职业:部分可观察性下强化学习的基础
- 批准号:
2239297 - 财政年份:2023
- 资助金额:
$ 54.1万 - 项目类别:
Continuing Grant
CAREER: OneSense: One-Rule-for-All Combinatorial Boolean Synthesis via Reinforcement Learning
职业:OneSense:通过强化学习进行一刀切的组合布尔综合
- 批准号:
2349670 - 财政年份:2023
- 资助金额:
$ 54.1万 - 项目类别:
Continuing Grant
CAREER: Reconfigurable and Predictive Control with Reinforcement Learning Supervisor for Active Battery Cell Balancing
职业:利用强化学习监控器实现主动电池平衡的可重构和预测控制
- 批准号:
2237317 - 财政年份:2023
- 资助金额:
$ 54.1万 - 项目类别:
Continuing Grant