权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

III: Small: Distributed Reinforcement Learning over Complex Networks

III：小型：复杂网络上的分布式强化学习

基本信息

批准号：
2230101
负责人：
Ji Liu
金额：
$ 60万
依托单位：
SUNY at Stony Brook
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-09-01 至 2025-08-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2230101&HistoricalAwards=false
关键词：
III Small Distributed Reinforcement Learning

项目摘要

In many distributed systems, a team of autonomous agents must collaborate in a complex environment, process massive amounts of streaming data, and simultaneously make optimal decisions. Traditional decision-making techniques can hardly tackle such a scenario, and reinforcement learning (RL) has been recently shown to be a promising decision-making technique for large-scale distributed systems. However, previous distributed RL models have failed to account for sensing and observing capabilities of agents, and thus rely on global information, which is not readily available in distributed environments. To fill this gap, this project aims to build a revolutionary, fully distributed RL system for large-scale networked systems without using global information. Toward this end, the project develops a novel theoretical framework, computational models, and scientific software tools needed to design, analyze, and test fully distributed RL algorithms. The algorithms will be further designed to be robust against dynamic environments and resilient to adversarial attacks, which will enable teams of multiple autonomous agents to reliably achieve their goals. The research will greatly impact real-world application areas where distributed machine learning algorithms and decision-making methods are needed. Typical examples include motion planning of teams of mobile robots, and coordination of networked smart devices in an IoT environment. The project promotes education and outreach activities, including broadening participation of female students in the field of machine learning, creating new courses, and designing research projects for K-12 students and undergraduates. The publications and software tools will be shared with the community to foster further research on distributed RL.The central goal of this project is to establish theoretical foundations for fully distributed RL algorithm design, analysis, and applications over large-scale networks. The key technical challenges include bridging the gap between the global and local observability settings and achieving resiliency in the presence of dynamic and untrustworthy communications. To achieve the technical objective and tackle technical challenges, the project investigates three main thrusts. The first thrust establishes the fundamental novel theory for the design of fully distributed RL by approximating global information via distributed estimation. The second thrust develops robust distributed RL algorithms against time-varying communication and sensing capabilities, communication delays, and asynchronous updating. The third thrust designs distributed RL algorithms that are resilient to adversaries and malicious attacks capable of introducing untrustworthy information into the communication network, by first designing communication-efficient RL algorithms in which each agent can transmit only low-dimensional states, and then designing resilient information fusion/aggregation approaches for small- and even single-dimensional cases. The project provides a suite of novel distributed RL algorithms which can be used in any applied area where fully distributed decision making and learning with streaming data and in adversarial environments are needed. Concurrently with the three main thrusts, the project also designs, develops, and maintains a software framework for empirically validating and studying distributed RL algorithms that the entire distributed RL community can use.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

在许多分布式系统中，自治代理团队必须在复杂的环境中协作，处理大量流数据，并同时做出最佳决策。传统的决策技术很难解决这种情况，而强化学习（RL）最近被证明是一种有前途的大规模分布式系统决策技术。然而，以前的分布式强化学习模型未能考虑代理的感知和观察能力，因此依赖于全局信息，而全局信息在分布式环境中不易获得。为了填补这一空白，该项目旨在为大规模网络系统构建一个革命性的、完全分布式的强化学习系统，而不需要使用全局信息。为此，该项目开发了设计、分析和测试完全分布式强化学习算法所需的新颖理论框架、计算模型和科学软件工具。这些算法将进一步设计为对动态环境具有鲁棒性，并且能够抵御对抗性攻击，这将使多个自主代理团队能够可靠地实现其目标。该研究将极大地影响需要分布式机器学习算法和决策方法的实际应用领域。典型示例包括移动机器人团队的运动规划以及物联网环境中联网智能设备的协调。该项目促进教育和推广活动，包括扩大女学生在机器学习领域的参与、创建新课程以及为 K-12 学生和本科生设计研究项目。这些出版物和软件工具将与社区共享，以促进分布式强化学习的进一步研究。该项目的中心目标是为大规模网络上的完全分布式强化学习算法设计、分析和应用奠定理论基础。关键的技术挑战包括弥合全球和本地可观测性设置之间的差距，以及在存在动态和不可信通信的情况下实现弹性。为了实现技术目标并应对技术挑战，该项目研究了三个主要目标。第一个主旨是通过分布式估计逼近全局信息，建立完全分布式强化学习设计的基本新颖理论。第二个重点是针对时变通信和传感能力、通信延迟和异步更新开发鲁棒的分布式强化学习算法。第三个重点是设计分布式强化学习算法，该算法能够抵御对手和能够将不可信信息引入通信网络的恶意攻击，首先设计高效通信的强化学习算法，其中每个代理只能传输低维状态，然后为小维甚至单维情况设计弹性信息融合/聚合方法。该项目提供了一套新颖的分布式强化学习算法，可用于需要完全分布式决策和流数据学习以及对抗环境的任何应用领域。在实现这三个主要目标的同时，该项目还设计、开发和维护一个软件框架，用于实证验证和研究整个分布式强化学习社区都可以使用的分布式强化学习算法。该奖项反映了 NSF 的法定使命，并通过使用基金会的智力价值和更广泛的影响审查标准进行评估，被认为值得支持。