Learning Methods for Decentralized Control in Multi-Agent Systems

多智能体系统中分散控制的学习方法

基本信息

  • 批准号:
    2025732
  • 负责人:
  • 金额:
    $ 40万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-09-01 至 2024-08-31
  • 项目状态:
    已结题

项目摘要

Multi-agent systems (MAS) are expected to become increasingly prevalent in military and civilian domains. Decentralized control and decision-making by agents is a fundamental driver of the diverse applications of multi-agent systems. Agents are expected to act and make decisions without relying on a centralized command structure. Communication and coordination among agents may have to be carried out over sparse, intermittent, unreliable, low data rate and/or noisy communication networks that preclude the possibility of centralized information and decision-making. A key design challenge is to find efficient ways of computing decentralized control and decision strategies for a team of agents. The problem is further compounded by various kinds of uncertainties - uncertainties about the environment, noisy observations, unreliable communication as well as uncertainties in the system model. In this project, we aim to develop learning-based methods for decentralized control in multi-agent systems. Intellectual merit: The research develops the following: (i) learning-based practical methods for computing near-optimal decentralized control policies for multi-agent systems with known system model. (ii) online decentralized learning algorithms for control of multi-agent systems with unknown system model. We aim to develop decentralized algorithms that asymptotically find the optimal decentralized policy for such systems and learn in the most efficient way possible. The proposed research will lay the foundations for Learning-based Decentralized Optimal Control, which is expected to become increasingly important for emerging multi-agent system applications. Broader Impact: The research will significantly impact the science of multi-agent systems, autonomous robotic systems, and reinforcement learning. It will introduce a systematic and practical learning-based approach to design of multi-agent systems that has long been lacking in the literature. The educational impact of the proposed research will include: (i) providing graduate students with a multi-disciplinary training in stochastic control, online learning and optimization, (ii) involvement of undergraduate students during summer to perform computational and lab experiments (iii) efforts to recruit female and under-represented minority students in our projects; (iv) The research results will be incorporated in classes on reinforcement learning, stochastic systems, and decentralized control taught by the principal investigators.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
多智能体系统(MAS)有望在军事和民用领域变得越来越普遍。多智能体的分散控制和决策是多智能体系统多样化应用的根本驱动力。特工应该在不依赖中央指挥结构的情况下采取行动和做出决定。代理之间的通信和协调可能必须在稀疏、间断、不可靠、低数据速率和/或噪声的通信网络上进行,这排除了集中信息和决策的可能性。一个关键的设计挑战是找到有效的方法来计算一组代理的分散控制和决策策略。各种不确定因素进一步加剧了这个问题--环境的不确定因素、噪声观测、不可靠的通信以及系统模型中的不确定因素。在这个项目中,我们的目标是开发基于学习的方法来实现多智能体系统中的分散控制。(I)基于学习的实用方法,用于计算已知系统模型的多智能体系统的近似最优分散控制策略。(Ii)系统模型未知的多智能体系统控制的在线分散学习算法。我们的目标是开发分散算法,渐进地找到此类系统的最优分散策略,并以尽可能最有效的方式学习。所提出的研究将为基于学习的分散最优控制奠定基础,预计这将对新兴的多智能体系统的应用变得越来越重要。更广泛的影响:这项研究将对多智能体系统、自主机器人系统和强化学习的科学产生重大影响。它将介绍一种系统和实用的基于学习的方法来设计多代理系统,这是长期以来文献中所缺乏的。拟议研究的教育影响将包括:(I)为研究生提供随机控制、在线学习和优化方面的多学科培训;(Ii)让本科生在暑期进行计算和实验室实验;(Iii)努力在我们的项目中招募女性和代表性不足的少数族裔学生;(Iv)研究成果将被纳入主要研究人员教授的关于强化学习、随机系统和分散控制的课程。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Online Learning for Unknown Partially Observable MDPs
  • DOI:
  • 发表时间:
    2021-02
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Mehdi Jafarnia-Jahromi;Rahul Jain;A. Nayyar
  • 通讯作者:
    Mehdi Jafarnia-Jahromi;Rahul Jain;A. Nayyar
Optimal Control of Partially Observable Markov Decision Processes with Finite Linear Temporal Logic Constraints
  • DOI:
    10.48550/arxiv.2203.09038
  • 发表时间:
    2022-03
  • 期刊:
  • 影响因子:
    0
  • 作者:
    K. C. Kalagarla;D. Kartik;Dongming Shen;Rahul Jain;A. Nayyar;P. Nuzzo
  • 通讯作者:
    K. C. Kalagarla;D. Kartik;Dongming Shen;Rahul Jain;A. Nayyar;P. Nuzzo
A modified Thompson sampling-based learning algorithm for unknown linear systems
一种改进的基于汤普森采样的未知线性系统学习算法
  • DOI:
    10.1109/cdc51059.2022.9992683
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Gagrani, Mukul;Sudhakara, Sagar;Mahajan, Aditya;Nayyar, Ashutosh;Ouyang, Yi
  • 通讯作者:
    Ouyang, Yi
Online Learning for Cooperative Multi-Player Multi-Armed Bandits
Thompson sampling for linear quadratic mean-field teams
线性二次平均场团队的汤普森采样
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ashutosh Nayyar其他文献

Correction to: Upper and Lower Values in Zero-Sum Stochastic Games with Asymmetric Information
  • DOI:
    10.1007/s13235-020-00366-9
  • 发表时间:
    2020-09-16
  • 期刊:
  • 影响因子:
    1.600
  • 作者:
    Dhruva Kartik;Ashutosh Nayyar
  • 通讯作者:
    Ashutosh Nayyar

Ashutosh Nayyar的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ashutosh Nayyar', 18)}}的其他基金

CAREER: Strategic decision-making for communication and control in decentralized systems
职业:分散系统中通信和控制的战略决策
  • 批准号:
    1750041
  • 财政年份:
    2018
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Stochastic Control for Decentralized Systems: A Common Information Approach
分散系统的随机控制:一种通用信息方法
  • 批准号:
    1509812
  • 财政年份:
    2015
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant

相似国自然基金

Computational Methods for Analyzing Toponome Data
  • 批准号:
    60601030
  • 批准年份:
    2006
  • 资助金额:
    17.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: Inference and Decentralized Computing for Quantile Regression and Other Non-Smooth Methods
合作研究:分位数回归和其他非平滑方法的推理和分散计算
  • 批准号:
    2401268
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Decentralized differentially-private methods for dynamic data release and analysis
用于动态数据发布和分析的去中心化差分隐私方法
  • 批准号:
    10740597
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
Decentralized Data Analytics and Optimization Methods for Physical Asset Management
实物资产管理的去中心化数据分析和优化方法
  • 批准号:
    RGPIN-2020-05477
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Discovery Grants Program - Individual
Decentralized differentially-private methods for dynamic data release and analysis
用于动态数据发布和分析的去中心化差分隐私方法
  • 批准号:
    10367349
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
Decentralized Data Analytics and Optimization Methods for Physical Asset Management
实物资产管理的去中心化数据分析和优化方法
  • 批准号:
    RGPIN-2020-05477
  • 财政年份:
    2021
  • 资助金额:
    $ 40万
  • 项目类别:
    Discovery Grants Program - Individual
Collaborative Research: Inference and Decentralized Computing for Quantile Regression and Other Non-Smooth Methods
合作研究:分位数回归和其他非平滑方法的推理和分散计算
  • 批准号:
    2113346
  • 财政年份:
    2021
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: Inference and Decentralized Computing for Quantile Regression and Other Non-Smooth Methods
合作研究:分位数回归和其他非平滑方法的推理和分散计算
  • 批准号:
    2113409
  • 财政年份:
    2021
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Decentralized Data Analytics and Optimization Methods for Physical Asset Management
实物资产管理的去中心化数据分析和优化方法
  • 批准号:
    RGPIN-2020-05477
  • 财政年份:
    2020
  • 资助金额:
    $ 40万
  • 项目类别:
    Discovery Grants Program - Individual
The establishment of the trust among stakeholders and patients for sharing the medical and genomic records by centralized and decentralized storage methods
通过集中和分散的存储方法在利益相关者和患者之间建立共享医疗和基因组记录的信任
  • 批准号:
    19KT0019
  • 财政年份:
    2019
  • 资助金额:
    $ 40万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Decentralized differentially-private methods for dynamic data release and analysis
用于动态数据发布和分析的去中心化差分隐私方法
  • 批准号:
    9239100
  • 财政年份:
    2017
  • 资助金额:
    $ 40万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了