Decentralized stochastic control of multi-agent teams: approximation, learning, and signaling
多智能体团队的去中心化随机控制:逼近、学习和信号发送
基本信息
- 批准号:RGPIN-2021-03511
- 负责人:
- 金额:$ 3.35万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
We are moving towards an envisioned future where multiple interconnected autonomous agents will interact with humans in shared environments. Examples include self-driving cars, robotic assistants in homes, factory floors and warehouses, Industry 4.0 where automated control algorithms supervised by human operators control multiple interconnected industrial plants, and so on. A salient feature of such environments is that the agents have different information, yet they need to cooperate and coordinate their actions to achieve a common goal. The agents may have uncertainty about the system model and must be able to adapt to stochastic changes in the environment. The long term goal of the proposed research program is to develop theory and algorithms which address these salient features, provide a systematic methodology to design multiple agents operating in dynamic, stochastic, and uncertain environments and, thereby, enable the technologies of the future. The proposal maps a five year research program to pursue three research directions: (i) Approximation guarantees in decentralized control: Quantify the affect of model uncertainty and model approximation on the performance of decentralized systems. Use these to develop a solution framework which provides approximately optimal policy for hitherto unsolved information structures and apply it to networked control systems (ii) Learning with decentralized information: Develop multi-agent reinforcement learning (MARL) framework for decentralized learning and decentralized execution paradigm. Characterize the asymptotic optimality and regret of MARL algorithms. Identify trade-offs between speed of convergence and performance of the converged policies by restricting attention to policies with specific structure. Use this trade-off to investigate explainable and interpretable decision making in human-robot teams. (iii) Role of signaling in multi-agent systems: Characterize what and when to communicate over explicit communication channels when communication is costly and potentially the system model is unknown. Build on these results to characterize how to when and how to signal information via implicit communication. Determine the impact of implicit communication on explainable and interpretable decision making in human-robot teams. The proposed research program will provide a broad training to 5 PhD, 3 MEng, and 5 UG students in fundamental areas of Systems and Control and Reinforcement Learning, thereby providing them with a solid foundation to be at the forefront of innovation of a growing and transformative research field. The results will advance the state of knowledge in decentralized stochastic control and multi-agent reinforcement learning, and will contribute to the emergence of new technologies which will maintain Canada's position as an innovator in machine learning, energy, automotive, aerospace, and information technology sectors.
我们正朝着一个设想的未来迈进,在这个未来,多个相互连接的自主智能体将在共享的环境中与人类互动。例如,自动驾驶汽车,家庭、工厂车间和仓库中的机器人助手,以及由人类操作员监督的自动控制算法控制多个互联工业工厂的工业4.0等等。这些环境的一个显著特征是,智能体拥有不同的信息,但它们需要合作和协调行动以实现共同的目标。代理可能对系统模型具有不确定性,并且必须能够适应环境中的随机变化。拟议的研究计划的长期目标是开发理论和算法,解决这些突出的特点,提供一个系统的方法来设计多个代理在动态,随机和不确定的环境中运行,从而使未来的技术。该提案描绘了一个为期五年的研究计划,以追求三个研究方向:(i)分散控制中的近似保证:量化模型不确定性和模型近似对分散系统性能的影响。使用这些开发一个解决方案框架,提供近似最优的政策,迄今未解决的信息结构,并将其应用到网络控制系统(ii)学习与分散的信息:开发多代理强化学习(MARL)框架分散学习和分散执行范式。刻画MARL算法的渐近最优性和遗憾性。通过限制对具有特定结构的策略的关注,确定收敛速度与收敛策略性能之间的权衡。使用这种权衡来研究人类-机器人团队中可解释和可解释的决策。(iii)信令在多代理系统中的作用:当通信代价高昂且系统模型可能未知时,描述在显式通信信道上通信的内容和时间。在这些结果的基础上,来描述如何、何时以及如何通过隐式沟通来传递信息。确定隐式通信对人类-机器人团队中可解释和可解释决策的影响。拟议的研究计划将为5名博士,3名工程硕士和5名UG学生提供系统和控制以及强化学习的基本领域的广泛培训,从而为他们提供坚实的基础,使他们处于不断增长和变革的研究领域的创新前沿。研究结果将推进分散随机控制和多智能体强化学习的知识状态,并将有助于新技术的出现,这将保持加拿大作为机器学习,能源,汽车,航空航天和信息技术领域的创新者的地位。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Mahajan, Aditya其他文献
Remote Estimation Over a Packet-Drop Channel With Markovian State
- DOI:
10.1109/tac.2019.2926160 - 发表时间:
2020-05-01 - 期刊:
- 影响因子:6.8
- 作者:
Chakravorty, Jhelum;Mahajan, Aditya - 通讯作者:
Mahajan, Aditya
Scalable Regret for Learning to Control Network-Coupled Subsystems With Unknown Dynamics
学习控制具有未知动态的网络耦合子系统的可扩展遗憾
- DOI:
10.1109/tcns.2022.3184107 - 发表时间:
2023 - 期刊:
- 影响因子:4.2
- 作者:
Sudhakara, Sagar;Mahajan, Aditya;Nayyar, Ashutosh;Ouyang, Yi - 通讯作者:
Ouyang, Yi
Thompson sampling for linear quadratic mean-field teams
线性二次平均场团队的汤普森采样
- DOI:
10.1109/cdc45484.2021.9683030 - 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Gagrani, Mukul;Sudhakara, Sagar;Mahajan, Aditya;Nayyar, Ashutosh;Ouyang, Yi - 通讯作者:
Ouyang, Yi
Transmission of Bursty Traffic over Fading Channels with Adaptive Decision Feedback
具有自适应决策反馈的衰落信道上的突发流量传输
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Sayedana, Borna;Mahajan, Aditya;Yeh, Edmund - 通讯作者:
Yeh, Edmund
A modified Thompson sampling-based learning algorithm for unknown linear systems
一种改进的基于汤普森采样的未知线性系统学习算法
- DOI:
10.1109/cdc51059.2022.9992683 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Gagrani, Mukul;Sudhakara, Sagar;Mahajan, Aditya;Nayyar, Ashutosh;Ouyang, Yi - 通讯作者:
Ouyang, Yi
Mahajan, Aditya的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Mahajan, Aditya', 18)}}的其他基金
Overload protection in mobile edge computing using multi-agent reinforcement learning
使用多智能体强化学习的移动边缘计算中的过载保护
- 批准号:
571054-2021 - 财政年份:2021
- 资助金额:
$ 3.35万 - 项目类别:
Alliance Grants
Decentralized stochastic control of multi-agent teams: approximation, learning, and signaling
多智能体团队的去中心化随机控制:逼近、学习和信号发送
- 批准号:
RGPIN-2021-03511 - 财政年份:2021
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
RGPIN-2016-05165 - 财政年份:2020
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
RGPIN-2016-05165 - 财政年份:2019
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
493011-2016 - 财政年份:2018
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
RGPIN-2016-05165 - 财政年份:2018
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
493011-2016 - 财政年份:2017
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
RGPIN-2016-05165 - 财政年份:2017
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
RGPIN-2016-05165 - 财政年份:2016
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
Optimal control of dynamic teams under constraints and uncertainty
约束和不确定性下动态团队的最优控制
- 批准号:
402753-2011 - 财政年份:2015
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
Development of a Linear Stochastic Model for Wind Field Reconstruction from Limited Measurement Data
- 批准号:
- 批准年份:2020
- 资助金额:40 万元
- 项目类别:
基于梯度增强Stochastic Co-Kriging的CFD非嵌入式不确定性量化方法研究
- 批准号:11902320
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
高性能纤维混凝土构件抗爆的强度预测
- 批准号:51708391
- 批准年份:2017
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
非标准随机调度模型的最优动态策略
- 批准号:71071056
- 批准年份:2010
- 资助金额:28.0 万元
- 项目类别:面上项目
基于随机网络演算的无线机会调度算法研究
- 批准号:60702009
- 批准年份:2007
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
基于随机模型检测的网络脆弱性分析研究
- 批准号:60573144
- 批准年份:2005
- 资助金额:5.0 万元
- 项目类别:面上项目
二阶段随机优化的并行方法
- 批准号:10161002
- 批准年份:2001
- 资助金额:4.5 万元
- 项目类别:地区科学基金项目
相似海外基金
Decentralized stochastic control of multi-agent teams: approximation, learning, and signaling
多智能体团队的去中心化随机控制:逼近、学习和信号发送
- 批准号:
RGPIN-2021-03511 - 财政年份:2021
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
RGPIN-2016-05165 - 财政年份:2020
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
RGPIN-2016-05165 - 财政年份:2019
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
493011-2016 - 财政年份:2018
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
RGPIN-2016-05165 - 财政年份:2018
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
493011-2016 - 财政年份:2017
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
RGPIN-2016-05165 - 财政年份:2017
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
Decentralized stochastic control: information structures, communication, and learning
分散随机控制:信息结构、通信和学习
- 批准号:
RGPIN-2016-05165 - 财政年份:2016
- 资助金额:
$ 3.35万 - 项目类别:
Discovery Grants Program - Individual
Decentralized Stochastic Control of Stochastic Games
随机博弈的分散随机控制
- 批准号:
492891-2015 - 财政年份:2015
- 资助金额:
$ 3.35万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Master's
Stochastic Control for Decentralized Systems: A Common Information Approach
分散系统的随机控制:一种通用信息方法
- 批准号:
1509812 - 财政年份:2015
- 资助金额:
$ 3.35万 - 项目类别:
Standard Grant