CPS: Medium: Collaborative Research: Provably Safe and Robust Multi-Agent Reinforcement Learning with Applications in Urban Air Mobility
CPS:中:协作研究:可证明安全且鲁棒的多智能体强化学习及其在城市空中交通中的应用
基本信息
- 批准号:2312094
- 负责人:
- 金额:$ 40万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-06-01 至 2026-05-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
This Cyber-Physical Systems (CPS) project aims at designing theories and algorithms for scalable multi-agent planning and control to support safety-critical autonomous eVTOL aircraft in high-throughput, uncertain and dynamic environments. Urban Air Mobility (UAM) is an emerging air transportation mode in which electrical vertical take-off and landing (eVTOL) aircraft will safely and efficiently transport passengers and cargo within urban areas. Guidance from the White House, the National Academy of Engineering, and the US Congress has encouraged fundamental research in UAM to maintain the US global leadership in this field. The success of UAM will depend on the safe and robust multi-agent autonomy to scale up the operations to high-throughput urban air traffic. Learning-based techniques such as deep reinforcement learning and multi-agent reinforcement learning are developed to support planning and control for these eVTOL vehicles. However, there is a major challenge to provide theoretical safety and robustness guarantees for these learning-based neural network in-the-loop models in multi-agent autonomous UAM applications. In this project, the researchers will collaborate with committed government and industry partners on the use-case-inspired fundamental research, with a focus on promoting safety and reliability of AI, machine learning and autonomy in students with diverse backgrounds. The technical objectives of this project include (1) Safety and Robustness of Single-Agent Reinforcement Learning: in order to address the “safety critical” UAM challenge, the PIs plan the min-max optimization for single agent reinforcement learning to formally build sufficient safety margin, constrained reinforcement learning to formulate safety as physical constraints in state and action spaces, and the novel cautious reinforcement learning that uses variational policy gradient to plan the safest aircraft trajectory with minimum distributional risk; (2) Safety and Robustness of Multi-Agent Reinforcement Learning: in order to address the “heterogeneous agents and scalability” challenge, a novel federated reinforcement learning framework where a central agent coordinates with decentralized safe agents to improve traffic throughput while guaranteeing safety, and a scaling mechanism to accommodate a varying number of decentralized aircraft; (3) Safety and Robustness from Simulations to the Real World: in order to address the “high-dimensionality and environment uncertainty” challenge, the researchers will focus on the agents’ policy robustness under distribution shift and fast adaptation from simulation to the real world. Specifically, value-targeted model learning to incorporate domain knowledge such as the aircraft and environment physics, and a safe adaptation mechanism after the RL model is deployed online for flight testing or execution is planned.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
这个网络物理系统(CPS)项目旨在设计可扩展的多智能体规划和控制的理论和算法,以支持高吞吐量,不确定和动态环境中的安全关键自主eVTOL飞机。城市空中交通(UAM)是一种新兴的空中运输模式,其中电动垂直起降(eVTOL)飞机将在城市地区内安全有效地运输乘客和货物。来自白宫、国家工程院和美国国会的指导鼓励UAM的基础研究,以保持美国在该领域的全球领导地位。UAM的成功将取决于安全和强大的多智能体自主性,以将操作扩展到高吞吐量的城市空中交通。开发了基于学习的技术,如深度强化学习和多智能体强化学习,以支持这些垂直起降车辆的规划和控制。然而,在多智能体自主UAM应用中,为这些基于学习的神经网络在环模型提供理论上的安全性和鲁棒性保证是一个重大挑战。在这个项目中,研究人员将与政府和行业合作伙伴合作,开展基于用例的基础研究,重点是促进人工智能的安全性和可靠性,机器学习和不同背景学生的自主性。本项目的技术目标包括(1)单Agent强化学习的安全性和鲁棒性:为了解决“安全关键”的UAM挑战,PI计划单代理强化学习的最小-最大优化,以正式建立足够的安全裕度,约束强化学习将安全性公式化为状态和动作空间中的物理约束,以及一种新的谨慎强化学习方法,该方法使用变策略梯度来规划具有最小分布风险的最安全飞行器轨迹;(2)多智能体强化学习的安全性和鲁棒性:为了解决“异构代理和可伸缩性”的挑战,一种新的联合强化学习框架,其中中央代理与分散的安全代理协调以提高流量吞吐量,(3)从模拟到真实的世界的安全性和鲁棒性:为了解决“高维和环境不确定性”的挑战,研究人员将重点关注代理在分布转移和从模拟到真实的世界的快速适应下的策略鲁棒性。具体而言,计划进行以价值为目标的模型学习,以整合飞机和环境物理学等领域知识,并在RL模型在线部署用于飞行测试或执行后建立安全适应机制。该奖项反映了NSF的法定使命,通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
- DOI:10.48550/arxiv.2310.00968
- 发表时间:2023-10
- 期刊:
- 影响因子:0
- 作者:Qiwei Di;Tao Jin;Yue Wu;Heyang Zhao;Farzad Farnoud;Quanquan Gu
- 通讯作者:Qiwei Di;Tao Jin;Yue Wu;Heyang Zhao;Farzad Farnoud;Quanquan Gu
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Quanquan Gu其他文献
Different patterns of gray matter density in early- and middle-late-onset Parkinson’s disease a voxel-based morphometry study
早发和中晚发帕金森病灰质密度的不同模式:基于体素的形态测量研究
- DOI:
10.1007/s11682-017-9745-4 - 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Min Xuan;Xiaojun Guan;Peiyu Huang;Zhujing Shen;Quanquan Gu;Xinfeng Yu;Xiaojun Xu;Wei Luo;Minming Zhang - 通讯作者:
Minming Zhang
Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback
来自对抗性反馈的上下文决斗强盗的近乎最优算法
- DOI:
10.48550/arxiv.2404.10776 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Qiwei Di;Jiafan He;Quanquan Gu - 通讯作者:
Quanquan Gu
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
用于文本到图像生成的扩散模型的自玩微调
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Huizhuo Yuan;Zixiang Chen;Kaixuan Ji;Quanquan Gu - 通讯作者:
Quanquan Gu
Provable Multi-Objective Reinforcement Learning with Generative Models
可证明的多目标强化学习与生成模型
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Dongruo Zhou;Jiahao Chen;Quanquan Gu - 通讯作者:
Quanquan Gu
Matching the Statistical Query Lower Bound for k-sparse Parity Problems with Stochastic Gradient Descent
使用随机梯度下降匹配 k 稀疏奇偶校验问题的统计查询下界
- DOI:
10.48550/arxiv.2404.12376 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Yiwen Kou;Zixiang Chen;Quanquan Gu;S. Kakade - 通讯作者:
S. Kakade
Quanquan Gu的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Quanquan Gu', 18)}}的其他基金
Collaborative Research: Towards the Foundation of Approximate Sampling-Based Exploration in Sequential Decision Making
协作研究:为顺序决策中基于近似采样的探索奠定基础
- 批准号:
2323113 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
III: Small: Towards the Foundations of Training Deep Neural Networks: New Theory and Algorithms
III:小:迈向训练深度神经网络的基础:新理论和算法
- 批准号:
2008981 - 财政年份:2020
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CIF: Small: Collaborative Research: Rank Aggregation with Heterogeneous Information Sources: Efficient Algorithms and Fundamental Limits
CIF:小型:协作研究:异构信息源的排名聚合:高效算法和基本限制
- 批准号:
1911168 - 财政年份:2019
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
III: Small: Collaborative Research: High-Dimensional Machine Learning Methods for Personalized Cancer Genomics
III:小:协作研究:个性化癌症基因组学的高维机器学习方法
- 批准号:
1903202 - 财政年份:2018
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
BIGDATA: F: Collaborative Research: Taming Big Networks via Embedding
BIGDATA:F:协作研究:通过嵌入驯服大网络
- 批准号:
1855099 - 财政年份:2018
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CAREER: Scaling Up Knowledge Discovery in High-Dimensional Data Via Nonconvex Statistical Optimization
职业:通过非凸统计优化扩大高维数据中的知识发现
- 批准号:
1906169 - 财政年份:2018
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
BIGDATA: F: Collaborative Research: Taming Big Networks via Embedding
BIGDATA:F:协作研究:通过嵌入驯服大网络
- 批准号:
1741342 - 财政年份:2018
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
III: Small: Collaborative Learning with Incomplete and Noisy Knowledge
III:小:知识不完整且有噪音的协作学习
- 批准号:
1904183 - 财政年份:2018
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
III: Small: Collaborative Research: High-Dimensional Machine Learning Methods for Personalized Cancer Genomics
III:小:协作研究:个性化癌症基因组学的高维机器学习方法
- 批准号:
1717206 - 财政年份:2017
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Scaling Up Knowledge Discovery in High-Dimensional Data Via Nonconvex Statistical Optimization
职业:通过非凸统计优化扩大高维数据中的知识发现
- 批准号:
1652539 - 财政年份:2017
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
相似国自然基金
水-土-固废多介质中典型新污染物筛查评估与多场景协同治理关键技术研发与应用
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
数据驱动多介质协同碳纳米管负载过渡
族金属化合物选择性去除新污染物
- 批准号:
- 批准年份:2025
- 资助金额:10.0 万元
- 项目类别:省市级项目
裂隙介质中核素Sr与胶体协同运移的机理研究
- 批准号:42302274
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
天然气掺氢输送环境多介质协同的管线钢氢渗透机制与氢损伤判据
- 批准号:52301075
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
高温强场下接枝亲电子体储能电介质短时击穿与长时耐久协同提升机制
- 批准号:52307022
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
基于多目标参数协同优化的大气压介质阻挡放电双频谐波调控技术研究
- 批准号:52377141
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
非均质软体机器人介质分布与肌腱布置的协同设计原理与方法
- 批准号:52305014
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
流化态催化剂提升介质阻挡放电与催化剂协同效应及生物质焦油转化研究
- 批准号:52377147
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
亚熔盐介质低氧压碱浸软锰矿制备锰酸钾多相反应/传递协同增效机制
- 批准号:52364045
- 批准年份:2023
- 资助金额:33 万元
- 项目类别:地区科学基金项目
Nd-Fe-B介质/缺陷诱导下晶界扩散迁移行为及协同调控机制研究
- 批准号:52361033
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
相似海外基金
Collaborative Research: CPS: Medium: Automating Complex Therapeutic Loops with Conflicts in Medical Cyber-Physical Systems
合作研究:CPS:中:自动化医疗网络物理系统中存在冲突的复杂治疗循环
- 批准号:
2322534 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CPS: Medium: Automating Complex Therapeutic Loops with Conflicts in Medical Cyber-Physical Systems
合作研究:CPS:中:自动化医疗网络物理系统中存在冲突的复杂治疗循环
- 批准号:
2322533 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CPS: Medium: Physics-Model-Based Neural Networks Redesign for CPS Learning and Control
合作研究:CPS:中:基于物理模型的神经网络重新设计用于 CPS 学习和控制
- 批准号:
2311084 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CPS: Medium: Collaborative Research: Provably Safe and Robust Multi-Agent Reinforcement Learning with Applications in Urban Air Mobility
CPS:中:协作研究:可证明安全且鲁棒的多智能体强化学习及其在城市空中交通中的应用
- 批准号:
2312092 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CPS: Medium: Sensor Attack Detection and Recovery in Cyber-Physical Systems
合作研究:CPS:中:网络物理系统中的传感器攻击检测和恢复
- 批准号:
2333980 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CPS: Medium: An Online Learning Framework for Socially Emerging Mixed Mobility
协作研究:CPS:媒介:社会新兴混合出行的在线学习框架
- 批准号:
2401007 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CPS: Medium: Collaborative Research: Robust Sensing and Learning for Autonomous Driving Against Perceptual Illusion
CPS:中:协作研究:针对自动驾驶对抗知觉错觉的鲁棒感知和学习
- 批准号:
2235231 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CPS: Medium: Data Driven Modeling and Analysis of Energy Conversion Systems -- Manifold Learning and Approximation
合作研究:CPS:媒介:能量转换系统的数据驱动建模和分析——流形学习和逼近
- 批准号:
2223987 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CPS: Medium: Mutualistic Cyber-Physical Interaction for Self-Adaptive Multi-Damage Monitoring of Civil Infrastructure
合作研究:CPS:中:土木基础设施自适应多损伤监测的互信息物理交互
- 批准号:
2305882 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CPS Medium: Collaborative Research: Physics-Informed Learning and Control of Passive and Hybrid Conditioning Systems in Buildings
CPS 媒介:协作研究:建筑物中被动和混合空调系统的物理信息学习和控制
- 批准号:
2241796 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant