Collaborative Research: CIF: Medium: Statistical and Algorithmic Foundations of Distributionally Robust Policy Learning
合作研究:CIF:媒介:分布式稳健政策学习的统计和算法基础
基本信息
- 批准号:2312204
- 负责人:
- 金额:$ 80万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-10-01 至 2027-09-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Efficient data-driven policy learning and deployment techniques are transforming many facets of our society as a result of their broad applicability in engineering, scientific and societal applications. Given the access to high-performance computing, the use of simulators and digital twins, for example, have emerged as practical alternatives to test and learn complex optimization policies. As a result, significant scholarly efforts have been devoted to this research area in the past decade. However, despite having made landmark progress, existing work in this area often makes a key (implicit) assumption; namely, that the environment in which the policy is trained will be the same as the environment in which the policy is deployed. Policies learned under this assumption can be fragile, as this assumption often does not hold in practical environments, either due to the simulator model specification or environment shifts. The goal of this project is to study statistical and algorithmic foundations for developing provably efficient robust policy learning in unknown environments, under a possibly misspecified generative model. The project studies comprehensive statistical and algorithmic foundations for distributionally robust policy learning in contextual bandits and reinforcement learning (RL) environments and develops statistically optimal and computationally efficient algorithms across a wide range of non-parametric distributional shifts. These provide a powerful framework for capturing model-agnostic environment changes, but at the same time, pose intellectual challenges as the unknown worst-case environment lies in an infinite-dimensional space. The presented program opens up several fundamental research directions that call for novel and principled developments. First, the project develops information-theoretic tools to understand the fundamental learning limits for distributionally robust policy learning and to characterize how the distributional uncertainty contributes to the difficulty of learning. Additionally, the project develops computationally efficient and statistically optimal estimation schemes for distributionally robust performance analysis of a given policy. Lastly, the project translates the efficiency gains in estimation due to learning a distributionally robust policy.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
高效的数据驱动的政策学习和部署技术正在改变我们社会的许多方面,因为它们在工程、科学和社会应用中的广泛适用性。考虑到高性能计算的使用,例如,模拟器和数字双胞胎的使用已经成为测试和学习复杂优化策略的实用替代方案。因此,在过去的十年中,学术界对这一研究领域进行了大量的努力。然而,尽管取得了里程碑式的进展,这一领域的现有工作往往做出了一个关键的(隐含的)假设;也就是说,训练策略的环境将与部署策略的环境相同。在这个假设下学习的策略可能是脆弱的,因为这个假设通常在实际环境中不成立,这可能是由于模拟器模型规范或环境变化。该项目的目标是研究统计和算法基础,以便在未知环境中,在可能错误指定的生成模型下,开发可证明有效的鲁棒策略学习。该项目研究了上下文分组和强化学习(RL)环境中分布式稳健策略学习的综合统计和算法基础,并在广泛的非参数分布转移中开发了统计最佳和计算高效的算法。这些为捕获模型不可知的环境变化提供了一个强大的框架,但同时,由于未知的最坏环境位于无限维空间中,因此提出了智力挑战。提出的方案开辟了几个基本的研究方向,要求新颖和原则性的发展。首先,该项目开发了信息理论工具,以了解分布鲁棒性策略学习的基本学习限制,并描述分布不确定性如何导致学习困难。此外,该项目还开发了计算效率和统计最优估计方案,用于给定策略的分布鲁棒性性能分析。最后,该项目转化了由于学习分布式鲁棒策略而获得的估计效率增益。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jose Blanchet其他文献
Optimal Sample Complexity of Reinforcement Learning for Uniformly Ergodic Discounted Markov Decision Processes
均匀遍历贴现马尔可夫决策过程的强化学习的最优样本复杂度
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Shengbo Wang;Jose Blanchet;Peter Glynn - 通讯作者:
Peter Glynn
A Model of Bed Demand to Facilitate the Implementation of Data-driven Recommendations for COVID-19 Capacity Management
床位需求模型促进实施数据驱动的 COVID-19 容量管理建议
- DOI:
10.21203/rs.3.rs-31953/v1 - 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Teng Zhang;Kelly A McFarlane;J. Vallon;Linying Yang;Jin Xie;Jose Blanchet;P. Glynn;Kristan Staudenmayer;K. Schulman;D. Scheinker - 通讯作者:
D. Scheinker
When are Unbiased Monte Carlo Estimators More Preferable than Biased Ones?
什么时候无偏蒙特卡罗估计比有偏估计更可取?
- DOI:
10.48550/arxiv.2404.01431 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Guanyang Wang;Jose Blanchet;P. Glynn - 通讯作者:
P. Glynn
Modeling shortest paths in polymeric networks using spatial branching processes
使用空间分支过程对聚合物网络中的最短路径进行建模
- DOI:
10.1016/j.jmps.2024.105636 - 发表时间:
2023 - 期刊:
- 影响因子:5.3
- 作者:
Zhenyuan Zhang;Shaswat Mohanty;Jose Blanchet;Wei Cai - 通讯作者:
Wei Cai
Efficient Steady-State Simulation of High-Dimensional Stochastic Networks
高维随机网络的高效稳态模拟
- DOI:
10.1287/stsy.2021.0077 - 发表时间:
2020-01 - 期刊:
- 影响因子:0
- 作者:
Jose Blanchet;Xinyun Chen;Nian Si;Peter W. Glynn - 通讯作者:
Peter W. Glynn
Jose Blanchet的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jose Blanchet', 18)}}的其他基金
Collaborative Research: AMPS: Rare Events in Power Systems: Novel Mathematics, Statistics and Algorithms.
合作研究:AMPS:电力系统中的罕见事件:新颖的数学、统计和算法。
- 批准号:
2229011 - 财政年份:2023
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
DMS-EPSRC: Fast Martingales, Large Deviations, and Randomized Gradients for Heavy-tailed Distributions
DMS-EPSRC:重尾分布的快速鞅、大偏差和随机梯度
- 批准号:
2118199 - 财政年份:2021
- 资助金额:
$ 80万 - 项目类别:
Continuing Grant
Robust Wasserstein Profile Inference
鲁棒 Wasserstein 轮廓推断
- 批准号:
1915967 - 财政年份:2019
- 资助金额:
$ 80万 - 项目类别:
Continuing Grant
An Approach to Robust Performance Analysis Using Optimal Transport
使用最佳传输进行鲁棒性能分析的方法
- 批准号:
1820942 - 财政年份:2018
- 资助金额:
$ 80万 - 项目类别:
Continuing Grant
Collaborative Proposal: Strong Stochastic Simulation of Stochastic Processes Theory and Applications
合作提案:随机过程理论与应用的强随机模拟
- 批准号:
1838576 - 财政年份:2018
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Proposal: Strong Stochastic Simulation of Stochastic Processes Theory and Applications
合作提案:随机过程理论与应用的强随机模拟
- 批准号:
1720451 - 财政年份:2017
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: Perfect Simulation of Stochastic Networks
合作研究:随机网络的完美模拟
- 批准号:
1538217 - 财政年份:2015
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: Modeling and Analyzing Extreme Risks in Insurance and Finance
合作研究:保险和金融极端风险的建模和分析
- 批准号:
1436700 - 财政年份:2014
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: Optimal Monte Carlo Estimation via Randomized Multilevel Methods
协作研究:通过随机多级方法进行最优蒙特卡罗估计
- 批准号:
1320550 - 财政年份:2013
- 资助金额:
$ 80万 - 项目类别:
Continuing Grant
CAREER: Efficient Monte Carlo Methods in Engineering and Science: From Coarse Analysis to Refined Estimators
职业:工程和科学中的高效蒙特卡罗方法:从粗略分析到精细估算器
- 批准号:
0846816 - 财政年份:2009
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403122 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402815 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343599 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343600 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402817 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326622 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402816 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403123 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326621 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Versatile Data Synchronization: Novel Codes and Algorithms for Practical Applications
合作研究:CIF:小型:多功能数据同步:实际应用的新颖代码和算法
- 批准号:
2312872 - 财政年份:2023
- 资助金额:
$ 80万 - 项目类别:
Standard Grant