Collaborative Research: CIF: Medium: Statistical and Algorithmic Foundations of Efficient Reinforcement Learning
合作研究:CIF:媒介:高效强化学习的统计和算法基础
基本信息
- 批准号:2221009
- 负责人:
- 金额:$ 40万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-01-01 至 2025-09-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
As a data-driven paradigm for sequential decision making in unknown environments, Reinforcement Learning (RL) has received significant interest in recent years owing to its potential ability to solve difficult problems associated with future societal and scientific developments. However, the explosion of both model dimensionality and complexity in current and emerging applications exacerbates the challenge of achieving efficient RL in sample-starved situations, where data collection is expensive, time-consuming, or even high-stake, e.g., in clinical trials, online advertising, and autonomous systems. As a result, understanding and improving the sample and computational efficiencies of RL algorithms, sometimes under additional resource and system-level constraints, are rightly understood as critical to the successful deployment of RL in the future. In this project the PIs are involving students at all levels with diverse backgrounds in Electrical and Computer Engineering, and in Statistics, are developing education modules on RL to enrich the curriculum, and are co-organizing workshops and outreach activities to enable the broader dissemination of the project outcomes.Despite decades-long research efforts, the statistical and computational underpinnings of RL are still far from being well understood, especially when it comes to finite-sample and finite-time issues which are of crucial operational value. This research project is bridging the theory-practice gap of modern algorithmic approaches to RL. It is doing so by (i) characterizing fundamental limits for the sample and computations complexities in various RL settings, (ii) by developing performance guarantees and uncertainty quantification schemes, and (iii) by designing new computationally efficient algorithms that are provably near-optimal in terms of sample complexity in both single-agent and multi-agent settings. The expected outcomes will enable the trustworthy adoption of RL algorithms in sample-starved environments. The complementary expertise of the research team is being leveraged to enrich the statistical and algorithmic foundations of RL through model-, policy-, and value-based approaches. New efficient algorithms that rely on function approximation schemes are being developed in order to address the curse of dimensionality; the resulting techniques are intended to lead to non-asymptotic analysis tools that deal with the complicated statistical dependencies present in RL. This rich research agenda is expected to foster multidisciplinary efforts at the intersection of high-dimensional statistics, non-convex optimization, control theory, information theory, and machine learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
强化学习(RL)作为一种在未知环境中进行顺序决策的数据驱动范式,由于其解决与未来社会和科学发展相关的难题的潜在能力,近年来受到了极大的关注。然而,当前和新兴应用中模型维度和复杂性的爆炸加剧了在样本匮乏情况下实现高效RL的挑战,其中数据收集是昂贵的,耗时的,甚至是高风险的,例如,在临床试验、在线广告和自主系统中。因此,理解和提高RL算法的样本和计算效率,有时在额外的资源和系统级约束下,正确地理解为RL在未来成功部署的关键。在这个项目中,PI让不同背景的学生参与到各个级别的电气和计算机工程以及统计学中,正在开发RL的教育模块以丰富课程,并正在共同组织研讨会和外展活动,以便更广泛地传播项目成果。尽管数十年的研究努力,RL的统计和计算基础仍然远远没有得到很好的理解,特别是当涉及到有限样本和有限时间的问题,这是至关重要的业务价值。该研究项目正在弥合RL现代算法方法的理论与实践差距。它是通过(i)表征各种RL设置中的样本和计算复杂性的基本限制,(ii)通过开发性能保证和不确定性量化方案,以及(iii)通过设计新的计算效率高的算法,这些算法在单智能体和多智能体设置中的样本复杂性方面可证明接近最优。预期的结果将使RL算法在样本匮乏的环境中得到可靠的采用。研究团队的互补专业知识正在通过基于模型,策略和价值的方法来丰富RL的统计和算法基础。正在开发新的高效算法,依赖于函数近似方案,以解决维数灾难;由此产生的技术旨在导致非渐近分析工具,处理RL中存在的复杂的统计依赖关系。这个丰富的研究议程预计将促进在高维统计,非凸优化,控制理论,信息论和机器学习的交叉学科的努力。这个奖项反映了NSF的法定使命,并已被认为是值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。
项目成果
期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Fast Global Convergence of Natural Policy Gradient Methods with Entropy Regularization
- DOI:10.1287/opre.2021.2151
- 发表时间:2020-07
- 期刊:
- 影响因子:0
- 作者:Shicong Cen;Chen Cheng;Yuxin Chen;Yuting Wei;Yuejie Chi
- 通讯作者:Shicong Cen;Chen Cheng;Yuxin Chen;Yuting Wei;Yuejie Chi
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity
- DOI:
- 发表时间:2022-02
- 期刊:
- 影响因子:0
- 作者:Laixi Shi;Gen Li;Yuting Wei;Yuxin Chen;Yuejie Chi
- 通讯作者:Laixi Shi;Gen Li;Yuting Wei;Yuxin Chen;Yuejie Chi
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence
- DOI:10.1137/21m1456789
- 发表时间:2021-05
- 期刊:
- 影响因子:0
- 作者:Wenhao Zhan;Shicong Cen;Baihe Huang;Yuxin Chen;Jason D. Lee;Yuejie Chi
- 通讯作者:Wenhao Zhan;Shicong Cen;Baihe Huang;Yuxin Chen;Jason D. Lee;Yuejie Chi
Breaking the sample complexity barrier to regret-optimal model-free reinforcement learning
打破样本复杂性障碍,实现后悔最优无模型强化学习
- DOI:10.1093/imaiai/iaac034
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Li, Gen;Shi, Laixi;Chen, Yuxin;Chi, Yuejie
- 通讯作者:Chi, Yuejie
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
- DOI:10.1109/tit.2021.3120096
- 发表时间:2020-06
- 期刊:
- 影响因子:2.5
- 作者:Gen Li;Yuting Wei;Yuejie Chi;Yuantao Gu;Yuxin Chen
- 通讯作者:Gen Li;Yuting Wei;Yuejie Chi;Yuantao Gu;Yuxin Chen
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yuxin Chen其他文献
Settling the Sample Complexity of Model-Based Offline Reinforcement Learning
解决基于模型的离线强化学习的样本复杂度
- DOI:
10.48550/arxiv.2204.05275 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Gen Li;Laixi Shi;Yuxin Chen;Yuejie Chi;Yuting Wei - 通讯作者:
Yuting Wei
Simultaneous determination of polycyclic musks in blood and urine by solid supported liquid–liquid extraction and gas chromatography–tandem mass spectrometry
固载液-液萃取气相色谱-串联质谱法同时测定血液和尿液中的多环麝香
- DOI:
10.1016/j.jchromb.2015.04.028 - 发表时间:
2015 - 期刊:
- 影响因子:3
- 作者:
Hongtao Liu;Liping Huang;Yuxin Chen;Liman Guo;Limin Li;Haiyun Zhou;Tiangang Luan - 通讯作者:
Tiangang Luan
Intelligent GP fusion from multiple sources for text classification
多源智能 GP 融合用于文本分类
- DOI:
10.1145/1099554.1099688 - 发表时间:
2005 - 期刊:
- 影响因子:0
- 作者:
Baoping Zhang;Yuxin Chen;Weiguo Fan;E. Fox;Marcos André Gonçalves;Marco Cristo;P. Calado - 通讯作者:
P. Calado
UNDERSTANDING USER INTENTIONS IN VERTICAL IMAGE SEARCH
了解垂直图像搜索中的用户意图
- DOI:
- 发表时间:
2011 - 期刊:
- 影响因子:0
- 作者:
Yuxin Chen - 通讯作者:
Yuxin Chen
Coptis chinensis inflorescence extract protection against ultraviolet-B-induced phototoxicity, and HPLC-MS analysis of its chemical composition.
黄连花序提取物对紫外线B诱导的光毒性的保护作用及其化学成分的HPLC-MS分析。
- DOI:
- 发表时间:
2012 - 期刊:
- 影响因子:4.3
- 作者:
Lingxin Zhu;Bo Huang;Xiaoquan Ban;Jingsheng He;Yuxin Chen;Li Han;Youwei Wang - 通讯作者:
Youwei Wang
Yuxin Chen的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yuxin Chen', 18)}}的其他基金
Collaborative Research: RI: Small: Foundations of Few-Round Active Learning
协作研究:RI:小型:少轮主动学习的基础
- 批准号:
2313131 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
RI: Medium: Collaborative Research:Algorithmic High-Dimensional Statistics: Optimality, Computtional Barriers, and High-Dimensional Corrections
RI:中:协作研究:算法高维统计:最优性、计算障碍和高维校正
- 批准号:
2218713 - 财政年份:2022
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
RI: Small: Uncertainty Quantification for Nonconvex Low-Complexity Models
RI:小:非凸低复杂度模型的不确定性量化
- 批准号:
2218773 - 财政年份:2022
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Medium: Statistical and Algorithmic Foundations of Efficient Reinforcement Learning
合作研究:CIF:媒介:高效强化学习的统计和算法基础
- 批准号:
2106739 - 财政年份:2021
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
RI: Small: Uncertainty Quantification for Nonconvex Low-Complexity Models
RI:小:非凸低复杂度模型的不确定性量化
- 批准号:
2100158 - 财政年份:2021
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: Fine-Grained Statistical Inference in High Dimension: Actionable Information, Bias Reduction, and Optimality
协作研究:高维细粒度统计推断:可操作信息、减少偏差和最优性
- 批准号:
2014279 - 财政年份:2020
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CIF: Small: Taming Nonconvexity in High-Dimensional Statistical Estimation
CIF:小:驯服高维统计估计中的非凸性
- 批准号:
1907661 - 财政年份:2019
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
RI: Medium: Collaborative Research:Algorithmic High-Dimensional Statistics: Optimality, Computtional Barriers, and High-Dimensional Corrections
RI:中:协作研究:算法高维统计:最优性、计算障碍和高维校正
- 批准号:
1900140 - 财政年份:2019
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
相似国自然基金
水凝胶改性陶瓷人工关节牢固结合界面的构筑与减磨润滑机理研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
锆酸铅基反铁电体畴动力学及其调控机理研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
载铁生物炭对土壤镉污染的吸附固定及微生物协同作用机制研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
SREBP转录因子BbSre1负调控球孢白僵菌抗真菌物质产生的机制研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
面向截肢患者运动感知重建的肌电假肢手关节运动反馈时变编码研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
面向水质应急快检的碳点/微流控限域增强发光传感研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
面向挠性压电太阳翼的物理信息混合建模与非同位控制方法研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
随机3维 Burgers 方程正则性研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
犬尿氨酸通过AhR/STAT3轴活化粒细胞样MDSCs促进慢性肾脏病心脏纤维化的机制研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
磁性的机器学习研究: 以图神经网络为中心
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
相似海外基金
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403122 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402815 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343599 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343600 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402817 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326622 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402816 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403123 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326621 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Versatile Data Synchronization: Novel Codes and Algorithms for Practical Applications
合作研究:CIF:小型:多功能数据同步:实际应用的新颖代码和算法
- 批准号:
2312872 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Standard Grant