Collaborative Research: CIF: Medium: Statistical and Algorithmic Foundations of Efficient Reinforcement Learning

合作研究:CIF:媒介:高效强化学习的统计和算法基础

基本信息

  • 批准号:
    2106739
  • 负责人:
  • 金额:
    $ 40万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2021
  • 资助国家:
    美国
  • 起止时间:
    2021-10-01 至 2022-03-31
  • 项目状态:
    已结题

项目摘要

As a data-driven paradigm for sequential decision making in unknown environments, Reinforcement Learning (RL) has received significant interest in recent years owing to its potential ability to solve difficult problems associated with future societal and scientific developments. However, the explosion of both model dimensionality and complexity in current and emerging applications exacerbates the challenge of achieving efficient RL in sample-starved situations, where data collection is expensive, time-consuming, or even high-stake, e.g., in clinical trials, online advertising, and autonomous systems. As a result, understanding and improving the sample and computational efficiencies of RL algorithms, sometimes under additional resource and system-level constraints, are rightly understood as critical to the successful deployment of RL in the future. In this project the PIs are involving students at all levels with diverse backgrounds in Electrical and Computer Engineering, and in Statistics, are developing education modules on RL to enrich the curriculum, and are co-organizing workshops and outreach activities to enable the broader dissemination of the project outcomes.Despite decades-long research efforts, the statistical and computational underpinnings of RL are still far from being well understood, especially when it comes to finite-sample and finite-time issues which are of crucial operational value. This research project is bridging the theory-practice gap of modern algorithmic approaches to RL. It is doing so by (i) characterizing fundamental limits for the sample and computations complexities in various RL settings, (ii) by developing performance guarantees and uncertainty quantification schemes, and (iii) by designing new computationally efficient algorithms that are provably near-optimal in terms of sample complexity in both single-agent and multi-agent settings. The expected outcomes will enable the trustworthy adoption of RL algorithms in sample-starved environments. The complementary expertise of the research team is being leveraged to enrich the statistical and algorithmic foundations of RL through model-, policy-, and value-based approaches. New efficient algorithms that rely on function approximation schemes are being developed in order to address the curse of dimensionality; the resulting techniques are intended to lead to non-asymptotic analysis tools that deal with the complicated statistical dependencies present in RL. This rich research agenda is expected to foster multidisciplinary efforts at the intersection of high-dimensional statistics, non-convex optimization, control theory, information theory, and machine learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
作为在未知环境中进行连续决策的数据驱动范式,近年来,强化学习(RL)由于解决了与未来的社会和科学发展相关的困难问题,因此受到了浓厚的兴趣。但是,在当前和新兴应用程序中模型维度和复杂性的爆炸式爆炸加剧了在样品饥饿的情况下实现有效RL的挑战,在样品中,数据收集昂贵,耗时,甚至是高率,例如,在临床,在线广告和自主系统中。结果,理解和改善RL算法的样本和计算效率,有时在其他资源和系统级别的约束下,正确地理解为对未来RL成功部署至关重要。在这个项目中,PI在电气和计算机工程领域具有不同背景的各个级别的学生参与,并且在统计数据中,正在开发有关RL的教育模块,以丰富课程,并共同组织研讨会和外展活动,以使项目越来越广泛地构成范围,尤其是在整个研究工作中,始终是统计的,该研究范围是在统计的情况下,构成了计算的范围。有限样本和有限时间问题,具有至关重要的操作价值。该研究项目正在弥合RL的现代算法方法的理论实践差距。它是通过(i)表征各种RL设置中样本和计算复杂性的基本限制,(ii)通过开发性能保证和不确定性量化方案,以及(iii)通过设计新的计算有效算法,这些算法在单个基本和多元化设置中的样本复杂性方面非常优势,这些算法近乎最佳。预期的结果将使在样品饥饿的环境中值得信赖的RL算法采用。研究团队的互补专业知识正在利用通过模型,政策和基于价值的方法来丰富RL的统计和算法基础。为了解决维度的诅咒,正在开发依赖功能近似方案的新的有效算法。最终的技术旨在导致非吸收分析工具,以处理RL中存在的复杂统计依赖性。预计这一丰富的研究议程将在高维统计,非convex优化,控制理论,信息理论和机器学习的交集中促进多学科的努力。该奖项反映了NSF的法定任务,并被认为是值得通过基金会的知识分子优点和更广泛影响的审查审查的审查标准来通过评估来获得支持的。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence
  • DOI:
    10.1137/21m1456789
  • 发表时间:
    2021-05
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Wenhao Zhan;Shicong Cen;Baihe Huang;Yuxin Chen;Jason D. Lee;Yuejie Chi
  • 通讯作者:
    Wenhao Zhan;Shicong Cen;Baihe Huang;Yuxin Chen;Jason D. Lee;Yuejie Chi
Tackling Small Eigen-Gaps: Fine-Grained Eigenvector Estimation and Inference Under Heteroscedastic Noise
Softmax policy gradient methods can take exponential time to converge
  • DOI:
    10.1007/s10107-022-01920-6
  • 发表时间:
    2021-02
  • 期刊:
  • 影响因子:
    2.7
  • 作者:
    Gen Li;Yuting Wei;Yuejie Chi;Yuantao Gu;Yuxin Chen
  • 通讯作者:
    Gen Li;Yuting Wei;Yuejie Chi;Yuantao Gu;Yuxin Chen
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
  • DOI:
    10.1109/tit.2021.3120096
  • 发表时间:
    2020-06
  • 期刊:
  • 影响因子:
    2.5
  • 作者:
    Gen Li;Yuting Wei;Yuejie Chi;Yuantao Gu;Yuxin Chen
  • 通讯作者:
    Gen Li;Yuting Wei;Yuejie Chi;Yuantao Gu;Yuxin Chen
Breaking the sample complexity barrier to regret-optimal model-free reinforcement learning
打破样本复杂性障碍,实现后悔最优无模型强化学习
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Yuxin Chen其他文献

Plant trait differences and soil moisture jointly affect insect herbivory on seedling young leaves in a subtropical forest
植物性状差异和土壤湿度共同影响亚热带森林幼苗幼叶昆虫食草
  • DOI:
    10.1016/j.foreco.2020.118878
  • 发表时间:
    2021-02
  • 期刊:
  • 影响因子:
    3.7
  • 作者:
    Wenbin Li;Yuxin Chen;Yong Shen;Y;an Lu;Shixiao Yu
  • 通讯作者:
    Shixiao Yu
Discovery of novel biphenyl-sulfonamide analogues as NLRP3 inflammasome inhibitors.
发现新型联苯磺酰胺类似物作为 NLRP3 炎性体抑制剂。
  • DOI:
    10.1016/j.bioorg.2024.107263
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    5.1
  • 作者:
    Chao Huang;Jinyu Liu;Yuxin Chen;Simin Sun;Tongtong Kang;Yuqi Jiang;Xiaoyang Li
  • 通讯作者:
    Xiaoyang Li
Genicular artery embolization for the treatment of knee pain secondary to mild to severe knee osteoarthritis: One year clinical outcomes.
膝动脉栓塞治疗继发于轻度至重度膝骨关节炎的膝关节疼痛:一年临床结果。
  • DOI:
    10.1016/j.ejrad.2024.111443
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    3.3
  • 作者:
    Changhao Sun;Yuxin Chen;Zhiling Gao;Longyun Wu;Rong Lu;Chaoyun Zhao;Hao Yang;Yong Chen
  • 通讯作者:
    Yong Chen
Maximizing Throughput for Coexisting Wireless Body Area Networks (WBANs) Based on Optimal Clustering
基于最优集群的共存无线体域网 (WBAN) 吞吐量最大化
  • DOI:
    10.1109/jiot.2023.3268049
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    10.6
  • 作者:
    Xiaokang Hu;Kunqi Guo;Chenyang Wang;Yuxin Chen;Yuting Qian;Jiajun Zhang
  • 通讯作者:
    Jiajun Zhang
Chip-scale metalens microscope for wide-field and depth-of-field imaging
用于宽视场和景深成像的芯片级超透镜显微镜
  • DOI:
    10.1117/1.ap.4.4.046006
  • 发表时间:
    2022-07
  • 期刊:
  • 影响因子:
    17.3
  • 作者:
    Xin Ye;Xiao Qian;Yuxin Chen;Rui Yuan;Xingjian Xiao;Chen Chen;Wei Hu;Chunyu Huang;Shining Zhu;Tao Li
  • 通讯作者:
    Tao Li

Yuxin Chen的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Yuxin Chen', 18)}}的其他基金

Collaborative Research: RI: Small: Foundations of Few-Round Active Learning
协作研究:RI:小型:少轮主动学习的基础
  • 批准号:
    2313131
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: CIF: Medium: Statistical and Algorithmic Foundations of Efficient Reinforcement Learning
合作研究:CIF:媒介:高效强化学习的统计和算法基础
  • 批准号:
    2221009
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
RI: Medium: Collaborative Research:Algorithmic High-Dimensional Statistics: Optimality, Computtional Barriers, and High-Dimensional Corrections
RI:中:协作研究:算法高维统计:最优性、计算障碍和高维校正
  • 批准号:
    2218713
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
RI: Small: Uncertainty Quantification for Nonconvex Low-Complexity Models
RI:小:非凸低复杂度模型的不确定性量化
  • 批准号:
    2218773
  • 财政年份:
    2022
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
RI: Small: Uncertainty Quantification for Nonconvex Low-Complexity Models
RI:小:非凸低复杂度模型的不确定性量化
  • 批准号:
    2100158
  • 财政年份:
    2021
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: Fine-Grained Statistical Inference in High Dimension: Actionable Information, Bias Reduction, and Optimality
协作研究:高维细粒度统计推断:可操作信息、减少偏差和最优性
  • 批准号:
    2014279
  • 财政年份:
    2020
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CIF: Small: Taming Nonconvexity in High-Dimensional Statistical Estimation
CIF:小:驯服高维统计估计中的非凸性
  • 批准号:
    1907661
  • 财政年份:
    2019
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
RI: Medium: Collaborative Research:Algorithmic High-Dimensional Statistics: Optimality, Computtional Barriers, and High-Dimensional Corrections
RI:中:协作研究:算法高维统计:最优性、计算障碍和高维校正
  • 批准号:
    1900140
  • 财政年份:
    2019
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant

相似国自然基金

钛基骨植入物表面电沉积镁氢涂层及其促成骨性能研究
  • 批准号:
    52371195
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
CLMP介导Connexin45-β-catenin复合体对先天性短肠综合征的致病机制研究
  • 批准号:
    82370525
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
人工局域表面等离激元高灵敏传感及其系统小型化的关键技术研究
  • 批准号:
    62371132
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
优先流对中俄原油管道沿线多年冻土水热稳定性的影响机制研究
  • 批准号:
    42301138
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
用于稳定锌负极的界面层/电解液双向调控研究
  • 批准号:
    52302289
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
  • 批准号:
    2403122
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
  • 批准号:
    2402815
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
  • 批准号:
    2343599
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
  • 批准号:
    2343600
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
Collaborative Research:CIF:Small:Acoustic-Optic Vision - Combining Ultrasonic Sonars with Visible Sensors for Robust Machine Perception
合作研究:CIF:Small:声光视觉 - 将超声波声纳与可见传感器相结合,实现强大的机器感知
  • 批准号:
    2326905
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了