Collaborative Research: Towards the Foundation of Approximate Sampling-Based Exploration in Sequential Decision Making
协作研究:为顺序决策中基于近似采样的探索奠定基础
基本信息
- 批准号:2323112
- 负责人:
- 金额:$ 30万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-10-01 至 2026-09-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Sequential decision-making problems, such as bandits and reinforcement learning, play a crucial role in various AI applications, including recommendation systems, robotics, games, and personalized healthcare. The main challenge lies in finding the optimal exploration strategy that strikes a balance between choosing actions with the best performance and choosing actions with high uncertainties. However, existing exploration strategies heavily depend on specific cases, requiring prior knowledge of reward distribution, function approximation, and the task at hand. This creates computational obstacles and hampers real-world applicability. This project aims to establish a theoretical foundation for using approximate sampling-based techniques to unify exploration strategies across different sequential decision problems. The goal is to develop efficient and provable algorithms applicable to diverse learning problems under a unified algorithmic framework based on approximate sampling. This project also provides research training opportunities for graduate students. The project consists of three tasks. Task one focuses on developing fast approximate sampling-based exploration strategies for contextual bandit problems, accompanied by theoretical guarantees. Task two involves implementing and generalizing these exploration algorithms to more complex sequential decision-making applications, leveraging deep neural networks. Task three aims to establish efficient and provably effective exploration strategies for reinforcement learning problems. These advancements will be translated into accessible tools for various bandit and reinforcement learning applications, providing verifiable guarantees. The open-source software and course materials resulting from this project will be made publicly available, benefiting research, education, and society at large.This award by the Division of Mathematical Sciences is jointly supported by the NSF Office of Advanced Cyberinfrastructure.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
顺序决策问题,如强盗和强化学习,在各种人工智能应用中发挥着至关重要的作用,包括推荐系统、机器人、游戏和个性化医疗保健。主要的挑战在于找到最优的探索策略,在选择具有最佳性能的行动和选择具有高不确定性的行动之间取得平衡。然而,现有的勘探策略严重依赖于具体情况,需要事先了解奖励分布、函数近似和手头的任务。这就造成了计算障碍,阻碍了现实世界的适用性。本项目旨在为利用基于近似采样的技术统一不同序列决策问题的勘探策略奠定理论基础。目标是在基于近似采样的统一算法框架下,开发适用于各种学习问题的高效且可证明的算法。本项目也为研究生提供了研究训练的机会。该项目包括三个任务。任务一的重点是为背景土匪问题开发基于近似采样的快速勘探策略,并提供理论保证。任务二涉及实现和推广这些探索算法到更复杂的顺序决策应用,利用深度神经网络。任务三旨在为强化学习问题建立高效且可证明有效的探索策略。这些进步将转化为各种强盗和强化学习应用程序的可访问工具,提供可验证的保证。这个项目产生的开源软件和课程材料将会公开,使研究、教育和整个社会受益。该奖项由数学科学部颁发,由国家科学基金会高级网络基础设施办公室联合支持。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Pan Xu其他文献
Rhodium-Catalyzed Direct C7 Alkynylation of Indolines.
铑催化二氢吲哚的直接 C7 炔基化。
- DOI:
10.1002/chin.201533157 - 发表时间:
2015 - 期刊:
- 影响因子:0
- 作者:
N. Jin;C. Pan;Honglin Zhang;Pan Xu;Yixiang Cheng;Chengjian Zhu - 通讯作者:
Chengjian Zhu
Two-Sided Capacitated Submodular Maximization in Gig Platforms
- DOI:
10.48550/arxiv.2309.09098 - 发表时间:
2023-09 - 期刊:
- 影响因子:0
- 作者:
Pan Xu - 通讯作者:
Pan Xu
A promising heat-induced supramolecular metallogel electrolyte for quasi-solid-state dye-sensitized solar cells
一种用于准固态染料敏化太阳能电池的有前景的热诱导超分子金属凝胶电解质
- DOI:
10.1007/s10008-019-04258-w - 发表时间:
2019-04 - 期刊:
- 影响因子:2.5
- 作者:
Zhang Wei;Wang Zhiyuan;Tao Li;Duan Keyu;Wang Hao;Zhang Jun;Pan Xu;Huo Zhipeng - 通讯作者:
Huo Zhipeng
Uncertainty Assessment and False Discovery Rate Control in High-Dimensional Granger Causal Inference
高维格兰杰因果推理中的不确定性评估和错误发现率控制
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Aditya Chaudhry;Pan Xu;Quanquan Gu - 通讯作者:
Quanquan Gu
41‐1: Invited Paper: Internal Compensation Type OLED Display Using a‐IGZO TFTs
41-1:特邀论文:使用 a-IGZO TFT 的内部补偿型 OLED 显示器
- DOI:
10.1002/sdtp.17077 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Pan Xu;Ying Han;Fengjuan Liu;Guang Yan;Mingyi Zhu;Linlin Wang;Yongqian Li;Jianwei Yu;Xue Dong - 通讯作者:
Xue Dong
Pan Xu的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Pan Xu', 18)}}的其他基金
CRII: RI: Fairness and Profitability in Online Matching Markets
CRII:RI:在线匹配市场的公平性和盈利性
- 批准号:
1948157 - 财政年份:2020
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: Maritime to Inland Transitions Towards ENvironments for Convection Initiation (MITTEN CI)
合作研究:海洋到内陆向对流引发环境的转变(MITTEN CI)
- 批准号:
2349935 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
Collaborative Research: Maritime to Inland Transitions Towards ENvironments for Convection Initiation (MITTEN CI)
合作研究:海洋到内陆向对流引发环境的转变(MITTEN CI)
- 批准号:
2349934 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
Collaborative Research: Frameworks: MobilityNet: A Trustworthy CI Emulation Tool for Cross-Domain Mobility Data Generation and Sharing towards Multidisciplinary Innovations
协作研究:框架:MobilityNet:用于跨域移动数据生成和共享以实现多学科创新的值得信赖的 CI 仿真工具
- 批准号:
2411152 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: Maritime to Inland Transitions Towards ENvironments for Convection Initiation (MITTEN CI)
合作研究:海洋到内陆向对流引发环境的转变(MITTEN CI)
- 批准号:
2349936 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
Collaborative Research: Multiple Team Membership (MTM) through Technology: A path towards individual and team wellbeing?
协作研究:通过技术实现多重团队成员 (MTM):通往个人和团队福祉的道路?
- 批准号:
2345652 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: Frameworks: MobilityNet: A Trustworthy CI Emulation Tool for Cross-Domain Mobility Data Generation and Sharing towards Multidisciplinary Innovations
协作研究:框架:MobilityNet:用于跨域移动数据生成和共享以实现多学科创新的值得信赖的 CI 仿真工具
- 批准号:
2411153 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: Maritime to Inland Transitions Towards ENvironments for Convection Initiation (MITTEN CI)
合作研究:海洋到内陆向对流引发环境的转变(MITTEN CI)
- 批准号:
2349937 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
Collaborative Research: Multiple Team Membership (MTM) through Technology: A path towards individual and team wellbeing?
协作研究:通过技术实现多重团队成员 (MTM):通往个人和团队福祉的道路?
- 批准号:
2345651 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: Frameworks: MobilityNet: A Trustworthy CI Emulation Tool for Cross-Domain Mobility Data Generation and Sharing towards Multidisciplinary Innovations
协作研究:框架:MobilityNet:用于跨域移动数据生成和共享以实现多学科创新的值得信赖的 CI 仿真工具
- 批准号:
2411151 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Collaborative Research: SaTC: CORE: Small: Towards Secure and Trustworthy Tree Models
协作研究:SaTC:核心:小型:迈向安全可信的树模型
- 批准号:
2413046 - 财政年份:2024
- 资助金额:
$ 30万 - 项目类别:
Standard Grant