Collaborative Research: CIF: Medium: Learning to Control from Data: from Theory to Practice
合作研究:CIF:媒介:从数据中学习控制:从理论到实践
基本信息
- 批准号:2211210
- 负责人:
- 金额:$ 39.89万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-10-01 至 2026-09-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Data-driven decision-making is playing an increasingly critical role in today's world with examples ranging from epidemic response to ridesharing optimization. However, learning an optimal control policy from data faces challenges in both the offline and online settings: (a) (Offline) It is unclear how to most efficiently utilize the available dataset which was collected a priori, especially when it does not cover all possible scenarios of interest. (b) (Online) It is unclear how to collect a dataset through minimal interactions with the environment in situations where it may be costly and unsafe to do so. Driven by the need to address these two challenges, this project aims to improve the sample efficiency of reinforcement learning (RL) in both settings. In addition, the project plans to incorporate adaptivity and trustworthiness that are required in practice. Activities complementary to these research thrusts include the training of future leaders of academia, industry, and government by equipping them with fundamental skills in data-driven decision making.The goal of this project is to develop the theory and algorithms for a new generation of data-driven decision rules in order to address critical challenges in modern RL. Specifically, the research agenda aims (i) to design sample-efficient and computationally-efficient algorithms for online and offline RL with function approximation, and (ii) to enhance the adaptivity and trustworthiness of existing RL paradigms. To achieve the first goal, we propose to incorporate optimistic exploration for online RL and pessimistic exploitation for offline RL into existing approaches with the help of faithful uncertainty quantification for neural networks. To achieve the second goal, we propose to incorporate model selection into existing approaches with the help of tight sample complexity characterizations.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据驱动的决策在当今世界发挥着越来越重要的作用,从疫情应对到拼车优化都有例子。然而,从数据中学习最优控制策略在离线和在线设置中都面临挑战:(a)(离线)不清楚如何最有效地利用先验收集的可用数据集,特别是当它不覆盖所有可能的感兴趣场景时。(b)目前尚不清楚如何通过与环境的最小交互来收集数据集,因为这样做可能成本高昂且不安全。由于需要解决这两个挑战,该项目旨在提高这两种情况下强化学习(RL)的样本效率。此外,该项目计划纳入实践中所需的适应性和可信度。作为这些研究方向的补充,还将培养未来的学术界、工业界和政府领导者,使他们具备数据驱动决策的基本技能。本项目的目标是开发新一代数据驱动决策规则的理论和算法,以应对现代强化学习中的关键挑战。具体而言,研究议程的目的是(i)设计样本效率和计算效率的算法,用于在线和离线RL函数逼近,以及(ii)提高现有RL范式的适应性和可信度。为了实现第一个目标,我们建议将在线RL的乐观探索和离线RL的悲观开发结合到现有的方法中,并对神经网络进行忠实的不确定性量化。为了实现第二个目标,我们建议将模型选择到现有的方法与紧密的样本复杂性characterizations.This奖项的帮助下,反映了NSF的法定使命,并已被认为是值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Zhaoran Wang其他文献
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
自我探索语言模型:在线对齐的主动偏好诱导
- DOI:
10.48550/arxiv.2405.19332 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Shenao Zhang;Donghan Yu;Hiteshi Sharma;Ziyi Yang;Shuohang Wang;Hany Hassan;Zhaoran Wang - 通讯作者:
Zhaoran Wang
Safe MPC Alignment with Human Directional Feedback
安全 MPC 对准与人工定向反馈
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Zhixian Xie;Wenlong Zhang;Yi Ren;Zhaoran Wang;George J. Pappas;Wanxin Jin - 通讯作者:
Wanxin Jin
Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics
具有接触动力学的一阶策略梯度的自适应障碍平滑
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Shenao Zhang;Wanxin Jin;Zhaoran Wang - 通讯作者:
Zhaoran Wang
Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information
离线强化学习,用于人类引导的私人信息人机交互
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Zuyue Fu;Zhengling Qi;Zhuoran Yang;Zhaoran Wang;Lan Wang - 通讯作者:
Lan Wang
Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes
混杂马尔可夫决策过程中使用工具变量的离线强化学习
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Zuyue Fu;Zhengling Qi;Zhaoran Wang;Zhuoran Yang;Yanxun Xu;Michael R. Kosorok - 通讯作者:
Michael R. Kosorok
Zhaoran Wang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Zhaoran Wang', 18)}}的其他基金
CAREER: Principled Deep Reinforcement Learning for Societal Systems
职业:社会系统的有原则的深度强化学习
- 批准号:
2048075 - 财政年份:2021
- 资助金额:
$ 39.89万 - 项目类别:
Continuing Grant
Collaborative Research: CIF: Small: A Unified Framework of Distributional Optimization via Variational Transport
合作研究:CIF:小型:通过变分传输的分布式优化的统一框架
- 批准号:
2008827 - 财政年份:2020
- 资助金额:
$ 39.89万 - 项目类别:
Standard Grant
Collaborative Research: High-Dimensional Decision Making and Inference with Applications for Personalized Medicine
合作研究:高维决策和推理及其在个性化医疗中的应用
- 批准号:
2015568 - 财政年份:2020
- 资助金额:
$ 39.89万 - 项目类别:
Continuing Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403122 - 财政年份:2024
- 资助金额:
$ 39.89万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402815 - 财政年份:2024
- 资助金额:
$ 39.89万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343599 - 财政年份:2024
- 资助金额:
$ 39.89万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343600 - 财政年份:2024
- 资助金额:
$ 39.89万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402817 - 财政年份:2024
- 资助金额:
$ 39.89万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326622 - 财政年份:2024
- 资助金额:
$ 39.89万 - 项目类别:
Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
- 批准号:
2402816 - 财政年份:2024
- 资助金额:
$ 39.89万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403123 - 财政年份:2024
- 资助金额:
$ 39.89万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326621 - 财政年份:2024
- 资助金额:
$ 39.89万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Versatile Data Synchronization: Novel Codes and Algorithms for Practical Applications
合作研究:CIF:小型:多功能数据同步:实际应用的新颖代码和算法
- 批准号:
2312872 - 财政年份:2023
- 资助金额:
$ 39.89万 - 项目类别:
Standard Grant