权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Medium: Provable Reinforcement Learning with Function Approximation and Neural Networks

RI：中：使用函数逼近和神经网络的可证明强化学习

基本信息

批准号：
2107304
负责人：
Chi Jin
金额：
$ 120万
依托单位：
Princeton University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-10-01 至 2024-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2107304&HistoricalAwards=false
关键词：
RI Medium Provable Reinforcement Learning

项目摘要

Reinforcement Learning (RL) is a generic and flexible framework for sequential decision-making problems. Modern RL commonly engages practical problems with an enormous number of states, where function approximation must be deployed to generalize knowledge from the visited states to the unvisited ones. Function approximation, particularly in the form of deep neural networks, lies at the heart of the recent practical successes of RL in domains such as robotics, autonomous vehicles, business management, and production systems. However, most existing theoretical understanding of RL has been restricted to the problems with a small number of states without using function approximation, and a significant gap remains between theory and practice of RL. This project seeks to bridge this gap by identifying and addressing the fundamental challenges that are persistent in RL with function approximation.To accomplish this goal, this project will develop a comprehensive set of fundamental theory and methodologies for RL with function approximation, with a special emphasis on its applicability to modern deep RL. Concretely, this project will proceed with two parallel thrusts. The first thrust investigates model-free RL with general function approximation. This thrust will identify the general structure of the function classes where RL problems are tractable, design new provably efficient algorithms for those general function classes, and address the challenging issues such as model misspecification. This thrust will further integrate these results with recent advances in representation, optimization and generalization of deep learning. The second thrust concerns model-based RL to incorporate domain knowledge. This thrust will first develop a general-purpose model-based RL method using the idea of value-targeted system identification. This thrust will also develop stochastic-approximation variants of the methods for tractable computation, and deep model reduction or feature learning methods for analyzing off-policy data prior to on-policy model-based RL. Important outcomes of this project will be new general and reliable RL algorithms that are guaranteed to perform well for a wide range of applications with both computational and statistical efficiency.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

强化学习（RL）是一个通用的和灵活的框架顺序决策问题。现代强化学习通常涉及具有大量状态的实际问题，其中必须部署函数近似以将知识从访问状态推广到未访问状态。函数逼近，特别是以深度神经网络的形式，是RL在机器人、自动驾驶汽车、企业管理和生产系统等领域最近取得实际成功的核心。然而，目前对强化学习的理论认识大多局限于少数状态的问题，没有采用函数逼近的方法，强化学习的理论与实践之间还存在着很大的差距。该项目旨在通过识别和解决RL与函数近似的基本挑战来弥合这一差距。为了实现这一目标，该项目将开发一套全面的RL与函数近似的基础理论和方法，特别强调其对现代深度RL的适用性。具体而言，该项目将进行两个平行的推力。第一个推力研究无模型RL与一般函数近似。这个推力将确定RL问题易于处理的函数类的一般结构，为这些一般函数类设计新的可证明有效的算法，并解决具有挑战性的问题，如模型错误指定。这一推动力将进一步将这些结果与深度学习的表示、优化和泛化方面的最新进展相结合。第二个推力涉及基于模型的强化学习，以纳入领域知识。这个推力将首先开发一个通用的基于模型的RL方法使用的价值目标系统识别的想法。这一推力还将开发用于易处理计算的方法的随机近似变体，以及用于在基于策略模型的RL之前分析非策略数据的深度模型简化或特征学习方法。该项目的重要成果将是新的通用和可靠的强化学习算法，保证在计算和统计效率方面都能在广泛的应用中表现良好。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（2）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms

DOI：
发表时间：
2021-02
期刊：
影响因子：
0
作者：
Chi Jin;Qinghua Liu;Sobhan Miryoosefi
通讯作者：
Chi Jin;Qinghua Liu;Sobhan Miryoosefi

A Simple Reward-free Approach to Constrained Reinforcement Learning

约束强化学习的简单无奖励方法

DOI：
发表时间：
2022
期刊：
International Conference on Machine Learning
影响因子：
0
作者：
Sobhan Miryoosefi, Chi Jin
通讯作者：
Sobhan Miryoosefi, Chi Jin

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Chi Jin其他文献

Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits

与对抗性对手学习马尔可夫博弈：高效算法和基本限制

DOI：
10.48550/arxiv.2203.06803
发表时间：
2022
期刊：
Proceedings of the forty-seventh annual ACM symposium on Theory of Computing
影响因子：
0
作者：
Qinghua Liu;Yuanhao Wang;Chi Jin
通讯作者：
Chi Jin

The stability control for isolated wind‐diesel power system based on the cross coupling effect model

基于交叉耦合效应模型的离风柴油发电系统稳定控制

DOI：
10.1049/gtd2.12089
发表时间：
2020-12
期刊：
Iet Generation Transmission & Distribution
影响因子：
2.5
作者：
Yang Mi;Lang Zhongjie;Chen Xin;Yang Fu;Chi Jin;Shi Shuai;Zhao Yao;Enyu Jiang
通讯作者：
Enyu Jiang