RI: Small: Feature Encoding for Reinforcement Learning
RI:小型:强化学习的特征编码
基本信息
- 批准号:1815300
- 负责人:
- 金额:$ 50万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-08-01 至 2023-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project focuses on the subfield of machine learning referred to as Reinforcement Learning (RL), in which algorithms or robots learn by trial and error. As with many areas of machine learning, there has been a surge of interest in "deep learning" approaches to reinforcement learning, i.e, "Deep RL." Deep learning uses computational models motivated by structures found in the brains of animals. Deep RL has enjoyed some stunning successes, including a recent advance by which a program learned to play the Asian game of Go better than the best human player. Notably, this level of performance was achieved without any human guidance. Given only the rules of the game, the program learned by playing against itself. Although games are intriguing and attention-grabbing, this feat was merely a technology demonstration. Firms are seeking to deploy Deep RL methods to increase the efficiency of their operations across a range of applications such as data center management and robotics. To realize fully the potential of Deep RL, further research is required to make the training process more predictable, reliable, and efficient. Current techniques require massive amounts of training data and computation, and subtle changes in the configuration of the system can cause huge differences in the quality of the results obtained. Thus, even though RL systems can learn autonomously by trial and error, a large amount of human intuition, experience and experimentation may be required to lay the groundwork for these systems to succeed. This proposal seeks to develop new techniques and theory to make high quality deep RL results more widely and easily obtainable. In addition, this proposal will provide opportunities for undergraduates to be involved in research through Duke's Data+ initiative.The proposed research is partly inspired by past work on feature selection and discovery for reinforcement learning. Much of that work focused primarily on linear value function approximation. Its relevance to deep reinforcement learning is that methods such as Deep Q-learning have a linear final layer. The preceding, nonlinear layers can, therefore, be interpreted as performing feature discovery for what is ultimately a linear value function approximation process. Sufficient conditions on the features that were specified for successful linear value function approximation in earlier work can now be re-interpreted as an intermediate objective function for the penultimate layer of a deep network. The proposed research aims to achieve the following objectives: 1) Develop a theory of feature construction that explains and informs deep reinforcement learning methods, 2) develop improved approaches to value function approximation that are applicable to deep reinforcement learning, 3) develop improved approaches to policy search that are applicable to deep reinforcement learning, and 4) develop new algorithms for exploration in reinforcement learning that take advantage of learned feature representations, and 5) perform computational experiments demonstrating the efficacy of the new algorithms developed on benchmark problems.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目重点关注机器学习的子领域,称为强化学习(RL),其中算法或机器人通过反复试验来学习。与机器学习的许多领域一样,人们对强化学习的“深度学习”方法(即“深度强化学习”)兴趣激增。 深度学习使用由动物大脑中发现的结构驱动的计算模型。深度强化学习取得了一些惊人的成功,包括最近的一项进展,一个程序学会了比最好的人类棋手更好地玩亚洲围棋游戏。值得注意的是,这种性能水平是在没有任何人工指导的情况下实现的。仅给定游戏规则,程序就通过与自身对弈来学习。尽管游戏很有趣且引人注目,但这一壮举只是一次技术演示。公司正在寻求部署深度强化学习方法,以提高数据中心管理和机器人等一系列应用的运营效率。为了充分发挥深度强化学习的潜力,需要进一步研究以使训练过程更加可预测、可靠和高效。当前的技术需要大量的训练数据和计算,系统配置的细微变化可能会导致所获得结果的质量存在巨大差异。因此,尽管强化学习系统可以通过反复试验自主学习,但可能需要大量的人类直觉、经验和实验来为这些系统的成功奠定基础。该提案旨在开发新技术和理论,使高质量的深度强化学习结果更广泛、更容易获得。此外,该提案将为本科生提供通过杜克大学数据+计划参与研究的机会。拟议研究的部分灵感来自于过去关于强化学习的特征选择和发现的工作。 大部分工作主要集中在线性值函数近似上。 它与深度强化学习的相关性在于深度 Q 学习等方法具有线性最终层。 因此,前面的非线性层可以解释为执行特征发现,以实现最终的线性值函数近似过程。 在早期工作中为成功的线性值函数近似指定的特征的充分条件现在可以重新解释为深度网络倒数第二层的中间目标函数。拟议的研究旨在实现以下目标:1)开发一种特征构建理论,解释和告知深度强化学习方法,2)开发适用于深度强化学习的价值函数逼近的改进方法,3)开发适用于深度强化学习的改进的策略搜索方法,4)开发利用学习到的特征表示的强化学习探索新算法,5)进行计算实验证明 该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Policy Caches with Successor Features
- DOI:
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Mark W. Nemecek;R. Parr
- 通讯作者:Mark W. Nemecek;R. Parr
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ronald Parr其他文献
Amazing Things Come From Having Many Good Models
令人惊奇的事情来自于拥有许多好的模型
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
Cynthia Rudin;Chudi Zhong;Lesia Semenova;Margo Seltzer;Ronald Parr;Jiachang Liu;Srikar Katta;Jon Donnelly;Harry Chen;Zachery Boner - 通讯作者:
Zachery Boner
An Optimal Tightness Bound for the Simulation Lemma
模拟引理的最优紧界
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Sam Lobel;Ronald Parr - 通讯作者:
Ronald Parr
Ronald Parr的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ronald Parr', 18)}}的其他基金
EAGER: Collaborative Research: An Unified Learnable Roadmap for Sequential Decision Making in Relational Domains
EAGER:协作研究:关系领域顺序决策的统一可学习路线图
- 批准号:
1836575 - 财政年份:2018
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
RI: Small: Non-parametric Approximate Dynamic Programming for Continuous Domains
RI:小:连续域的非参数近似动态规划
- 批准号:
1218931 - 财政年份:2012
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
EAGER: IIS: RI: Learning in Continuous and High Dimensional Action Spaces
EAGER:IIS:RI:在连续和高维行动空间中学习
- 批准号:
1147641 - 财政年份:2011
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Collaborative: RI: Feature Discovery and Benchmarks for Exportable Reinforcement Learning
协作:RI:可导出强化学习的特征发现和基准
- 批准号:
0713435 - 财政年份:2007
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CAREER: Observing to Plan - Planning to Observe
职业生涯:观察到计划 - 计划到观察
- 批准号:
0546709 - 财政年份:2006
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
Prediction and Planning: Bridging the Gap
预测和规划:弥合差距
- 批准号:
0209088 - 财政年份:2002
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
相似国自然基金
昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
- 批准号:
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
- 批准号:32000033
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
- 批准号:31972324
- 批准年份:2019
- 资助金额:58.0 万元
- 项目类别:面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
- 批准号:81900988
- 批准年份:2019
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
- 批准号:31870821
- 批准年份:2018
- 资助金额:56.0 万元
- 项目类别:面上项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
- 批准号:31802058
- 批准年份:2018
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
- 批准号:31772128
- 批准年份:2017
- 资助金额:60.0 万元
- 项目类别:面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
- 批准号:81704176
- 批准年份:2017
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
- 批准号:91640114
- 批准年份:2016
- 资助金额:85.0 万元
- 项目类别:重大研究计划
相似海外基金
Collaborative Research: SHF: Small: Sub-millisecond Topological Feature Extractor for High-Rate Machine Learning
合作研究:SHF:小型:用于高速机器学习的亚毫秒拓扑特征提取器
- 批准号:
2234921 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: Sub-millisecond Topological Feature Extractor for High-Rate Machine Learning
合作研究:SHF:小型:用于高速机器学习的亚毫秒拓扑特征提取器
- 批准号:
2234920 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: Sub-millisecond Topological Feature Extractor for High-Rate Machine Learning
合作研究:SHF:小型:用于高速机器学习的亚毫秒拓扑特征提取器
- 批准号:
2234919 - 财政年份:2023
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
III: Small: Deep Interactive Reinforcement Learning for Self-optimizing Feature Selection
III:小:用于自优化特征选择的深度交互式强化学习
- 批准号:
2152030 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Development of a Frontier Magnetic Resonance (MR) Imaging Technology As a Tool for Visualization and Quantified Vascular-Feature Measurement for Use in Brain and Behavioral Research on Small Animals
开发前沿磁共振 (MR) 成像技术作为可视化和量化血管特征测量的工具,用于小动物的大脑和行为研究
- 批准号:
10384839 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Odor sensing, feature extraction and its classification by machine learning and fabrication of small and lightweight e-noses
通过机器学习和小型轻量级电子鼻的制造来进行气味传感、特征提取和分类
- 批准号:
20K11888 - 财政年份:2020
- 资助金额:
$ 50万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
CSR: Small: Ultra-Low Power Analog Computing and Dry Skin-Electrode Contact Interface Design Techniques for Systems-On-A-Chip with EEG Sensing and Feature Extraction
CSR:小型:具有 EEG 传感和特征提取功能的片上系统的超低功耗模拟计算和干皮肤电极接触接口设计技术
- 批准号:
1812588 - 财政年份:2018
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CCSS: Small: Universal Feature Selection in Integrated Monitoring of Large Networks
CCSS:小型:大型网络综合监控中的通用特征选择
- 批准号:
1711027 - 财政年份:2017
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
III: Small: Collaborative Research: A General Feature Learning Framework for Dynamic Attributed Networks
III:小:协作研究:动态属性网络的通用特征学习框架
- 批准号:
1718840 - 财政年份:2017
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
III: Small: Unsupervised Feature Selection in the Era of Big Data
III:小:大数据时代的无监督特征选择
- 批准号:
1714741 - 财政年份:2017
- 资助金额:
$ 50万 - 项目类别:
Standard Grant














{{item.name}}会员




