Fast Reinforcement Learning Using Multiple Models and State Decomposition
使用多个模型和状态分解的快速强化学习
基本信息
- 批准号:1407925
- 负责人:
- 金额:$ 15.42万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2014
- 资助国家:美国
- 起止时间:2014-08-15 至 2017-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project attempts to develop better methods for Reinforcement Learning and Approximate Dynamic Programming (RLADP), in order to be able to handle decision tasks with greater complexity both in time and in space. Reinforcement learning systems are systems which can learn to maximize any measure of performance or satisfaction, based on their experience of observing their environment, acting on the environment, and receiving feedback on performance, similar to the pain or pleasure which is used to reinforce animal behavior. Current reinforcement learning methods do not learn fast enough to perform well, when their environment is too complex in space or in time. This project will develop new methods to handle that kind of complexity. The team will also have a collaboration with IBM research, and will try to address a testbed problem involving the management of a fleet of plug-in hybrid cars.Complexity in time will be handled by use of a multiple model approach, connecting various options or skills by evaluation and updating of the landmark states which mark transitions between different regions of state space. This is similar to previous work on decision blocks and modified Bellman equations previously presented at the PI's workshop on learning and adaptive systems, but otherwise is a unique, new an important direction. Complexity in space is addressed by a multiagent approach, based on a kind of spatial decomposition.
该项目试图为强化学习和近似动态规划(RLADP)开发更好的方法,以便能够处理在时间和空间上都具有更大复杂性的决策任务。强化学习系统是一种可以学习最大化任何性能或满意度的系统,基于他们观察环境的经验,对环境采取行动,并接收关于性能的反馈,类似于用于加强动物行为的痛苦或快乐。当前的强化学习方法在空间或时间上过于复杂时,学习速度不够快,无法表现良好。这个项目将开发新的方法来处理这种复杂性。该团队还将与IBM研究部门合作,并将尝试解决一个涉及插电式混合动力汽车车队管理的试验台问题。时间上的复杂性将通过使用多模型方法来处理,通过评估和更新标志着状态空间不同区域之间转换的地标状态来连接各种选项或技能。这类似于先前在PI的学习和自适应系统研讨会上提出的决策块和修改的Bellman方程的工作,但在其他方面是一个独特的,新的重要方向。空间的复杂性是由多智能体的方法,基于一种空间分解。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Snehasis Mukhopadhyay其他文献
Homogeneous Agent-Based Distributed Information Filtering
- DOI:
10.1023/a:1019760221121 - 发表时间:
2002-10-01 - 期刊:
- 影响因子:4.100
- 作者:
Rajeev R. Raje;Mingyong Qiao;Snehasis Mukhopadhyay;Mathew Palakal;Shengquan Peng;Javed Mostafa - 通讯作者:
Javed Mostafa
COBioSIFTER – A CORBA-Based Distributed Multi-Agent Biological Information Management System
- DOI:
10.1023/b:clus.0000039496.64629.32 - 发表时间:
2004-10-01 - 期刊:
- 影响因子:4.100
- 作者:
Rajeev R. Raje;Daocheng Zhu;Snehasis Mukhopadhyay;Liying Tang;Mathew Palakal;Javed Mostafa - 通讯作者:
Javed Mostafa
A bidding mechanism for Web-based agents involved in information classification
- DOI:
10.1023/a:1019215815209 - 发表时间:
1998-01-01 - 期刊:
- 影响因子:3.400
- 作者:
Rajeev R. Raje;Snehasis Mukhopadhyay;Michael Boyles;Artur Papiez;Nila Patel;Mathew Palakal;Javed Mostafa - 通讯作者:
Javed Mostafa
Snehasis Mukhopadhyay的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Snehasis Mukhopadhyay', 18)}}的其他基金
Collaborative Research: Mutual Learning: A Systems Theoretic Investigation
协作研究:相互学习:系统理论研究
- 批准号:
1930606 - 财政年份:2019
- 资助金额:
$ 15.42万 - 项目类别:
Standard Grant
ITR: An Active, Personalized, Adaptive, Multi-format Biological Information Delivery System
ITR:主动、个性化、自适应、多格式生物信息传递系统
- 批准号:
0081944 - 财政年份:2000
- 资助金额:
$ 15.42万 - 项目类别:
Continuing Grant
Career: Adaptation and Learning in Distributed Systems Using Neural Networks
职业:使用神经网络的分布式系统的适应和学习
- 批准号:
9623971 - 财政年份:1996
- 资助金额:
$ 15.42万 - 项目类别:
Continuing Grant
相似国自然基金
海桑属杂种区强化(Reinforcement)的检验与遗传基础研究
- 批准号:30800060
- 批准年份:2008
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: CDS&E: Generalizable RANS Turbulence Models through Scientific Multi-Agent Reinforcement Learning
合作研究:CDS
- 批准号:
2347423 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Standard Grant
CAREER: Stochasticity and Resilience in Reinforcement Learning: From Single to Multiple Agents
职业:强化学习中的随机性和弹性:从单个智能体到多个智能体
- 批准号:
2339794 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Continuing Grant
Learning to Reason in Reinforcement Learning
在强化学习中学习推理
- 批准号:
DP240103278 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Discovery Projects
CAREER: Towards Real-world Reinforcement Learning
职业:走向现实世界的强化学习
- 批准号:
2339395 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Continuing Grant
CAREER: Robust Reinforcement Learning Under Model Uncertainty: Algorithms and Fundamental Limits
职业:模型不确定性下的鲁棒强化学习:算法和基本限制
- 批准号:
2337375 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Continuing Grant
Optimizing Intelligent Vehicular Routing with Edge Computing through Multi-Agent Reinforcement Learning
通过多智能体强化学习利用边缘计算优化智能车辆路由
- 批准号:
24K14913 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
CAREER: Temporal Causal Reinforcement Learning and Control for Autonomous and Swarm Cyber-Physical Systems
职业:自治和群体网络物理系统的时间因果强化学习和控制
- 批准号:
2339774 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Continuing Grant
Federated Reinforcement Learning Empowered Point Cloud Video Streaming
联合强化学习赋能点云视频流
- 批准号:
24K14927 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Collaborative Research: CDS&E: Generalizable RANS Turbulence Models through Scientific Multi-Agent Reinforcement Learning
合作研究:CDS
- 批准号:
2347422 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Standard Grant
CAREER: Structure Exploiting Multi-Agent Reinforcement Learning for Large Scale Networked Systems: Locality and Beyond
职业:为大规模网络系统利用多智能体强化学习的结构:局部性及其他
- 批准号:
2339112 - 财政年份:2024
- 资助金额:
$ 15.42万 - 项目类别:
Continuing Grant