Robust Decision-Aware Model-based Reinforcement Learning

基于鲁棒决策感知模型的强化学习

基本信息

  • 批准号:
    RGPIN-2021-03701
  • 负责人:
  • 金额:
    $ 2.11万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

Reinforcement learning (RL) is the problem of designing an agent that interacts with its environment and adaptively improves its long-term performance. Many complex real-world decision-making problems can be formulated as an RL problem. Example applications include energy management systems for hybrid cars, dynamic treatment regimes in healthcare, and many others in robotics, finance, etc. RL is at the core of AI and has the potential of having a huge impact on our economy and society, arguably more so than any other area of machine learning. Despite these successes, RL as a technology is not ready for most real-world applications. A major source of difficulty is the high sample complexity of RL agents. Sample complexity refers to the number of interactions (or data points) required to achieve a certain level of performance. An RL agent that requires too many samples before performing well is unsuitable for real-world applications, in which obtaining new samples is often costly and time consuming. Model-based RL (MBRL) is a promising approach to design sample-efficient agents for problems where the number of interactions with the real-world cannot be very large. The basic idea of MBRL is to learn a model of the environment, and then use the model in an internal simulator to plan a good policy, i.e., the strategy to select actions. This may improve the sample complexity of the agent. This is contingent, however, on learning an accurate model of the real-world. The conventional approach to model learning, which is based on learning a good predictive model of the environment, has an important shortcoming. It is based on the belief that an accurate predictor is sufficient for planning. The often-unnoticed fact is that no model can be completely accurate, and there are always some errors between the real-world and the model. The real-world is sometimes too complex for our models. What I suggest in my research program is to rethink how we should do MBRL. Trying to learn complex dynamics that are irrelevant to the underlying decision problem is pointless. A conventional model learning approach cannot discriminate between decision-relevant and irrelevant aspects of the environment, and hence wastes the capacity of a model on unnecessary detail. The fundamental idea of this research program is that instead of trying to learn a model that is a good predictor of the environment, one should only learn about the aspects that are relevant to the decision problem. The scientific impact of this research program is that it opens up and explores an unorthodox way of thinking about how an agent should learn about its environment. I expect my research team's progress on this direction to provide the theoretical and foundational groundwork for the future of model-based RL. I also expect that it leads to sample-efficient RL agents that can be used for real-world applications.
强化学习(RL)是设计一个与环境交互并自适应地提高其长期性能的代理的问题。许多复杂的现实决策问题都可以用RL问题来表述。示例应用包括混合动力汽车的能源管理系统,医疗保健中的动态治疗方案以及机器人,金融等许多其他领域。RL是人工智能的核心,有可能对我们的经济和社会产生巨大影响,可以说比机器学习的任何其他领域都要大。尽管取得了这些成功,但RL作为一种技术还没有为大多数现实世界的应用做好准备。困难的一个主要来源是RL代理的高样本复杂性。样本复杂度是指达到一定性能水平所需的交互(或数据点)数量。在表现良好之前需要太多样本的RL代理不适合现实世界的应用,其中获得新样本通常是昂贵且耗时的。基于模型的强化学习(MBRL)是一种很有前途的方法来设计样本有效的代理问题,其中与现实世界的交互数量不能很大。MBRL的基本思想是学习环境的模型,然后在内部模拟器中使用该模型来规划好的策略,即,选择行动的策略。这可以改善试剂的样品复杂性。然而,这取决于学习真实世界的准确模型。基于学习环境的良好预测模型的传统模型学习方法具有重要的缺点。它是基于这样一种信念,即一个准确的预测是足够的规划。一个经常被忽视的事实是,没有一个模型是完全准确的,现实世界和模型之间总是存在一些误差。现实世界有时对我们的模型来说太复杂了。在我的研究计划中,我建议重新思考我们应该如何做MBRL。试图学习与潜在决策问题无关的复杂动态是毫无意义的。传统的模型学习方法无法区分环境的决策相关和不相关方面,因此浪费了模型在不必要细节上的能力。 这个研究计划的基本思想是,而不是试图学习一个模型,这是一个很好的预测环境,一个人应该只学习与决策问题相关的方面。这项研究计划的科学影响在于,它开辟并探索了一种非正统的思考方式,即智能体应该如何了解其环境。我希望我的研究团队在这个方向上的进展能够为基于模型的强化学习的未来提供理论和基础。我还希望它能带来可用于现实世界应用的样本高效RL代理。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Farahmand, Amirmassoud其他文献

Farahmand, Amirmassoud的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Farahmand, Amirmassoud', 18)}}的其他基金

Robust Decision-Aware Model-based Reinforcement Learning
基于鲁棒决策感知模型的强化学习
  • 批准号:
    DGECR-2021-00419
  • 财政年份:
    2021
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Launch Supplement
Robust Decision-Aware Model-based Reinforcement Learning
基于鲁棒决策感知模型的强化学习
  • 批准号:
    RGPIN-2021-03701
  • 财政年份:
    2021
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    合作创新研究团队

相似海外基金

Collaborative Research: RI: Medium: Informed, Fair, Efficient, and Incentive-Aware Group Decision Making
协作研究:RI:媒介:知情、公平、高效和具有激励意识的群体决策
  • 批准号:
    2313137
  • 财政年份:
    2023
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Standard Grant
Collaborative Research: RI: Medium: Informed, Fair, Efficient, and Incentive-Aware Group Decision Making
协作研究:RI:媒介:知情、公平、高效和具有激励意识的群体决策
  • 批准号:
    2313136
  • 财政年份:
    2023
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Standard Grant
EAGER: DCL: SaTC: Enabling Interdisciplinary Collab: Impact-aware Machine Learning for Fair and Private Decision Making: Algorithms and Applications in Juvenile Justice Systems
EAGER:DCL:SaTC:实现跨学科协作:影响感知机器学习促进公平和私人决策:少年司法系统中的算法和应用
  • 批准号:
    2209951
  • 财政年份:
    2022
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Standard Grant
Robust Decision-Aware Model-based Reinforcement Learning
基于鲁棒决策感知模型的强化学习
  • 批准号:
    DGECR-2021-00419
  • 财政年份:
    2021
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Launch Supplement
Robust Decision-Aware Model-based Reinforcement Learning
基于鲁棒决策感知模型的强化学习
  • 批准号:
    RGPIN-2021-03701
  • 财政年份:
    2021
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Grants Program - Individual
RII Track-1: Data Analytics that are Robust and Trusted (DART): From Smart Curation to Socially Aware Decision Making
RII Track-1:稳健且值得信赖的数据分析 (DART):从智能管理到社会意识决策
  • 批准号:
    1946391
  • 财政年份:
    2020
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Cooperative Agreement
Fairness aware data mining for discrimination free decision-making
公平意识数据挖掘以实现无歧视决策
  • 批准号:
    DP200101210
  • 财政年份:
    2020
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Projects
RTML: Large: Real-Time Autonomic Decision Making on Sparsity-Aware Accelerated Hardware via Online Machine Learning and Approximation
RTML:大型:通过在线机器学习和近似在稀疏感知加速硬件上进行实时自主决策
  • 批准号:
    1937403
  • 财政年份:
    2019
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Standard Grant
NRI: Collaborative Research: Enabling Risk-Aware Decision Making in Human-Guided Unmanned Surface Vehicle Teams
NRI:协作研究:在人类引导的无人地面车辆团队中实现风险意识决策
  • 批准号:
    1634433
  • 财政年份:
    2016
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Standard Grant
NRI: Collaborative Research: Enabling Risk-Aware Decision Making in Human-Guided Unmanned Surface Vehicle Teams
NRI:协作研究:在人类引导的无人地面车辆团队中实现风险意识决策
  • 批准号:
    1526016
  • 财政年份:
    2015
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了