Compositional Causal Model-based Reinforcement Learning

基于组合因果模型的强化学习

基本信息

  • 批准号:
    RGPIN-2020-06904
  • 负责人:
  • 金额:
    $ 2.48万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

One of the most important, unsolved problems of artificial intelligence is to build agents with human-like creativity, curiosity, self-assessment, and commonsense reasoning. Recently, model--free reinforcement learning (MFRL) has shown impressive performance in video game playing and locomotion controls using deep neural networks. Despite their success, MFRL methods are fundamentally limited by their trial--and--error nature, which requires millions of training examples to learn a reliable policy. On the other hand, a model--based reinforcement learning (MBRL) agent is capable of deliberate reasoning to achieve its goal. Unlike model--free agents, the MBRL agent iteratively learns a model of the world and plan its action according to its world model. MBRL has a great appeal because the learned model allows the agent to predict its future and reason about the consequences of its own actions. One of the ultimate goals of reinforcement learning research is to have agents acting in multiple environments and generalize previous learning experience to new situations. The ability to transfer knowledge across tasks is considered a critical aspect of any intelligent agent. The main objectives of the proposed research are to introduce a general model-based reinforcement learning algorithm that brings together three key ideas--compositionality, causality, and intrinsic curiosity--have been separately influential in machine learning over the past several decades. The objectives in this 5-year project are as follows: 1. Establish baselines for comparisons: Train and evaluate state-of-the-art model-based reinforcement learning agents in the latest locomotion control physics simulators. 2. Derive a compositional forward dynamics model, where the internal representations are object-based. 3. Explore, evaluate different types of causal inference methods in the proposed compositional model, including linear independent component analysis, mutual information-based independence tests, variational inference. 4. Develop planning-based algorithms to overcome non-stationary intrinsic rewards in exploration. 5. Answer the hypothesis that causal representations lead to simplified learning on new down-stream tasks, to help end-users in interpreting data, and to generalize to novel test examples. I anticipate that this project will benefit both deep learning and reinforcement learning community in several ways, ranging from the establishment of a new approach to actively infer causal factors, to elucidating new knowledge of exploration algorithms, to providing benchmark and open-source implementations of state-of-the-art MFRL and MBRL agents to maximally facilitate future research in the field of machine learning.
人工智能最重要的未解决问题之一是构建具有人类创造力,好奇心,自我评估和常识推理的智能体。最近,无模型强化学习(MFRL)在使用深度神经网络的视频游戏和运动控制中表现出令人印象深刻的性能。尽管取得了成功,MFRL方法从根本上受到其试错性质的限制,这需要数百万个训练示例来学习可靠的策略。另一方面,基于模型的强化学习(MBRL)代理能够进行深思熟虑的推理以实现其目标。与无模型代理不同,MBRL代理迭代地学习世界模型,并根据其世界模型计划其行动。MBRL具有很大的吸引力,因为学习模型允许代理预测其未来并对自己行为的后果进行推理。 强化学习研究的最终目标之一是让智能体在多个环境中发挥作用,并将以前的学习经验推广到新的情况。跨任务传递知识的能力被认为是任何智能代理的关键方面。 这项研究的主要目标是引入一种通用的基于模型的强化学习算法,该算法汇集了三个关键思想-组合性,因果关系和内在好奇心-在过去几十年中分别对机器学习产生了影响。本五年计划的目标如下:1.建立比较基准:在最新的运动控制物理模拟器中训练和评估最先进的基于模型的强化学习代理。 2.推导出一个组合前向动力学模型,其中内部表示是基于对象的。 3.探索、评价了组合模型中不同类型的因果推理方法,包括线性独立成分分析、基于互信息的独立性检验、变分推理。 4.开发基于规划的算法,以克服探索中的非平稳内在奖励。5.回答因果表征导致新的下游任务的简化学习,帮助最终用户解释数据,并推广到新的测试示例的假设。我预计这个项目将在几个方面使深度学习和强化学习社区受益,从建立一种新的方法来积极推断因果因素,到阐明探索算法的新知识,再到提供最先进的MFRL和MBRL代理的基准和开源实现,以最大限度地促进机器学习领域的未来研究。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ba, Jimmy其他文献

Ba, Jimmy的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ba, Jimmy', 18)}}的其他基金

Compositional Causal Model-based Reinforcement Learning
基于组合因果模型的强化学习
  • 批准号:
    RGPIN-2020-06904
  • 财政年份:
    2021
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Discovery Grants Program - Individual
Compositional Causal Model-based Reinforcement Learning
基于组合因果模型的强化学习
  • 批准号:
    RGPIN-2020-06904
  • 财政年份:
    2020
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Discovery Grants Program - Individual
Compositional Causal Model-based Reinforcement Learning
基于组合因果模型的强化学习
  • 批准号:
    DGECR-2020-00309
  • 财政年份:
    2020
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Discovery Launch Supplement

相似海外基金

Bayesian causal estimation via model misspecification
通过模型错误指定进行贝叶斯因果估计
  • 批准号:
    EP/Y029755/1
  • 财政年份:
    2024
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Research Grant
Model-Based and Design-Based Approaches to Longitudinal Causal Decomposition Analysis
基于模型和设计的纵向因果分解分析方法
  • 批准号:
    2243119
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Standard Grant
Constructing a causal model of ADHD from an embodied perspective
从具身视角构建ADHD因果模型
  • 批准号:
    23KJ0064
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Realization of causal model-based machine learning that autonomously responds to change
实现基于因果模型的机器学习,自主响应变化
  • 批准号:
    23K16951
  • 财政年份:
    2023
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Exploring the Critical Period of Health Effects of the COVID-19 Disaster on Children - Creating a "Critical Period Causal Model" -
探索COVID-19灾难对儿童健康影响的关键期——创建“关键期因果模型”——
  • 批准号:
    22K19646
  • 财政年份:
    2022
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Grant-in-Aid for Challenging Research (Exploratory)
Developing methods for model selection in causal health analyses.
开发因果健康分析中模型选择的方法。
  • 批准号:
    2741534
  • 财政年份:
    2022
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Studentship
Examining payment and delivery model impacts on health equity using novel quasi-experimental causal inference methods
使用新颖的准实验因果推理方法检查支付和交付模式对健康公平的影响
  • 批准号:
    10674818
  • 财政年份:
    2022
  • 资助金额:
    $ 2.48万
  • 项目类别:
Determining causal effects of variants of chromatin regulators on neurodevelopmental disorders in a 'humanized' Drosophila model
在“人源化”果蝇模型中确定染色质调节因子变异对神经发育障碍的因果影响
  • 批准号:
    559355-2021
  • 财政年份:
    2022
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Clarification of difficulty causal model and palliative intervention for nurses in with-COVID-19 society
澄清 COVID-19 社会中护士的困难因果模型和姑息干预
  • 批准号:
    22K10687
  • 财政年份:
    2022
  • 资助金额:
    $ 2.48万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Validation of a Causal Model of Implementation
实施因果模型的验证
  • 批准号:
    10402919
  • 财政年份:
    2021
  • 资助金额:
    $ 2.48万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了