Model-based reinforcement learning : brain implementation and engineering applications

基于模型的强化学习:大脑实现和工程应用

基本信息

  • 批准号:
    15300102
  • 负责人:
  • 金额:
    $ 7.68万
  • 依托单位:
  • 依托单位国家:
    日本
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
  • 财政年份:
    2003
  • 资助国家:
    日本
  • 起止时间:
    2003 至 2005
  • 项目状态:
    已结题

项目摘要

[On-line Bayesian learning schemes]We devised an on-line Bayesian learning algorithm which can be applied to Gaussian stochastic processes and can estimate the system dimensionality and change occurrence in the target dynamics (Hirayama et al., 2004). We also devised a sequential Monte-Carlo-based method which can be applied to non-Gaussian stochastic processes and applied it to visual tracking problems (Bando, et al., in press).[Applications of model-based reinforcement learning and on-line learning]We succeeded in allowing a biped robot simulator to biped-walk autonomously, based on the combination of central pattern generator and reinforcement learning. We later extended this approach such to incorporate policy-gradient-based reinforcement learning. By further introducing an on-line model identification method, the autonomous learning by the biped simulator has been accelerated (Nakamura et al., 2005). Our reinforcement learning for a switching controller succeeded in swinging-up an … More d stabilizing an underactuated real robot, the acrobot. An autonomous training scheme based on the combination of the model-based reinforcement learning and the on-line model learning can construct a card-game playing agent for a multi-agent card game, which is as strong as a human expert player (Ishii, et al., 2005).[Reward-related prefrontal neural activities of primates]An electrophysiological study with a primates memory-based sensorimotor processing task revealed that the reward expectation significantly enhanced the selectivity of sensory working memory but not that of motor memory (Amemori, et al., 2005).[Neuropsychological study of humans prefrontal information processing]We developed an information processing model during a human performs a Markov decision process, and evaluated the model plausibility by means of neuropsychological studies with functional magnetic resonance imaging. We found the engagement of dorsolateral prefrontal cortex (Yoshida, et al., 2005). When the Markov decision environment involves uncertainty, its resolution could be performed in front-polar prefrontal cortex (Yoshida, et al., in press). Less
[在线贝叶斯学习方案]我们设计了一种在线贝叶斯学习算法,该算法可以应用于高斯随机过程,并且可以估计系统维度和目标动态中的变化发生(Hirayama et al.,2004年)。我们还设计了一种基于顺序蒙特-卡罗的方法,该方法可以应用于非高斯随机过程,并将其应用于视觉跟踪问题(Bando等人,印刷中)。[基于模型的强化学习和在线学习的应用]基于中央模式发生器和强化学习的组合,我们成功地让一个机器人模拟器自主地两足行走。我们后来扩展了这种方法,以纳入基于策略梯度的强化学习。通过进一步引入在线模型识别方法,已经加速了由所述仿真器进行的自主学习(中村等人,2005年)。我们的开关控制器的强化学习成功地在摆动了一个 ...更多信息 d稳定欠驱动的真实的机器人,杂技演员。基于基于模型的强化学习和在线模型学习的组合的自主训练方案可以为多智能体纸牌游戏构建纸牌游戏智能体,其与人类专家玩家一样强(石井等人,2005年)。[灵长类动物的奖赏相关前额叶神经活动]一项对灵长类动物基于记忆的感觉运动加工任务的电生理研究显示,奖赏期望显著增强了感觉工作记忆的选择性,但对运动记忆的选择性没有影响(Amemori,et al.,2005年)。[人类前额叶信息加工的神经心理学研究]我们建立了一个人类执行马尔可夫决策过程的信息加工模型,并通过功能磁共振成像的神经心理学研究来评估模型的可操作性。我们发现背外侧前额叶皮层的参与(Yoshida等人,2005年)。当马尔可夫决策环境涉及不确定性时,其解决方案可以在前极前额叶皮层中执行(Yoshida等人,印刷中)。少

项目成果

期刊论文数量(96)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller
  • DOI:
    10.20965/jrm.2005.p0636
  • 发表时间:
    2005-12
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yutaka Nakamura;Takeshi Mori;Yoichi Tokita;T. Shibata;S. Ishii
  • 通讯作者:
    Yutaka Nakamura;Takeshi Mori;Yoichi Tokita;T. Shibata;S. Ishii
A model of smooth pursuit in primates based on learning the target dynamics
Aceobot control by learning the switching of multiple controllers
通过学习多个控制器的切换进行Aceobot控制
Acrobot control by learning the switching of multiple controllers
通过学习多个控制器的切换进行 Acrobot 控制
Bayesian noisy ICA for source switching environments
用于源切换环境的贝叶斯噪声 ICA
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

ISHII Shin其他文献

ISHII Shin的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('ISHII Shin', 18)}}的其他基金

Uncovering neural correlates in human decision making based on brain decoding
基于大脑解码揭示人类决策中的神经关联
  • 批准号:
    24300114
  • 财政年份:
    2012
  • 资助金额:
    $ 7.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
A study of modular models of decision making in uncertain and non-stationary environments
不确定非平稳环境下决策的模块化模型研究
  • 批准号:
    21300113
  • 财政年份:
    2009
  • 资助金额:
    $ 7.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Computational model of human decision-making in complicated environments and its applications
复杂环境下人类决策计算模型及其应用
  • 批准号:
    18300101
  • 财政年份:
    2006
  • 资助金额:
    $ 7.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Reseach for stable bioinformatics method based on hierarchical Bayes inference.
基于分层贝叶斯推理的稳定生物信息学方法研究。
  • 批准号:
    18079011
  • 财政年份:
    2006
  • 资助金额:
    $ 7.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research on Priority Areas

相似海外基金

The roles of prefrontal cortex on the endogenous analgesic systems
前额皮质对内源性镇痛系统的作用
  • 批准号:
    23H03000
  • 财政年份:
    2023
  • 资助金额:
    $ 7.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Functional, structural, and computational consequences of NMDA receptor ablation at medial prefrontal cortex synapses
内侧前额皮质突触 NMDA 受体消融的功能、结构和计算后果
  • 批准号:
    10677047
  • 财政年份:
    2023
  • 资助金额:
    $ 7.68万
  • 项目类别:
Implications of Prefrontal Cortex Development for Adolescent Reward Seeking Behavior
前额皮质发育对青少年奖励寻求行为的影响
  • 批准号:
    10739548
  • 财政年份:
    2023
  • 资助金额:
    $ 7.68万
  • 项目类别:
NIH Gonzalez-Amoretti Characterizing PopulationDynamics of Prefrontal Cortex which Govern the Modulation of Visual Processing
NIH Gonzalez-Amoretti 描述了控制视觉处理调节的前额叶皮层的群体动态
  • 批准号:
    10748169
  • 财政年份:
    2023
  • 资助金额:
    $ 7.68万
  • 项目类别:
Layer 1 interneurons as master regulators of prefrontal cortex circuit development
第一层中间神经元作为前额皮质回路发育的主要调节器
  • 批准号:
    BB/X016331/1
  • 财政年份:
    2023
  • 资助金额:
    $ 7.68万
  • 项目类别:
    Research Grant
What role do different interneurons in medial prefrontal cortex play in associative recognition memory?
内侧前额叶皮层的不同中间神经元在联想识别记忆中发挥什么作用?
  • 批准号:
    BB/X000915/1
  • 财政年份:
    2023
  • 资助金额:
    $ 7.68万
  • 项目类别:
    Research Grant
Alcohol and Interneurons in the Prefrontal Cortex
酒精和前额皮质的中间神经元
  • 批准号:
    10567414
  • 财政年份:
    2023
  • 资助金额:
    $ 7.68万
  • 项目类别:
Investigation of non-canonical opioid signaling in the prefrontal cortex of alcohol-dependent rats
酒精依赖大鼠前额叶皮层非典型阿片类药物信号传导的研究
  • 批准号:
    10811444
  • 财政年份:
    2023
  • 资助金额:
    $ 7.68万
  • 项目类别:
Role of CB1R expressed in the prefrontal cortex in the control of locomotion
前额皮质表达的 CB1R 在运动控制中的作用
  • 批准号:
    10590320
  • 财政年份:
    2023
  • 资助金额:
    $ 7.68万
  • 项目类别:
The Role of Medial Prefrontal Cortex in Context-Dependent Valuation and Decision Processes
内侧前额叶皮层在上下文相关的评估和决策过程中的作用
  • 批准号:
    10658093
  • 财政年份:
    2023
  • 资助金额:
    $ 7.68万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了