权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

共進化的環境創造による自律移動ロボットのメタレベル行動学習

通过共同进化环境创建自主移动机器人的元级行为学习

基本信息

批准号：
14750362
负责人：
近藤敏之
金额：
$ 2.11万
依托单位：
Tokyo Institute of Technology
依托单位国家：
日本
项目类别：
Grant-in-Aid for Young Scientists (B)
财政年份：
2002
资助国家：
日本
起止时间：
2002 至 2003
项目状态：
已结题

项目摘要

本研究では,高次元・連続な状態入出力を有する制御対象として自律移動ロボットをとりあげ,その感覚・行動間写像の同定に強化学習法を適用する際に問題となる,計算資源の割当て問題を解決するための一手法として,NGnetで実装したActor-Critic強化学習に学習器の構造パラメータを同時に探索する進化的recruitment戦略を導入する手法を提案した.昨年度までに検証した提案アルゴリズムの有効性と実ロボットによる実証実験は,計測自動制御学会論文集ならびにJournal of Robotics and Autonomous Systemsに掲載された.また,本年度は学習器の構造最適化に加えて,「いかにして複雑な学習課題を効率よく学習するか?」という,学習のスケジューリングに関する研究にも同時並行して取り組んだ.発達心理学におけるpiagetの先駆的研究を参考に,人間の身体と神経系の共進化的発達と,近年,盛んに研究が行われ始めている認知発達ロボティクスの関連に着目した.すなわち,多自由度な感覚運動連関を有する移動ロボットの制御器を強化学習で学習する際に,過去の学習事例から「学習のコツ」となる拘束条件を抽出して記憶しておき,これを未学習課題の習得に拘束条件として用いることで,無駄な試行錯誤数を削減し,その結果として強化学習を高速化することができる,「拘束条件抽出型強化学習法」を提案した.

In this study, we propose a method to solve the problem of computational resource segmentation when applying reinforcement learning method to high dimensional continuous state input and output, and to explore the evolutionary recruitment strategy of learning machine. The paper was published in Journal of Robotics and Autonomous Systems. This year, the optimization of the structure of the learning device was added,"The efficiency of learning is improved." In the middle of the study, the study was conducted in parallel. In recent years, the research on the relationship between cognitive development and the evolution of human body and nervous system has begun. During reinforcement learning, learning constraints are extracted from past learning examples, learning constraints are stored in memory, learning constraints are used in unlearned subjects, trial errors are reduced, and reinforcement learning speeds up as a result. A proposal for "constrained conditional withdrawal reinforcement learning".

项目成果

期刊论文数量（10）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

近藤敏之, 伊藤宏司: "環境共創による適応的行動学習 -実移動ロボットによる押し動作獲得"計測自動制御学会システム・情報部門学術講演会2002講演論文集(優秀論文賞受賞). 423-428 (2002)

Toshiyuki Kondo、Hiroshi Ito：“通过环境共创进行自适应行为学习 - 通过真实移动机器人获取推力运动”仪器与控制工程师学会系统与信息分会学术会议 2002 年论文集（最佳论文奖获得者） 423-428（2002）

DOI：
发表时间：
期刊：
影响因子：
0
作者：
通讯作者：

近藤敏之, 伊藤宏司: "共進化環境創造による実移動ロボットのPeg押し動作学習"日本ロボット学会創立20周年記念学術講演会. (CD-ROM). 3H32 (2002)

Toshiyuki Kondo、Hiroshi Ito：“通过创建共同进化环境来学习真实移动机器人的推动运动”日本机器人学会 20 周年学术讲座（CD-ROM）。

DOI：
发表时间：
期刊：
影响因子：
0
作者：
通讯作者：

Toshiyuki Kondo, Koji Ito: "A Study on Designing Robot Controllers by Using Reinforcement Learning with Evolutionary State Recruitment Strategy"Proceedings of the First International Workshop on Biologically Inspired Approaches to Advanced Information Tec

Toshiyuki Kondo、Koji Ito：“利用强化学习和进化状态招募策略设计机器人控制器的研究”第一届高级信息技术仿生方法国际研讨会论文集

DOI：
发表时间：
期刊：
影响因子：
0
作者：
通讯作者：

Toshiyuki Kondo, Norihiko Itoh, Koji Ito: "An Incremental Learning using Schema Extraction Mechanism for Autonomous Mobile Robot"Proceedings of 2003 IEEE International Symposium on Computational Intelligence in Robotics and Automation. (CD-ROM). 1126-1131

Toshiyuki Kondo、Norihiko Itoh、Koji Ito：“An Incremental Learning using Schema Extraction Mechanism for Autonomous Mobile Robot”2003 IEEE 机器人与自动化计算智能国际研讨会论文集。