权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Self-Organization of Hierarchical Reinforcement Learning System

分层强化学习系统的自组织

基本信息

批准号：
13650480
负责人：
ABE Kenichi
金额：
$ 2.18万
依托单位：
Tohoku University
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (C)
财政年份：
2001
资助国家：
日本
起止时间：
2001 至 2002
项目状态：
已结题

项目摘要

Previously, we proposed two learning algorithms, Labeling Q-learning(LQ-learning) and Switching Q-learning(SQ-learning). Although the former is the algorithm of simple structure which consists of a single agent, it can learn well in a certain kind of POMDP environments. The latter is a type of hierarchical Q-learning method (HQ-learning), which changes Q-modules by using a hierarchical learning automaton, and can work well also in a more complicated POMDP environment. In this study, we improved these two algorithms, and developed more effective HQ-learning algorithms. Further, in order to overcome more realistic environments where either or both of observations and actions take continuous values, we conducted a basic study about function approximations by neural networks. The results are following.1) We improved the SQ-learning so that it works well in noisy environments. We also demonstrated that the SQ-learning exhibits a better performance than Wiering's HQ-learning.2) We enhanced the performance of the LQ-leaning by introducing the Kohonen's self-organizing map(SOM).3) We improved the self-segmentation of sequence(SSS) algorithm by Sun and Sessions. Further, we also developed a new algorithm, called SSS(λ).4) We examined the effectiveness of SSS(λ) by applying it to the navigation task of a mobile robot. Here, the SOM was used for self-classification of continuous sonar observations.5) We proposed a statistical approximation learning(SAL) for the simultaneous recurrent neural networks, and demonstrated that it achieves the high accuracy of nonlinear function approximation. Further, we presented a novel neural network model for incremental learning.

在此之前，我们提出了两种学习算法，标记Q学习（LQ-learning）和切换Q学习（SQ-learning）。前者虽然是由单个Agent组成的结构简单的算法，但在某种POMDP环境下具有良好的学习能力。后者是一种层次Q学习方法（HQ学习），它通过使用层次学习自动机来改变Q模块，并且也可以在更复杂的POMDP环境中工作。在本研究中，我们改进了这两种算法，并开发了更有效的HQ学习算法。此外，为了克服更现实的环境中，其中一个或两个观察和行动采取连续值，我们进行了关于神经网络的函数近似的基础研究。结果如下：1）我们改进了SQ-learning，使其在嘈杂的环境中工作得很好。2）通过引入Kohonen的自组织映射（SOM），提高了LQ学习的性能; 3）改进了Sun和Sessions提出的序列自分割（SSS）算法。4）将SSS（λ）算法应用于移动的机器人的导航任务，验证了SSS（λ）算法的有效性。本文将SOM用于连续声纳观测数据的自分类。5）提出了一种同步递归神经网络的统计逼近学习方法，并证明了该方法具有非线性函数逼近的高精度。此外，我们提出了一种新的神经网络模型的增量学习。

项目成果

期刊论文数量（42）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

M. Sakai: "Control of Chaos Dynamics in Jordan Recurrent Neural Networks"Proc. of the International Conference on Control, Automation and Systems. 292-295 (2001)

M. Sakai：“约旦循环神经网络中的混沌动力学控制”Proc。

DOI：
发表时间：
期刊：
影响因子：
0
作者：
通讯作者：

M. Sakai: "A Statistical Approximation Learning Method for Simultaneous Recurrent Networks"Proc. of the 15th IFAC World Congress on Automatic Control. 2491-2496 (2002)

M. Sakai：“同时循环网络的统计近似学习方法”Proc。

DOI：
发表时间：
期刊：
影响因子：
0
作者：
通讯作者：

H.Y.Lee: "Labeling Q-learning with SOM"Int. Conf.on Control, Automation, and Systems(ICCAS 2002). 105-109 (2002)

H.Y.Lee：“用 SOM 标记 Q 学习”Int。

DOI：
发表时间：
期刊：
影响因子：
0
作者：
通讯作者：

M.Sakai: "A Statistical Approximation Learning Method for Simultaneous Recurrent Networks"Proc.of the 15^<th> IFAC World Congress on Automatic Control. 2491-2496 (2002)

M.Sakai：第 15 届 IFAC 世界自动控制大会的“同时循环网络的统计近似学习方法”。

DOI：
发表时间：
期刊：
影响因子：
0
作者：
通讯作者：

M.Sakai: "Control of Chaos Dynamics in Jordan Recurrent Neural Networks"Proc.of the International Conference on Control, Automation and Systems. 292-295 (2001)

M.Sakai：“约旦循环神经网络中的混沌动力学控制”国际控制、自动化与系统会议论文集。

DOI：
发表时间：
期刊：
影响因子：
0
作者：
通讯作者：

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

ABE Kenichi其他文献

鎮圧の後で

镇压后

DOI：
发表时间：
2004
期刊：
情況 5巻9号
影响因子：
0
作者：
NISHITANI Osamu;NAKAYAMA Chikako (as editors);田島達也;川村邦光;田島達也;NAKAYAMA Chikako;荻野美穂;成澤勝嗣;NAKAYAMA Chikako;NAKAYAMA Chikako;島薗進;五十嵐公一;HAYASHI Midori;YONETANI Masafumi;杉原達;五十嵐公一;YONETANI Masafumi;野口剛;中村生雄;井田太郎;YONETANI Masafumi;赤坂憲雄;大久保純一;ABE Kenichi;Junichi Okubo;池上良正;ABE Kenichi;島薗進;並木誠士;ABE Kenichi;Seishi Namiki;島薗進;SAKAI Takashi;玉蟲敏子;SAKAI Takashi;玉蟲敏子;冨山一郎;Satoko Tamamushi;SAKAI Takashi;冨山一郎
通讯作者：
冨山一郎

理性の探求(5)名づけと所有--アメリカという制度空間

理性探寻（五）命名与所有权--美国的制度空间

DOI：
发表时间：
2005
期刊：
UP 5月号
影响因子：
0
作者：
NISHITANI Osamu;NAKAYAMA Chikako (as editors);田島達也;川村邦光;田島達也;NAKAYAMA Chikako;荻野美穂;成澤勝嗣;NAKAYAMA Chikako;NAKAYAMA Chikako;島薗進;五十嵐公一;HAYASHI Midori;YONETANI Masafumi;杉原達;五十嵐公一;YONETANI Masafumi;野口剛;中村生雄;井田太郎;YONETANI Masafumi;赤坂憲雄;大久保純一;ABE Kenichi;Junichi Okubo;池上良正;ABE Kenichi;島薗進;並木誠士;ABE Kenichi;Seishi Namiki;島薗進;SAKAI Takashi;玉蟲敏子;SAKAI Takashi;玉蟲敏子;冨山一郎;Satoko Tamamushi;SAKAI Takashi;冨山一郎;西谷修;Satoko Tamamushi;玉蟲敏子;中村生雄;西谷修
通讯作者：
西谷修

視覚のジオポリティクス : メディアウォールを突き崩す

视野地缘政治：打破媒体墙

DOI：
发表时间：
2005
期刊：
影响因子：
0
作者：
西谷修;西谷修;西谷修;西谷修;西谷修;西谷修;西谷修;西谷修;西谷修;中山智香子;安村直己;林みどり;大川正彦;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NAKAYAMA Chikako;YASUMURA Naoki;HAYASHI Midori;OKAWA Masahiko;安村直己;林みどり;林みどり;阿部賢一;YASUMURA Naoki;HAYASHI Midori;林みどり;安村直己;阿部賢一;ABE Kenichi;西谷修・中山智香子(編集)
通讯作者：
西谷修・中山智香子(編集)

A Tikopia in the Global Era : Using Mediation to Empower Coffee Growing Communities in East Timor

全球时代的提科皮亚：利用调解为东帝汶咖啡种植社区赋权

DOI：
发表时间：
2009
期刊：
影响因子：
0
作者：
Tarsitani;Belle Asante;ABE Kenichi
通讯作者：
ABE Kenichi

暴力の哲学

暴力哲学

DOI：
发表时间：
2004
期刊：
影响因子：
0
作者：
西谷修;西谷修;西谷修;西谷修;西谷修;西谷修;西谷修;西谷修;西谷修;中山智香子;安村直己;林みどり;大川正彦;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NISHITANI Osamu;NAKAYAMA Chikako;YASUMURA Naoki;HAYASHI Midori;OKAWA Masahiko;安村直己;林みどり;林みどり;阿部賢一;YASUMURA Naoki;HAYASHI Midori;林みどり;安村直己;阿部賢一;ABE Kenichi;西谷修・中山智香子(編集);西谷修・中山智香子(共編著);NISHITANI Osamu;大川正彦;酒井隆史
通讯作者：
酒井隆史