权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

未知の協調・環境を想定したマルチエージェント強化学習の知識転移

假设未知合作/环境的多智能体强化学习的知识转移

基本信息

批准号：
21K17807
负责人：
上野史
金额：
$ 3万
依托单位：
Okayama University
依托单位国家：
日本
项目类别：
Grant-in-Aid for Early-Career Scientists
财政年份：
2021
资助国家：
日本
起止时间：
2021-04-01 至 2024-03-31
项目状态：
已结题

项目摘要

本年度は，マルチエージェント強化学習の未知の協調，未知の環境への適応に向けた，(1)学習結果のモジュール化，(2)知識モジュールに基づく未知の協調行動学習法の提案，(3)未知環境を想定した知識の再構成法の提案の内，主にテーマ(2)および(3)に向けた調査を実施した．具体的には，知識モジュールの抽出として，深層強化学習A3Cをベースとして，獲得報酬の変化をパラメトリックな分布を構成し，報酬を獲得した目的ごとの分布を比較することで他エージェントの目的を抽出し，それぞれの目的に合わせて協調行動を学習する手法を提案した．また，ロボットナビゲーションのシミュレーション実験により手法の有効性を検証した．本提案手法は，未知の環境や不測の事態でエージェントの取るべき協調行動が不明の時に，学習結果から知識として切り出した互いの目的を組み合わせて適切な目的を達成する協調行動が学習可能である点で画期的成果となった．また，知識モジュールの組み合わせに関して，エージェントの固有の状態を抽出し連結することで新たな知識を生成し，その知識を学習により環境に対して最適にする手法を提案し，前述のナビゲーション実験によりその有効性を示した．これにより，知識モジュールの連結方法を変えることで環境に合わせた知識の再構成が可能であることが確認できた．本成果は，知識モジュールを抽出した際にそれを組み合わせることによる効果を実証できた点において重要である．なお，本成果は国際会議ICAART 2023および国内学会人工知能学会全国大会等において発表している．

This year, we will focus on: (1) the integration of learning results;(2) the proposal of knowledge integration based on unknown coordination action learning method;(3) the implementation of knowledge reconstruction method in unknown environment. Specifically, knowledge extraction and deep reinforcement learning A3C, acquisition of compensation, acquisition of compensation, distribution of compensation, acquisition of compensation, distribution of compensation, comparison of compensation, acquisition of compensation, target extraction and coordination of learning methods. The results show that there is no significant difference between the two methods. The proposed approach is to coordinate actions against unknown circumstances, unexpected events, unknown circumstances, learning results, knowledge, interaction, goals, integration, appropriate goals, achievement of coordinated actions, learning possibilities, and timely results. For example, the combination of knowledge and environment, the extraction of inherent state, the generation of new knowledge, the learning of knowledge, the optimization of environment, and the effectiveness of the above mentioned knowledge. This is the first time I've ever seen such a thing. The results of this study are as follows: Knowledge is extracted from the body, results are obtained, and results are verified. This achievement was presented at the International Conference ICAART 2023 and the National Conference of the Society of Artificial Intelligence of the National Society.

项目成果

期刊论文数量（2）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

マルチエージェント強化学習の報酬設計による知識の蒸留と転移に関する一考察

多智能体强化学习中通过奖励设计进行知识蒸馏和迁移的研究

DOI：
发表时间：
2022
期刊：
影响因子：
0
作者：
上野史;Fumito Uwano;Fumito Uwano;Fumito Uwano;上野史;上野史
通讯作者：
上野史

Queensland University of Technology(オーストラリア)

昆士兰科技大学（澳大利亚）

DOI：
发表时间：
期刊：
影响因子：
0
作者：
通讯作者：

獲得報酬の分布に基づくエージェント間の暗黙的協調行動学習とその効果の検証

基于获得的奖励分配及其效果验证的代理之间的隐式合作行为学习

DOI：
发表时间：
2022
期刊：
影响因子：
0
作者：
上野史;Fumito Uwano;Fumito Uwano;Fumito Uwano;上野史;上野史;上野　史
通讯作者：
上野　史

Design of Human-Agent-Group Interaction for Correct Opinion Sharing on Social Media

社交媒体上正确观点分享的人-智能体-群体交互设计

DOI：
发表时间：
2022
期刊：
影响因子：
0
作者：
上野史;Fumito Uwano;Fumito Uwano;Fumito Uwano
通讯作者：
Fumito Uwano

マルチエージェント強化学習における知識とその境界

多智能体强化学习中的知识及其边界

DOI：
发表时间：
2022
期刊：
影响因子：
0
作者：
Shuhei Aoyama;Takuma Miwa;and Takanobu Otsuka,;上野史;上野史
通讯作者：
上野史

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

上野史其他文献

SLIM Spacecraft Location Estimation by Crater Matching Based on Similar Triangles and Its Improvement

基于相似三角形的弹坑匹配的SLIM航天器位置估计及其改进

DOI：
10.2322/astj.jsass-d-17-00011
发表时间：
2018
期刊：
AEROSPACE TECHNOLOGY JAPAN, THE JAPAN SOCIETY FOR AERONAUTICAL AND SPACE SCIENCES
影响因子：
0
作者：
石井晴之;福田盛介;澤井秀次郎;坂井真一郎;村田暁紀;上野史;辰巳嵩豊;梅内祐太;高玉圭樹;原田智広;鎌田弘之;石田貴行
通讯作者：
石田貴行