权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

環境モデル徒弟学習の抜本的高速化技術の開発と実用的対話システムのプロトタイプ構築

开发技术以显着加快环境模型学徒的学习速度并构建实用对话系统的原型

基本信息

批准号：
25730128
负责人：
牧野貴樹
金额：
$ 2.66万
依托单位：
The University of Tokyo
依托单位国家：
日本
项目类别：
Grant-in-Aid for Young Scientists (B)
财政年份：
2013
资助国家：
日本
起止时间：
2013-04-01 至 2014-03-31
项目状态：
已结题

来源：
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-25730128/
关键词：
強化学習逆強化学習徒弟学習 LUKE

项目摘要

初年度の課題は、環境モデル徒弟学習を抜本的に高速化する技術を開発することであった。環境モデル徒弟学習においては、部分観測マルコフ決定過程の最適解計算を多数回実行する必要があり、計算速度が非常に遅いため、実用的な問題に適用することが不可能であり、高速化が不可欠な問題となっていた。本研究では、2つの手法、すなわち、方策の事後確率の列勾配計算による高速化と前回の解の再利用による高速化技術を開発し、実際に実装することで高速化を実現することができた。本研究で開発した技術は、オープンソースソフトウェア LUKE として一般に公開した。また、人工知能学会全国大会などで発表した。

In the early years, the problem of environmental protection and the development of high-speed technology for students 'learning were discussed. Environmental learning is the most important part of the process of determining the optimal solution. The calculation speed is very high. The application speed is impossible. In this study, we developed a new method to realize the high speed of the system, which is based on the high speed of the system, the high speed of the system, and the high speed of the system. This study is open to the public. The National Congress of the Artificial Intelligence Society was held in Beijing.

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

強化学習をベイズで理解する

用贝叶斯方法理解强化学习

DOI：
发表时间：
2014
期刊：
影响因子：
0
作者：
Hiroshi Nakano;Hiroshi Ono;Norio Iwasawa;Toshiyuki Takai;Yumiko Arai-Sanoh;Motohiko Kondo;Takaki Makino;中野洋，小野裕嗣，岩澤紀生，髙井俊之，荒井裕見子，近藤始彦;牧野　貴樹
通讯作者：
牧野　貴樹

LUKE (Learning Underlying Knowledge of Experts)

LUKE（学习专家的基础知识）

DOI：
发表时间：
期刊：
影响因子：
0
作者：
通讯作者：

Estimation of POMDP Parameters by Apprenticeship Learning

通过学徒学习估计 POMDP 参数

DOI：
发表时间：
2014
期刊：
影响因子：
0
作者：
Hiroshi Nakano;Hiroshi Ono;Norio Iwasawa;Toshiyuki Takai;Yumiko Arai-Sanoh;Motohiko Kondo;Takaki Makino
通讯作者：
Takaki Makino