权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

自動運転のパーソナライゼーションによる乗員の快適性向上

通过自动驾驶的个性化提高乘客舒适度

基本信息

批准号：
18J13910
负责人：
石川翔太
金额：
$ 1.22万
依托单位：
Chiba University
依托单位国家：
日本
项目类别：
Grant-in-Aid for JSPS Fellows
财政年份：
2018
资助国家：
日本
起止时间：
2018-04-25 至 2020-03-31
项目状态：
已结题

来源：
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-18J13910/
关键词：
自動運転強化学習逆強化学習

项目摘要

平成30年度は逆強化学習や自動運転に関する研究の調査，実験環境の整備，逆強化学習に用いる特徴量を抽出法の提案，および逆強化学習の比較や検討をおこなった．研究の調査では，主に国際会議の参加や調査論文の投稿をおこなった．ストックホルムでおこなわれたIJCAI-ECAI 2018のワークショップATT2018において論文の発表をおこない，研究に関する議論を交わした．このワークショップは，メジャーな国際会議であるIJCAIに併設されていることから，非常にレベルの高い研究者が集まっていると考えられる．実際にここでの議論から様々な着想を得ることができた．また，世界各国から集まった研究の調査もでき，調査論文として内容をまとめて投稿した．学会に発表した論文では，自動運転車の強化学習に必要な特徴量を抽出するアルゴリズムを提案した．最近の自動運転技術では，数多くの観測すべき対象物を含む環境入力に対して適切な行動出力が学習できる深層強化学習が注目されている．しかし，入出力関係がブラックボックスとなる深層強化学習を適用することは難しい．そこで提案法では，深層強化学習後のネットワークを解析することにより，学習後の自動運転方策に必要となる特徴量を抽出する．基本的なアイデアは，入力に対する出力の勾配を求めることである．勾配の値が大きいほど，出力に与える影響も大きくなると考えられるため，勾配が大きくなる特徴量が重要となる．計算機実験では，自動運転タスクのベンチマーク問題であるTORCSを用いて，提案手法の有効性を確かめた．平成30年度の後半では，LogRegIRLやNNP-FIRLなどの逆強化学習アルゴリズムを比較し，検討していた．今後はこの検討結果をもとに研究を続ける．

Research on inverse reinforcement learning and automatic operation in 2007 - 2008, preparation of environment, proposal of feature extraction method for inverse reinforcement learning, comparison and discussion of inverse reinforcement learning. Research and participation in international conferences and contribution of research papers. IJCAI-ECAI 2018 will be the first to discuss the issue of ATT2018. The IJCAI is an international conference, and it is a very important event for researchers to gather together. In the meantime, the discussion of the matter is going on. Research papers collected from various countries in the world are submitted for research. In this paper, we propose a new method to extract the necessary features for reinforcement learning of automatic transport vehicles. Recently, automatic operation technology has attracted a lot of attention, such as deep reinforcement learning, which includes environmental input, appropriate action input and so on. Deep reinforcement learning is applicable to the relationship between input and output. The proposed method is to extract the necessary features for automatic operation after deep reinforcement learning. The basic idea is to ask for the matching of input and output. The matching value is high, the output is high, the influence is high, the matching characteristic is high. The computer automatically runs the program to solve the problem of TORCS, and the proposal method is effective. In the second half of 2010, LogRegIRL and NNP-FIRL were compared and discussed. In the future, we will discuss the results.