Coordination of Multiple Behaviors for Competition Robots by Vision-Based Reinforcement Learning
基于视觉的强化学习协调竞赛机器人的多种行为
基本信息
- 批准号:07455112
- 负责人:
- 金额:$ 4.8万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for Scientific Research (B)
- 财政年份:1995
- 资助国家:日本
- 起止时间:1995 至 1996
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Coordination of multiple behaviors independently obtained by the reinforcement learning method is one of the issues in order for the method to be scaled to larger and more complex robot learning tasks. Direct combination of all the state spaces for individual modules (subtasks) needs enormous learning time, and it causes hidden states. In this project, we propsed a method which accomplished a whole task consisting of plural subtasks by coordinating multiple behaviors acquired by vision-based reinforcement learning in the first year, and modified the method by introducing modular learning which coordinates multiple behaviors taking account of a trade-off between learning time and performance in the second year.The first year :1.Individual behaviors which achieve the corresponding subtasks were independently acquired by Q-learning.2.Three kinds of coordinations of multiple behaviors were considered ; simple summation of different action-value functions, switching action-value functions a … More ccording to situations, and learning with previously obtained action-value funcions as initial values of a new action-value function.3.A Task of shooting a ball into the goal avoiding collisions with an opponet was examined. The task can be decomposed into a ball shooting subtask and a collision avoiding subtask.4.As a result, the learing method was the best one in shooting ratio, mean steps to the goal, and avoidance performance.The second year :1.In order to reduce the learing time the whole state space was classified into two categories based on the action values separately obtained by Q- learning : the area where one of the learned behaviors was directly applicable (no more learning area), and the area where learning was necessary due to the competition of multiple behaviors (re-learning area).2.Hidden states are detected by model fitting to the learned action values based on the information criterion.3.The initial action values in the re-learning area were adjusted so that they could be consistent with the values in the no more learning area.4.The method was applied to one to one soccer playing robots, and the validity of the proposed method was shown by computer simulation and real robot experiments. Less
通过强化学习方法独立获得的多个行为的协调是问题之一,以便将该方法扩展到更大和更复杂的机器人学习任务。直接组合各个模块(子任务)的所有状态空间需要大量的学习时间,并且会导致隐藏状态。在本项目中,我们提出了一种方法,该方法在第一年通过协调基于视觉的强化学习获得的多个行为来完成由多个子任务组成的整个任务,并在第二年通过引入模块化学习来修改该方法,该模块化学习考虑到学习时间和性能之间的权衡来协调多个行为。第一年:1.通过Q-学习,个体行为独立地获得,并完成相应的子任务。2.考虑了三种多行为的协调;不同动作值函数的简单求和,切换动作值函数a ...更多信息 3.考察了一个将球射入球门避免与球门碰撞的任务。结果表明,学习方法在投篮率、平均步数和避碰性能方面均优于学习方法第二年:1.为了减少学习时间,根据Q-学习得到的动作值,将整个状态空间分为两类:其中一个学到的行为直接适用的领域(不再学习),以及由于多种行为的竞争而需要学习的区域2.基于信息准则,通过对学习动作值的模型拟合来检测隐藏状态。调整学习区域,使其与不再学习区域的值一致。4.将该方法应用于一对一足球机器人,计算机仿真和真实的机器人实验验证了该方法的有效性。少
项目成果
期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Minoru Asada: "Agents that learn from other competitive agents" Proc.of Machine Learning Conference Workshop on Agents That Learn from Other Agents. 1-7 (1995)
Minoru Asada:“向其他竞争性代理学习的代理”Proc.of 机器学习会议研讨会上关于向其他代理学习的代理。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
M. Asada: "Agenst that learn from other competitive agents" Proc. of Machine Learning Conference Workshop on Agents That Learn from Other Agents. 1-7 (1995)
M. Asada:“向其他竞争性代理学习的代理”Proc。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
内部 英治: "競合エージェントの存在する環境での視覚に基づく強化学習によるロボットの行動獲得" 第8回自律分散システム・シンポジウム資料. 371-374 (1996)
Eiji Nachi:“在具有竞争代理的环境中通过基于视觉的强化学习获取机器人行为”第八届自治分布式系统研讨会材料 371-374 (1996)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
内部 英治: "サッカーロボットの技能学習" つくばソフトウェアシンポジウム予稿集. 43-46 (1996)
那智英二:《足球机器人技能学习》筑波软件研讨会论文集 43-46 (1996)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
内部英治: "視覚を有する移動ロボットの強化学習による複数タスクの達成" ロボティクス・メカトロニクス講演会95予稿集. 700-703 (1995)
Eiji Uchichi:“通过视觉移动机器人的强化学习实现多项任务”第 95 届机器人和机电一体化会议论文集 700-703 (1995)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
ASADA Minoru其他文献
ASADA Minoru的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('ASADA Minoru', 18)}}的其他基金
Constructive Developmental Science Based on Understanding the Process from Neuro-Dynamics to Social Interaction
基于理解从神经动力学到社会互动过程的建设性发展科学
- 批准号:
24000012 - 财政年份:2012
- 资助金额:
$ 4.8万 - 项目类别:
Grant-in-Aid for Specially Promoted Research
Brap gene can be one of causal genes of the Ras-MAPK syndromes.
Brap基因可能是Ras-MAPK综合征的致病基因之一。
- 批准号:
22591141 - 财政年份:2010
- 资助金额:
$ 4.8万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Targeting the brap2, a candidate gene for ataxia caused by developmental defects of the cerebellum
针对小脑发育缺陷引起的共济失调的候选基因 brap2
- 批准号:
19591014 - 财政年份:2007
- 资助金额:
$ 4.8万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Cooperation and competitive learning of multi-humanoid based on mapping other's behavior on the self behavior space
基于将他人行为映射到自我行为空间的多类人协作与竞争学习
- 批准号:
16200012 - 财政年份:2004
- 资助金额:
$ 4.8万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
ロボカップを用いたマルチロボット環境における学習・発達・進化手法の共同開発
使用 RoboCup 联合开发多机器人环境中的学习、开发和进化方法
- 批准号:
11694155 - 财政年份:1999
- 资助金额:
$ 4.8万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Outdoor World Modeling by Intelligent Integration of Multi-Visual Information
多视觉信息智能融合户外世界建模
- 批准号:
03805029 - 财政年份:1991
- 资助金额:
$ 4.8万 - 项目类别:
Grant-in-Aid for General Scientific Research (C)
相似国自然基金
老年人群视障风险VISION管控模式构建与实证研究
- 批准号:71974198
- 批准年份:2019
- 资助金额:48.5 万元
- 项目类别:面上项目
相似海外基金
N2Vision+: A robot-enabled, data-driven machine vision tool for nitrogen diagnosis of arable soils
N2Vision:一种由机器人驱动、数据驱动的机器视觉工具,用于耕地土壤的氮诊断
- 批准号:
10091423 - 财政年份:2024
- 资助金额:
$ 4.8万 - 项目类别:
Collaborative R&D
Learning to create Intelligent Solutions with Machine Learning and Computer Vision: A Pathway to AI Careers for Diverse High School Students
学习利用机器学习和计算机视觉创建智能解决方案:多元化高中生的人工智能职业之路
- 批准号:
2342574 - 财政年份:2024
- 资助金额:
$ 4.8万 - 项目类别:
Standard Grant
Professional Visionの可視化による英語教師認知の形成・変容過程の解明
从专业视野可视化阐释英语教师认知的形成与转化过程
- 批准号:
24K00089 - 财政年份:2024
- 资助金额:
$ 4.8万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
CAREER: Teachers Learning to be Technology Accessibility Allies to Blind and Low-Vision Students in Science
职业:教师学习成为盲人和低视力学生在科学领域的技术无障碍盟友
- 批准号:
2334693 - 财政年份:2024
- 资助金额:
$ 4.8万 - 项目类别:
Continuing Grant
REU Site: Research Experience for Undergraduates in Computer Vision
REU 网站:计算机视觉本科生的研究经验
- 批准号:
2349386 - 财政年份:2024
- 资助金额:
$ 4.8万 - 项目类别:
Standard Grant
Vision Servoing Based Micro Continuum Robot Actuated by SMA Wires for Precise Laser Irradiation during Transurethral Lithotripsy
基于视觉伺服的微型连续体机器人由 SMA 线驱动,用于经尿道碎石术期间的精确激光照射
- 批准号:
24K21116 - 财政年份:2024
- 资助金额:
$ 4.8万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
2022BBSRC-NSF/BIO Generating New Network Analysis Tools for Elucidating the Functional Logic of 3D Vision Circuits of the Drosophila Brain
2022BBSRC-NSF/BIO 生成新的网络分析工具来阐明果蝇大脑 3D 视觉电路的功能逻辑
- 批准号:
BB/Y000234/1 - 财政年份:2024
- 资助金额:
$ 4.8万 - 项目类别:
Research Grant
Vision-only structure-from-motion via acoustic video for extreme underwater environment sensing
通过声学视频进行纯视觉运动结构,用于极端水下环境传感
- 批准号:
24K20867 - 财政年份:2024
- 资助金额:
$ 4.8万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Unifying Object Detection and Image Captioning using Vision-Language Knowledge Base for Open-World Comprehension
使用视觉语言知识库统一对象检测和图像描述以实现开放世界理解
- 批准号:
24K20830 - 财政年份:2024
- 资助金额:
$ 4.8万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Collaborative Research:CIF:Small:Acoustic-Optic Vision - Combining Ultrasonic Sonars with Visible Sensors for Robust Machine Perception
合作研究:CIF:Small:声光视觉 - 将超声波声纳与可见传感器相结合,实现强大的机器感知
- 批准号:
2326905 - 财政年份:2024
- 资助金额:
$ 4.8万 - 项目类别:
Standard Grant