权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Human-like continual robot learning based on three-level computational energy cost regulation

基于三级计算能量成本调节的类人持续学习机器人

基本信息

批准号：
22H03670
负责人：
OZTOP Erhan
金额：
$ 9.57万
依托单位：
Osaka University
依托单位国家：
日本
项目类别：
Grant-in-Aid for Scientific Research (B)
财政年份：
2022
资助国家：
日本
起止时间：
2022-04-01 至 2025-03-31
项目状态：
未结题

项目摘要

In the first year of the project an appropriate robot simulator environment is selected (Pybullet) and the software platform for Lifelong Robot Learning (LRL) model has been developed on it. A robotic arm with three tasks (T1,T2,T3) is considered for LRL. The robot action is modeled as hitting objects with different angles. LRL tasks are set as the prediction of the effects of the actions in the three different environments, free space (T1), wall with changing orientation (T2) , L-shaped wall with changing orientation (T3). Task execution is based on Learning Progress (LP) whereas ‘neural cost’ consideration is left for next year. A basic knowledge transfer architecture is developed among the neural networks of each task. The symbol formation component is also explored but not incorporated into the simulated LRL model. Parallel to the development of the LRL model, supporting work is conducted and several publications are produced, and a workshop in IROS 2022 is held together with collaborators. In one line of research, work on symbol formation by the use of discrete units in the latent layers of deep neural networks is studied [2]. In addition, a work on robotic trust is conducted with collaborators which uses ‘neural computational cost’ for forming trust in social partners [3]. Therefore the LRL model can be extended to include trust formation, even though it was not directly part of the initial proposal. In addition, for supporting human-robot related tasks some work is devoted to teaching robots how to correct errors based on human demonstration.

在项目的第一年，选择了一个合适的机器人模拟器环境（Pybullet），并在其上开发了终身机器人学习（LRL）模型的软件平台。考虑具有T1、T2、T3三个任务的机械臂进行LRL。机器人动作建模为以不同角度撞击物体。LRL任务设定为在自由空间（T1）、改变方向的墙壁（T2）、改变方向的l型墙壁（T3）三种不同环境下预测动作的效果。任务执行基于学习进度（LP），而“神经成本”的考虑将留到明年。在每个任务的神经网络之间建立了一个基本的知识转移体系结构。符号形成组件也进行了探讨，但没有纳入模拟LRL模型。在开发LRL模型的同时，还开展了支持性工作，并出版了一些出版物，并与合作者一起在IROS 2022举办了一个研讨会。在一项研究中，研究了在深度神经网络的潜在层中使用离散单元来形成符号的工作。此外，与合作者一起进行了机器人信任的研究，该研究使用“神经计算成本”在社会伙伴[3]中形成信任。因此，LRL模型可以扩展到包括信任形成，即使它不是最初提议的直接组成部分。此外，为了支持人机相关的任务，一些工作致力于教机器人如何基于人类演示来纠正错误。