权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Reframing Deep Reinforcement Learning for Large-Scale, Real-World Implementation

重构深度强化学习以实现大规模、现实世界的实施

基本信息

批准号：
2373874
负责人：
金额：
--
依托单位：
King's College London
依托单位国家：
英国
项目类别：
Studentship
财政年份：
2019
资助国家：
英国
起止时间：
2019 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=studentship-2373874
关键词：
Reframing Deep Reinforcement Learning Large

项目摘要

Reinforcement learning applied to robotics systems has the potential to revolutionize many industries, but still lacks the flexibility, scalability and safety needed for deployment. One key bottleneck of such methods is their reliance on obtaining information from hand-engineered reward functions, requiring a non-trivial design effort in tasks where experience is expensive to collect.The goal of this research project is to augment the classical reinforcement learning approach to develop efficient, practical and scalable algorithms for robotics control. The ultimate objective is to produce and deploy a versatile and safe system, able to learn a set of customizable tasks ranging from industrial processes to human support at run time. This entails providing autonomous agents with the ability to recover a structured representation of their environment, obtain hierarchical knowledge over higher-level behaviour and utilize social signals to best understand their objectives. The current plan is to approach each of these sub-goals individually and ultimately implement them jointly on the Toyota HSR Robot. The expected research outcomes are:- Develop a novel method to enable usage of human interaction to form a meaningful form of information for learning new tasks.- Define a framework for the unsupervised recovery of a higher-level disentangled representation of the agent's mechanical inputs.- Incorporate a system for meta-learning, bolstering training efficiency over a range of user-specified tasks.- Port these systems on the HSR robot and evaluate real-world scenarios of interaction with human users.

应用于机器人系统的强化学习有可能彻底改变许多行业，但仍然缺乏部署所需的灵活性，可扩展性和安全性。这种方法的一个关键瓶颈是依赖于从手工设计的奖励函数中获取信息，需要在经验昂贵的任务中进行非平凡的设计工作，本研究项目的目标是增强经典的强化学习方法，以开发高效，实用和可扩展的机器人控制算法。最终目标是生产和部署一个多功能和安全的系统，能够学习一组可定制的任务，从工业流程到运行时的人工支持。这需要提供自主代理的能力，以恢复其环境的结构化表示，获得层次知识的更高层次的行为，并利用社会信号，以最好地了解他们的目标。目前的计划是单独实现这些子目标，并最终在丰田HSR机器人上共同实现。预期的研究成果是：-开发一种新的方法，使人类互动的使用，形成一个有意义的形式的信息学习新的任务。定义一个框架，用于无监督地恢复代理机械输入的更高级别的解纠缠表示。整合元学习系统，在一系列用户指定的任务中提高培训效率。将这些系统移植到HSR机器人上，并评估与人类用户交互的真实场景。