权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

NRI: FND: Robust Inverse Learning for Human-Robot Collaboration

NRI：FND：人机协作的鲁棒逆向学习

基本信息

批准号：
1830421
负责人：
Prashant Doshi
金额：
$ 64.42万
依托单位：
University of Georgia Research Foundation Inc
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2018
资助国家：
美国
起止时间：
2018-09-01 至 2022-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1830421&HistoricalAwards=false
关键词：
NRI FND Robust Inverse Learning

项目摘要

Collaborative robots are robots that share with humans their work and personal spaces. These robots are expected to work with humans on a variety of tasks in various situations with few changes to their software and hardware. To do this, the robot must understand what is it that the human or other robot is doing, how is the human or robot performing the task, and then personalize its interaction. Currently, robots are programmed with much manual effort to perform specific tasks in controlled environments. This research is studying ways that will substantially advance a robot's capabilities in all these aspects, to enable a collaboration that is as automatic and seamless as possible. It is building methods, which allow the robot to observe the human or robot perform the task, understand the human's preferences and intent in the task, and then spontaneously collaborate with the human on the task. This approach relies on the insight that observing a human or robot perform the task provides information and facilitates learning the task. An application considered in this project is an agricultural robot that will observe and autonomously collaborate with a human in grading and packing onions in postharvest processing sheds. This has the potential to augment scarce human labor in our nation's farms in performing this repetitive task. Inverse reinforcement learning (IRL) refers to both the problem and method by which an agent learns the goals and preferences of another agent that explain the latter's observed behavior. The technical approach to this research is first identifying the challenges that IRL is facing in its use toward inferring the goals and preferences of the observed agent, human or robot, in real-world contexts. The research is tractably generalizing IRL to meet key unmet challenges. It is developing new methods that will make IRL robust to real-world uncertainties involving hidden variables, occlusions, and imperfect observations by the robot. Typically, IRL is one sided and the reward is learned with the aim of imitating the observed behavior. This research will go a step further and investigate how the dynamics and learned preferences can be revised and incorporated in the robot's collaborative decision making and planning, to enable the robot to spontaneously collaborate with the previously observed agent on the task. Consequently, the focus is on domains where the subject robot can observe an agent performing well-defined tasks that benefit from teamwork. The research plan is expected to yield a portfolio of algorithms that take key steps toward enabling robots to autonomously learn how to perform tasks and deploy this knowledge toward optimally collaborating with others on the task. Being able to learn tasks simply from passive demonstrations provides greater appeal to this research as it minimizes costly human interventions.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

协作机器人是与人类分享工作和个人空间的机器人。这些机器人有望在各种情况下与人类一起完成各种任务，而对软件和硬件的改变很少。要做到这一点，机器人必须了解人类或其他机器人在做什么，人类或机器人是如何执行任务的，然后个性化它的交互。目前，机器人被编程为在受控环境中执行特定任务的人工操作。这项研究正在研究将大大提高机器人在所有这些方面的能力的方法，以实现尽可能自动化和无缝的协作。它是构建方法，允许机器人观察人类或机器人执行任务，了解人类在任务中的偏好和意图，然后自发地与人类合作完成任务。这种方法依赖于观察人类或机器人执行任务提供信息并促进学习任务的洞察力。在这个项目中考虑的一个应用是一个农业机器人，它将观察并自主地与人类合作，在收获后的加工棚里分级和包装洋葱。这有可能增加我们国家农场中稀缺的人力来完成这项重复性的任务。逆强化学习（IRL）是指一个智能体学习另一个智能体的目标和偏好来解释后者观察到的行为的问题和方法。这项研究的技术方法首先是确定IRL在现实环境中用于推断被观察主体（人类或机器人）的目标和偏好时所面临的挑战。该研究可追溯地概括了IRL，以满足关键的未满足的挑战。它正在开发新的方法，使IRL对现实世界的不确定性具有鲁棒性，包括隐藏变量、遮挡和机器人的不完美观察。通常，IRL是单向的，学习奖励的目的是模仿观察到的行为。这项研究将进一步研究如何将动态和学习偏好修改并纳入机器人的协作决策和规划中，以使机器人能够自发地与先前观察到的代理在任务上进行协作。因此，重点是主题机器人可以观察代理执行从团队合作中受益的定义良好的任务的领域。该研究计划预计将产生一系列算法，这些算法将采取关键步骤，使机器人能够自主学习如何执行任务，并将这些知识用于与其他机器人在任务上进行最佳合作。能够简单地从被动演示中学习任务为这项研究提供了更大的吸引力，因为它最大限度地减少了昂贵的人为干预。该奖项反映了美国国家科学基金会的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（8）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Min-Max Entropy Inverse RL of Multiple Tasks

多任务的最小-最大熵逆强化学习

DOI：
发表时间：
2021
期刊：
IEEE International Conference on Robotics and Automation
影响因子：
0
作者：
Arora, Saurabh;Doshi, Prashant;Banerjee, Bikramjit
通讯作者：
Banerjee, Bikramjit

A survey of inverse reinforcement learning: Challenges, methods and progress

DOI：
10.1016/j.artint.2021.103500
发表时间：
2021-03-30
期刊：
ARTIFICIAL INTELLIGENCE
影响因子：
14.4
作者：
Arora, Saurabh;Doshi, Prashant
通讯作者：
Doshi, Prashant

SA-Net: Robust State-Action Recognition for Learning from Observations

SA-Net：从观察中学习的鲁棒状态动作识别

DOI：
发表时间：
2020
期刊：
IEEE International Conference on Robotics and Automation
影响因子：
0
作者：
Soans, Nihal;Asali, Ehsan;Hong, Yi;Doshi, Prashant
通讯作者：
Doshi, Prashant

Online Inverse Reinforcement Learning under Occlusion

遮挡下的在线逆强化学习

DOI：
发表时间：
2019
期刊：
Proceedings of the 18th International Conference on Autonomous Agents and Multi-Agent Systems
影响因子：
0
作者：
Arora, Saurabh;Doshi, Prashant;Banerjee, Bikramjit
通讯作者：
Banerjee, Bikramjit

Model-Free IRL Using Maximum Likelihood Estimation

使用最大似然估计的无模型 IRL