权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

NRI: FND: Improving Robot Learning from Feedback and Demonstration using Natural Language

NRI：FND：使用自然语言通过反馈和演示改进机器人学习

基本信息

批准号：
1925082
负责人：
Raymond Mooney
金额：
$ 74.94万
依托单位：
University of Texas at Austin
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-09-01 至 2024-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1925082&HistoricalAwards=false
关键词：
NRI FND Improving Robot Learning

项目摘要

Deploying general purpose robots on a wide scale ranging from the home to the workplace requires a more sustainable model to quickly and robustly train them to perform novel tasks in unknown environments without the intervention of robotics experts. Toward this goal, various approaches have been explored to allow an ordinary human user to train a robot using various forms of instruction and interaction, specifically by providing evaluative feedback while a robot is learning to perform a task, or by explicitly demonstrating how to perform the task. When a person is providing feedback or demonstrating a task for another human, they typically describe what they are doing in natural language, providing context, clarification, and/or explanations for their evaluations or actions. Therefore, this project focuses on developing new computational methods that will enable robots to more efficiently and robustly learn from feedback and demonstration by leveraging accompanying natural language narration as context.The project develops two new approaches to using language to aid interactive task learning by integrating ideas from language grounding, explanation for deep learning, and learning from rationales. The first approach uses language narration as a form of "supervised attention" that focuses learning on relevant features of the environment, thereby allowing effective learning from limited training data. First, the system learns to ground natural language in the robot's perceptions, utilizing prior work on automated video captioning and multi-modal linguistic grounding. Next, human linguistic narration is translated to a saliency map over the perceptual field using recent methods for visually explaining the processing of the resulting language-grounding networks. Finally, this saliency map is used to supervise the attention mechanism of a deep-reinforcement learning system that learns from feedback and/or demonstration, allowing it to learn faster and more effectively from limited interaction. The second approach uses natural language narrations to perform reward shaping. In this approach, natural language instructions are mapped to intermediate rewards, which can be seamlessly integrated into any standard reinforcement learning algorithm, again improving the speed and accuracy of learning. Both of these approaches are experimentally evaluated by using them to learn new tasks and quantitatively comparing the speed and effectiveness of learning with and without linguistic narration. The hypothesis is that the use of linguistic narration will improve the speed and effectiveness of learning. Tasks will include simulated ones employing video games typically used to evaluate reinforcement learning and real-world robot tasks involving navigation and object manipulation.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

从家庭到工作场所，大规模部署通用机器人需要一个更可持续的模型，以快速、稳健地训练它们在未知环境中执行新任务，而无需机器人专家的干预。为了这个目标，已经探索了各种方法来允许普通人类用户使用各种形式的指令和交互来训练机器人，特别是通过在机器人学习执行任务时提供评估反馈，或者通过明确地演示如何执行任务。当一个人为另一个人提供反馈或演示任务时，他们通常用自然语言描述他们正在做什么，为他们的评估或行动提供上下文，澄清和/或解释。因此，本项目致力于开发新的计算方法，使机器人能够利用伴随的自然语言叙述作为背景，从反馈和演示中更有效、更强大地学习。本项目开发了两种新的方法，通过整合语言基础、深度学习的解释和理性学习的思想，使用语言来辅助交互式任务学习。第一种方法使用语言叙述作为一种“监督注意”的形式，将学习集中在环境的相关特征上，从而允许从有限的训练数据中进行有效的学习。首先，该系统学习自然语言在机器人的看法，利用自动视频字幕和多模态语言接地先前的工作。接下来，人类的语言叙述被翻译成一个显着的地图上的感知领域使用最近的方法，直观地解释所产生的语言接地网络的处理。最后，该显着图用于监督深度强化学习系统的注意力机制，该系统从反馈和/或演示中学习，使其能够从有限的交互中更快，更有效地学习。第二种方法使用自然语言叙述来执行奖励成形。在这种方法中，自然语言指令被映射到中间奖励，可以无缝集成到任何标准的强化学习算法中，再次提高学习的速度和准确性。这两种方法的实验评估，使用它们来学习新的任务，并定量比较学习的速度和有效性与没有语言叙述。本研究的假设是，语言叙述的使用将提高学习的速度和效率。任务将包括使用视频游戏的模拟任务，这些游戏通常用于评估强化学习，以及涉及导航和对象操作的真实机器人任务。该奖项反映了NSF的法定使命，并通过使用基金会的智力价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（19）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Importance sampling in reinforcement learning with an estimated behavior policy

DOI：
10.1007/s10994-020-05938-9
发表时间：
2021-05
期刊：
Machine Learning
影响因子：
7.5
作者：
Josiah P. Hanna;S. Niekum;P. Stone
通讯作者：
Josiah P. Hanna;S. Niekum;P. Stone

PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards

DOI：
发表时间：
2020-07
期刊：
ArXiv
影响因子：
0
作者：
Prasoon Goyal;S. Niekum;R. Mooney
通讯作者：
Prasoon Goyal;S. Niekum;R. Mooney

SCAPE: Learning Stiffness Control from Augmented Position Control Experiences

SCAPE：从增强的位置控制经验中学习刚度控制

DOI：
发表时间：
2021
期刊：
Conference on Robot Learning
影响因子：
0
作者：
Kim, M;Niekum, S;Deshpande, A
通讯作者：
Deshpande, A

Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks

DOI：
10.48550/arxiv.2210.04476
发表时间：
2022-10
期刊：
ArXiv
影响因子：
0
作者：
Albert Yu;R. Mooney
通讯作者：
Albert Yu;R. Mooney

Efficiently Guiding Imitation Learning Algorithms with Human Gaze

DOI：
发表时间：
2020-02
期刊：
ArXiv
影响因子：
0
作者：
Akanksha Saran;Ruohan Zhang;Elaine Schaertl Short;S. Niekum
通讯作者：
Akanksha Saran;Ruohan Zhang;Elaine Schaertl Short;S. Niekum

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Raymond Mooney其他文献

Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community

与机器人对话：扩大 SLIVAR 社区参与和研究的提案

DOI：
发表时间：
2024
期刊：
arXiv.org
影响因子：
0
作者：
Casey Kennington;Malihe Alikhani;Heather Pon;Katherine Atwell;Yonatan Bisk;Daniel Fried;Felix Gervits;Zhao Han;Mert Inan;Michael Johnston;Raj Korpan;Diane Litman;M. Marge;Cynthia Matuszek;Ross Mead;Shiwali Mohan;Raymond Mooney;Natalie Parde;Jivko Sinapov;Angela Stewart;Matthew Stone;Stefanie Tellex;Tom Williams
通讯作者：
Tom Williams