NRI: FND: Improving Robot Learning from Feedback and Demonstration using Natural Language

NRI:FND:使用自然语言通过反馈和演示改进机器人学习

基本信息

  • 批准号:
    1925082
  • 负责人:
  • 金额:
    $ 74.94万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-09-01 至 2024-08-31
  • 项目状态:
    已结题

项目摘要

Deploying general purpose robots on a wide scale ranging from the home to the workplace requires a more sustainable model to quickly and robustly train them to perform novel tasks in unknown environments without the intervention of robotics experts. Toward this goal, various approaches have been explored to allow an ordinary human user to train a robot using various forms of instruction and interaction, specifically by providing evaluative feedback while a robot is learning to perform a task, or by explicitly demonstrating how to perform the task. When a person is providing feedback or demonstrating a task for another human, they typically describe what they are doing in natural language, providing context, clarification, and/or explanations for their evaluations or actions. Therefore, this project focuses on developing new computational methods that will enable robots to more efficiently and robustly learn from feedback and demonstration by leveraging accompanying natural language narration as context.The project develops two new approaches to using language to aid interactive task learning by integrating ideas from language grounding, explanation for deep learning, and learning from rationales. The first approach uses language narration as a form of "supervised attention" that focuses learning on relevant features of the environment, thereby allowing effective learning from limited training data. First, the system learns to ground natural language in the robot's perceptions, utilizing prior work on automated video captioning and multi-modal linguistic grounding. Next, human linguistic narration is translated to a saliency map over the perceptual field using recent methods for visually explaining the processing of the resulting language-grounding networks. Finally, this saliency map is used to supervise the attention mechanism of a deep-reinforcement learning system that learns from feedback and/or demonstration, allowing it to learn faster and more effectively from limited interaction. The second approach uses natural language narrations to perform reward shaping. In this approach, natural language instructions are mapped to intermediate rewards, which can be seamlessly integrated into any standard reinforcement learning algorithm, again improving the speed and accuracy of learning. Both of these approaches are experimentally evaluated by using them to learn new tasks and quantitatively comparing the speed and effectiveness of learning with and without linguistic narration. The hypothesis is that the use of linguistic narration will improve the speed and effectiveness of learning. Tasks will include simulated ones employing video games typically used to evaluate reinforcement learning and real-world robot tasks involving navigation and object manipulation.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
从家庭到工作场所,大规模部署通用机器人需要一个更可持续的模型,以快速、稳健地训练它们在未知环境中执行新任务,而无需机器人专家的干预。为了这个目标,已经探索了各种方法来允许普通人类用户使用各种形式的指令和交互来训练机器人,特别是通过在机器人学习执行任务时提供评估反馈,或者通过明确地演示如何执行任务。当一个人为另一个人提供反馈或演示任务时,他们通常用自然语言描述他们正在做什么,为他们的评估或行动提供上下文,澄清和/或解释。因此,本项目致力于开发新的计算方法,使机器人能够利用伴随的自然语言叙述作为背景,从反馈和演示中更有效、更强大地学习。本项目开发了两种新的方法,通过整合语言基础、深度学习的解释和理性学习的思想,使用语言来辅助交互式任务学习。 第一种方法使用语言叙述作为一种“监督注意”的形式,将学习集中在环境的相关特征上,从而允许从有限的训练数据中进行有效的学习。首先,该系统学习自然语言在机器人的看法,利用自动视频字幕和多模态语言接地先前的工作。接下来,人类的语言叙述被翻译成一个显着的地图上的感知领域使用最近的方法,直观地解释所产生的语言接地网络的处理。最后,该显着图用于监督深度强化学习系统的注意力机制,该系统从反馈和/或演示中学习,使其能够从有限的交互中更快,更有效地学习。第二种方法使用自然语言叙述来执行奖励成形。在这种方法中,自然语言指令被映射到中间奖励,可以无缝集成到任何标准的强化学习算法中,再次提高学习的速度和准确性。这两种方法的实验评估,使用它们来学习新的任务,并定量比较学习的速度和有效性与没有语言叙述。 本研究的假设是,语言叙述的使用将提高学习的速度和效率。任务将包括使用视频游戏的模拟任务,这些游戏通常用于评估强化学习,以及涉及导航和对象操作的真实机器人任务。该奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(19)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Importance sampling in reinforcement learning with an estimated behavior policy
  • DOI:
    10.1007/s10994-020-05938-9
  • 发表时间:
    2021-05
  • 期刊:
  • 影响因子:
    7.5
  • 作者:
    Josiah P. Hanna;S. Niekum;P. Stone
  • 通讯作者:
    Josiah P. Hanna;S. Niekum;P. Stone
PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards
  • DOI:
  • 发表时间:
    2020-07
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Prasoon Goyal;S. Niekum;R. Mooney
  • 通讯作者:
    Prasoon Goyal;S. Niekum;R. Mooney
SCAPE: Learning Stiffness Control from Augmented Position Control Experiences
SCAPE:从增强的位置控制经验中学习刚度控制
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Kim, M;Niekum, S;Deshpande, A
  • 通讯作者:
    Deshpande, A
Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks
  • DOI:
    10.48550/arxiv.2210.04476
  • 发表时间:
    2022-10
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Albert Yu;R. Mooney
  • 通讯作者:
    Albert Yu;R. Mooney
Efficiently Guiding Imitation Learning Algorithms with Human Gaze
  • DOI:
  • 发表时间:
    2020-02
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Akanksha Saran;Ruohan Zhang;Elaine Schaertl Short;S. Niekum
  • 通讯作者:
    Akanksha Saran;Ruohan Zhang;Elaine Schaertl Short;S. Niekum
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Raymond Mooney其他文献

Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community
与机器人对话:扩大 SLIVAR 社区参与和研究的提案
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Casey Kennington;Malihe Alikhani;Heather Pon;Katherine Atwell;Yonatan Bisk;Daniel Fried;Felix Gervits;Zhao Han;Mert Inan;Michael Johnston;Raj Korpan;Diane Litman;M. Marge;Cynthia Matuszek;Ross Mead;Shiwali Mohan;Raymond Mooney;Natalie Parde;Jivko Sinapov;Angela Stewart;Matthew Stone;Stefanie Tellex;Tom Williams
  • 通讯作者:
    Tom Williams
Sparse Meets Dense: A Hybrid Approach to Enhance Scientific Document Retrieval
稀疏与密集:增强科学文档检索的混合方法
  • DOI:
    10.48550/arxiv.2401.04055
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Priyanka Mandikal;Raymond Mooney
  • 通讯作者:
    Raymond Mooney

Raymond Mooney的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Raymond Mooney', 18)}}的其他基金

NRI: Robots that Learn to Communicate through Natural Human Dialog
NRI:通过自然人类对话学习交流的机器人
  • 批准号:
    1637736
  • 财政年份:
    2016
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Standard Grant
EAGER: Robots that Learn to Communicate with Humans Tthrough Natural Dialog
EAGER:通过自然对话学习与人类交流的机器人
  • 批准号:
    1548567
  • 财政年份:
    2015
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Standard Grant
RI: Small: Perceptually Grounded Learning of Instructional Language
RI:小:教学语言的感知基础学习
  • 批准号:
    1016312
  • 财政年份:
    2010
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Continuing Grant
RI: Learning Language Semantics from Perceptual Context
RI:从感知上下文中学习语言语义
  • 批准号:
    0712097
  • 财政年份:
    2007
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Continuing Grant
ITR: Feedback from Multi-Source Data Mining to Experimentation for Gene Network Discovery
ITR:从多源数据挖掘到基因网络发现实验的反馈
  • 批准号:
    0325116
  • 财政年份:
    2003
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Continuing Grant
Text Data Mining Using Information Extraction
使用信息提取的文本数据挖掘
  • 批准号:
    0117308
  • 财政年份:
    2001
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Continuing Grant
Symbolic Learning for Natural Language Processing: Integrating Information Extraction and Querying
自然语言处理的符号学习:集成信息提取和查询
  • 批准号:
    9704943
  • 财政年份:
    1997
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Continuing Grant
Learning Search-Control Heuristics for Logic Programs: Applications to Speedup Learning and Language Acquisition
逻辑程序的学习搜索控制启发式:加速学习和语言习得的应用
  • 批准号:
    9310819
  • 财政年份:
    1994
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Continuing Grant
Refining Concepts And Domain Theories By Combining Explanation-Based And Empirical Learning
通过结合基于解释的学习和实证学习来完善概念和领域理论
  • 批准号:
    9102926
  • 财政年份:
    1991
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Continuing Grant

相似国自然基金

Novosphingobium sp. FND-3降解呋喃丹的分子机制研究
  • 批准号:
    31670112
  • 批准年份:
    2016
  • 资助金额:
    62.0 万元
  • 项目类别:
    面上项目

相似海外基金

Movement perception in Functional Neurological Disorder (FND)
功能性神经疾病 (FND) 的运动感知
  • 批准号:
    MR/Y004000/1
  • 财政年份:
    2024
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Research Grant
NRI: FND: Collaborative Research: DeepSoRo: High-dimensional Proprioceptive and Tactile Sensing and Modeling for Soft Grippers
NRI:FND:合作研究:DeepSoRo:软抓手的高维本体感受和触觉感知与建模
  • 批准号:
    2348839
  • 财政年份:
    2023
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Standard Grant
S&AS: FND: COLLAB: Planning and Control of Heterogeneous Robot Teams for Ocean Monitoring
S
  • 批准号:
    2311967
  • 财政年份:
    2022
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Standard Grant
NRI: FND: Collaborative Research: DeepSoRo: High-dimensional Proprioceptive and Tactile Sensing and Modeling for Soft Grippers
NRI:FND:合作研究:DeepSoRo:软抓手的高维本体感受和触觉感知与建模
  • 批准号:
    2024882
  • 财政年份:
    2021
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Standard Grant
NRI: FND: Collaborative Research: DeepSoRo: High-dimensional Proprioceptive and Tactile Sensing and Modeling for Soft Grippers
NRI:FND:合作研究:DeepSoRo:软抓手的高维本体感受和触觉感知与建模
  • 批准号:
    2024646
  • 财政年份:
    2021
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Standard Grant
NRI: FND: Foundations for Physical Co-Manipulation with Mixed Teams of Humans and Soft Robots
NRI:FND:人类和软机器人混合团队物理协同操作的基础
  • 批准号:
    2024792
  • 财政年份:
    2021
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Standard Grant
NRI: FND: Foundations for Physical Co-Manipulation with Mixed Teams of Humans and Soft Robots
NRI:FND:人类和软机器人混合团队物理协同操作的基础
  • 批准号:
    2024670
  • 财政年份:
    2021
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Standard Grant
NRI: FND: Natural Power Transmission through Unconstrained Fluids for Robotic Manipulation
NRI:FND:通过不受约束的流体进行自然动力传输,用于机器人操作
  • 批准号:
    2024409
  • 财政年份:
    2020
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Standard Grant
NRI: FND: Multi-Manipulator Extensible Robotic Platforms
NRI:FND:多机械手可扩展机器人平台
  • 批准号:
    2024435
  • 财政年份:
    2020
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Standard Grant
Collaborative Research: NRI: FND: Flying Swarm for Safe Human Interaction in Unstructured Environments
合作研究:NRI:FND:用于非结构化环境中安全人类互动的飞群
  • 批准号:
    2024615
  • 财政年份:
    2020
  • 资助金额:
    $ 74.94万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了