Collaborative Research: RI: Medium: Bootstrapping natural feedback for reinforcement learning
合作研究:RI:中:引导强化学习的自然反馈
基本信息
- 批准号:2212310
- 负责人:
- 金额:$ 120万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-09-01 至 2025-08-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Many modern applications of artificial intelligence---from industrial automation to content recommendation---depend on machine learning algorithms that train automated agents to interact with their environments. But the two main approaches to interactive learning, reinforcement learning and imitation, require so much supervision or training time that it is prohibitively expensive to apply them to most real-world problems. Human learning does not suffer from this shortcoming, in large part because humans learn not from rewards or demonstrations, but instead from extended interaction with skilled teachers who use signals like gesture and language. This project will lay a foundation for research on interactive learning with rich feedback, from the perspective of individual agents, human--agent teams, and multi-agent populations. It will yield new capabilities for interactive training of automated agents, expanding both the effectiveness and accessibility of such techniques. Support for natural, interactive feedback will also improve the customizability of such systems, making on-the-fly adaptation or retraining accessible to users without significant computing power, data annotation resources or even programming ability.The project is organized into three broad research objectives. First, it will develop a formal framework for grounding feedback, using simple supervisory signals (provided during or after execution) to bootstrap learned interpretation of more complex feedback types. Second, it will develop algorithms for learning to solicit feedback. These algorithms will turn the one-way process of reinforcement learning into a two-way interaction, enabling agents to proactively query supervisors for information about the compositional and causal structure of the environment. Third, it will develop new mechanisms and techniques for providing feedback, via software tools that assist human supervisors in selecting or generating maximally informative feedback signals. Research under each of these objectives will be carried out in simulated environments, benchmarked using complex tasks spanning navigation, robot manipulation, and furniture assembly, and evaluated in terms of its benefits to sample efficiency, end-to-end development time, and usability.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
人工智能的许多现代应用--从工业自动化到内容推荐--都依赖于机器学习算法,这些算法可以训练自动代理与环境进行交互。但是交互式学习的两种主要方法,强化学习和模仿,需要大量的监督或训练时间,以至于将它们应用于大多数现实世界的问题是非常昂贵的。人类的学习并不受这个缺点的影响,这在很大程度上是因为人类不是从奖励或示范中学习,而是通过与熟练的教师进行广泛的互动来学习,这些教师使用手势和语言等信号。该项目将从个体代理、人-代理团队和多代理群体的角度,为具有丰富反馈的交互式学习研究奠定基础。 它将为自动代理的交互式培训提供新的能力,扩大这种技术的有效性和可访问性。支持自然的、交互式的反馈也将提高这种系统的可定制性,使用户在没有强大的计算能力、数据注释资源甚至编程能力的情况下也能进行动态适应或再培训。首先,它将开发一个正式的框架,用于接地反馈,使用简单的监督信号(在执行过程中或执行后提供)引导学习更复杂的反馈类型的解释。其次,它将开发学习征求反馈的算法。这些算法将把强化学习的单向过程变成双向交互,使智能体能够主动向监督者查询有关环境组成和因果结构的信息。第三,它将开发提供反馈的新机制和技术,通过软件工具帮助人类监督者选择或生成最大信息反馈信号。每个目标下的研究将在模拟环境中进行,使用跨越导航,机器人操作和家具装配的复杂任务进行基准测试,并根据其对样品效率,端到端开发时间,该奖项反映了NSF的法定使命,并被认为是值得通过使用基金会的知识价值和更广泛的影响审查评估的支持的搜索.
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Guiding Pretraining in Reinforcement Learning with Large Language Models
- DOI:10.48550/arxiv.2302.06692
- 发表时间:2023-02
- 期刊:
- 影响因子:0
- 作者:Yuqing Du;Olivia Watkins;Zihan Wang;Cédric Colas;Trevor Darrell;P. Abbeel;Abhishek Gupta;Jacob Andreas
- 通讯作者:Yuqing Du;Olivia Watkins;Zihan Wang;Cédric Colas;Trevor Darrell;P. Abbeel;Abhishek Gupta;Jacob Andreas
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jacob Andreas其他文献
Good-Enough Compositional Data Augmentation
- DOI:
10.18653/v1/2020.acl-main.676 - 发表时间:
2019-04 - 期刊:
- 影响因子:0
- 作者:
Jacob Andreas - 通讯作者:
Jacob Andreas
Guided K-best Selection for Semantic Parsing Annotation
语义解析标注的引导 K-best 选择
- DOI:
10.18653/v1/2022.acl-demo.11 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Anton Belyy;Huang Chieh;Jacob Andreas;Emmanouil Antonios Platanios;Sam Thomson;Richard Shin;Subhro Roy;Aleksandr Nisnevich;Charles C. Chen;Benjamin Van Durme - 通讯作者:
Benjamin Van Durme
From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought
从文字模型到世界模型:从自然语言到概率性思维语言的翻译
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
L. Wong;Gabriel Grand;Alexander K. Lew;Noah D. Goodman;Vikash K. Mansinghka;Jacob Andreas;J. Tenenbaum - 通讯作者:
J. Tenenbaum
Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling
松散的嘴唇沉船:通过语言通知的程序采样在战舰中提问
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Gabriel Grand;Valerio Pepe;Jacob Andreas;Joshua B. Tenenbaum - 通讯作者:
Joshua B. Tenenbaum
Pushdown Layers: Encoding Recursive Structure in Transformer Language Models
下推层:在 Transformer 语言模型中编码递归结构
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Shikhar Murty;Pratyusha Sharma;Jacob Andreas;Christopher D. Manning - 通讯作者:
Christopher D. Manning
Jacob Andreas的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jacob Andreas', 18)}}的其他基金
CAREER: Learning Structured Models with Natural Language Supervision
职业:利用自然语言监督学习结构化模型
- 批准号:
2238240 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Continuing Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312841 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312842 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Foundations of Few-Round Active Learning
协作研究:RI:小型:少轮主动学习的基础
- 批准号:
2313131 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313151 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Continuing Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312840 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Deep Constrained Learning for Power Systems
合作研究:RI:小型:电力系统的深度约束学习
- 批准号:
2345528 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Motion Fields Understanding for Enhanced Long-Range Imaging
合作研究:RI:小型:增强远程成像的运动场理解
- 批准号:
2232298 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: End-to-end Learning of Fair and Explainable Schedules for Court Systems
合作研究:RI:小型:法院系统公平且可解释的时间表的端到端学习
- 批准号:
2232055 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313149 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Continuing Grant
Collaborative Research: CompCog: RI: Medium: Understanding human planning through AI-assisted analysis of a massive chess dataset
合作研究:CompCog:RI:中:通过人工智能辅助分析海量国际象棋数据集了解人类规划
- 批准号:
2312374 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant