Collaborative Research: RI: Medium: Superhuman Imitation Learning from Heterogeneous Demonstrations
合作研究:RI:媒介:异质演示中的超人模仿学习
基本信息
- 批准号:2312956
- 负责人:
- 金额:$ 39.94万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-07-01 至 2026-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Learning from demonstrated behavior (i.e., imitation) is an effective means of knowledge transfer in animals and humans. Existing imitation learning methods for artificial intelligence (AI) systems typically assume the capabilities of the imitator match those of the demonstrator. This can lead to undesirable behavior when the imitator’s capabilities significantly exceed those of the demonstrator. This project reformulates imitation learning for AI systems that are more capable than (human) demonstrators in some aspects by seeking to make the AI system unambiguously better than human demonstrators. The project will train graduate students and undergraduates to develop artificial intelligence systems that are better aligned with safety and utility requirements in a broad range of highly impactful future applications.The project approaches its reformulated imitation learning objective using a maximum margin optimization for guiding (deep) reinforcement learning of control/decision policies. It focuses on learning from heterogeneous demonstrations and tasks that differ in quality, difficulty, and structure. Initially, multiple metrics for assessing and comparing different behaviors are assumed to be available. Later in the project, these metrics will be learned from demonstrations and supplemental annotations using deep representation learning methods. The policies produced by the approach of this project will be evaluated on a diverse set of applications: open source simulators (e.g., Atari games), manipulation and mobility tasks for robotics platforms, and cancer treatment decisions.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
从表现出的行为中学习(即,模仿)是动物和人类知识转移的有效手段。现有的人工智能(AI)系统的模仿学习方法通常假设模仿者的能力与演示者的能力相匹配。当模仿者的能力大大超过演示者时,这可能会导致不受欢迎的行为。该项目重新制定了在某些方面比(人类)演示者更有能力的人工智能系统的模仿学习,试图使人工智能系统明确优于人类演示者。该项目将培训研究生和本科生开发人工智能系统,使其更好地符合广泛的高度影响力的未来应用中的安全和实用要求。该项目使用最大边际优化来指导控制/决策策略的(深度)强化学习,从而实现其重新制定的模仿学习目标。它侧重于从质量、难度和结构不同的异构演示和任务中学习。最初,用于评估和比较不同行为的多个度量被假定为可用。在项目的后期,这些指标将使用深度表示学习方法从演示和补充注释中学习。该项目的方法产生的政策将在一组不同的应用程序上进行评估:开源模拟器(例如,该奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Sanjiban Choudhury其他文献
Approximate Dynamic Programming
- DOI:
10.1007/978-1-4899-7687-1_100018 - 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Sanjiban Choudhury - 通讯作者:
Sanjiban Choudhury
Densification strategies for anytime motion planning over large dense roadmaps
用于在大型密集路线图上进行随时运动规划的致密化策略
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Shushman Choudhury;Oren Salzman;Sanjiban Choudhury;S. Srinivasa - 通讯作者:
S. Srinivasa
Game-Theoretic Algorithms for Conditional Moment Matching
条件矩匹配的博弈论算法
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Gokul Swamy;Sanjiban Choudhury;J. Bagnell;Zhiwei Steven Wu - 通讯作者:
Zhiwei Steven Wu
MOTION PRIMITIVES FOR AN AUTOROTATING HELICOPTER
- DOI:
- 发表时间:
2012 - 期刊:
- 影响因子:0
- 作者:
Sanjiban Choudhury - 通讯作者:
Sanjiban Choudhury
Generalized Lazy Search for Robot Motion Planning: Interleaving Search and Edge Evaluation via Event-based Toggles
机器人运动规划的广义惰性搜索:通过基于事件的切换进行交错搜索和边缘评估
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Aditya Mandalika;Sanjiban Choudhury;Oren Salzman;S. Srinivasa - 通讯作者:
S. Srinivasa
Sanjiban Choudhury的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Sanjiban Choudhury', 18)}}的其他基金
Collaborative Research: Inverse Task Planning from Few-Shot Vision Language Demonstrations
协作研究:基于少镜头视觉语言演示的逆向任务规划
- 批准号:
2327973 - 财政年份:2024
- 资助金额:
$ 39.94万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312841 - 财政年份:2023
- 资助金额:
$ 39.94万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312842 - 财政年份:2023
- 资助金额:
$ 39.94万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Foundations of Few-Round Active Learning
协作研究:RI:小型:少轮主动学习的基础
- 批准号:
2313131 - 财政年份:2023
- 资助金额:
$ 39.94万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313151 - 财政年份:2023
- 资助金额:
$ 39.94万 - 项目类别:
Continuing Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312840 - 财政年份:2023
- 资助金额:
$ 39.94万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Deep Constrained Learning for Power Systems
合作研究:RI:小型:电力系统的深度约束学习
- 批准号:
2345528 - 财政年份:2023
- 资助金额:
$ 39.94万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: Motion Fields Understanding for Enhanced Long-Range Imaging
合作研究:RI:小型:增强远程成像的运动场理解
- 批准号:
2232298 - 财政年份:2023
- 资助金额:
$ 39.94万 - 项目类别:
Standard Grant
Collaborative Research: RI: Small: End-to-end Learning of Fair and Explainable Schedules for Court Systems
合作研究:RI:小型:法院系统公平且可解释的时间表的端到端学习
- 批准号:
2232055 - 财政年份:2023
- 资助金额:
$ 39.94万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313149 - 财政年份:2023
- 资助金额:
$ 39.94万 - 项目类别:
Continuing Grant
Collaborative Research: CompCog: RI: Medium: Understanding human planning through AI-assisted analysis of a massive chess dataset
合作研究:CompCog:RI:中:通过人工智能辅助分析海量国际象棋数据集了解人类规划
- 批准号:
2312374 - 财政年份:2023
- 资助金额:
$ 39.94万 - 项目类别:
Standard Grant