EAGER: Formal Models of Trainer Feedback for I-Learning Theoretical Guarantees

EAGER:I-Learning 理论保证的培训师反馈正式模型

基本信息

  • 批准号:
    1643411
  • 负责人:
  • 金额:
    $ 7万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2016
  • 资助国家:
    美国
  • 起止时间:
    2016-08-15 至 2017-12-31
  • 项目状态:
    已结题

项目摘要

As virtual agents and physical robots become more common, there is an increasing number of complex tasks they can usefully perform to assist humans. These tasks are typically formalized as sequential decision tasks, where robots and agents perceive states, take actions, and receive a reward feedback signal. In practice, there is a critical need to learn directly from human users---the majority of human users will not be able to directly program or fully specify a useful reward function. On the other hand, they can likely train an agent to perform tasks unanticipated by the original designer. Machine reinforcement learning (RL), a paradigm often used for solving sequential decision making tasks, was originally developed with inspiration from animal learning research from the applied behavior analysis (ABA) community. Existing RL approaches operationalize a limited set of ABA principles effectively; however, there are additional principles and properties from ABA research that are not well encapsulated in the existing RL formalisms, and that are likely sources of new inspiration for designing more effective RL techniques capable of learning from human teachers. The objective of this project is to leverage insights from animal training to reformulate the learning of sequential tasks from an agent learning alone in a fixed environment to an agent learning cooperatively with a competent, but not necessarily perfect, human teacher. Successful completion of this project will contribute a foundation of knowledge that will aide in the development new technologies to allow end users to customize the functions of their gadgets. This project is a part of a larger and collaborative effort between North Carolina State University (NCSU), Brown University, and Washington State University (WSU). The NCSU effort will include theoretical contributions along with empirical analyses and data collection. The emphasis of the NCSU portion of the project will be on the development of theoretical models of human feedback. When humans provide rewards to learning machines, describing the properties of the algorithms those machines use requires knowledge of how the humans provide feedback. For example, knowing when and how they make errors, the circumstances where they provide reinforcement or punishment, or use extinction, etc. Understanding the theoretical properties of I-Learning under different trainer paradigms will be the primary effort of NCSU project personnel. NCSU personnel will also work in concert with collaborators at Brown to use these models of feedback for describing the performance properties of I-Learning under different assumptions of trainer behavior. In addition, NCSU personnel will work with WSU collaborators to collect data from human trainers in virtual settings in order to validate and set the parameters of the theoretical models.
随着虚拟代理和物理机器人变得越来越普遍,它们可以有效地执行越来越多的复杂任务来帮助人类。这些任务通常被形式化为顺序决策任务,其中机器人和代理感知状态、采取行动并接收奖励反馈信号。在实践中,迫切需要直接从人类用户那里学习-大多数人类用户将无法直接编程或完全指定有用的奖励功能。另一方面,他们可能会训练一名代理执行原始设计者没有预料到的任务。机器强化学习(RL)是一种经常用于解决顺序决策任务的范式,最初是受应用行为分析(ABA)社区的动物学习研究的启发而发展起来的。现有的RL方法有效地操作了一组有限的ABA原则;然而,来自ABA研究的其他原则和性质没有很好地封装在现有的RL形式中,这些可能是设计能够从人类教师那里学习的更有效的RL技术的新灵感的来源。这个项目的目标是利用动物训练的洞察力来重新制定顺序任务的学习,从一个单独在固定环境中学习的代理学习到一个有能力但不一定完美的人类教师合作学习。该项目的成功完成将有助于开发新技术,使最终用户能够定制其小工具的功能,从而为开发新技术奠定基础。该项目是北卡罗来纳州立大学(NCSU)、布朗大学(Brown University)和华盛顿州立大学(WSU)之间更大规模的合作努力的一部分。NCSU的工作将包括理论贡献以及经验分析和数据收集。该项目NCSU部分的重点将是开发人类反馈的理论模型。当人类向学习机器提供奖励时,描述这些机器使用的算法的属性需要了解人类如何提供反馈。例如,了解他们何时以及如何犯错,他们在什么情况下提供强化或惩罚,或使用灭绝等。了解不同教员范式下i-Learning的理论属性将是NCSU项目人员的主要工作。NCSU人员还将与Brown的合作者合作,使用这些反馈模型来描述i-Learning在不同培训师行为假设下的性能属性。此外,NCSU人员将与WSU合作者合作,在虚拟环境中从人类教练员那里收集数据,以验证和设置理论模型的参数。

项目成果

期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans
对速度的需求:调整代理动作速度以提高非专家的任务学习能力
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

David Roberts其他文献

Understanding Middle Neolithic food and farming in and around the Stonehenge World Heritage Site: An integrated approach
了解巨石阵世界遗产地及其周围新石器时代中期的食物和农业:综合方法
  • DOI:
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Fay Worley;R. Madgwick;R. Pelling;P. Marshall;J. Evans;A. Lamb;Inés López;C. Bronk Ramsey;E. Dunbar;P. Reimer;J. Vallender;David Roberts
  • 通讯作者:
    David Roberts
Neither Deep nor Shallow: A Classroom Experiment Testing the Orthographic Depth of Tone Marking in Kabiye (Togo)
不深也不浅:卡比耶(多哥)测试声调标记的正字法深度的课堂实验
  • DOI:
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    1.8
  • 作者:
    David Roberts;Stephen L. Walter;Keith L. Snider
  • 通讯作者:
    Keith L. Snider
Work-based skills development: a context-engaged approach
基于工作的技能发展:结合情境的方法
KEK Preprint 2001-26
KEK 预印本 2001-26
  • DOI:
  • 发表时间:
    2001
  • 期刊:
  • 影响因子:
    0
  • 作者:
    B. H. Behrens;W. T. Ford;A. Gritsan;H. Krieg;J. Roy;J. Smith;M. Zhao;J. Alexander;R. Baker;C. Bebek;B. Berger;Karl Berkelman;K. Bloom;V. Boisvert;D. G. Cassel;David S. Crowcroft;M. Dickson;S. V. Dombrowski;P. S. Drell;K. Ecklund;R. Ehrlich;A. D. Foland;Peter Gaidarev;L. Gibbons;B. Gittelman;S. W. Gray;D. L. Hartill;B. K. Heltsley;P. I. Hopman;J. Kandaswamy;Philip Kim;D. L. Kreinick;T. Lee;Yehan Liu;N. B. Mistry;C. Ng;E. Nordberg;M. Ogg;J. R. Patterson;Dean E. Peterson;D. Riley;A. Soffer;B. Valant;C. Ward;Michael Athanas;P. Avery;C. D. Jones;M. Lohner;S. Patton;C. Prescott;J. Yelton;J. Zheng;G. Brandenburg;R. A. Briere;A. Ershov;Y. S. Gao;D. Kim;R. Wilson;H. Yamamoto;T. Browder;Yan Li;Jorge Luis Rodriguez;T. Bergfeld;B. I. Eisenstein;J. Ernst;G. E. Gladding;G. D. Gollin;R. M. Hans;E. Johnson;I. Karliner;M. A. Marsh;M. Palmer;M. Selen;J. J. Thaler;K. Edwards;A. Bellerive;R. Janicek;D. B. Macfarlane;P. M. Patel;A. J. Sadoff;R. Ammar;P. Baringer;A. Bean;D. Besson;D. Coppage;Cynthia L. Darling;Robin E. P. Davis;S. A. Kotov;I. Kravchenko;N. Kwak;L. Zhou;Stuart B. Anderson;Y. Kubota;S. J. Lee;Jim O’Neill;R. Poling;T. Riehle;A. J. Smith;M. S. Alam;S. B. Athar;Ling Zhao;A. Mahmood;S. Timm;F. Wappler;A. Anastassov;J. E. Duboscq;D. Fujino;K. Gan;T. L. Hart;K. Honscheid;H. Kagan;R. Kass;Jason Sang Hun Lee;M. Spencer;M. Sung;A. Undrus;Andreas Wolf;M. M. Zoeller;B. Nemati;S. J. Richichi;W. R. Ross;H. Severini;P. Skubic;M. Bishai;J. Fast;J. W. Hinson;N. Menon;D. H. Miller;E. I. Shibata;I. Shipsey;M. Yurko;Steven M Glenn;Y. Kwon;S. Roberts;E. H. Thorndike;C. Jessop;K. Lingel;H. Marsiske;M. Perl;V. Savinov;D. Ugolini;R. Wang;X.;T. E. Coan;V. Fadeyev;I. Korolkov;Y. Maravin;I. Narsky;V. Shelkov;J. Staeck;R. Stroynowski;I. Volobouev;J. Ye;Marina Artuso;F. Azfar;A. O. Efimov;M. Goldberg;Dong‐Qiang He;S. Kopp;G. Moneti;R. Mountain;S. Schuh;Tomasz Skwarnicki;S. Stone;G. Viehhauser;X. Xing;J. Bartelt;S. E. Csorna;V. Jain;K. W. McLean;S. Marka;R. Godang;K. Kinoshita;I. Lai;P. Pomianowski;S. Schrenk;G. Bonvicini;D. Cinabro;R. Greene;L. Perera;G. Zhou;M. Chadha;Simon Chan;G. Eigen;Js Miller;Cp O'Grady;M. Schmidtler;J. Urheim;A. Weinstein;F. Würthwein;D. W. Bliss;G. Masek;H. Paar;S. Prell;Varun Sharma;D. Asner;J. Gronberg;T. Hill;D. J. Lange;R. J. Morrison;H. Nelson;T. Nelson;David Roberts;A. Ryd
  • 通讯作者:
    A. Ryd
The potential impacts of biofouling on a wave energy converter using an open loop seawater power take off system
生物污垢对使用开环海水动力起飞系统的波浪能转换器的潜在影响
  • DOI:
  • 发表时间:
    2014
  • 期刊:
  • 影响因子:
    0
  • 作者:
    S. Blair;David Roberts;M. Scantlebury;Bob Eden
  • 通讯作者:
    Bob Eden

David Roberts的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('David Roberts', 18)}}的其他基金

Twenty-First AAAI/SIGAI Doctoral Consortium
第二十届 AAAI/SIGAI 博士联盟
  • 批准号:
    1611894
  • 财政年份:
    2016
  • 资助金额:
    $ 7万
  • 项目类别:
    Standard Grant
RUI: Hurwitz Number Fields
RUI:Hurwitz 数字字段
  • 批准号:
    1601350
  • 财政年份:
    2016
  • 资助金额:
    $ 7万
  • 项目类别:
    Continuing Grant
Greenland in a warmer climate: What controls the advance & retreat of the NE Greenland Ice Stream
气候变暖的格陵兰岛:是什么控制着前进
  • 批准号:
    NE/N011228/1
  • 财政年份:
    2016
  • 资助金额:
    $ 7万
  • 项目类别:
    Research Grant
The 20th SIGART/AAAI Doctoral Consortium
第20届SIGART/AAAI博士联盟
  • 批准号:
    1452078
  • 财政年份:
    2014
  • 资助金额:
    $ 7万
  • 项目类别:
    Standard Grant
iTrade Wildlife - software to detect illegal wildlife sales
iTrade Wildlife - 检测非法野生动物销售的软件
  • 批准号:
    NE/L00075X/1
  • 财政年份:
    2013
  • 资助金额:
    $ 7万
  • 项目类别:
    Research Grant
RI: Small: Collaborative Research: Speeding Up Learning through Modeling the Pragmatics of Training
RI:小型:协作研究:通过培训语用建模加速学习
  • 批准号:
    1319305
  • 财政年份:
    2013
  • 资助金额:
    $ 7万
  • 项目类别:
    Continuing Grant
CPS: Synergy: Integrated Sensing and Control Algorithms for Computer-Assisted Training
CPS:Synergy:用于计算机辅助训练的集成传感和控制算法
  • 批准号:
    1329738
  • 财政年份:
    2013
  • 资助金额:
    $ 7万
  • 项目类别:
    Standard Grant
URM: American Indian Research Opportunities in Ecology and Environmental Science
URM:美洲印第安人在生态学和环境科学方面的研究机会
  • 批准号:
    0840098
  • 财政年份:
    2008
  • 资助金额:
    $ 7万
  • 项目类别:
    Continuing Grant
Eye Catching: Supporting tele-communicational eye-gaze in Collaborative Virtual Environments
引人注目:支持协作虚拟环境中的远程通信视线
  • 批准号:
    EP/E007406/1
  • 财政年份:
    2006
  • 资助金额:
    $ 7万
  • 项目类别:
    Research Grant
Studies of Compact Extragalactic Radio Sources
紧凑型河外射电源的研究
  • 批准号:
    0307531
  • 财政年份:
    2003
  • 资助金额:
    $ 7万
  • 项目类别:
    Standard Grant

相似海外基金

EAGER: III: Learning with less data: Capitalizing on formal pedagogies and human performance to incorporate domain knowledge into deep learning models
EAGER:III:用更少的数据学习:利用正规教学法和人类表现将领域知识纳入深度学习模型
  • 批准号:
    2228910
  • 财政年份:
    2022
  • 资助金额:
    $ 7万
  • 项目类别:
    Standard Grant
EAGER: Formal Analysis of Stochastic Models in Systems Biology Under Uncertainty
EAGER:不确定性下系统生物学随机模型的形式分析
  • 批准号:
    2227898
  • 财政年份:
    2022
  • 资助金额:
    $ 7万
  • 项目类别:
    Continuing Grant
Formal Analysis of Abstract Behavioural Models Using Automated Deductive Reasoning
使用自动演绎推理对抽象行为模型进行形式化分析
  • 批准号:
    RGPIN-2016-03992
  • 财政年份:
    2022
  • 资助金额:
    $ 7万
  • 项目类别:
    Discovery Grants Program - Individual
CAREER: Explorable Formal Models of Privacy Policies and Regulations
职业:可探索的隐私政策和法规的正式模型
  • 批准号:
    2319894
  • 财政年份:
    2022
  • 资助金额:
    $ 7万
  • 项目类别:
    Continuing Grant
Models and algorithms for interactive machine learning applied to formal languages and geometric concepts
应用于形式语言和几何概念的交互式机器学习模型和算法
  • 批准号:
    RGPIN-2017-05336
  • 财政年份:
    2022
  • 资助金额:
    $ 7万
  • 项目类别:
    Discovery Grants Program - Individual
Formal Analysis of Abstract Behavioural Models Using Automated Deductive Reasoning
使用自动演绎推理对抽象行为模型进行形式化分析
  • 批准号:
    RGPIN-2016-03992
  • 财政年份:
    2021
  • 资助金额:
    $ 7万
  • 项目类别:
    Discovery Grants Program - Individual
Models and algorithms for interactive machine learning applied to formal languages and geometric concepts
应用于形式语言和几何概念的交互式机器学习模型和算法
  • 批准号:
    RGPIN-2017-05336
  • 财政年份:
    2021
  • 资助金额:
    $ 7万
  • 项目类别:
    Discovery Grants Program - Individual
Discovering formal business process models by process mining
通过流程挖掘发现正式的业务流程模型
  • 批准号:
    21K11756
  • 财政年份:
    2021
  • 资助金额:
    $ 7万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Formal Analysis of Abstract Behavioural Models Using Automated Deductive Reasoning
使用自动演绎推理对抽象行为模型进行形式化分析
  • 批准号:
    RGPIN-2016-03992
  • 财政年份:
    2020
  • 资助金额:
    $ 7万
  • 项目类别:
    Discovery Grants Program - Individual
Models and algorithms for interactive machine learning applied to formal languages and geometric concepts
应用于形式语言和几何概念的交互式机器学习模型和算法
  • 批准号:
    RGPIN-2017-05336
  • 财政年份:
    2020
  • 资助金额:
    $ 7万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了