EAGER: Income Learning: A New Model for Behavior-Analysis-Inspired Learning from Human Feedback
EAGER:收入学习:基于人类反馈的行为分析启发学习的新模型
基本信息
- 批准号:1643614
- 负责人:
- 金额:$ 7万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-08-15 至 2017-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
As virtual agents and physical robots become more common, there is an increasing number of complex tasks they can usefully perform to assist humans. These tasks are typically formalized as sequential decision tasks, where robots and agents perceive states, take actions, and receive a reward feedback signal. In practice, there is a critical need to learn directly from human users if such machines are to accomplish tasks outside of those pre-specified by the original developments. Machine reinforcement learning (RL), a paradigm often used for solving sequential decision making tasks, was originally developed with inspiration from animal learning research from the applied behavior analysis (ABA) community. Existing RL approaches operationalize a limited set of ABA principles effectively; however, there are additional principles and properties from ABA research that are not well encapsulated in the existing RL formalisms, and that are likely sources of new inspiration for designing more effective RL techniques capable of learning from human teachers. This project will (1) take combine principles from ABA and RL to produce algorithms that can learn more effectively from humans, (2) evaluate these algorithms in both virtual agents and on robot platforms, and (3) investigate whether and how non-expert humans can construct sequences of tasks of increasing difficulty, similar to how expert animal trainers shape tasks. Insights from these user studies will be leveraged to further improve our algorithms' abilities to learn from human trainers. Once successful, this project will make critical progress towards allowing non-technical users to be able to teach virtual and physical agents to perform complex tasks in a natural setting, familiar to many from previous experience in training household pets.This project is a part of a larger effort between Washington State University (WSU), North Carolina State University, and Brown University. The WSU effort will focus on implementing the proposed family of machine learning algorithms, called Income Learning (I-Learning). As these algorithms are co-developed by the three universities, WSU will design user studies to evaluate when and how the principles behind I-Learning allow it to outperform other existing algorithms at learning from human feedback. WSU will primarily focus on 1) virtual agents, allowing test learning via crowdsourcing, as well as testing on 2) physical robots and study if embodiment changes user's perceptions and actions, or the algorithms' learning efficacy. Additionally, WSU will investigate 3) human curricula design. Expert trainers can shape the behavior of animals, increasing task complexity over time, so that the animals can learn a sequence of tasks much faster than if they trained directly on the final, difficult task. WSU will run user studies on crowdsourcing platforms to better understand how non-expert humans design curricula for machine learning algorithms in sequential decision tasks, and investigate how these design decisions can inform algorithm design.
随着虚拟代理和物理机器人变得越来越普遍,它们可以有效地执行越来越多的复杂任务来帮助人类。这些任务通常形式化为顺序决策任务,其中机器人和代理感知状态,采取行动并接收奖励反馈信号。在实践中,如果这些机器要完成原始开发预先指定的任务之外的任务,则迫切需要直接向人类用户学习。机器强化学习(RL)是一种通常用于解决顺序决策任务的范式,最初是在应用行为分析(ABA)社区的动物学习研究的启发下开发的。现有的强化学习方法有效地实现了一套有限的ABA原则;然而,ABA研究中还有一些额外的原则和特性并没有很好地封装在现有的强化学习形式中,这可能是设计能够向人类教师学习的更有效的强化学习技术的新灵感来源。该项目将(1)将ABA和RL的原理结合起来,产生可以更有效地向人类学习的算法,(2)在虚拟代理和机器人平台上评估这些算法,以及(3)研究非专业人类是否以及如何构建难度越来越大的任务序列,类似于专业动物驯兽师如何塑造任务。从这些用户研究中获得的见解将被用来进一步提高我们的算法向人类训练师学习的能力。一旦成功,该项目将在允许非技术用户能够教虚拟和物理代理在自然环境中执行复杂任务方面取得关键进展,许多人从以前训练家庭宠物的经验中熟悉这些任务。该项目是华盛顿州立大学(WSU)、北卡罗来纳州立大学和布朗大学之间更大合作的一部分。华盛顿州立大学的工作将集中在实施提议的机器学习算法家族,称为收入学习(I-Learning)。由于这些算法是由三所大学共同开发的,华盛顿州立大学将设计用户研究,以评估I-Learning背后的原理何时以及如何使其在从人类反馈中学习方面优于其他现有算法。WSU将主要关注1)虚拟代理,允许通过众包进行测试学习,以及2)物理机器人的测试,并研究具体化是否会改变用户的感知和行为,或者算法的学习效率。此外,华盛顿州立大学将研究人文课程设计。专业的训练师可以塑造动物的行为,随着时间的推移增加任务的复杂性,这样动物就能比直接接受最终的、困难的任务训练更快地学会一系列任务。WSU将在众包平台上进行用户研究,以更好地了解非专业人员如何为顺序决策任务中的机器学习算法设计课程,并研究这些设计决策如何为算法设计提供信息。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Matthew Taylor其他文献
Ketamine PCA for Treatment of End-of-Life Neuropathic Pain in Pediatrics
氯胺酮 PCA 用于治疗儿科临终神经病理性疼痛
- DOI:
10.1177/1049909114543640 - 发表时间:
2015 - 期刊:
- 影响因子:0
- 作者:
Matthew Taylor;R. Jakacki;Carol May;D. Howrie;Scott H. Maurer - 通讯作者:
Scott H. Maurer
Radiation‐induced apoptosis in MOLT‐4 cells requires de novo protein synthesis independent of de novo RNA synthesis
MOLT-4细胞中辐射诱导的细胞凋亡需要从头合成蛋白质,独立于从头RNA合成
- DOI:
- 发表时间:
2002 - 期刊:
- 影响因子:3.5
- 作者:
Matthew Taylor;M. Buckwalter;Amen Craig Stephenson;Janet Leigh Hart;Benjamin James Taylor;K. O’Neill - 通讯作者:
K. O’Neill
Warm protons at comet 67P/Churyumov-Gerasimenko – Implications for the infant bow shock
67P/Churyumov-Gerasimenko 彗星上的暖质子——对婴儿弓激波的影响
- DOI:
10.5194/angeo-2020-66 - 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
C. Goetz;H. Gunell;F. L. Johansson;K. Llera;H. Nilsson;K. Glassmeier;Matthew Taylor - 通讯作者:
Matthew Taylor
Cluster Technical Challenges and Scientific Achievements
集群技术挑战和科学成果
- DOI:
10.1007/978-3-319-03952-7_30 - 发表时间:
2015 - 期刊:
- 影响因子:2.7
- 作者:
C. Escoubet;A. Masson;H. Laakso;Matthew Taylor;J. Volpp;D. Sieg;M. Hapgood;M. Goldstein - 通讯作者:
M. Goldstein
Antihypertensive Medications and Risk of Melanoma and Keratinocyte Carcinomas: A Systematic Review and Meta-Analysis
抗高血压药物与黑色素瘤和角质形成细胞癌的风险:系统回顾和荟萃分析
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Olivia G. Cohen;Matthew Taylor;Cassandra Mohr;K. Nead;C. Hinkston;Sharon H Giordano;Sinéad M Langan;David J Margolis;M. Wehner - 通讯作者:
M. Wehner
Matthew Taylor的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Matthew Taylor', 18)}}的其他基金
DISES: Indigenous forest management in a non-stationary climate
疾病:不稳定气候下的本土森林管理
- 批准号:
2310797 - 财政年份:2023
- 资助金额:
$ 7万 - 项目类别:
Standard Grant
Pilot study to develop a novel model to investigate the mechanisms and consequences of foetal immune programming on immune fitness through life
初步研究开发一种新模型来研究胎儿免疫编程对一生免疫健康的机制和后果
- 批准号:
BB/S002987/1 - 财政年份:2018
- 资助金额:
$ 7万 - 项目类别:
Research Grant
Doctoral Mentoring Consortium at the Fourteenth International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS-16)
第十四届自主代理和多代理系统国际会议 (AAMAS-16) 博士生导师联盟
- 批准号:
1620841 - 财政年份:2016
- 资助金额:
$ 7万 - 项目类别:
Standard Grant
19th Annual SIGART/AAAI Doctoral Consortium
第 19 届年度 SIGART/AAAI 博士联盟
- 批准号:
1444754 - 财政年份:2014
- 资助金额:
$ 7万 - 项目类别:
Standard Grant
RI: Small: Collaborative Research: Speeding Up Learning through Modeling the Pragmatics of Training
RI:小型:协作研究:通过培训语用建模加速学习
- 批准号:
1319412 - 财政年份:2013
- 资助金额:
$ 7万 - 项目类别:
Continuing Grant
Mechanisms of Th2 cell-intrinsic hypo-responsiveness, and its impact on protective immunity and memory to parasitic helminths
Th2细胞固有低反应机制及其对寄生虫保护性免疫和记忆的影响
- 批准号:
MR/K020196/1 - 财政年份:2013
- 资助金额:
$ 7万 - 项目类别:
Research Grant
CAREER: A Multiagent Teacher/Student Framework for Sequential Decision Making Tasks
职业:用于顺序决策任务的多智能体教师/学生框架
- 批准号:
1348109 - 财政年份:2013
- 资助金额:
$ 7万 - 项目类别:
Standard Grant
Collaborative Research: Reconstructing Droughts in the Tropical Americas Using Tree-Ring Analysis
合作研究:利用树木年轮分析重建热带美洲的干旱
- 批准号:
1263517 - 财政年份:2013
- 资助金额:
$ 7万 - 项目类别:
Continuing Grant
EAAI-12: The Third Symposium on Educational Advances in AI
EAAI-12:第三届人工智能教育进展研讨会
- 批准号:
1231124 - 财政年份:2012
- 资助金额:
$ 7万 - 项目类别:
Standard Grant
CAREER: A Multiagent Teacher/Student Framework for Sequential Decision Making Tasks
职业:用于顺序决策任务的多智能体教师/学生框架
- 批准号:
1149917 - 财政年份:2012
- 资助金额:
$ 7万 - 项目类别:
Standard Grant
相似海外基金
Comparative Study on Teaching Quality and Learning Achievements Inequalities in Low- and Middle-income Countries
低收入和中等收入国家教学质量和学习成绩不平等的比较研究
- 批准号:
23K12732 - 财政年份:2023
- 资助金额:
$ 7万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Multi-tiered Peer Mentoring and Experiential Learning for Academically Talented, Low-Income STEM Students
为有学术才华的低收入 STEM 学生提供多层次的同伴辅导和体验式学习
- 批准号:
2220835 - 财政年份:2023
- 资助金额:
$ 7万 - 项目类别:
Standard Grant
Enhanced Learning and Training Experiences to Support Talented, Low-Income STEM Students
增强学习和培训体验,支持有才华的低收入 STEM 学生
- 批准号:
2221187 - 财政年份:2023
- 资助金额:
$ 7万 - 项目类别:
Standard Grant
Using Media and Texting to Foster STEM Learning in Low-Income and Latinx Families
利用媒体和短信促进低收入和拉丁裔家庭的 STEM 学习
- 批准号:
2115669 - 财政年份:2021
- 资助金额:
$ 7万 - 项目类别:
Continuing Grant
CHS: Small: Experiential Learning Systems for Promoting Wellness in Low-Income Families
CHS:小型:促进低收入家庭健康的体验式学习系统
- 批准号:
2050309 - 财政年份:2020
- 资助金额:
$ 7万 - 项目类别:
Standard Grant
EAGER: Training A Mobile Robot from Human Feedback via Income Learning
EAGER:通过收入学习根据人类反馈训练移动机器人
- 批准号:
1643413 - 财政年份:2016
- 资助金额:
$ 7万 - 项目类别:
Standard Grant
CHS: Small: Experiential Learning Systems for Promoting Wellness in Low-Income Families
CHS:小型:促进低收入家庭健康的体验式学习系统
- 批准号:
1618406 - 财政年份:2016
- 资助金额:
$ 7万 - 项目类别:
Standard Grant
Science in the Learning Gardens: Factors that Support Racial and Ethnic Minority Students' Success in Low-Income Middle Schools
学习花园中的科学:支持少数族裔学生在低收入中学取得成功的因素
- 批准号:
1418270 - 财政年份:2014
- 资助金额:
$ 7万 - 项目类别:
Continuing Grant
A Japan-UK Comparative Study on the Effects of Household Income on Development, Learning Environments and Outcomes for Children
日英家庭收入对儿童发展、学习环境和成果影响的比较研究
- 批准号:
26780335 - 财政年份:2014
- 资助金额:
$ 7万 - 项目类别:
Grant-in-Aid for Young Scientists (B)