权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Small: Collaborative Research: Speeding Up Learning through Modeling the Pragmatics of Training

RI：小型：协作研究：通过培训语用建模加速学习

基本信息

批准号：
1319618
负责人：
Michael Littman
金额：
$ 14.8万
依托单位：
Brown University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2013
资助国家：
美国
起止时间：
2013-10-01 至 2016-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1319618&HistoricalAwards=false
关键词：
RI Small Collaborative Research Speeding

项目摘要

Years of effort to develop algorithms capable of learning from reward signals have resulted in a plethora of techniques that can leverage numerical signals that vary in value based on performance. Recent efforts to use these techniques to learn from humans providing rewards have been slower to progress, in part, because humans give feedback discretely rather than numerically. This project contributes new learning algorithms designed specifically to leverage the information contained in the choices humans make to provide such discrete feedbacks. The algorithms are inspired by the human-canine partnership, and the incredible things that humans are able to teach dogs using only discrete feedback and carefully constructed sequences of tasks. The Bayesian learning framework being developed in this project will leverage the pragmatic implicatures contained in the feedbacks and tasks sequences to learn more quickly from human feedback. The ultimate goal of this work is to provide a more natural paradigm for humans to tell computers what they would like for them to do. To that end, project efforts will result in a teaching module for Brown University?s Learning Exchange (LE). The LE involves undergraduates working with underserved minority middle school students to engage them in STEM. They are a perfect audience to demonstrate the broader impacts of this work. LE participants learn to instruct computers using a combination of programming with the Scratch environment and the feedback paradigm, which shows how powerful the algorithms are.

多年来，开发能够从奖励信号中学习的算法的努力已经产生了大量的技术，这些技术可以利用基于性能而变化的数值信号。最近使用这些技术向提供奖励的人类学习的努力进展缓慢，部分原因是人类提供的反馈是离散的，而不是数字。这个项目贡献了新的学习算法，专门用来利用人类做出的选择中包含的信息来提供这种离散的反馈。这些算法的灵感来自于人类与狗的伙伴关系，以及人类仅使用离散反馈和精心构建的任务序列就能够教狗的令人难以置信的事情。贝叶斯学习框架将利用反馈和任务序列中包含的语用含义，更快地从人类反馈中学习。这项工作的最终目标是为人类提供一个更自然的范例，告诉计算机他们想让他们做什么。为此，项目工作将为布朗大学编制一个教学单元。的学习交流（LE）。LE涉及本科生与服务不足的少数民族中学生合作，让他们参与STEM。他们是展示这项工作更广泛影响的完美观众。LE参与者学习使用Scratch环境和反馈范式的编程组合来指导计算机，这表明算法是多么强大。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Michael Littman其他文献

Model-based reasoning

基于模型的推理

DOI：
10.1016/j.compedu.2012.11.014
发表时间：
2013
期刊：
Comput. Educ.
影响因子：
0
作者：
Michael Jackson;Janusz Wojtusiak;Dayne Freitag;Eugene Subbotsky;Hans M. Nordahl;Jens C. Thimm;John Burgoyne;Roberto Poli;Thomas R. Guskey;Michael Davison;J. Magnotti;Adam M. Goodman;Jeffrey S. Katz;L. Verschaffel;W. Dooren;B. Smedt;Sean A. Fulop;Melva R. Grant;Leonid I. Perlovsky;B. De Smedt;P. Ghesquière;Dariusz Plewczynski;Leily Ziglari;P. Birjandi;Scott Rick;Roberto Weber;N. Seel;Maike Luhmann;Michael Eid;A. Antonietti;Barbara Colombo;Hamish Coates;Ali Radloff;P. Pirnay;Dirk Ifenthaler;Edward Swing;Craig A Anderson;David Tzuriel;Norman M. Weinberger;David C. Riccio;Patrick K. Cullen;J. Tallet;Megan L. Hoffman;David A. Washburn;Iván Izquierdo;Jorge H. Medina;M. Cammarota;A. Podolskiy;Joke Torbeyns;J. Kranzler;P. A. Kirschner;F. Kirschner;Kenn Apel;Julie A. Wolter;J. Masterson;JungMi Lee;Stefan N Groesser;Sabine Al;Philip Barker;Paul Schaik;I. Cutica;Monica Bucciarelli;K. Pata;Anna Strasser;A. Guillot;N. Hoyek;Christian Collet;Maria Opfermann;Roger Azevedo;Detlev Leutner;Thomas C. Toppino;Alice Y. Kolb;David A. Kolb;P. Brazdil;Ricardo Vilalta;Carlos Soares;C. Giraud;Jeffrey W. Bloom;Tyler Volk;Marwan A. Dwairy;Richard A. Swanson;Johanna Pöysä;K. Luwel;Theo Hug;Angélique Martin;Nicolas Guéguen;Craig Hassed;Fabio Alivernini;Michael Herczeg;M. Mastropieri;T. Scruggs;Angelika Rieder;S. Castillo;Gerardo Ayala;R. Low;R. Babuška;Barbara C. Buckley;Henry Markovits;Sungho Kim;In;Michael J. Spector;A. Towse;Charlie N. Lewis;Brian Francis;David N. Rapp;Pratim Sengupta;Sidney D’Mello;Serge Brand;J. Patry;Cees Klaassen;Sieglinde Weyringer;Alfred Weinberger;Marilla D. Svinicki;Jane S. Vogler;Andrew J. Martin;John M. Keller;ChanMin Kim;Gabriele Wulf;Lynne E. Parker;Michael Wunder;Michael Littman;Lisa J. Lehmberg;C. Victor Fung;Hannele Niemi;Steven Reiss;Piet Desmet;F. Cornillie;Helmut M. Niegemann;Steffi Heidig;Dominic W. Massaro;Charles Fadel;Cheryl Lemke;R. Grabner;Michael D. Basil;Daniel R. Little;Stephan Lewandowsky;Parmjit Singh;Zheng Liu;Marcelo H. Ang;W. Seah;Jack Heller;C. Randles;Kenneth S. Aigen
通讯作者：
Kenneth S. Aigen