CAREER: Using Imperfect Predictions to Make Good Decisions
职业:利用不完美的预测做出正确的决策
基本信息
- 批准号:1939827
- 负责人:
- 金额:$ 30.13万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-07-01 至 2023-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
As humans and other animals navigate the world they demonstrate remarkable flexibility in encountering unfamiliar systems, spaces and phenomena, learning to make predictions about how they will behave, and making good decisions based on those predictions. Crucial to this ability is the fact that one does not need to make perfectly accurate or fully detailed predictions to make good decisions. Though, due to our natural limitations, our predictions about the future are necessarily flawed, they are nevertheless sufficiently useful to make reasonable decisions. For artificial agents, in contrast, imperfect predictions often lead to catastrophic failures in decision making. Many existing approaches fundamentally assume that the agent will eventually learn to make perfect predictions and make perfect decisions, which is unreasonable in sufficiently rich, complex environments. This work considers the problem of developing artificial agents that are more aware of and more robust to their own limitations. Agents that can more robustly and flexibly learn from experience in truly complex environments have the potential to impact nearly any application in which decisions are made over time, for instance autonomous robots/vehicles, personal assistants, and medical/legal decision support. Furthermore, as the project will be undertaken at an undergraduate-only liberal arts college, undergraduate researchers will play an integral role in the work. The PI will also build on the strength of the liberal arts setting to enhance instruction of key discipline-specific research and writing skills throughout the Computer Science curriculum. Explicit development of these skills will not only improve students' preparation for a wide variety of career paths (including basic research) but is also aligned with best practices for broadening participation in the discipline. This project studies model-based reinforcement learning (MBRL) under the assumption that the agent has fundamental limitations that prevent it from learning a perfect model or from producing optimal plans. The central hypothesis is that in this context the MBRL problem cannot be decomposed into separate model-learning and planning problems, each treating the other as an idealized black box. Rather the optimization process for each component must be aware of its role in the overall architecture and of the limitations of its partner. One key aim of the work is to derive novel measures of model quality that are more tightly related to the true objective of control performance than standard measures of one-step prediction accuracy adapted from supervised learning settings. Another is to investigate how model learning objectives/algorithms can be adapted to account for the limitations of the specific planner that will use the model. Further, control algorithms will be investigated that can make effective use of models of non-homogeneous quality by mediating between model-based and model-free knowledge. The ultimate goal is to integrate these principles into novel MBRL agents that are significantly more robust to limitations in the model class and/or planner and are able to succeed in environments that are too complex and high-dimensional to be modeled or solved exactly.
当人类和其他动物在世界上航行时,它们在遇到不熟悉的系统、空间和现象时表现出非凡的灵活性,学会预测自己的行为,并根据这些预测做出正确的决策。对这种能力至关重要的是,一个人不需要做出完全准确或完全详细的预测来做出好的决定。虽然,由于我们天生的局限性,我们对未来的预测必然是有缺陷的,但它们仍然足够有用,可以做出合理的决定。相比之下,对于人工智能体来说,不完美的预测往往会导致灾难性的决策失败。许多现有的方法从根本上假设智能体最终将学会做出完美的预测并做出完美的决策,这在足够丰富、复杂的环境中是不合理的。这项工作考虑了开发人工代理的问题,这些代理对自己的局限性更有意识,更健壮。能够在真正复杂的环境中更稳健、更灵活地从经验中学习的智能体,有可能影响几乎任何需要长期做出决策的应用,例如自主机器人/车辆、个人助理和医疗/法律决策支持。此外,由于该项目将在一所只招收本科生的文理学院进行,本科生研究人员将在工作中发挥不可或缺的作用。PI还将以文科设置的优势为基础,在整个计算机科学课程中加强对关键学科特定研究和写作技巧的指导。这些技能的明确发展不仅将提高学生对各种职业道路(包括基础研究)的准备,而且还与扩大学科参与的最佳实践相一致。本项目研究基于模型的强化学习(MBRL),假设智能体有基本的限制,阻止它学习一个完美的模型或产生最优的计划。中心假设是,在这种情况下,MBRL问题不能分解为单独的模型学习和规划问题,每个问题都将对方视为理想的黑箱。相反,每个组件的优化过程必须了解其在整个体系结构中的作用以及其合作伙伴的局限性。这项工作的一个关键目标是推导出与控制性能的真实目标更紧密相关的模型质量的新度量,而不是从监督学习设置中适应的一步预测精度的标准度量。另一个是研究如何调整模型学习目标/算法,以解释将使用该模型的特定规划器的局限性。此外,还将研究控制算法,通过在基于模型的知识和无模型的知识之间进行中介,有效地利用非同质质量的模型。最终目标是将这些原则集成到新的MBRL代理中,这些代理对模型类和/或规划器的限制具有更强的鲁棒性,并且能够在过于复杂和高维的环境中取得成功,无法精确建模或解决。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Erin Talvitie其他文献
Erin Talvitie的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Erin Talvitie', 18)}}的其他基金
CAREER: Using Imperfect Predictions to Make Good Decisions
职业:利用不完美的预测做出正确的决策
- 批准号:
1552533 - 财政年份:2016
- 资助金额:
$ 30.13万 - 项目类别:
Continuing Grant
相似国自然基金
Molecular Interaction Reconstruction of Rheumatoid Arthritis Therapies Using Clinical Data
- 批准号:31070748
- 批准年份:2010
- 资助金额:34.0 万元
- 项目类别:面上项目
相似海外基金
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
- 批准号:
2890513 - 财政年份:2027
- 资助金额:
$ 30.13万 - 项目类别:
Studentship
Impact of Urban Environmental Factors on Momentary Subjective Wellbeing (SWB) using Smartphone-Based Experience Sampling Methods
使用基于智能手机的体验采样方法研究城市环境因素对瞬时主观幸福感 (SWB) 的影响
- 批准号:
2750689 - 财政年份:2025
- 资助金额:
$ 30.13万 - 项目类别:
Studentship
Design of metal structures of custom composition using additive manufacturing
使用增材制造设计定制成分的金属结构
- 批准号:
2593424 - 财政年份:2025
- 资助金额:
$ 30.13万 - 项目类别:
Studentship
Scalable indoor power harvesters using halide perovskites
使用卤化物钙钛矿的可扩展室内能量收集器
- 批准号:
MR/Y011686/1 - 财政年份:2025
- 资助金额:
$ 30.13万 - 项目类别:
Fellowship
Collaborative Research: Using Adaptive Lessons to Enhance Motivation, Cognitive Engagement, And Achievement Through Equitable Classroom Preparation
协作研究:通过公平的课堂准备,利用适应性课程来增强动机、认知参与和成就
- 批准号:
2335802 - 财政年份:2024
- 资助金额:
$ 30.13万 - 项目类别:
Standard Grant
Collaborative Research: Using Adaptive Lessons to Enhance Motivation, Cognitive Engagement, And Achievement Through Equitable Classroom Preparation
协作研究:通过公平的课堂准备,利用适应性课程来增强动机、认知参与和成就
- 批准号:
2335801 - 财政年份:2024
- 资助金额:
$ 30.13万 - 项目类别:
Standard Grant
CAREER: Investigating Biogeographic Hypotheses and Drivers of Diversification in Neotropical Harvestmen (Opiliones: Laniatores) Using Ultraconserved Elements
职业:利用超保守元素研究新热带收获者(Opiliones:Laniatores)多样化的生物地理学假设和驱动因素
- 批准号:
2337605 - 财政年份:2024
- 资助金额:
$ 30.13万 - 项目类别:
Continuing Grant
CAREER: Development of New Gas-Releasing Molecules Using a Thiol Carrier
职业:利用硫醇载体开发新型气体释放分子
- 批准号:
2338835 - 财政年份:2024
- 资助金额:
$ 30.13万 - 项目类别:
Continuing Grant
Collaborative Research: NCS-FR: Individual variability in auditory learning characterized using multi-scale and multi-modal physiology and neuromodulation
合作研究:NCS-FR:利用多尺度、多模式生理学和神经调节表征听觉学习的个体差异
- 批准号:
2409652 - 财政年份:2024
- 资助金额:
$ 30.13万 - 项目类别:
Standard Grant
Collaborative Research: Ionospheric Density Response to American Solar Eclipses Using Coordinated Radio Observations with Modeling Support
合作研究:利用协调射电观测和建模支持对美国日食的电离层密度响应
- 批准号:
2412294 - 财政年份:2024
- 资助金额:
$ 30.13万 - 项目类别:
Standard Grant