权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Small: Extracting Knowledge from Language Models for Decision Making

RI：小型：从语言模型中提取知识以进行决策

基本信息

批准号：
2246811
负责人：
Sergey Levine
金额：
$ 60万
依托单位：
University of California-Berkeley
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-09-15 至 2026-08-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2246811&HistoricalAwards=false
关键词：
RI Small Extracting Knowledge Language

项目摘要

This project aims to integrate semantic knowledge from large language models into automated decision-making and control systems while retaining reliability and robustness. The main principle of the proposed approach is to use language models to generate proposals and guidance, but still make the final decision or plan based on principled and robust planning and control methods, such that the language models are used when their semantic predictions are useful but not relied upon to always yield the correct answer. Large language models, such as ChatGPT, have garnered considerable attention in recent years due to their ability to respond to complex user queries and fulfill elaborate requests, such as writing code, composing stories, or providing educational explanations. Because of this, there is considerable interest in using them directly as decision-making systems (for example, if a language model can give “how to” instructions for repairing a car, perhaps it can also issue commands to a robot that actually repairs a car). However, there are also numerous concerns that such models might be too unreliable or too prone to generate false predictions to be useful as decision-making systems on their own. Therefore, this project aims to integrate these models into principled methods for planning and control to leverage the semantic knowledge in these models while providing a degree of robustness. This research has significant ramifications for automated decision-making systems that need to interact with complex real-world environments, where both semantic reasoning and intelligent planning are important. This includes robotic systems, including autonomous vehicles and service robots, intelligent assistants, decision support systems, and a range of automation technologies.The technical approach in this project will be based around a probabilistic formulation that ties together the ungrounded semantic predictions from language models with grounded but non-semantic predictions from learned dynamics models. In this way, probabilistic inference machinery can be used to derive algorithms that make decisions that have a high likelihood of being semantically good according to the language model and a high likelihood of being physically (dynamically) optimal according to the learned dynamics model. In practice, this principle can be instantiated in the context of both model-based and model-free reinforcement-learning systems, learned prediction systems, and planning algorithms (by formulating planning as inference). The project will explore applications of this concept to prediction, planning and control, and exploration in reinforcement learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

该项目旨在将来自大型语言模型的语义知识集成到自动化决策和控制系统中，同时保持可靠性和鲁棒性。所提出的方法的主要原则是使用语言模型来生成建议和指导，但仍然根据原则性和鲁棒性的计划和控制方法做出最终的决定或计划，这样，当语言模型的语义预测有用但不依赖于总是产生正确答案时，就会使用语言模型。大型语言模型，如ChatGPT，近年来获得了相当多的关注，因为它们能够响应复杂的用户查询并完成精细的请求，例如编写代码、编写故事或提供教育解释。正因为如此，人们对直接将它们用作决策系统非常感兴趣（例如，如果语言模型可以给出“如何”修理汽车的指令，也许它也可以向实际修理汽车的机器人发出命令）。然而，也有许多人担心，这些模型可能太不可靠，或者太容易产生错误的预测，而不能作为自己的决策系统。因此，本项目旨在将这些模型集成到规划和控制的原则方法中，以利用这些模型中的语义知识，同时提供一定程度的鲁棒性。这项研究对需要与复杂的现实世界环境进行交互的自动化决策系统具有重大影响，其中语义推理和智能规划都很重要。这包括机器人系统，包括自动驾驶汽车和服务机器人、智能助手、决策支持系统和一系列自动化技术。这个项目的技术方法将基于一个概率公式，该公式将来自语言模型的无根据语义预测与来自学习动力学模型的有根据但非语义预测联系在一起。通过这种方式，可以使用概率推理机制来推导算法，这些算法根据语言模型做出的决策在语义上很有可能是好的，根据学习的动力学模型做出的决策在物理上（动态）最优的可能性很高。在实践中，这一原则可以在基于模型和无模型的强化学习系统、学习预测系统和规划算法（通过将规划制定为推理）的上下文中实例化。该项目将探索这一概念在预测、规划和控制以及强化学习探索中的应用。该奖项反映了美国国家科学基金会的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Sergey Levine其他文献

Goal-oriented Vision-and-Dialog Navigation through Reinforcement Learning

通过强化学习实现目标导向的视觉和对话导航

DOI：
发表时间：
2022
期刊：
影响因子：
0
作者：
Peter Anderson;Qi Wu;Damien Teney;Jake Bruce;Mark Johnson;Niko Sünderhauf;Ian D. Reid;F. Bonin;Alberto Ortiz;Angel X. Chang;Angela Dai;T. Funkhouser;Ma;Matthias Niebner;M. Savva;David Chen;Raymond Mooney. 2011;Learning;Howard Chen;Alane Suhr;Dipendra Kumar Misra;T. Kollar;Nicholas Roy;Trajectory;Satwik Kottur;José M. F. Moura;Dhruv Devi Parikh;Sergey Levine;Chelsea Finn;Trevor Darrell;Jianfeng Li;Gao Yun;Chen;Ziming Li;Sungjin Lee;Baolin Peng;Jinchao Li;Julia Kiseleva;M. D. Rijke;Shahin Shayandeh;Weixin Liang;Youzhi Tian;Cheng;Yitao Liang;Marlos C. Machado;Erik Talvitie;Chih;Jiasen Lu;Zuxuan Wu;G. Al
通讯作者：
G. Al