RI: Small: Extracting Knowledge from Language Models for Decision Making
RI:小型:从语言模型中提取知识以进行决策
基本信息
- 批准号:2246811
- 负责人:
- 金额:$ 60万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-09-15 至 2026-08-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
This project aims to integrate semantic knowledge from large language models into automated decision-making and control systems while retaining reliability and robustness. The main principle of the proposed approach is to use language models to generate proposals and guidance, but still make the final decision or plan based on principled and robust planning and control methods, such that the language models are used when their semantic predictions are useful but not relied upon to always yield the correct answer. Large language models, such as ChatGPT, have garnered considerable attention in recent years due to their ability to respond to complex user queries and fulfill elaborate requests, such as writing code, composing stories, or providing educational explanations. Because of this, there is considerable interest in using them directly as decision-making systems (for example, if a language model can give “how to” instructions for repairing a car, perhaps it can also issue commands to a robot that actually repairs a car). However, there are also numerous concerns that such models might be too unreliable or too prone to generate false predictions to be useful as decision-making systems on their own. Therefore, this project aims to integrate these models into principled methods for planning and control to leverage the semantic knowledge in these models while providing a degree of robustness. This research has significant ramifications for automated decision-making systems that need to interact with complex real-world environments, where both semantic reasoning and intelligent planning are important. This includes robotic systems, including autonomous vehicles and service robots, intelligent assistants, decision support systems, and a range of automation technologies.The technical approach in this project will be based around a probabilistic formulation that ties together the ungrounded semantic predictions from language models with grounded but non-semantic predictions from learned dynamics models. In this way, probabilistic inference machinery can be used to derive algorithms that make decisions that have a high likelihood of being semantically good according to the language model and a high likelihood of being physically (dynamically) optimal according to the learned dynamics model. In practice, this principle can be instantiated in the context of both model-based and model-free reinforcement-learning systems, learned prediction systems, and planning algorithms (by formulating planning as inference). The project will explore applications of this concept to prediction, planning and control, and exploration in reinforcement learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目旨在将大型语言模型的语义知识集成到自动化决策和控制系统中,同时保持可靠性和鲁棒性。该方法的主要原则是使用语言模型来生成建议和指导,但仍然基于原则性和稳健的规划和控制方法做出最终决策或计划,以便在语义预测有用但不依赖于总是产生正确答案时使用语言模型。近年来,诸如 ChatGPT 之类的大型语言模型因其能够响应复杂的用户查询并满足复杂的请求(例如编写代码、撰写故事或提供教育解释)而引起了相当大的关注。正因为如此,人们对直接将它们用作决策系统非常感兴趣(例如,如果语言模型可以给出修理汽车的“如何”指令,那么它也许也可以向实际修理汽车的机器人发出命令)。然而,也有很多人担心这些模型可能太不可靠或太容易产生错误的预测,无法单独用作决策系统。因此,该项目旨在将这些模型集成到规划和控制的原则方法中,以利用这些模型中的语义知识,同时提供一定程度的鲁棒性。这项研究对于需要与复杂的现实环境交互的自动化决策系统具有重大影响,其中语义推理和智能规划都很重要。这包括机器人系统,包括自动驾驶车辆和服务机器人、智能助手、决策支持系统和一系列自动化技术。该项目的技术方法将基于概率公式,将语言模型的无根据的语义预测与学习的动力学模型的有根据的但非语义的预测联系在一起。通过这种方式,概率推理机制可用于导出算法,这些算法做出的决策根据语言模型很有可能在语义上是好的,并且根据学习的动态模型很有可能在物理(动态)上是最优的。在实践中,这一原理可以在基于模型和无模型的强化学习系统、学习预测系统和规划算法(通过将规划制定为推理)的背景下实例化。该项目将探索这一概念在预测、规划和控制以及强化学习探索中的应用。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
                item.title }}
{{ item.translation_title }}
- DOI:{{ item.doi }} 
- 发表时间:{{ item.publish_year }} 
- 期刊:
- 影响因子:{{ item.factor }}
- 作者:{{ item.authors }} 
- 通讯作者:{{ item.author }} 
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ patent.updateTime }}
Sergey Levine其他文献
Goal-oriented Vision-and-Dialog Navigation through Reinforcement Learning
通过强化学习实现目标导向的视觉和对话导航
- DOI:
- 发表时间:2022 
- 期刊:
- 影响因子:0
- 作者:Peter Anderson;Qi Wu;Damien Teney;Jake Bruce;Mark Johnson;Niko Sünderhauf;Ian D. Reid;F. Bonin;Alberto Ortiz;Angel X. Chang;Angela Dai;T. Funkhouser;Ma;Matthias Niebner;M. Savva;David Chen;Raymond Mooney. 2011;Learning;Howard Chen;Alane Suhr;Dipendra Kumar Misra;T. Kollar;Nicholas Roy;Trajectory;Satwik Kottur;José M. F. Moura;Dhruv Devi Parikh;Sergey Levine;Chelsea Finn;Trevor Darrell;Jianfeng Li;Gao Yun;Chen;Ziming Li;Sungjin Lee;Baolin Peng;Jinchao Li;Julia Kiseleva;M. D. Rijke;Shahin Shayandeh;Weixin Liang;Youzhi Tian;Cheng;Yitao Liang;Marlos C. Machado;Erik Talvitie;Chih;Jiasen Lu;Zuxuan Wu;G. Al 
- 通讯作者:G. Al 
Is Value Learning Really the Main Bottleneck in Offline RL?
价值学习真的是离线强化学习的主要瓶颈吗?
- DOI:
- 发表时间:2024 
- 期刊:
- 影响因子:0
- 作者:Seohong Park;Kevin Frans;Sergey Levine;Aviral Kumar 
- 通讯作者:Aviral Kumar 
Functional Graphical Models: Structure Enables Offline Data-Driven Optimization
功能图形模型:结构支持离线数据驱动优化
- DOI:
- 发表时间:2024 
- 期刊:
- 影响因子:0
- 作者:J. Kuba;Masatoshi Uehara;Pieter Abbeel;Sergey Levine 
- 通讯作者:Sergey Levine 
Grow Your Limits: Continuous Improvement with Real-World RL for Robotic Locomotion
拓展你的极限:通过现实世界的强化学习来持续改进机器人运动
- DOI:
- 发表时间:2023 
- 期刊:
- 影响因子:0
- 作者:Laura M. Smith;Yunhao Cao;Sergey Levine 
- 通讯作者:Sergey Levine 
HiLMa-Res: A General Hierarchical Framework via Residual RL for Combining Quadrupedal Locomotion and Manipulation
HiLMa-Res:通过残差强化学习结合四足运动和操纵的通用分层框架
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:Xiaoyu Huang;Qiayuan Liao;Yiming Ni;Zhongyu Li;Laura Smith;Sergey Levine;Xue Bin Peng;K. Sreenath 
- 通讯作者:K. Sreenath 
Sergey Levine的其他文献
{{
              item.title }}
{{ item.translation_title }}
- DOI:{{ item.doi }} 
- 发表时间:{{ item.publish_year }} 
- 期刊:
- 影响因子:{{ item.factor }}
- 作者:{{ item.authors }} 
- 通讯作者:{{ item.author }} 
{{ truncateString('Sergey Levine', 18)}}的其他基金
Robotic Learning with Reusable Datasets
使用可重复使用的数据集进行机器人学习
- 批准号:2150826 
- 财政年份:2022
- 资助金额:$ 60万 
- 项目类别:Standard Grant 
CAREER: Deep Robotic Learning with Large Datasets: Toward Simple and Reliable Lifelong Learning Frameworks
职业:大数据集的深度机器人学习:迈向简单可靠的终身学习框架
- 批准号:1651843 
- 财政年份:2017
- 资助金额:$ 60万 
- 项目类别:Continuing Grant 
NRI: Collaborative Research: Learning Deep Sensorimotor Policies for Shared Autonomy
NRI:协作研究:学习共享自主权的深度感觉运动策略
- 批准号:1637443 
- 财政年份:2016
- 资助金额:$ 60万 
- 项目类别:Standard Grant 
RI: Small: Model-Based Deep Reinforcement Learning for Domain Transfer
RI:小型:用于域迁移的基于模型的深度强化学习
- 批准号:1700697 
- 财政年份:2016
- 资助金额:$ 60万 
- 项目类别:Standard Grant 
RI: Small: Model-Based Deep Reinforcement Learning for Domain Transfer
RI:小型:用于域迁移的基于模型的深度强化学习
- 批准号:1614653 
- 财政年份:2016
- 资助金额:$ 60万 
- 项目类别:Standard Grant 
NRI: Collaborative Research: Learning Deep Sensorimotor Policies for Shared Autonomy
NRI:协作研究:学习共享自主权的深度感觉运动策略
- 批准号:1700696 
- 财政年份:2016
- 资助金额:$ 60万 
- 项目类别:Standard Grant 
相似国自然基金
昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
- 批准号:32000033
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
- 批准号:31972324
- 批准年份:2019
- 资助金额:58.0 万元
- 项目类别:面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
- 批准号:81900988
- 批准年份:2019
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
- 批准号:31802058
- 批准年份:2018
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
- 批准号:31870821
- 批准年份:2018
- 资助金额:56.0 万元
- 项目类别:面上项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
- 批准号:31772128
- 批准年份:2017
- 资助金额:60.0 万元
- 项目类别:面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
- 批准号:81704176
- 批准年份:2017
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
- 批准号:91640114
- 批准年份:2016
- 资助金额:85.0 万元
- 项目类别:重大研究计划
相似海外基金
Powering Small Craft with a Novel Ammonia Engine
用新型氨发动机为小型船只提供动力
- 批准号:10099896 
- 财政年份:2024
- 资助金额:$ 60万 
- 项目类别:Collaborative R&D 
"Small performances": investigating the typographic punches of John Baskerville (1707-75) through heritage science and practice-based research
“小型表演”:通过遗产科学和基于实践的研究调查约翰·巴斯克维尔(1707-75)的印刷拳头
- 批准号:AH/X011747/1 
- 财政年份:2024
- 资助金额:$ 60万 
- 项目类别:Research Grant 
Fragment to small molecule hit discovery targeting Mycobacterium tuberculosis FtsZ
针对结核分枝杆菌 FtsZ 的小分子片段发现
- 批准号:MR/Z503757/1 
- 财政年份:2024
- 资助金额:$ 60万 
- 项目类别:Research Grant 
Bacteriophage control of host cell DNA transactions by small ORF proteins
噬菌体通过小 ORF 蛋白控制宿主细胞 DNA 交易
- 批准号:BB/Y004426/1 
- 财政年份:2024
- 资助金额:$ 60万 
- 项目类别:Research Grant 
Windows for the Small-Sized Telescope (SST) Cameras of the Cherenkov Telescope Array (CTA)
切伦科夫望远镜阵列 (CTA) 小型望远镜 (SST) 相机的窗口
- 批准号:ST/Z000017/1 
- 财政年份:2024
- 资助金额:$ 60万 
- 项目类别:Research Grant 
CSR: Small: Leveraging Physical Side-Channels for Good
CSR:小:利用物理侧通道做好事
- 批准号:2312089 
- 财政年份:2024
- 资助金额:$ 60万 
- 项目类别:Standard Grant 
CSR: Small: Multi-FPGA System for Real-time Fraud Detection with Large-scale Dynamic Graphs
CSR:小型:利用大规模动态图进行实时欺诈检测的多 FPGA 系统
- 批准号:2317251 
- 财政年份:2024
- 资助金额:$ 60万 
- 项目类别:Standard Grant 
AF: Small: Problems in Algorithmic Game Theory for Online Markets
AF:小:在线市场的算法博弈论问题
- 批准号:2332922 
- 财政年份:2024
- 资助金额:$ 60万 
- 项目类别:Standard Grant 
Collaborative Research: FET: Small: Algorithmic Self-Assembly with Crisscross Slats
合作研究:FET:小型:十字交叉板条的算法自组装
- 批准号:2329908 
- 财政年份:2024
- 资助金额:$ 60万 
- 项目类别:Standard Grant 
NeTS: Small: ML-Driven Online Traffic Analysis at Multi-Terabit Line Rates
NeTS:小型:ML 驱动的多太比特线路速率在线流量分析
- 批准号:2331111 
- 财政年份:2024
- 资助金额:$ 60万 
- 项目类别:Standard Grant 

 刷新
              刷新
            
















 {{item.name}}会员
              {{item.name}}会员
            



