权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: RI:Medium:MoDL:Mathematical and Conceptual Understanding of Large Language Models

合作研究：RI:Medium:MoDL:大型语言模型的数学和概念理解

基本信息

批准号：
2211780
负责人：
Tengyu Ma
金额：
$ 40万
依托单位：
Stanford University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-10-01 至 2025-09-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2211780&HistoricalAwards=false
关键词：
Collaborative Research RI Medium MoDL

项目摘要

Large language models (LLMs) have achieved unprecedented success in natural language processing (NLP). Since language models are being seen as a cornerstone of artificial intelligence in the near future, there is a need to be able to understand them, and to convey that understanding to regulators as well as the general public. These models are based on deep neural networks that are trained from vast quantities of text and have been demonstrated to be highly useful in performing tasks such as question answering, text classification, machine translation and summarization. Despite the huge empirical success, there is little understanding about their inner workings. This project seeks to bridge the gap by developing conceptual and mathematical understanding about training and using LLMs. The project will advance such understanding. The project will also seek to develop and disseminate instructional materials and draw on ideas from the project to impact ongoing programs at their institution to help increase participation in computing by individuals from underrepresented groups. The project has three components. (1) We will first build simplified generative models that capture the intrinsic structures of text, and analyze language models that are trained on texts from such generative models. (2) We then analyze why the learned language models can encode useful information that helps a wide range of downstream tasks. (3) Finally, we analyze and design new adaptation methods for downstream tasks with quantitative sample and computational efficiency guarantees. Education and outreach plans are integrated into this project: the investigators will develop a new introductory course in machine learning and disseminate instructional materials, mentor graduate and undergraduate students from underrepresented groups (through Princeton Freshman Scholars Institute, Stanford Summer Teacher Research Program, REU’s) and organize research workshops to promote conversations between the theoretical machine learning and NLP community.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

大型语言模型（llm）在自然语言处理（NLP）领域取得了前所未有的成功。由于语言模型在不久的将来被视为人工智能的基石，因此有必要能够理解它们，并将这种理解传达给监管机构和公众。这些模型基于深度神经网络，这些神经网络是从大量文本中训练出来的，并已被证明在执行诸如问答、文本分类、机器翻译和摘要等任务方面非常有用。尽管在经验上取得了巨大的成功，但人们对它们的内部运作却知之甚少。该项目旨在通过发展对培训和使用法学硕士的概念和数学理解来弥合差距。该项目将促进这种理解。该项目还将寻求开发和传播教学材料，并从项目中吸取思想，以影响他们所在机构正在进行的项目，以帮助来自代表性不足群体的个人增加对计算机的参与。该项目有三个组成部分。(1)我们将首先构建捕获文本内在结构的简化生成模型，并分析基于这些生成模型的文本训练的语言模型。(2)然后，我们分析了为什么学习的语言模型可以编码有用的信息，这有助于广泛的下游任务。(3)最后，分析设计了具有定量样本和计算效率保证的下游任务自适应方法。教育和推广计划被整合到这个项目中：研究人员将开发一门新的机器学习入门课程，传播教学材料，指导来自代表性不足群体的研究生和本科生（通过普林斯顿大学新生学者研究所，斯坦福大学暑期教师研究计划，REU），组织研究研讨会，促进理论机器学习和NLP社区之间的对话。该奖项反映了美国国家科学基金会的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。