权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: CIF:Medium:Theoretical Foundations of Compositional Learning in Transformer Models

合作研究：CIF：Medium：Transformer 模型中组合学习的理论基础

基本信息

批准号：
2403075
负责人：
Samet Oymak
金额：
$ 40万
依托单位：
Regents of the University of Michigan - Ann Arbor
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2024
资助国家：
美国
起止时间：
2024-07-01 至 2028-06-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2403075&HistoricalAwards=false
关键词：
Collaborative Research CIF Medium Theoretical

项目摘要

Large Language Models (LLMs) based on transformer architectures, such as GPT-4, Llama 2, and Claude 3, have demonstrated remarkable emergent capabilities in compositional reasoning, allowing them to tackle complex tasks by decomposing them into simpler intermediate steps. Examples to these tasks include text and code generation, basic arithmetic and problem solving, and answering complex questions. Despite these empirical advances, the underlying mechanics of these capabilities remain largely unexplored. This collaborative research project aims to investigate the theoretical foundations of compositional learning in transformer models, focusing on three key areas: model expressivity, statistical learning theory, and optimization, aiming to develop novel learning guarantees, algorithms, architectures, and design principles that significantly advance the development of more capable and interpretable Artificial Intelligence (AI) and LLM systems. The research findings will be incorporated into educational curricula, fostering a diverse community around transformers, compositional learning, and their applications. The project will also engage the broader public through workshops and outreach activities, promoting responsible AI practices and AI education for undergraduate and K-12 students.The first thrust will explore the expressive capacity of transformers augmented with loops, memory, and external tools, which are essential for compositional reasoning. The second thrust will examine the statistical properties of autoregressive training using compositional data to understand its limits, benefits, and ability to generalize to novel problem instances. This is expected to lead to new theories of compositional learning that will highlight the role of skill acquisition and composition. The third thrust will investigate the optimization principles of compositional learning with transformers. This research will shed light on the optimization landscape and identify techniques for more efficient training of transformers through compositional techniques.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

基于转换器体系结构的大型语言模型(LLM)，如GPT-4、Llama 2和Claude 3，在组合推理方面表现出非凡的紧急能力，允许它们通过将复杂任务分解为更简单的中间步骤来处理复杂任务。这些任务的例子包括文本和代码生成、基本算法和问题解决以及回答复杂问题。尽管取得了这些经验上的进展，但这些能力的潜在机制在很大程度上仍未被探索。这一合作研究项目旨在研究变压器模型中成分学习的理论基础，重点关注三个关键领域：模型表现力、统计学习理论和优化，旨在开发新的学习保证、算法、体系结构和设计原则，以显著促进更有能力和可解释的人工智能(AI)和LLM系统的发展。研究成果将被纳入教育课程，围绕变形金刚、成分学习及其应用培养一个多样化的社区。该项目还将通过研讨会和外联活动吸引更广泛的公众，促进负责任的人工智能实践和对本科生和K-12学生的人工智能教育。第一个推力将探索变压器的表达能力，这些都是成分推理所必需的，包括回路、记忆和外部工具。第二个重点将使用成分数据来检查自回归训练的统计特性，以了解其局限性、益处以及将其推广到新问题实例的能力。这有望导致新的作文学习理论，突出技能习得和作文的作用。第三个重点将研究与变压器的成分学习的优化原则。这项研究将阐明优化前景，并确定通过组合技术更有效地培训变压器的技术。该奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Samet Oymak其他文献

Learning Feature Nonlinearities with Non-Convex Regularized Binned Regression

使用非凸正则化分箱回归学习特征非线性

DOI：
发表时间：
2017
期刊：
arXiv.org
影响因子：
0
作者：
Samet Oymak;M. Mahdavi;Jiasi Chen
通讯作者：
Jiasi Chen

Phase retrieval for sparse signals using rank minimization

使用秩最小化对稀疏信号进行相位检索

DOI：
10.1109/icassp.2012.6288658
发表时间：
2012
期刊：
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
影响因子：
0
作者：
K. Jaganathan;Samet Oymak;B. Hassibi
通讯作者：
B. Hassibi