权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Theoretical foundations for deep learning and large-scale AI models

职业：深度学习和大规模人工智能模型的理论基础

基本信息

批准号：
2339904
负责人：
Song Mei
金额：
$ 45万
依托单位：
University of California-Berkeley
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2024
资助国家：
美国
起止时间：
2024-07-01 至 2029-06-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2339904&HistoricalAwards=false
关键词：
CAREER Theoretical foundations deep learning

项目摘要

Generative AI models have shown remarkable capabilities across various domains, making a transformative societal impact. However, their powerful capabilities present substantial challenges and risks due to limited theoretical foundations, especially regarding sensitive applications. The primary objective of this project is to establish a theoretical foundation for generative AI models including language models and diffusion models. The project will examine the capabilities and limitations of neural networks such as transformers and ResNets within these models, and develop techniques to interpret the algorithms implicitly implemented in these black-box systems. The theoretical investigation will leverage a diverse range of subjects including variational inference, sampling methods, high-dimensional statistics, computational complexity theory, and reinforcement learning theory. The results will provide valuable theoretical insights and promote the safe utilization of prevailing foundation models such as ChatGPT and DALLE. This project will establish a theoretical foundation to elucidate the capabilities and limitations of language models and diffusion models. The project will investigate three key learning modalities: in-context learning, generative modeling, and decision making. For in-context learning, this project will analyze which algorithms transformers can implicitly implement, develop techniques to interpret the algorithms implemented in transformers, and provide guarantees on optimization and generalization during meta-training. This project will derive conditions for neural networks to represent high-dimensional score functions for diffusion-based generative modeling. For decision-making, the project will reveal how neural networks can be meta-trained to approximate bandit and reinforcement learning algorithms and investigate approaches to employing neural networks as decision-making agents. The outcomes will guide principled design and responsible deployment of AI models across disciplines. The activities include graduate student training and new course developments.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

生成性人工智能模型在各个领域都显示出了非凡的能力，产生了革命性的社会影响。然而，由于有限的理论基础，特别是关于敏感应用的理论基础，它们的强大能力带来了巨大的挑战和风险。本项目的主要目标是为生成性人工智能模型建立理论基础，包括语言模型和扩散模型。该项目将检查这些模型中的神经网络(如变压器和ResNet)的能力和局限性，并开发技术来解释在这些黑盒系统中隐含实现的算法。理论研究将利用一系列不同的学科，包括变分推理、抽样方法、高维统计、计算复杂性理论和强化学习理论。研究结果将提供有价值的理论见解，并促进CHATGPT和DALE等主流基础模型的安全使用。本项目将为阐明语言模型和扩散模型的能力和局限性奠定理论基础。该项目将调查三种关键的学习模式：情景学习、生成性建模和决策制定。对于情景学习，该项目将分析转换器可以隐式实现哪些算法，开发解释转换器中实现的算法的技术，并在元培训期间提供优化和泛化的保证。该项目将为神经网络表示基于扩散的生成性建模的高维得分函数提供条件。在决策方面，该项目将揭示如何对神经网络进行元训练，以近似强盗和强化学习算法，并研究将神经网络用作决策代理的方法。研究结果将指导跨学科的人工智能模型的原则性设计和负责任的部署。这些活动包括研究生培训和新课程开发。这一奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。