权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Theory and Algorithms for Learning with Frozen Pretrained Models

职业：使用冻结的预训练模型进行学习的理论和算法

基本信息

批准号：
2339978
负责人：
KANGWOOK LEE
金额：
$ 58.4万
依托单位：
University of Wisconsin-Madison
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2024
资助国家：
美国
起止时间：
2024-07-01 至 2029-06-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2339978&HistoricalAwards=false
关键词：
CAREER Theory Algorithms Learning Frozen

项目摘要

Adapting pretrained models to new tasks and new data is crucial for the democratization of modern machine learning. Application of large pretrained models to customized tasks or new data is difficult since full retraining of the model is not possible due to the huge size and inaccesibility of the original training data. The inaccessibility and large size of the training data also renders traditional domain adaptation techniques impractical. To overcome this challenge there has been shift towards modular adaptation methods that enable the fine-tuning of "frozen" pretrained models with minimal or no modifications to their internal parameters. Despite the success of these methods in some practical applications, there is a gap in theoretical understanding of the factors affecting the effectiveness of finetuning of pretrained models. This project develops a systematic framework that will lay theoretical foundations and lead to rigorous fine-tuning principles. The project has the potential to impact a wide range of scientific fields and industries that rely on adaptation of pretrained machine learning models. The project's broader impact includes the establishment of educational initiatives for advancing STEM, contributing to the development of the future workforce in modern machine learning.The project's goal is to establish a unified theory and devise new algorithms with provable guarantees for the emerging paradigm of learning with frozen pretrained models. A mathematical framework will be developed to facilitate the theoretical analysis of the expressive power of frozen pretrained models under various adaptation and fine-tuning methods. The project will investigate three adaptation strategies: (1) parameter-efficient fine-tuning, which updates a minimal portion of the pretrained model's parameters while keeping the rest unchanged; (2) input/output processing, which involves modifying the input entering or the output produced by the pretrained models; and (3) model composition, which constructs a system of multiple pretrained models to address more complex tasks. In addition to analyzing these methods, the project aims to develop novel adaptation algorithms with provable performance guarantees.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

使预训练模型适应新任务和新数据对于现代机器学习的民主化至关重要。将大型预训练模型应用于定制任务或新数据是困难的，因为由于原始训练数据的巨大规模和不可访问性，完全重新训练模型是不可能的。训练数据的不可访问性和大尺寸也使得传统的领域自适应技术不切实际。为了克服这一挑战，人们转向了模块化自适应方法，这种方法可以对“冻结”的预训练模型进行微调，而对其内部参数进行最小或不进行修改。尽管这些方法在一些实际应用中取得了成功，但对影响预训练模型微调有效性的因素的理论理解存在差距。该项目开发了一个系统的框架，将奠定理论基础，并导致严格的微调原则。该项目有可能影响广泛的科学领域和行业，这些领域和行业依赖于预先训练的机器学习模型的适应。该项目的更广泛影响包括建立推进STEM的教育计划，为现代机器学习的未来劳动力的发展做出贡献。该项目的目标是建立一个统一的理论，并设计新的算法，为使用冻结预训练模型的新兴学习范式提供可证明的保证。将开发一个数学框架，以便于在各种适应和微调方法下对冻结预训练模型的表达能力进行理论分析。该项目将研究三种适应策略：（1）参数有效的微调，更新预训练模型参数的最小部分，同时保持其余部分不变;（2）输入/输出处理，包括修改输入或预训练模型产生的输出;（3）模型组合，构建多个预训练模型的系统，以解决更复杂的任务。除了分析这些方法外，该项目还旨在开发具有可证明的性能保证的新型自适应算法。该奖项反映了NSF的法定使命，并且通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。