权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: SCALE MoDL: Adaptivity of Deep Neural Networks

合作研究：SCALE MoDL：深度神经网络的适应性

基本信息

批准号：
2134106
负责人：
Simon Du
金额：
$ 30万
依托单位：
University of Washington
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-10-01 至 2024-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2134106&HistoricalAwards=false
关键词：
Collaborative Research SCALE MoDL Adaptivity

项目摘要

The overarching theme of the project is to systematically expand understanding of how deep neural networks (DNNs) work and why or when they are better than classical methods through the lens of "adaptivity." Adaptivity refers to the properties of an algorithm that take advantage of favorable structures in the input data without knowing that these structures exist. That is, adaptive algorithms are those that are free of tuning parameters and could automatically configure themselves to adapt to each input data. The anticipated outcome of the project includes a new theory that explains and quantifies the adaptivity of popular DNN models such as multi-layer perceptrons, self-attention mechanisms (namely, transformer models), and meta-learning. The theory could result in substantial savings in the statistical and computational complexity of these models, allowing them to be applied in resource-constrained settings and to have more environmentally friendly energy footprint. This project will also provide opportunities for students and postdocs to explore interdisciplinary research topics related to deep learning.Specifically, this project investigates (1) the "local adaptivity" of DNNs in estimating functions from noisy data; (2) the "relational adaptivity" of self-attention mechanism that parses a structure data point (such as an image or a chunk of text); and (3) the "task adaptivity" of multi-task and meta-learning algorithms that learn to share information across multiple tasks. The research covers some of the most popular DNN models. Technically the project leverages multiple branches of mathematics (such as function classes, nonparametric statistics, statistical learning theory, optimization, and compressed sensing) and involves innovations in the approximation-theoretic understanding, algorithmic insights, and statistical theory of DNNs. The new analytical tools to be developed are also of independent interest to the broader machine learning theory community.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

该项目的主要主题是通过“适应性”的视角，系统地扩大对深度神经网络(DNN)如何工作以及它们为什么或何时比经典方法更好的理解。自适应是指在不知道输入数据中存在有利结构的情况下利用这些结构的算法的特性。也就是说，自适应算法是那些不需要调整参数并且可以自动配置自身以适应每个输入数据的算法。该项目的预期结果包括一种新的理论，该理论解释并量化了流行的DNN模型的适应性，如多层感知器、自我注意机制(即，变压器模型)和元学习。这一理论可以大大节省这些模型的统计和计算复杂性，使它们能够应用于资源有限的环境中，并具有更环保的能源足迹。该项目还将为学生和博士后提供探索与深度学习相关的跨学科研究主题的机会。具体地说，该项目调查(1)DNN在从噪声数据中估计函数时的“局部适应性”；(2)自我注意机制的“关系适应性”，该机制解析结构数据点(如图像或文本块)；以及(3)多任务和元学习算法的“任务适应性”，该算法学习在多个任务之间共享信息。这项研究涵盖了一些最流行的DNN模型。从技术上讲，该项目利用了多个数学分支(如函数类、非参数统计、统计学习理论、优化和压缩传感)，并涉及DNN的近似理论理解、算法见解和统计理论方面的创新。即将开发的新分析工具也是更广泛的机器学习理论界的独立兴趣。这一奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。