权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

III: Small: Towards the Foundations of Training Deep Neural Networks: New Theory and Algorithms

III：小：迈向训练深度神经网络的基础：新理论和算法

基本信息

批准号：
2008981
负责人：
Quanquan Gu
金额：
$ 50万
依托单位：
University of California-Los Angeles
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2020
资助国家：
美国
起止时间：
2020-10-01 至 2024-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2008981&HistoricalAwards=false
关键词：
III Small Towards Foundations Training

项目摘要

Deep learning has achieved tremendous successes in the past decade. Despite these empirical successes, the theoretical understanding of deep learning is still largely falling behind. There exists a huge gap between the empirical successes of deep learning and conventional optimization and machine learning theories. This project aims to bridge this gap by establishing the theoretical foundations of deep learning to understand why and how it works, and use this theory to develop new models and algorithms. The expected outcome of this project includes new theories and the state-of-the-art approaches for deep learning. The project will push the frontier of deep learning and train next-generation researchers and practitioners in artificial intelligence. Research demonstrations and lab tours will be given to K-12 school students by showing the wide range of applications of AI and their connection to society, to motivate them to pursue a STEM discipline.This project consists of two synergistic research thrusts: (1) understanding the optimization dynamics of training algorithms such as stochastic gradient descent for deep learning models, and deriving algorithm-dependent generalization error bounds to assess their generalization performance; and (2) developing a new suite of faster training algorithms for deep learning, as well as principled neural architecture search algorithms guided by the generalization error bounds to design better neural network models. To evaluate the developed approaches, both theoretical analyses and extensive experimental evaluations will be performed on real-world benchmarks including but not limited to image classification and natural language processing. The open source software and course materials developed in this project will be made publicly available to the broader community, to help engineers and scientists better understand and apply deep learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

深度学习在过去十年中取得了巨大的成功。尽管取得了这些经验上的成功，但对深度学习的理论理解仍然很大程度上落后。深度学习的经验成功与传统优化和机器学习理论之间存在巨大差距。该项目旨在通过建立深度学习的理论基础来弥合这一差距，以了解它为什么以及如何工作，并使用该理论开发新的模型和算法。该项目的预期成果包括深度学习的新理论和最先进的方法。该项目将推动深度学习的前沿，并培养下一代人工智能研究人员和从业者。我们将为K-12学校的学生提供研究示范和实验室图尔斯参观，向他们展示人工智能的广泛应用及其与社会的联系，以激励他们追求STEM学科。该项目包括两个协同研究方向：（1）理解训练算法的优化动态，如深度学习模型的随机梯度下降，并推导出算法相关的泛化误差界，以评估其泛化性能;（2）开发一套新的更快的深度学习训练算法，以及由泛化误差界指导的原则性神经架构搜索算法，以设计更好的神经网络模型。为了评估所开发的方法，将对真实世界的基准进行理论分析和广泛的实验评估，包括但不限于图像分类和自然语言处理。该项目中开发的开源软件和课程材料将向更广泛的社区公开提供，以帮助工程师和科学家更好地理解和应用深度学习。该奖项反映了NSF的法定使命，并通过使用基金会的智力价值和更广泛的影响力审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（38）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise

DOI：
发表时间：
2021-01
期刊：
ArXiv
影响因子：
0
作者：
Spencer Frei;Yuan Cao;Quanquan Gu
通讯作者：
Spencer Frei;Yuan Cao;Quanquan Gu

Provable Robustness of Adversarial Training for Learning Halfspaces with Noise

DOI：
发表时间：
2021-04
期刊：
2023 IEEE International Conference on Quantum Computing and Engineering (QCE)
影响因子：
0
作者：
Difan Zou;Spencer Frei;Quanquan Gu
通讯作者：
Difan Zou;Spencer Frei;Quanquan Gu

Towards Understanding the Mixture-of-Experts Layer in Deep Learning

理解深度学习中的专家混合层

DOI：
发表时间：
2022
期刊：
Advances in neural information processing systems
影响因子：
0
作者：
Chen, Zixiang;Deng, Yihe;Wu, Yue;Gu, Quanquan;Li, Yuanzhi
通讯作者：
Li, Yuanzhi

How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?

DOI：
发表时间：
2019-11
期刊：
ArXiv
影响因子：
0
作者：
Zixiang Chen;Yuan Cao;Difan Zou;Quanquan Gu
通讯作者：
Zixiang Chen;Yuan Cao;Difan Zou;Quanquan Gu

UNDERSTANDING TRAIN-VALIDATION SPLIT IN META-LEARNING WITH NEURAL NETWORKS

了解神经网络元学习中的训练验证分割

DOI：
发表时间：
2023
期刊：
International Conference on Learning Representations (ICLR
影响因子：
0
作者：
Zuo, Xinzhe;Chen, Zixiang;Yao, Huaxiu;Cao, Yuan;Gu, Quqnquan
通讯作者：
Gu, Quqnquan

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Quanquan Gu其他文献

Different patterns of gray matter density in early- and middle-late-onset Parkinson’s disease a voxel-based morphometry study

早发和中晚发帕金森病灰质密度的不同模式：基于体素的形态测量研究

DOI：
10.1007/s11682-017-9745-4
发表时间：
2017
期刊：
Brain Imaging Behav
影响因子：
0
作者：
Min Xuan;Xiaojun Guan;Peiyu Huang;Zhujing Shen;Quanquan Gu;Xinfeng Yu;Xiaojun Xu;Wei Luo;Minming Zhang
通讯作者：
Minming Zhang