从频率原则理解深度学习-猫眼课题宝

权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

从频率原则理解深度学习

结题报告

批准号：

62002221

项目类别：

青年科学基金项目

资助金额：

24.0 万元

负责人：

许志钦

依托单位：

上海交通大学

学科分类：

数据科学与大数据计算

结题年份：

2023

批准年份：

2020

项目状态：

已结题

项目参与者：

许志钦

关键词：

深度学习理论泛化深度学习梯度下降频率原则

国基评审专家1V1指导中标率高出同行96.8%

中文摘要

深度学习在许多领域取得了成功。然而，由于理论缺失，深度学习存在诸多问题，如依赖调参和安全性低等。申请者在2018年发现深度学习通常先收敛目标函数中的低频成分，即频率原则。频率原则启发了一系列国际同行的实验、理论和算法设计的基础研究，呈现出很大的潜力可以解释深度学习在不同问题中的泛化性能。然而，由于高维和非线性，频率原则在高维数据的任务中是否普遍成立仍有待验证，其相关的理论机制仍不清楚，目前的工作仍未定量地把频率原则和泛化性联系起来。在申请者的前期工作的基础上，本项目将设计高维实验验证频率原则在真实数据中的普遍性和阐明一个重要的理论机制，即激活函数在频域空间的衰减特性导致了频率原则，并定量地建立频率原则与深度学习泛化误差的关系。本项目从实验和理论对深度学习的训练过程和泛化误差进行了深入的分析，这将有助于认识深度学习的优势和不足，有助于指导神经网络的设计，有助于提高算法的鲁棒性和训练速度等。

英文摘要

Deep learning has achieved much success in many fields. However, without theoretical support, many problems emerge in deep learning, such as the high dependence of parameter tuning and low security. In 2018, the applicant found that deep learning usually first converges low-frequency components of the target function, known as the Frequency Principle (F-Principle). F-principle has inspired a series of basic researches on experiments, theories and algorithm design of deep learning, showing great potential to explain the generalization performance of deep learning in different problems. However, due to the curse of high dimensionality and highly nonlinearity, whether F-Principle is universally valid in the task of high-dimensional data has yet to be verified, and the relevant theoretical mechanism is still unclear. The current work has not yet quantitatively linked F-Principle with the generalization of deep learning. Based on the applicant's preliminary work, this project will design high-dimensional experiments to verify the universality of F-Principle in real data and demonstrate an important theoretical mechanism, that is, the attenuation of activation function in the frequency space leads to F-Principle, and will quantitatively establish the relationship between F-Principle and the generalization error of deep learning. The theory of this project, illustrating an underlying mechanism of generalization error and the training behavior in deep learning, will help to recognize the advantages and disadvantages of deep learning, guide the design of neural networks, improve the robustness of algorithms and accelerate the training of algorithms.

期刊论文列表

专著列表

科研奖励列表

会议论文列表

专利列表

Embedding Principle: a hierarchical structure of loss landscape of deep neural networks

DOI：10.4208/jml.220108

发表时间：2021-11

期刊：

ArXiv

影响因子：--

作者：

Yaoyu Zhang;Yuqing Li;Zhongwang Zhang;Tao Luo;Z. Xu

通讯作者：Yaoyu Zhang;Yuqing Li;Zhongwang Zhang;Tao Luo;Z. Xu

Theory of the Frequency Principle for General Deep Neural Networks

DOI：10.4208/csiam-am.so-2020-0005

发表时间：2019-06

期刊：

ArXiv

影响因子：--

作者：

Tao Luo;Zheng Ma;Zhi-Qin John Xu;Yaoyu Zhang

通讯作者：Tao Luo;Zheng Ma;Zhi-Qin John Xu;Yaoyu Zhang

Subspace decomposition based DNN algorithm for elliptic type multi-scale PDEs

DOI：10.1016/j.jcp.2023.112242

发表时间：2023

期刊：

Journal of Computational Physics

影响因子：--

作者：

Xi-An Li;Zhi-Qin John Xu;Lei Zhang

通讯作者：Lei Zhang

Implicit Bias in Understanding Deep Learning for Solving PDEs Beyond Ritz-Galerkin Method

DOI：10.4208/csiam-am.so-2020-0006

发表时间：2022

期刊：

CSIAM transaction on applied mathematics

影响因子：--

作者：

Jihong Wang;Zhi-Qin John Xu;Jiwei Zhang;Yaoyu Zhang

通讯作者：Yaoyu Zhang

A multi-scale sampling method for accurate and robust deep neural network to predict combustion chemical kinetics

DOI：10.1016/j.combustflame.2022.112319

发表时间：2022-01

期刊：

ArXiv

影响因子：--

作者：

Tianhan Zhang;Yuxiao Yi;Yifan Xu;Z. X. Chen;Yaoyu Zhang;E. Weinan;Zhi-Qin John Xu

通讯作者：Tianhan Zhang;Yuxiao Yi;Yifan Xu;Z. X. Chen;Yaoyu Zhang;E. Weinan;Zhi-Qin John Xu

国内基金

海外基金

会员权益说明：