EAGER: Quantifying the error landscape of deep neural networks
EAGER:量化深度神经网络的错误情况
基本信息
- 批准号:2132995
- 负责人:
- 金额:$ 14.92万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-09-15 至 2022-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The remarkable success achieved by deep learning systems in a broad number of applications can be attributed to their ability to approximate complex functions well, their aptitude to being trained efficiently, and their good performance in predicting the values of unseen inputs. This last property, known as generalization, is particularly puzzling. It is observed that deep neural networks (DNNs) trained by the optimization algorithm known as stochastic gradient descent produce models that generalize well, particularly when the number of model parameters greatly exceeds the number of samples on which the model is trained. Traditional theory fails to explain these observations and new perspectives and means of investigation are necessary to elucidate these phenomena. To this end, statistical mechanics may provide methods and perspectives capable of addressing long-standing questions in deep learning. The energy landscape represents a common paradigm at the intersection of these fields: when training a DNN we descend the so- called “error landscape” towards a minimum corresponding to a particular choice of model parameters. Understanding generalization performance in DNNs amounts to understanding the interplay between the structure of the error landscape and the dynamics of the training algorithm that descends it. In particular, the concept of “flat minima” is gaining popularity as a possible explanation for these observations, but a rigorous approach for estimating flatness is lacking. We propose to employ a new class of methods developed within statistical mechanics to answer questions concerning the structure of the error landscapes of DNNs and to identify the relationship between the probability of finding a given solution, its flatness and its generalization performance. This line of investigation should have a significant impact on our understanding of generalization in deep learning systems with implications for high-stakes applications such as transportation, security and medicine.This proposal seeks to bring a new degree of rigor in the characterization of the error landscape of DNNs and how the interplay between landscape structure and optimization dynamics yield generalizable solutions. As a result, we will be able to elucidate why DNNs are endowed with low estimation error (i.e., high generalization performance). Such an understanding will represent a significant step forward in the development of a theory of deep learning. We aim to do so by exploiting state-of-the-science numerical techniques to measure the volume of basins of attraction in high-dimensional parameter spaces. We will measure the basin volume distributions and the associated flatness as a function of the number of parameters and the generalization performance of the network.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
深度学习系统在广泛的应用中取得的显著成功可以归因于它们很好地逼近复杂函数的能力,它们接受有效训练的能力,以及它们在预测看不见的输入的价值方面的良好表现。最后一个被称为泛化的性质尤其令人费解。人们观察到,由随机梯度下降优化算法训练的深度神经网络(DNN)产生的模型具有很好的泛化能力,特别是当模型参数的数量远远超过模型训练的样本数量时。传统的理论无法解释这些现象,需要新的研究视角和方法来解释这些现象。为此,统计力学可以提供能够解决深度学习中长期存在的问题的方法和视角。能量图景代表了这些领域交叉处的一个共同范例:当训练DNN时,我们将所谓的“误差图景”降低到与特定的模型参数选择相对应的最小值。理解DNN中的泛化性能相当于理解错误场景的结构和由此而来的训练算法的动态之间的相互作用。特别值得一提的是,“平坦度极小值”的概念作为对这些观测结果的可能解释越来越受欢迎,但缺乏一种估算平坦度的严格方法。我们建议使用统计力学中发展起来的一类新的方法来回答有关DNN的错误景观的结构问题,并确定找到给定解的概率、其平坦性和其泛化性能之间的关系。这条研究路线应该对我们对深度学习系统中泛化的理解产生重大影响,并对交通、安全和医学等高风险应用产生影响。该建议试图在描述DNN的错误场景以及场景结构和优化动态之间的相互作用如何产生可推广的解决方案方面带来新的严谨程度。因此,我们将能够解释为什么DNN具有低估计误差(即高泛化性能)。这样的理解将代表着在发展深度学习理论方面向前迈出的重要一步。我们的目标是通过利用最先进的数值技术来测量高维参数空间中吸引盆地的体积来实现这一点。我们将根据参数的数量和网络的泛化性能来测量盆地体积分布和相关的平坦度。这一奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Stefano Martiniani其他文献
Transport and Energetics of Bacterial Rectification
细菌整流的运输和能量学
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Satyam Anand;Xiaolei Ma;Shuo Guo;Stefano Martiniani;Xiang Cheng - 通讯作者:
Xiang Cheng
Monte Carlo sampling for stochastic weight functions
随机权重函数的蒙特卡罗采样
- DOI:
10.1073/pnas.1620497114 - 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
D. Frenkel;K. J. Schrenk;Stefano Martiniani - 通讯作者:
Stefano Martiniani
On the complexity of energy landscapes: algorithms and a direct test of the Edwards conjecture
关于能源景观的复杂性:算法和爱德华兹猜想的直接检验
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Stefano Martiniani - 通讯作者:
Stefano Martiniani
Structural analysis of high-dimensional basins of attraction.
高维吸引力盆地的结构分析。
- DOI:
10.1103/physreve.94.031301 - 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Stefano Martiniani;K. J. Schrenk;J. Stevenson;D. Wales;D. Frenkel - 通讯作者:
D. Frenkel
Vicsek model by time-interlaced compression: A dynamical computable information density.
时间交错压缩的 Vicsek 模型:动态可计算信息密度。
- DOI:
10.1103/physreve.103.062141 - 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
A. Cavagna;P. Chaikin;D. Levine;Stefano Martiniani;A. Puglisi;M. Viale - 通讯作者:
M. Viale
Stefano Martiniani的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Stefano Martiniani', 18)}}的其他基金
GOALI: Frameworks: At-Scale Heterogeneous Data based Adaptive Development Platform for Machine-Learning Models for Material and Chemical Discovery
GOALI:框架:基于大规模异构数据的自适应开发平台,用于材料和化学发现的机器学习模型
- 批准号:
2311632 - 财政年份:2023
- 资助金额:
$ 14.92万 - 项目类别:
Standard Grant
EAGER: Quantifying the error landscape of deep neural networks
EAGER:量化深度神经网络的错误情况
- 批准号:
2226387 - 财政年份:2022
- 资助金额:
$ 14.92万 - 项目类别:
Standard Grant
相似海外基金
CAREER: User-Based Simulation Methods for Quantifying Sources of Error and Bias in Recommender Systems
职业:基于用户的模拟方法,用于量化推荐系统中的错误和偏差来源
- 批准号:
2415042 - 财政年份:2023
- 资助金额:
$ 14.92万 - 项目类别:
Continuing Grant
EAGER: Quantifying the error landscape of deep neural networks
EAGER:量化深度神经网络的错误情况
- 批准号:
2226387 - 财政年份:2022
- 资助金额:
$ 14.92万 - 项目类别:
Standard Grant
Quantifying Error Growth to Improve Infectious Disease Forecast Accuracy
量化误差增长以提高传染病预测准确性
- 批准号:
10623347 - 财政年份:2021
- 资助金额:
$ 14.92万 - 项目类别:
Quantifying Error Growth to Improve Infectious Disease Forecast Accuracy
量化误差增长以提高传染病预测准确性
- 批准号:
10424587 - 财政年份:2021
- 资助金额:
$ 14.92万 - 项目类别:
Quantifying Error Growth to Improve Infectious Disease Forecast Accuracy
量化误差增长以提高传染病预测准确性
- 批准号:
10278807 - 财政年份:2021
- 资助金额:
$ 14.92万 - 项目类别:
CAREER: User-Based Simulation Methods for Quantifying Sources of Error and Bias in Recommender Systems
职业:基于用户的模拟方法,用于量化推荐系统中的错误和偏差来源
- 批准号:
1751278 - 财政年份:2018
- 资助金额:
$ 14.92万 - 项目类别:
Continuing Grant
Quantifying the margin for error in Tennis
量化网球的误差范围
- 批准号:
18K17846 - 财政年份:2018
- 资助金额:
$ 14.92万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
AutoSense: Quantifying Exposures to Addictive Substances and Psychosocial Stress
AutoSense:量化成瘾物质暴露和心理压力
- 批准号:
8116354 - 财政年份:2007
- 资助金额:
$ 14.92万 - 项目类别:
AutoSense: Quantifying Exposures to Addictive Substances and Psychosocial Stress
AutoSense:量化成瘾物质暴露和心理压力
- 批准号:
8115664 - 财政年份:2007
- 资助金额:
$ 14.92万 - 项目类别:
AutoSense: Quantifying Exposures to Addictive Substances and Psychosocial Stress
AutoSense:量化成瘾物质暴露和心理压力
- 批准号:
7613500 - 财政年份:2007
- 资助金额:
$ 14.92万 - 项目类别:














{{item.name}}会员




