EAGER: Quantifying the error landscape of deep neural networks

EAGER:量化深度神经网络的错误情况

基本信息

  • 批准号:
    2226387
  • 负责人:
  • 金额:
    $ 14.92万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-02-15 至 2024-08-31
  • 项目状态:
    已结题

项目摘要

The remarkable success achieved by deep learning systems in a broad number of applications can be attributed to their ability to approximate complex functions well, their aptitude to being trained efficiently, and their good performance in predicting the values of unseen inputs. This last property, known as generalization, is particularly puzzling. It is observed that deep neural networks (DNNs) trained by the optimization algorithm known as stochastic gradient descent produce models that generalize well, particularly when the number of model parameters greatly exceeds the number of samples on which the model is trained. Traditional theory fails to explain these observations and new perspectives and means of investigation are necessary to elucidate these phenomena. To this end, statistical mechanics may provide methods and perspectives capable of addressing long-standing questions in deep learning. The energy landscape represents a common paradigm at the intersection of these fields: when training a DNN we descend the so- called “error landscape” towards a minimum corresponding to a particular choice of model parameters. Understanding generalization performance in DNNs amounts to understanding the interplay between the structure of the error landscape and the dynamics of the training algorithm that descends it. In particular, the concept of “flat minima” is gaining popularity as a possible explanation for these observations, but a rigorous approach for estimating flatness is lacking. We propose to employ a new class of methods developed within statistical mechanics to answer questions concerning the structure of the error landscapes of DNNs and to identify the relationship between the probability of finding a given solution, its flatness and its generalization performance. This line of investigation should have a significant impact on our understanding of generalization in deep learning systems with implications for high-stakes applications such as transportation, security and medicine.This proposal seeks to bring a new degree of rigor in the characterization of the error landscape of DNNs and how the interplay between landscape structure and optimization dynamics yield generalizable solutions. As a result, we will be able to elucidate why DNNs are endowed with low estimation error (i.e., high generalization performance). Such an understanding will represent a significant step forward in the development of a theory of deep learning. We aim to do so by exploiting state-of-the-science numerical techniques to measure the volume of basins of attraction in high-dimensional parameter spaces. We will measure the basin volume distributions and the associated flatness as a function of the number of parameters and the generalization performance of the network.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
深度学习系统在广泛的应用中取得的巨大成功可以归因于它们能够很好地近似复杂函数,它们能够有效地进行训练,以及它们在预测未知输入值方面的良好表现。最后一个属性,被称为泛化,特别令人困惑。据观察,由称为随机梯度下降的优化算法训练的深度神经网络(DNN)产生的模型泛化能力很好,特别是当模型参数的数量大大超过模型训练的样本数量时。传统理论无法解释这些现象,需要新的视角和研究手段来阐明这些现象。为此,统计力学可以提供能够解决深度学习中长期存在的问题的方法和观点。能量景观代表了这些领域交叉点的一种常见范式:当训练DNN时,我们将所谓的“误差景观”降到与模型参数的特定选择相对应的最小值。理解DNN中的泛化性能相当于理解错误景观的结构和训练算法的动态之间的相互作用。特别是,“平坦最小值”的概念越来越流行,作为这些观察结果的可能解释,但缺乏估计平坦度的严格方法。我们建议采用统计力学中开发的一类新方法来回答有关DNN错误景观结构的问题,并确定找到给定解决方案的概率,其平坦性及其泛化性能之间的关系。这条研究路线应该对我们理解深度学习系统中的泛化产生重大影响,并对运输,安全和医学等高风险应用产生影响。该提案旨在为DNN的错误景观的表征以及景观结构和优化动态之间的相互作用如何产生可推广的解决方案带来新的严格程度。因此,我们将能够阐明为什么DNN具有低估计误差(即,高泛化性能)。这样的理解将代表着深度学习理论发展的重要一步。我们的目标是这样做,利用国家的科学数值技术来衡量的吸引力在高维参数空间盆地的体积。我们将测量流域体积分布和相关的平坦度,作为网络参数数量和泛化性能的函数。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Estimating random close packing in polydisperse and bidisperse hard spheres via an equilibrium model of crowding
  • DOI:
    10.1063/5.0137111
  • 发表时间:
    2023-01-28
  • 期刊:
  • 影响因子:
    4.4
  • 作者:
    Anzivino, Carmine;Casiulis, Mathias;Zaccone, Alessio
  • 通讯作者:
    Zaccone, Alessio
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Stefano Martiniani其他文献

Transport and Energetics of Bacterial Rectification
细菌整流的运输和能量学
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Satyam Anand;Xiaolei Ma;Shuo Guo;Stefano Martiniani;Xiang Cheng
  • 通讯作者:
    Xiang Cheng
Monte Carlo sampling for stochastic weight functions
随机权重函数的蒙特卡罗采样
On the complexity of energy landscapes: algorithms and a direct test of the Edwards conjecture
关于能源景观的复杂性:算法和爱德华兹猜想的直接检验
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Stefano Martiniani
  • 通讯作者:
    Stefano Martiniani
Structural analysis of high-dimensional basins of attraction.
高维吸引力盆地的结构分析。
  • DOI:
    10.1103/physreve.94.031301
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Stefano Martiniani;K. J. Schrenk;J. Stevenson;D. Wales;D. Frenkel
  • 通讯作者:
    D. Frenkel
Vicsek model by time-interlaced compression: A dynamical computable information density.
时间交错压缩的 Vicsek 模型:动态可计算信息密度。
  • DOI:
    10.1103/physreve.103.062141
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    0
  • 作者:
    A. Cavagna;P. Chaikin;D. Levine;Stefano Martiniani;A. Puglisi;M. Viale
  • 通讯作者:
    M. Viale

Stefano Martiniani的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Stefano Martiniani', 18)}}的其他基金

GOALI: Frameworks: At-Scale Heterogeneous Data based Adaptive Development Platform for Machine-Learning Models for Material and Chemical Discovery
GOALI:框架:基于大规模异构数据的自适应开发平台,用于材料和化学发现的机器学习模型
  • 批准号:
    2311632
  • 财政年份:
    2023
  • 资助金额:
    $ 14.92万
  • 项目类别:
    Standard Grant
EAGER: Quantifying the error landscape of deep neural networks
EAGER:量化深度神经网络的错误情况
  • 批准号:
    2132995
  • 财政年份:
    2021
  • 资助金额:
    $ 14.92万
  • 项目类别:
    Standard Grant

相似海外基金

CAREER: User-Based Simulation Methods for Quantifying Sources of Error and Bias in Recommender Systems
职业:基于用户的模拟方法,用于量化推荐系统中的错误和偏差来源
  • 批准号:
    2415042
  • 财政年份:
    2023
  • 资助金额:
    $ 14.92万
  • 项目类别:
    Continuing Grant
Quantifying Error Growth to Improve Infectious Disease Forecast Accuracy
量化误差增长以提高传染病预测准确性
  • 批准号:
    10623347
  • 财政年份:
    2021
  • 资助金额:
    $ 14.92万
  • 项目类别:
Quantifying Error Growth to Improve Infectious Disease Forecast Accuracy
量化误差增长以提高传染病预测准确性
  • 批准号:
    10424587
  • 财政年份:
    2021
  • 资助金额:
    $ 14.92万
  • 项目类别:
Quantifying Error Growth to Improve Infectious Disease Forecast Accuracy
量化误差增长以提高传染病预测准确性
  • 批准号:
    10278807
  • 财政年份:
    2021
  • 资助金额:
    $ 14.92万
  • 项目类别:
EAGER: Quantifying the error landscape of deep neural networks
EAGER:量化深度神经网络的错误情况
  • 批准号:
    2132995
  • 财政年份:
    2021
  • 资助金额:
    $ 14.92万
  • 项目类别:
    Standard Grant
CAREER: User-Based Simulation Methods for Quantifying Sources of Error and Bias in Recommender Systems
职业:基于用户的模拟方法,用于量化推荐系统中的错误和偏差来源
  • 批准号:
    1751278
  • 财政年份:
    2018
  • 资助金额:
    $ 14.92万
  • 项目类别:
    Continuing Grant
Quantifying the margin for error in Tennis
量化网球的误差范围
  • 批准号:
    18K17846
  • 财政年份:
    2018
  • 资助金额:
    $ 14.92万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
AutoSense: Quantifying Exposures to Addictive Substances and Psychosocial Stress
AutoSense:量化成瘾物质暴露和心理压力
  • 批准号:
    8116354
  • 财政年份:
    2007
  • 资助金额:
    $ 14.92万
  • 项目类别:
AutoSense: Quantifying Exposures to Addictive Substances and Psychosocial Stress
AutoSense:量化成瘾物质暴露和心理压力
  • 批准号:
    8115664
  • 财政年份:
    2007
  • 资助金额:
    $ 14.92万
  • 项目类别:
AutoSense: Quantifying Exposures to Addictive Substances and Psychosocial Stress
AutoSense:量化成瘾物质暴露和心理压力
  • 批准号:
    7613500
  • 财政年份:
    2007
  • 资助金额:
    $ 14.92万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了