Statistical and Computational Foundations of Deep Generative Models
深度生成模型的统计和计算基础
基本信息
- 批准号:2134216
- 负责人:
- 金额:$ 115万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-09-01 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Complex data are continuously generated across all areas of science and engineering on a daily basis, from photographs or news articles to biological or cosmological experiments. In order to extract meaningful information out of this stream of material, it is necessary to build appropriate statistical models that faithfully represent each data modality. Indeed, such statistical models are critical to assess the expected performance of data analysis methods on future events, and form a key component of several data processing pipelines called `inverse problems’. For example, removing noise and defects from an image, or predicting the most likely folding of a protein are instances of inverse problems that at their core require a faithful statistical model of the desired output. The main goal of this project is to advance the theoretical foundations of statistical models based on neural networks. Such classes of models provide greater flexibility than traditional statistical modeling, but as a result are harder to analyze and manipulate. The investigators will cover a wide background in machine learning, probability, statistics, and mathematical physics; their combined expertise will result in guiding principles to combine neural networks into a theoretically sound statistical modeling, as well as novel algorithms with statistical guarantees. The research outcomes will be directly applicable to a wide range of problems in science and engineering, ranging from cosmology, climate modeling, chemistry, and signal processing, and they will be tightly integrated into educational courses. The success of deep learning (DL) across science and engineering suggests that Deep Neural Networks (DNN) are effective function approximation models for complex high-dimensional data, yet the reasons for such capability are still poorly understood. To make headway on this problem, this project focuses on generative probabilistic modeling. Understanding their inner-workings is essential to explaining the success of DL on typical problem instances, as opposed to worst-case (too pessimistic) or unstructured (too simplistic) data distributions. Additionally, probabilistic models are at the core of computational tools used in many scientific disciplines, yet they often rely on domain expertise preventing them to scale efficiently with dimension. This project puts forward a unified view on generative modeling that simultaneously addresses approximation, estimation, and optimization aspects. Specifically, it covers both explicit modeling, given by Boltzmann-Gibbs distributions, and implicit modeling, given by Transport-based models (Generative Adversarial Networks, Normalizing Flows). It will establish guarantees of learning and sampling from these models when using DNNs as function approximation. This project will rely on methods for importance-sampling developed in computational sciences (such as Replica Exchange and Thermodynamic Integration) and upgrade them to operate alongside DNNs. It will also derive novel algorithms that combine implicit with explicit generative modeling. Finally, it will exploit physical priors such as symmetries and multiscale structure, and assess their benefits on challenging domains such as molecular prediction, turbulence, statistical mechanics, and exploration for reinforcement learning. The investigators have combined expertise in all these areas, making them well qualified to carry out the project.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
复杂的数据每天都在科学和工程的各个领域不断产生,从照片或新闻文章到生物学或宇宙学实验。为了从这些材料流中提取有意义的信息,有必要构建适当的统计模型,忠实地表示每种数据模式。事实上,这样的统计模型对于评估数据分析方法对未来事件的预期性能是至关重要的,并且形成了称为“逆问题”的几个数据处理管道的关键组成部分。例如,从图像中去除噪声和缺陷,或预测最可能折叠的蛋白质都是逆问题的实例,其核心是需要期望输出的忠实统计模型。本项目的主要目标是推进基于神经网络的统计模型的理论基础。这类模型提供了比传统统计建模更大的灵活性,但结果是更难分析和操作。研究人员将涵盖机器学习,概率论,统计学和数学物理的广泛背景;他们结合的专业知识将产生指导原则,将神经网络结合到理论上健全的统计模型中,以及具有统计保证的新算法。研究成果将直接应用于宇宙学、气候模型、化学、信号处理等科学和工程领域的广泛问题,并将与教育课程紧密结合。深度学习(DL)在科学和工程领域的成功表明,深度神经网络(DNN)是复杂高维数据的有效函数近似模型,但这种能力的原因仍然知之甚少。为了在这个问题上取得进展,本项目侧重于生成概率建模。理解它们的内部工作原理对于解释深度学习在典型问题实例上的成功至关重要,而不是最坏情况(过于悲观)或非结构化(过于简单)的数据分布。此外,概率模型是许多科学学科中使用的计算工具的核心,但它们通常依赖于领域专业知识,使它们无法有效地随维度扩展。这个项目提出了一个统一的观点,同时解决近似,估计和优化方面的生成建模。具体来说,它涵盖了Boltzmann-Gibbs分布给出的显式建模和基于传输的模型(生成对抗网络,规范化流)给出的隐式建模。当使用dnn作为函数逼近时,它将从这些模型中建立学习和采样的保证。该项目将依赖于计算科学中开发的重要采样方法(如副本交换和热力学集成),并将其升级为与深度神经网络一起运行。它还将推导出结合隐式和显式生成建模的新算法。最后,它将利用物理先验,如对称性和多尺度结构,并评估它们在具有挑战性的领域,如分子预测、湍流、统计力学和探索强化学习方面的好处。调查人员综合了所有这些领域的专业知识,使他们完全有资格执行该项目。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(16)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Learning Optimal Flows for Non-Equilibrium Importance Sampling
学习非平衡重要性抽样的最优流程
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Cao, Yu;Vanden-Eijnden, Eric
- 通讯作者:Vanden-Eijnden, Eric
Lattice-Based Methods Surpass Sum-of-Squares in Clustering
- DOI:
- 发表时间:2021-12
- 期刊:
- 影响因子:0
- 作者:Ilias Zadik;M. Song;Alexander S. Wein;Joan Bruna
- 通讯作者:Ilias Zadik;M. Song;Alexander S. Wein;Joan Bruna
Conditionally Strongly Log-Concave Generative Models
- DOI:10.48550/arxiv.2306.00181
- 发表时间:2023-05
- 期刊:
- 影响因子:0
- 作者:Florentin Guth;Etienne Lempereur;Joan Bruna;S. Mallat
- 通讯作者:Florentin Guth;Etienne Lempereur;Joan Bruna;S. Mallat
Depth Separation beyond Radial Functions
超越径向函数的深度分离
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:6
- 作者:Luca Venturi;Samy Jelassi;Tristan Ozuc;Joan Bruna
- 通讯作者:Joan Bruna
Building Normalizing Flows with Stochastic Interpolants
- DOI:10.48550/arxiv.2209.15571
- 发表时间:2022-09
- 期刊:
- 影响因子:0
- 作者:M. S. Albergo;E. Vanden-Eijnden
- 通讯作者:M. S. Albergo;E. Vanden-Eijnden
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Eric Vanden-Eijnden其他文献
Mapping Co Diffusion Paths in Myoglobin with the Single Sweep Method
- DOI:
10.1016/j.bpj.2009.12.3109 - 发表时间:
2010-01-01 - 期刊:
- 影响因子:
- 作者:
Luca Maragliano;Grazia Cottone;Giovanni Ciccotti;Eric Vanden-Eijnden - 通讯作者:
Eric Vanden-Eijnden
Force-Clamp Analysis Techniques Give Highest Rank to Stretched Exponential Unfolding Kinetics in Ubiquitin
- DOI:
10.1016/j.bpj.2012.10.022 - 发表时间:
2012-11-21 - 期刊:
- 影响因子:
- 作者:
Herbert Lannon;Eric Vanden-Eijnden;J. Brujic - 通讯作者:
J. Brujic
Kinetics of phase transitions in two dimensional Ising models studied with the string method
- DOI:
10.1007/s10910-008-9376-5 - 发表时间:
2008-05-17 - 期刊:
- 影响因子:2.000
- 作者:
Maddalena Venturoli;Eric Vanden-Eijnden;Giovanni Ciccotti - 通讯作者:
Giovanni Ciccotti
Metastability of the Nonlinear Wave Equation: Insights from Transition State Theory
- DOI:
10.1007/s00332-016-9358-x - 发表时间:
2017-01-03 - 期刊:
- 影响因子:2.600
- 作者:
Katherine A. Newhall;Eric Vanden-Eijnden - 通讯作者:
Eric Vanden-Eijnden
Eric Vanden-Eijnden的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Eric Vanden-Eijnden', 18)}}的其他基金
DMS-EPSRC Collaborative Research: Sharp Large Deviation Estimates of Fluctuations in Stochastic Hydrodynamic Systems
DMS-EPSRC 合作研究:随机水动力系统波动的急剧大偏差估计
- 批准号:
2012510 - 财政年份:2020
- 资助金额:
$ 115万 - 项目类别:
Standard Grant
Collaborative Research: Computation of instantons in complex nonlinear systems.
合作研究:复杂非线性系统中瞬子的计算。
- 批准号:
1522767 - 财政年份:2016
- 资助金额:
$ 115万 - 项目类别:
Standard Grant
Collaborative Research: On-the-fly free energy parameterization in molecular simulations
合作研究:分子模拟中的动态自由能参数化
- 批准号:
1207432 - 财政年份:2012
- 资助金额:
$ 115万 - 项目类别:
Standard Grant
Numerical methods for the moving contact line problem
动接触线问题的数值方法
- 批准号:
1114827 - 财政年份:2011
- 资助金额:
$ 115万 - 项目类别:
Standard Grant
Workshop on Modern Perspectives in Applied Mathematics; New York City, NY
应用数学现代视角研讨会;
- 批准号:
0904087 - 财政年份:2009
- 资助金额:
$ 115万 - 项目类别:
Standard Grant
Collaborative Research: Multiscale Methods for the Molecular Simulation of Sensory Mechanotransduction Channels
合作研究:感觉机械传导通道分子模拟的多尺度方法
- 批准号:
0718172 - 财政年份:2007
- 资助金额:
$ 115万 - 项目类别:
Standard Grant
AMC-SS: Theory and Modeling of Rare Events
AMC-SS:罕见事件的理论和建模
- 批准号:
0708140 - 财政年份:2007
- 资助金额:
$ 115万 - 项目类别:
Standard Grant
CAREER: Transition Pathways in Complex Systems. Theory and Numerical Methods.
职业:复杂系统中的过渡途径。
- 批准号:
0239625 - 财政年份:2003
- 资助金额:
$ 115万 - 项目类别:
Standard Grant
Statistical Description of Stochastic Dynamical Systems
随机动力系统的统计描述
- 批准号:
0209959 - 财政年份:2002
- 资助金额:
$ 115万 - 项目类别:
Standard Grant
相似国自然基金
Computational Methods for Analyzing Toponome Data
- 批准号:60601030
- 批准年份:2006
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
ProbAI: A Hub for the Mathematical and Computational Foundations of Probabilistic AI
ProbAI:概率人工智能的数学和计算基础中心
- 批准号:
EP/Y028783/1 - 财政年份:2024
- 资助金额:
$ 115万 - 项目类别:
Research Grant
Collaborative Research: Foundations of programmable living materials through synthetic biofilm engineering and quantitative computational modeling
合作研究:通过合成生物膜工程和定量计算建模为可编程生物材料奠定基础
- 批准号:
2214021 - 财政年份:2023
- 资助金额:
$ 115万 - 项目类别:
Standard Grant
CRCNS: Computational Foundations for Externalizing/Internalizing Psychopathology
CRCNS:外化/内化精神病理学的计算基础
- 批准号:
10831117 - 财政年份:2023
- 资助金额:
$ 115万 - 项目类别:
CAREER: Computational Foundations of Modern Machine Learning
职业:现代机器学习的计算基础
- 批准号:
2239265 - 财政年份:2023
- 资助金额:
$ 115万 - 项目类别:
Continuing Grant
Collaborative Research: Foundations of programmable living materials through synthetic biofilm engineering and quantitative computational modeling
合作研究:通过合成生物膜工程和定量计算建模为可编程生物材料奠定基础
- 批准号:
2214020 - 财政年份:2023
- 资助金额:
$ 115万 - 项目类别:
Standard Grant
Foundations of Computational Mathematics Conference – FoCM 2023
计算数学基础会议 – FoCM 2023
- 批准号:
2232812 - 财政年份:2022
- 资助金额:
$ 115万 - 项目类别:
Standard Grant
Computational foundations of active visual sensing
主动视觉传感的计算基础
- 批准号:
10431247 - 财政年份:2022
- 资助金额:
$ 115万 - 项目类别:
Foundations of quantum computational advantage (FoQaCiA)
量子计算优势的基础 (FoQaCiA)
- 批准号:
569582-2021 - 财政年份:2022
- 资助金额:
$ 115万 - 项目类别:
Alliance Grants
The foundations of quantum computational advantage
量子计算优势的基础
- 批准号:
RGPIN-2022-03103 - 财政年份:2022
- 资助金额:
$ 115万 - 项目类别:
Discovery Grants Program - Individual
Computational Foundations of Machine Learning in the Era of Big Data
大数据时代机器学习的计算基础
- 批准号:
RGPIN-2017-05032 - 财政年份:2022
- 资助金额:
$ 115万 - 项目类别:
Discovery Grants Program - Individual