权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Random Neural Nets and Random Matrix Products

职业：随机神经网络和随机矩阵产品

基本信息

批准号：
2143754
负责人：
Boris Hanin
金额：
$ 57.72万
依托单位：
Princeton University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-06-01 至 2027-05-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2143754&HistoricalAwards=false
关键词：
CAREER Random Neural Nets Matrix

项目摘要

We live in an era of big data and inexpensive computation. Vast stores of information can efficiently be analyzed for underlying patterns by machine learning algorithms, leading to remarkable progress in applications ranging from self-driving cars to automatic drug discovery and machine translation. Underpinning many of these exciting practical developments is a class of computational models called neural networks. Originally developed in the 1940's and 1950's, the neural nets used today are as complex as they are powerful. The purpose of this project is to develop a range of principled techniques for understanding key aspects of how neural networks work in practice and how to make them better. The approach taken by this project is probabilistic and statistical in nature. Just as the ideal gas law accurately describes the large-scale properties of a gas directly through pressure, volume, and temperature without the need specify the state of each individual gas molecule, this project will explore and identify emergent statistical behaviors of large neural networks that provably explain many of their key properties observed in practice. The project will also provide research training and educational opportunities through organization of summer schools in machine learning for graduate students. At a high level, a neural network is a family of functions given by composing affine transformations with elementary non-linear operations. The simplest important kind of neural networks are roughly described by two parameters called depth and width. The former is the dimension of the spaces on which the affine transformations act and the latter is the number of compositions. The technical heart of this project is to understand the statistical behavior of such networks when the affine transformations are chosen at random. The starting point is an analytically tractable regime in which the network width is sent to infinity at fixed depth. In this infinite width limit, random networks converge to Gaussian processes and optimization of network parameters from their randomly chosen starting points reduces to a kernel method. Unfortunately, this concise description cannot capture what is perhaps the most important empirical property of neural networks, namely their ability to learn data-dependent features. Understanding how feature learning occurs is at the core of this project and requires new probabilistic and analytic tools for studying random neural networks at finite width. The basic idea is to perform perturbation theory around the infinite width limit, treating the reciprocal of the network width as a small parameter. The goal is then to obtain, to all orders in this reciprocal, the expressions for joint distribution of the values and derivatives (with respect to both model inputs and model parameters) of a random neural network. Such formulas have practical consequences for understanding the numerical stability of neural network training, suggesting principled settings for optimization hyper-parameters, and quantifying feature learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

我们生活在一个大数据和廉价计算的时代。通过机器学习算法，可以有效地分析大量信息存储的潜在模式，从而在从自动驾驶汽车到自动药物发现和机器翻译等应用领域取得显著进展。支撑许多这些令人兴奋的实际发展的是一类称为神经网络的计算模型。最初开发于20世纪40年代和50年代，今天使用的神经网络既复杂又强大。该项目的目的是开发一系列原则性技术，以了解神经网络在实践中如何工作的关键方面，以及如何使它们更好。该项目采用的方法是概率和统计性质的。正如理想气体定律直接通过压力，体积和温度准确描述气体的大尺度特性，而无需指定每个气体分子的状态，该项目将探索和识别大型神经网络的紧急统计行为，这些行为可以证明解释在实践中观察到的许多关键特性。该项目还将通过为研究生组织机器学习暑期学校，提供研究培训和教育机会。在高层次上，神经网络是通过将仿射变换与基本非线性运算组合而给出的函数族。最简单的神经网络可以用两个参数来描述，即深度和宽度。前者是仿射变换作用的空间的维数，后者是合成的数目。这个项目的技术核心是理解当随机选择仿射变换时这种网络的统计行为。起点是一个易于分析的区域，其中网络宽度在固定深度处被发送到无穷大。在这个无限宽的限制下，随机网络收敛到高斯过程，从随机选择的起点优化网络参数简化为核方法。不幸的是，这种简洁的描述无法捕捉到神经网络最重要的经验属性，即它们学习数据依赖特征的能力。了解特征学习是如何发生的是这个项目的核心，需要新的概率和分析工具来研究有限宽度的随机神经网络。其基本思想是在无限宽极限附近进行微扰理论，将网络宽度的倒数视为一个小参数。然后，我们的目标是获得，在这个倒数中的所有阶，随机神经网络的值和导数（相对于模型输入和模型参数）的联合分布的表达式。这些公式对于理解神经网络训练的数值稳定性、建议优化超参数的原则性设置以及量化特征学习具有实际意义。该奖项反映了NSF的法定使命，并通过使用基金会的智力价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（6）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Deep ReLU networks preserve expected length

深度 ReLU 网络保留预期长度

DOI：
发表时间：
2022
期刊：
ICLR
影响因子：
0
作者：
Hanin, B.;Jeong, R.;Rolnick, D.
通讯作者：
Rolnick, D.

Maximal Initial Learning Rates in Deep ReLU Networks

DOI：
10.48550/arxiv.2212.07295
发表时间：
2022-12
期刊：
ArXiv
影响因子：
0
作者：
Gaurav M. Iyer;B. Hanin;D. Rolnick
通讯作者：
Gaurav M. Iyer;B. Hanin;D. Rolnick

Random Neural Networks in the Infinite Width Limit as Gaussian Processes

DOI：
10.1214/23-aap1933
发表时间：
2021-07
期刊：
ArXiv
影响因子：
0
作者：
B. Hanin
通讯作者：
B. Hanin

Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit

DOI：
10.48550/arxiv.2309.16620
发表时间：
2023-09
期刊：
ArXiv
影响因子：
0
作者：
Blake Bordelon;Lorenzo Noci;Mufan Bill Li;Boris Hanin;C. Pehlevan
通讯作者：
Blake Bordelon;Lorenzo Noci;Mufan Bill Li;Boris Hanin;C. Pehlevan

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Boris Hanin其他文献

Interface asymptotics of Wigner—Weyl distributions for the Harmonic Oscillator

DOI：
10.1007/s11854-022-0209-4
发表时间：
2022-07-11
期刊：
JOURNAL D ANALYSE MATHEMATIQUE
影响因子：
0.900
作者：
Boris Hanin;Steve Zelditch
通讯作者：
Steve Zelditch

Les Houches Lectures on Deep Learning at Large & Infinite Width

Les Houches 深度学习讲座

DOI：
10.48550/arxiv.2309.01592
发表时间：
2023
期刊：
ArXiv
影响因子：
0
作者：
Yasaman Bahri;Boris Hanin;Antonin Brossollet;Vittorio Erba;Christian Keup;Rosalba Pacelli;James B. Simon
通讯作者：
James B. Simon

$$C^\infty $$ Scaling Asymptotics for the Spectral Projector of the Laplacian

DOI：
10.1007/s12220-017-9812-5
发表时间：
2017-05-27
期刊：
JOURNAL OF GEOMETRIC ANALYSIS
影响因子：
1.500
作者：
Yaiza Canzani;Boris Hanin
通讯作者：
Boris Hanin

Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems

您需要更多的 LLM 电话吗？

DOI：
10.48550/arxiv.2403.02419
发表时间：
2024
期刊：
ArXiv
影响因子：
0
作者：
Lingjiao Chen;Jared Quincy Davis;Boris Hanin;Peter D. Bailis;Ion Stoica;Matei Zaharia;James Zou
通讯作者：
James Zou