权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: Probabilistic, Geometric, and Topological Analysis of Neural Networks, From Theory to Applications

合作研究：神经网络的概率、几何和拓扑分析，从理论到应用

基本信息

批准号：
2133806
负责人：
Boris Hanin
金额：
$ 50万
依托单位：
Princeton University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-01-01 至 2024-12-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2133806&HistoricalAwards=false
关键词：
Collaborative Research Probabilistic Geometric Topological

项目摘要

One of the most exciting technical developments of the last decade is the widespread adoption of a family of algorithms called neural networks, used in cutting-edge industrial applications ranging from self-driving cars to predicting the three-dimensional shapes of proteins from their amino acid sequences. The goals of this project are twofold. First, the investigators seek to use tools from mathematics (specifically probability and combinatorics) to better understand how neural networks behave and then to fashion this understanding into new, more efficient, and safer algorithms. This involves a collaborative effort between mathematicians, computer scientists, and electrical engineers. The project team seeks to unravel a fundamental mystery: why is it that neural networks appear to be incredibly complex, yet despite their seeing intricacy, still learn parsimonious and useful ways of making predictions? Put another way, the investigators aim to define and analyze different mathematical notions of neural network complexity and then to use them as theoretically grounded guides in the search for ever more efficient and interpretable algorithms related to neural networks. The second goal is to create a series of educational resources, ranging from videos to course notes, that will enable various segments of society at large (e.g. students, policy makers, scientists, and so on) to engage with and get a usable appreciation for the ideas, challenges, and opportunities surrounding modern neural networks. The research in this project consists of three interconnected parts. The first is a probabilistic analysis of a variety of neural network complexity measures before, during, and after training. Relevant tools come from probability, functional analysis, information theory, and geometry. Key theoretical questions include quantifying implicit bias and bounding generalization error for learning structured functions. The second is a topological and geometric analysis of both individual ReLU network functions and spaces of ReLU networks. Relevant tools come from Morse Theory and low-dimensional topology. Key theoretical questions hinge on understanding topological implicit bias and topological depth separation. Finally, the investigators seek theory-guided insights for applied deep learning via (i) principled, efficient neural architecture search using average case complexity measures as surrogates for practical expressivity, trainability, and generalization and (ii) novel approaches to model compression and scaling via topological expressivity of ReLU networks.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

过去十年中最令人兴奋的技术发展之一是广泛采用了一系列称为神经网络的算法，用于尖端工业应用，从自动驾驶汽车到根据氨基酸序列预测蛋白质的三维形状。该项目的目标是双重的。首先，研究人员试图使用数学工具（特别是概率和组合学）来更好地理解神经网络的行为，然后将这种理解转变为新的，更有效，更安全的算法。这涉及到数学家、计算机科学家和电气工程师之间的合作。该项目团队试图解开一个基本的谜团：为什么神经网络看起来非常复杂，但尽管它们看起来错综复杂，仍然学会了简约而有用的预测方法？换句话说，研究人员的目标是定义和分析神经网络复杂性的不同数学概念，然后将它们作为理论基础的指导，以寻找与神经网络相关的更有效和更可解释的算法。第二个目标是创建一系列教育资源，从视频到课程笔记，使整个社会的各个部分（例如学生，政策制定者，科学家等）能够参与并对围绕现代神经网络的想法，挑战和机遇进行有用的欣赏。本项目的研究由三个相互联系的部分组成。第一个是在训练之前、期间和之后对各种神经网络复杂性度量进行概率分析。相关的工具来自概率论、泛函分析、信息论和几何学。关键的理论问题包括量化隐式偏差和学习结构化函数的泛化误差。第二个是对单个ReLU网络函数和ReLU网络空间的拓扑和几何分析。相关的工具来自于莫尔斯理论和低维拓扑学。关键的理论问题取决于理解拓扑隐式偏差和拓扑深度分离。最后，研究人员通过（i）有原则的，有效的神经架构搜索，使用平均案例复杂度作为实际表达力，可训练性，和概括以及（ii）通过ReLU网络的拓扑表现力进行模型压缩和缩放的新方法。该奖项反映了NSF的法定使命，并通过使用基金会的学术价值和更广泛的影响审查标准。

项目成果

期刊论文数量（2）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Maximal Initial Learning Rates in Deep ReLU Networks

DOI：
10.48550/arxiv.2212.07295
发表时间：
2022-12
期刊：
ArXiv
影响因子：
0
作者：
Gaurav M. Iyer;B. Hanin;D. Rolnick
通讯作者：
Gaurav M. Iyer;B. Hanin;D. Rolnick

Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis

DOI：
10.48550/arxiv.2205.05662
发表时间：
2022-05
期刊：
ArXiv
影响因子：
0
作者：
Wuyang Chen;Wei Huang;Xinyu Gong;B. Hanin;Zhangyang Wang
通讯作者：
Wuyang Chen;Wei Huang;Xinyu Gong;B. Hanin;Zhangyang Wang

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Boris Hanin其他文献

Interface asymptotics of Wigner—Weyl distributions for the Harmonic Oscillator

DOI：
10.1007/s11854-022-0209-4
发表时间：
2022-07-11
期刊：
JOURNAL D ANALYSE MATHEMATIQUE
影响因子：
0.900
作者：
Boris Hanin;Steve Zelditch
通讯作者：
Steve Zelditch

Les Houches Lectures on Deep Learning at Large & Infinite Width

Les Houches 深度学习讲座

DOI：
10.48550/arxiv.2309.01592
发表时间：
2023
期刊：
ArXiv
影响因子：
0
作者：
Yasaman Bahri;Boris Hanin;Antonin Brossollet;Vittorio Erba;Christian Keup;Rosalba Pacelli;James B. Simon
通讯作者：
James B. Simon

$$C^\infty $$ Scaling Asymptotics for the Spectral Projector of the Laplacian

DOI：
10.1007/s12220-017-9812-5
发表时间：
2017-05-27
期刊：
JOURNAL OF GEOMETRIC ANALYSIS
影响因子：
1.500
作者：
Yaiza Canzani;Boris Hanin
通讯作者：
Boris Hanin

Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems

您需要更多的 LLM 电话吗？

DOI：
10.48550/arxiv.2403.02419
发表时间：
2024
期刊：
ArXiv
影响因子：
0
作者：
Lingjiao Chen;Jared Quincy Davis;Boris Hanin;Peter D. Bailis;Ion Stoica;Matei Zaharia;James Zou
通讯作者：
James Zou