权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: Scalable Linear Algebra and Neural Network Theory

合作研究：可扩展线性代数和神经网络理论

基本信息

批准号：
2134248
负责人：
Mert Pilanci
金额：
$ 35万
依托单位：
Stanford University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-09-01 至 2024-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2134248&HistoricalAwards=false
关键词：
Collaborative Research Scalable Linear Algebra

项目摘要

These projects will use randomized numerical linear algebra building blocks to develop improved methods in stochastic optimization theory and statistical/machine learning theory. The motivation is that, while machine learning and deep learning methodology has transformed certain applications, such as computer vision and natural language processing, its promised impact on many other areas has yet to be seen. The reason for this is the flip side of why it has been successful where it has. In the applications where it has had the most remarkable successes, people have adopted the following strategy: get large quantities of data; train a neural network model using stochastic first order methods; and implement and apply the model in a user-facing industrial application. There are many well-known limitations with this general approach, ranging from the need for large quantities of data and daunting compute resources to interpretability and robustness issues. These limitations are particularly apparent when using neural networks for problems such as high-performance computing, fluid mechanics/dynamics, temporal supply chain forecasting problems, biotechnology, etc., where interpretability is paramount. This work aims to address central technical issues underlying this approach, namely: while linear algebraic techniques are central to the design and use of modern neural network models, current methodology uses linear algebra in relatively superficial ways. If we have stronger control over the linear algebraic methods, the community will have a more practical theory to guide neural network use in a broad range of applications beyond computer vision and natural language processing. These methods will enable qualitatively more refined scalable implementations and applications of neural network models in a range of scientific and engineering domains. Broader impacts of these projects include mentoring of grant-supported graduate students and postdoctoral researchers.Technically, the work will focus on three general directions: optimization theory, including convex optimization based neural network and going beyond optimization; scalable linear algebra theory, including randomized linear algebra for neural networks, and sparse randomized linear algebra; and statistics and machine learning theory, including implicit regularization, and learning with limited non-iid data. More broadly, the goal is to provide a basis for practical theory that can guide practice, in a manner analogous to how linear algebraic and functional analytic methods underlie practical and useful theory in a broad range of scientific/engineering applications. We expect that such a challenging task is possible since many of the recent developments in machine learning theory and neural network practice have parallels in scientific computing, where there is a long history of what may be called scalable linear algebra for physical/engineering theory. Many of the methods to be developed may be viewed as bridging the interdisciplinary gap between these old ideas and the new challenges we face; and principal investigators have a history of developing interdisciplinary classes, summer schools, workshops related to the topics of the proposed work, and they will continue to do so.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

这些项目将使用随机数值线性代数构建模块来开发随机优化理论和统计/机器学习理论的改进方法。其动机是，虽然机器学习和深度学习方法已经改变了某些应用，如计算机视觉和自然语言处理，但其对许多其他领域的影响还有待观察。这样做的原因是它为什么在那里取得成功的另一面。在其取得最显著成功的应用中，人们采用了以下策略：获取大量数据;使用随机一阶方法训练神经网络模型;并在面向用户的工业应用中实现和应用该模型。这种通用方法有许多众所周知的局限性，从需要大量数据和令人生畏的计算资源到可解释性和鲁棒性问题。当使用神经网络解决高性能计算、流体力学/动力学、临时供应链预测问题、生物技术等问题时，这些局限性尤其明显，可解释性是最重要的这项工作的目的是解决这种方法的核心技术问题，即：虽然线性代数技术是现代神经网络模型的设计和使用的核心，目前的方法使用线性代数在相对肤浅的方式。如果我们对线性代数方法有更强的控制力，社区将有更实用的理论来指导神经网络在计算机视觉和自然语言处理之外的广泛应用中的使用。这些方法将使神经网络模型在一系列科学和工程领域的定性更精细的可扩展实现和应用成为可能。这些项目的更广泛影响包括指导受资助的研究生和博士后研究人员。从技术上讲，工作将集中在三个大方向：优化理论，包括基于凸优化的神经网络和超越优化;可扩展线性代数理论，包括神经网络的随机线性代数和稀疏随机线性代数;以及统计学和机器学习理论，包括隐式正则化，以及利用有限的非IID数据的学习。更广泛地说，目标是为可以指导实践的实用理论提供基础，类似于线性代数和函数分析方法如何在广泛的科学/工程应用中成为实用和有用理论的基础。我们希望这样一个具有挑战性的任务是可能的，因为机器学习理论和神经网络实践的许多最新发展在科学计算中有相似之处，在科学计算中，物理/工程理论的可扩展线性代数有着悠久的历史。许多有待开发的方法可被视为弥合这些旧观念与我们面临的新挑战之间的跨学科差距;和主要研究者有开发跨学科课程，暑期学校，与拟议工作主题相关的研讨会的历史，该奖项反映了NSF的法定使命，并通过使用基金会的学术价值和更广泛的影响审查标准。

项目成果

期刊论文数量（8）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Optimal sets and solution paths of ReLU networks

ReLU网络的最优集和求解路径

DOI：
发表时间：
2023
期刊：
ICML'23: Proceedings of the 40th International Conference on Machine Learning
影响因子：
0
作者：
Mishkin, Aaron
通讯作者：
Mishkin, Aaron

The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program

DOI：
发表时间：
2021-10
期刊：
ArXiv
影响因子：
0
作者：
Yifei Wang;Mert Pilanci
通讯作者：
Yifei Wang;Mert Pilanci

Sketching the Krylov subspace: faster computation of the entire ridge regularization path

绘制 Krylov 子空间：更快地计算整个岭正则化路径

DOI：
10.1007/s11227-023-05309-w
发表时间：
2023
期刊：
The Journal of Supercomputing
影响因子：
0
作者：
Wang, Yifei;Pilanci, Mert
通讯作者：
Pilanci, Mert

Optimal Shrinkage for Distributed Second-Order Optimization

DOI：
发表时间：
2024-02
期刊：
影响因子：
0
作者：
Fangzhao Zhang;Mert Pilanci
通讯作者：
Fangzhao Zhang;Mert Pilanci

Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers

DOI：
10.48550/arxiv.2205.08078
发表时间：
2022-05
期刊：
ArXiv
影响因子：
0
作者：
Arda Sahiner;Tolga Ergen;Batu Mehmet Ozturkler;J. Pauly;M. Mardani;Mert Pilanci
通讯作者：
Arda Sahiner;Tolga Ergen;Batu Mehmet Ozturkler;J. Pauly;M. Mardani;Mert Pilanci

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Mert Pilanci其他文献

Polynomial-Time Solutions for ReLU Network Training: A Complexity Classification via Max-Cut and Zonotopes

ReLU 网络训练的多项式时间解决方案：通过 Max-Cut 和 Zonotopes 进行复杂性分类

DOI：
发表时间：
2023
期刊：
arXiv.org
影响因子：
0
作者：
Yifei Wang;Mert Pilanci
通讯作者：
Mert Pilanci

Training Convolutional ReLU Neural Networks in Polynomial Time: Exact Convex Optimization Formulations

在多项式时间内训练卷积 ReLU 神经网络：精确的凸优化公式

DOI：
发表时间：
2020
期刊：
arXiv.org
影响因子：
0
作者：
Tolga Ergen;Mert Pilanci
通讯作者：
Mert Pilanci

All Local Minima are Global for Two-Layer ReLU Neural Networks: The Hidden Convex Optimization Landscape

对于两层 ReLU 神经网络，所有局部最小值都是全局的：隐藏的凸优化景观

DOI：
发表时间：
2020
期刊：
arXiv.org
影响因子：
0
作者：
Jonathan Lacotte;Mert Pilanci
通讯作者：
Mert Pilanci

Using a Novel COVID-19 Calculator to Measure Positive U.S. Socio-Economic Impact of a COVID-19 Pre-Screening Solution (AI/ML)

使用新型 COVID-19 计算器衡量 COVID-19 预筛查解决方案 (AI/ML) 对美国社会经济的积极影响

DOI：
发表时间：
2022
期刊：
arXiv.org
影响因子：
0
作者：
R. Swartzbaugh;Amil Khanzada;Praveen Govindan;Mert Pilanci;A. Owoyemi;Les E. Atlas;Hugo Estrada;Richard Nall;Michael Lotito;Rich Falcone;J. Ranjani
通讯作者：
J. Ranjani