权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CRII: AF: Optimization and sampling algorithms with provable generalization and runtime guarantees, with applications to deep learning

CRII：AF：具有可证明的泛化性和运行时保证的优化和采样算法，以及深度学习的应用

基本信息

批准号：
2104528
负责人：
Oren Mangoubi
金额：
$ 17.42万
依托单位：
Worcester Polytechnic Institute
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-06-01 至 2023-05-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2104528&HistoricalAwards=false
关键词：
CRII AF Optimization sampling algorithms

项目摘要

When training deep-learning and other machine-learning models, one would like to train in such a way that the generalization error of the trained model, that is, the error of the trained model when it is used to make predictions on a “test” dataset which was not used to train the model, is as low as possible. Training algorithms with good generalization properties can lead to machine-learning models that are more robust to changes in the dataset, allow for robust predictions, and help mitigate algorithmic bias when the training dataset may not be fully representative of the diversity of the population dataset. Such algorithms can also lead to more stable training in settings such as distributed training and online learning. In practice, the choice of optimization algorithm that one uses to train the model can greatly affect both its training error and generalization error. Unfortunately, there is a lack of optimization algorithms with provable guarantees on the generalization error. This makes it difficult to design algorithms which provably achieve both a fast running time and low generalization error. The aim of this project is to design novel algorithms for training deep-learning and other machine-learning models, and to prove guarantees on the running time, generalization error and related robustness properties of these algorithms. To design and analyze such algorithms, this project brings together ideas from different areas of mathematics and computer science.This project is designing novel optimization and Markov-chain sampling algorithms, for training deep-learning models as well as other machine-learning models. It aims to prove guarantees on the generalization error and related robustness properties of these algorithms, and also to provide fast running-time guarantees. Guaranteeing a low generalization error is especially challenging in deep learning, since the number of trainable parameters is oftentimes much larger than the size of the dataset, and the loss function used to train the model is nonconvex. To prove stronger generalization and related robustness guarantees, the project team uses ideas from manifold learning and differential geometry to model the low-dimensional structure of datasets which arise in many machine learning applications. The project has three components. One component is to design and analyze novel optimization algorithms for training deep learning models. Another component is to design and analyze algorithms for multi-agent optimization problems, such as the min-max optimization problems which arise when training generative adversarial nets (GANs) as well as multi-agent optimization problems which arise when training meta-learning models. Finally, in addition to optimization algorithms, it is also designing and analyzing Markov-chain sampling algorithms and related algorithms which are used to train Bayesian machine learning models.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

在训练深度学习和其他机器学习模型时，人们希望以这样的方式进行训练，即训练模型的泛化误差，即当训练模型用于对未用于训练模型的“测试”数据集进行预测时，训练模型的误差尽可能低。具有良好泛化特性的训练算法可以产生对数据集变化更鲁棒的机器学习模型，允许鲁棒的预测，并在训练数据集可能不能完全代表群体数据集的多样性时帮助减轻算法偏差。这样的算法还可以在分布式训练和在线学习等环境中实现更稳定的训练。在实践中，用于训练模型的优化算法的选择可以极大地影响其训练误差和泛化误差。不幸的是，目前还缺乏对泛化误差具有可证明保证的优化算法。这使得设计可证明同时实现快速运行时间和低泛化误差的算法变得困难。该项目的目的是设计用于训练深度学习和其他机器学习模型的新算法，并证明这些算法的运行时间，泛化误差和相关鲁棒性的保证。为了设计和分析这些算法，该项目汇集了数学和计算机科学不同领域的想法。该项目正在设计新颖的优化和马尔可夫链采样算法，用于训练深度学习模型以及其他机器学习模型。它的目的是证明这些算法的泛化误差和相关的鲁棒性的保证，并提供快速运行时间的保证。在深度学习中，确保低泛化误差尤其具有挑战性，因为可训练参数的数量通常比数据集的大小大得多，并且用于训练模型的损失函数是非凸的。为了证明更强的泛化能力和相关的鲁棒性保证，项目团队使用流形学习和微分几何的思想来建模许多机器学习应用中出现的数据集的低维结构。该项目有三个组成部分。其中一个组成部分是设计和分析用于训练深度学习模型的新型优化算法。另一个组成部分是设计和分析多智能体优化问题的算法，例如训练生成对抗网络（GAN）时出现的最小-最大优化问题以及训练元学习模型时出现的多智能体优化问题。最后，除了优化算法之外，还设计和分析马尔可夫链抽样算法和用于训练贝叶斯机器学习模型的相关算法。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估而被认为值得支持。

项目成果

期刊论文数量（7）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Sampling from Log-Concave Distributions with Infinity-Distance Guarantees

具有无限距离保证的对数凹分布采样

DOI：
发表时间：
2022
期刊：
Advances in neural information processing systems
影响因子：
0
作者：
Oren Mangoubi, Nisheeth Vishnoi
通讯作者：
Oren Mangoubi, Nisheeth Vishnoi

Private Matrix Approximation and Geometry of Unitary Orbits

单一轨道的私有矩阵近似和几何

DOI：
发表时间：
2022
期刊：
Conference on Learning Theory
影响因子：
0
作者：
Mangoubi, Oren;Wu, Yikai;Kale, Satyen;Thakurta, Abhradeep G;Vishnoi, Nisheeth K
通讯作者：
Vishnoi, Nisheeth K

Data-Driven Soil Water Content Estimation at Multiple Depths Using SFCW GPR

DOI：
10.1109/orss58323.2023.10161940
发表时间：
2023-04
期刊：
2023 IEEE International Opportunity Research Scholars Symposium (ORSS)
影响因子：
0
作者：
Vincent Filardi;Allen Cheung;Ruba Khan;Oren Mangoubi;Majid Moradikia;S. Zekavat;B. Wilson;Radwin Askari;D. Petkie
通讯作者：
Vincent Filardi;Allen Cheung;Ruba Khan;Oren Mangoubi;Majid Moradikia;S. Zekavat;B. Wilson;Radwin Askari;D. Petkie

Re-Analyze Gauss: Bounds for Private Matrix Approximation via Dyson Brownian Motion

重新分析高斯：通过戴森布朗运动的私有矩阵近似的界限