权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

The Heavy-Tailed Methods in Machine Learning

机器学习中的重尾方法

基本信息

批准号：
2208303
负责人：
Lingjiong Zhu
金额：
$ 16.35万
依托单位：
Florida State University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-07-01 至 2025-06-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2208303&HistoricalAwards=false
关键词：
Heavy Tailed Methods Machine Learning

项目摘要

Stochastic gradient descent and its variants are core algorithms for solving machine learning problems and work remarkably well in practice. However, a general theory that explains their success is still lacking. One popular approach is to impose structure on the gradient noise, typically modeled by Gaussian or other light-tailed distributions. However, many empirical and some recent theoretical works challenge these assumptions, calling for an understanding of heavy-tailed distributions and resulting phenomena in machine learning. In this project, a theoretical framework will be built towards understanding and explaining why and how heavy tailed distributions arise in popular machine learning algorithms, and how heavy tails can better explain their success, bridging a gap between theory and practice. The results derived from this project are expected to impact the mathematics community as well as developers and practitioners in the data science and machine learning communities. In this project, theoretical convergence properties and performance guarantees will be obtained for heavy-tailed stochastic gradient descent, their accelerated momentum-based methods, and continuous-time approximations. Further theoretical properties such as metastability will be studied to gain a further understanding of these heavy-tailed methods. A novel heavy-tailed adaptive Langevin algorithm and its variants will be developed and the theoretical guarantees will be studied for both sampling and non-convex stochastic optimization. Such an objective requires combining a broad set of ideas and mathematical tools from applied probability, continuous optimization, statistics and numerical analysis. Based on such mathematical developments the project will develop and study heavy-tailed algorithms with theoretical guarantees that can solve large-scale machine learning problems, ultimately building up a mathematical theory to explain the cause and implications of heavy-tailed distributions and other important phenomena that arise in machine learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

随机梯度下降及其变体是解决机器学习问题的核心算法，在实践中效果非常好。然而，解释他们成功的一般理论仍然缺乏。一种流行的方法是在梯度噪声上施加结构，通常由高斯分布或其他光尾分布建模。然而，许多经验和一些最近的理论工作挑战这些假设，呼吁理解重尾分布和机器学习中的现象。在这个项目中，将建立一个理论框架，以理解和解释为什么以及如何在流行的机器学习算法中出现重尾分布，以及重尾如何更好地解释它们的成功，弥合理论与实践之间的差距。该项目的成果预计将影响数学界以及数据科学和机器学习界的开发人员和从业人员。在这个项目中，理论收敛特性和性能保证将获得重尾随机梯度下降，他们的加速动量为基础的方法，和连续时间近似。进一步的理论性质，如亚稳态将被研究，以获得这些重尾方法的进一步理解。一种新的重尾自适应Langevin算法及其变种将开发和理论保证将研究采样和非凸随机优化。这样一个目标需要结合广泛的思想和数学工具，从应用概率，连续优化，统计和数值分析。基于这样的数学发展，该项目将开发和研究具有理论保证的重尾算法，这些算法可以解决大规模机器学习问题，最终建立了一个数学理论来解释重-该奖项反映了NSF的法定使命，并被认为值得通过使用基金会的学术价值和更广泛的影响审查标准。