权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Concentrated Optimization for Machine Learning: Complexity in High-Dimensions, Average-case Analysis, and Exact Dynamics

机器学习的集中优化：高维复杂性、平均情况分析和精确动态

基本信息

批准号：
RGPIN-2022-04034
负责人：
Paquette, Courtney
金额：
$ 2.11万
依托单位：
McGill University
依托单位国家：
加拿大
项目类别：
Discovery Grants Program - Individual
财政年份：
2022
资助国家：
加拿大
起止时间：
2022-01-01 至 2023-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=759088
关键词：
Concentrated Optimization Machine Learning Complexity

项目摘要

Learning algorithms play an integral role in machine learning and have had widespread empirical success in training high-dimensional problems (e.g., number of features and samples are large). In spite of their popularity, there is a gap between real-world performances and best known theoretical bounds. Traditionally, complexity theory of learning algorithms focuses on worst-case analysis, which guarantees convergence for all inputs under general assumptions of the objective function such as convexity and smoothness. High-dimensionality is often not explicitly assumed. High-dimensional data implies more possibilities for the inputs into an algorithm so the input which generate the worst-case complexity could be far from typical. Average-case analysis places a probability distribution on the inputs and computes the expected complexity. Compared to worst-case analysis, it is more representative of the typical behavior of an algorithm, but remains largely unexplored in optimization. A challenge is finding a good probability distribution on the input (data set) that matches real-world successes and is amenable to analysis. This proposal addresses a series of research questions relating average-case complexity of first-order methods and high-dimensionality. The proposal seeks to answer the following question. Develop a general framework for average-case complexity of learning algorithms and analyze their exact dynamics to gain insights into step size selections and convergence properties. The proposal has two components. First, the PI explores questions concerning high-dimensional dynamics of stochastic optimization algorithms on a random least squares problem. In particular, the PI plans on investigating the relationships between average-case complexity, step-size and momentum parameter selection strategies, batch-size, and modeling assumptions on the data-set and targets. Indeed lots of computational time is wasted trying to find step-sizes which (1). yield good optimizers and (2). do so in a reasonable amount of time. Average-case analysis can illuminate parameter selections that work for typical high-dimensional problems. Second for various minimization problems (e.g., generalized linear models (GLMs), non-smooth objective functions), one lacks good models for the behavior of real-data as inputs into the objective functions and in particular a model for the Hessian. The spectrum of the Hessian provides a picture of the loss landscape for complicated objective functions. The PI plans on approximating this spectrum with random matrices. Using Hessians generated by random matrices, one can extend the average-case analysis beyond quadratic models to more complicated objective functions.

学习算法在机器学习中起着不可或缺的作用，并且在训练高维问题（例如，特征和样本的数量很大）。尽管它们很受欢迎，但现实世界的表现与最知名的理论界限之间存在差距。传统上，学习算法的复杂性理论侧重于最坏情况分析，这保证了在目标函数的一般假设（如凸性和光滑性）下所有输入的收敛性。高维性通常没有明确的假设。高维数据意味着算法的输入有更多的可能性，因此产生最坏情况复杂度的输入可能远非典型。平均情况分析在输入上放置概率分布，并计算预期的复杂度。与最坏情况分析相比，它更能代表算法的典型行为，但在优化中仍有很大程度上未被探索。一个挑战是在输入（数据集）上找到一个良好的概率分布，该分布与现实世界的成功相匹配，并且易于分析。该建议解决了一系列与一阶方法的平均情况复杂性和高维性相关的研究问题。该提案旨在回答以下问题。为学习算法的平均情况复杂度开发一个通用框架，并分析其确切的动态特性，以深入了解步长选择和收敛特性。该提案有两个组成部分。首先，PI探讨了随机最小二乘问题上的随机优化算法的高维动态问题。特别是，PI计划调查平均情况复杂性，步长和动量参数选择策略，批量大小以及数据集和目标的建模假设之间的关系。事实上，大量的计算时间被浪费在试图找到步长（1）。产生好的优化器和（2）.在合理的时间内这样做。平均情况分析可以说明典型高维问题的参数选择。其次是各种最小化问题（例如，广义线性模型（GLM），非光滑目标函数），缺乏用于作为目标函数的输入的真实数据的行为的良好模型，特别是用于Hessian的模型。Hessian的谱提供了复杂目标函数的损失景观的图片。PI计划用随机矩阵来近似这个频谱。使用由随机矩阵生成的海森，可以将平均情况分析扩展到二次模型之外的更复杂的目标函数。