权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Fundamentals of Modern Machine Learning: A Precise High-dimensional Approach

现代机器学习的基础：精确的高维方法

基本信息

批准号：
RGPIN-2021-03677
负责人：
Thrampoulidis, Christos
金额：
$ 3.21万
依托单位：
University of British Columbia
依托单位国家：
加拿大
项目类别：
Discovery Grants Program - Individual
财政年份：
2022
资助国家：
加拿大
起止时间：
2022-01-01 至 2023-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=752290
关键词：
Fundamentals Modern Machine Learning Precise

项目摘要

As we aspire to use data-driven machine-learning (ML) algorithms to create automated decision rules in more aspects of everyday life, we need to make sure that they meet a number of complex system requirements: ML algorithms used for perception in self-driving cars need to be safe against disturbances caused by adversaries. In applications that directly involve data about people, such as decisions on who is granted a loan or who gets hired, we need to ensure fairness against demographic imbalances that exist in our society and translate to data. To effectively use modern deep-learning models -which are increasingly more complex, thus computationally expensive- in resource constrained platforms such as mobile health devices, we need to carefully balance accuracy and resource efficiency. The goal of my research program is to advance the expanded use of ML by developing a modern theory that can guide the design of algorithms that fulfill these requirements. A prime challenge in developing such a theory is the high-dimensionality of data that renders classical statistical tools inadequate. But even where recent theories have captured certain aspects of high-dimensionality, they have often failed to capture newly discovered ML phenomena, because they produce statistical characterizations that are not precise. To address these challenges, I will develop a new `precise high-dimensional (HD) statistics' approach to modern ML theory. I will establish a mathematical framework that will lead to precise characterization of the accuracy of classification algorithms as a function of the distribution and size of data, the model complexity, and the algorithms' parameters. This effort builds on my previous work, which innovated a method of precise estimation-error analysis in HD signal processing. Now I will apply the new framework to guide the design of improved ML algorithms with three objectives in mind: robustness to adversarial perturbations (aka safety), robustness to imbalances (aka fairness) and reduced model complexity (aka resource efficiency). To this end, I will also develop theory-driven statistical models that are rich enough to resemble the intricacies of data-driven ones. This program will provide students with the essential tools in mathematical data science: optimization, probability, statistical signal-processing, and learning theories. Just as awareness of biases in data and algorithms are key concerns of my research, I am also committed to building a diverse research group through inclusive recruitment, training environment, and teaching. The focus of my research program aligns with Canada's national strategy for AI with a particular emphasis on robustness and equitable algorithms for protecting the rights of marginalized groups. The outcomes of the proposed program have the potential to lead to tech-industry collaborations towards integrating the new provably robust and resource-efficient algorithms to existing data-driven products.

当我们渴望使用数据驱动的机器学习（ML）算法在日常生活的更多方面创建自动化决策规则时，我们需要确保它们满足许多复杂的系统要求：用于自动驾驶汽车感知的ML算法需要安全地抵御对手造成的干扰。在直接涉及人的数据的应用程序中，例如决定谁获得贷款或谁被雇用，我们需要确保公平对待我们社会中存在的人口不平衡并转化为数据。为了在资源受限的平台（如移动的医疗设备）中有效地使用现代深度学习模型（这些模型越来越复杂，因此计算成本也越来越高），我们需要仔细平衡准确性和资源效率。我的研究计划的目标是通过开发一种现代理论来推动ML的扩展使用，该理论可以指导满足这些要求的算法的设计。发展这样一个理论的主要挑战是数据的高维性，这使得经典的统计工具不足。但是，即使最近的理论已经捕捉到了高维的某些方面，它们也往往未能捕捉到新发现的ML现象，因为它们产生的统计特征并不精确。为了应对这些挑战，我将为现代ML理论开发一种新的“精确高维（HD）统计”方法。我将建立一个数学框架，该框架将导致分类算法的准确性作为数据分布和大小、模型复杂性和算法参数的函数的精确表征。这项工作建立在我以前的工作，创新了一种方法，精确估计误差分析的高清信号处理。现在，我将应用新框架来指导改进的ML算法的设计，并考虑三个目标：对抗性扰动的鲁棒性（又名安全性），不平衡的鲁棒性（又名公平性）和降低模型复杂性（又名资源效率）。为此，我还将开发理论驱动的统计模型，这些模型足够丰富，可以类似于数据驱动模型的复杂性。该计划将为学生提供数学数据科学的基本工具：优化，概率，统计信号处理和学习理论。正如对数据和算法偏见的认识是我研究的关键问题一样，我也致力于通过包容性招聘，培训环境和教学建立一个多元化的研究小组。我的研究项目的重点与加拿大的人工智能国家战略保持一致，特别强调保护边缘群体权利的鲁棒性和公平算法。拟议计划的结果有可能导致技术行业合作，将新的可证明的强大和资源高效的算法集成到现有的数据驱动产品中。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Thrampoulidis, Christos其他文献

Sharp Guarantees for Solving Random Equations with One-Bit Information

用一位信息求解随机方程的尖锐保证

DOI：
10.1109/allerton.2019.8919905
发表时间：
2019
期刊：
and Computing (Allerton
影响因子：
0
作者：
Taheri, Hossein;Pedarsani, Ramtin;Thrampoulidis, Christos
通讯作者：
Thrampoulidis, Christos

Sharp Asymptotics and Optimal Performance for Inference in Binary Models

二元模型中推理的尖锐渐近性和最佳性能

DOI：
发表时间：
2020
期刊：
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics
影响因子：
0
作者：
Taheri, Hossein;Pedarsani, Ramtin;Thrampoulidis, Christos
通讯作者：
Thrampoulidis, Christos

Near-optimal Coded Apertures for Imaging via Nazarov’s Theorem

通过纳扎罗夫定理用于成像的近乎最佳编码孔径

DOI：
10.1109/icassp.2019.8682254
发表时间：
2019
期刊：
and Signal Processing
影响因子：
0
作者：
Ajjanagadde, Ganesh;Thrampoulidis, Christos;Yedidia, Adam;Wornell, Gregory
通讯作者：
Wornell, Gregory

A Simple Bound on the BER of the Map Decoder for Massive MIMO Systems

大规模 MIMO 系统映射解码器 BER 的简单限制

DOI：
10.1109/icassp.2019.8682440
发表时间：
2019
期刊：
A Simple Bound on the BER of the Map Decoder for Massive MIMO Systems
影响因子：
0
作者：
Thrampoulidis, Christos;Zadik, Ilias;Polyanskiy, Yury
通讯作者：
Polyanskiy, Yury

Multi-Environment Meta-Learning in Stochastic Linear Bandits

随机线性老虎机中的多环境元学习

DOI：
10.1109/isit50566.2022.9834636
发表时间：
2022
期刊：
2022 IEEE International Symposium on Information Theory (ISIT
影响因子：
0
作者：
Moradipari, Ahmadreza;Ghavamzadeh, Mohammad;Rajabzadeh, Taha;Thrampoulidis, Christos;Alizadeh, Mahnoosh
通讯作者：
Alizadeh, Mahnoosh