Fundamentals of Modern Machine Learning: A Precise High-dimensional Approach
现代机器学习的基础:精确的高维方法
基本信息
- 批准号:RGPIN-2021-03677
- 负责人:
- 金额:$ 3.21万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
As we aspire to use data-driven machine-learning (ML) algorithms to create automated decision rules in more aspects of everyday life, we need to make sure that they meet a number of complex system requirements: ML algorithms used for perception in self-driving cars need to be safe against disturbances caused by adversaries. In applications that directly involve data about people, such as decisions on who is granted a loan or who gets hired, we need to ensure fairness against demographic imbalances that exist in our society and translate to data. To effectively use modern deep-learning models -which are increasingly more complex, thus computationally expensive- in resource constrained platforms such as mobile health devices, we need to carefully balance accuracy and resource efficiency. The goal of my research program is to advance the expanded use of ML by developing a modern theory that can guide the design of algorithms that fulfill these requirements. A prime challenge in developing such a theory is the high-dimensionality of data that renders classical statistical tools inadequate. But even where recent theories have captured certain aspects of high-dimensionality, they have often failed to capture newly discovered ML phenomena, because they produce statistical characterizations that are not precise. To address these challenges, I will develop a new `precise high-dimensional (HD) statistics' approach to modern ML theory. I will establish a mathematical framework that will lead to precise characterization of the accuracy of classification algorithms as a function of the distribution and size of data, the model complexity, and the algorithms' parameters. This effort builds on my previous work, which innovated a method of precise estimation-error analysis in HD signal processing. Now I will apply the new framework to guide the design of improved ML algorithms with three objectives in mind: robustness to adversarial perturbations (aka safety), robustness to imbalances (aka fairness) and reduced model complexity (aka resource efficiency). To this end, I will also develop theory-driven statistical models that are rich enough to resemble the intricacies of data-driven ones. This program will provide students with the essential tools in mathematical data science: optimization, probability, statistical signal-processing, and learning theories. Just as awareness of biases in data and algorithms are key concerns of my research, I am also committed to building a diverse research group through inclusive recruitment, training environment, and teaching. The focus of my research program aligns with Canada's national strategy for AI with a particular emphasis on robustness and equitable algorithms for protecting the rights of marginalized groups. The outcomes of the proposed program have the potential to lead to tech-industry collaborations towards integrating the new provably robust and resource-efficient algorithms to existing data-driven products.
当我们渴望使用数据驱动的机器学习(ML)算法来在日常生活的更多方面创建自动决策规则时,我们需要确保它们符合许多复杂的系统要求:用于自动驾驶汽车中的ML算法需要在自动驾驶汽车的感知中造成对敌人造成的干扰的安全。在直接涉及有关人员的数据的应用程序中,例如关于谁获得贷款或被雇用的人的决定,我们需要确保与我们社会中存在的人口不平衡的公平性,并转化为数据。要有效地使用现代的深度学习模型(越来越复杂,因此在资源约束平台(例如移动健康设备)中的计算昂贵,我们需要仔细平衡准确性和资源效率。我的研究计划的目的是通过开发一种现代理论来指导满足这些要求的算法的设计,从而推动ML扩展的使用。发展这种理论的主要挑战是使经典统计工具不足的数据的高度差异。但是,即使最近的理论捕获了高维度的某些方面,它们也常常未能捕获新发现的ML现象,因为它们产生了不精确的统计特征。为了应对这些挑战,我将开发一种新的“精确的高维(HD)统计方法”,以实现现代ML理论。我将建立一个数学框架,该框架将导致分类算法的准确性的精确表征,这是数据的分布和大小,模型复杂性和算法的参数的函数。这项工作以我以前的工作为基础,该工作创新了一种在高清信号处理中进行精确估计 - 错误分析的方法。现在,我将应用新框架来指导有三个目标的改进ML算法的设计:对对抗性扰动(又称安全性)的稳健性,对失衡(又称公平性)的鲁棒性和降低模型复杂性(又称资源效率)。为此,我还将开发以理论驱动的统计模型,这些模型足够丰富,可以类似于数据驱动的统计模型。该计划将为学生提供数学数据科学中的基本工具:优化,概率,统计信号处理和学习理论。正如我研究的关键问题是对数据和算法的偏见的认识一样,我也致力于通过包容性招聘,培训环境和教学来建立一个多样化的研究小组。我的研究计划的重点与加拿大国家AI的国家战略相吻合,特别着重于保护边缘化群体权利的鲁棒性和公平算法。拟议计划的结果有可能导致技术行业的合作,以将新的可证明可证明和资源效率的算法整合到现有数据驱动的产品中。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Thrampoulidis, Christos其他文献
Near-optimal Coded Apertures for Imaging via Nazarov’s Theorem
通过纳扎罗夫定理用于成像的近乎最佳编码孔径
- DOI:
10.1109/icassp.2019.8682254 - 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Ajjanagadde, Ganesh;Thrampoulidis, Christos;Yedidia, Adam;Wornell, Gregory - 通讯作者:
Wornell, Gregory
Sharp Asymptotics and Optimal Performance for Inference in Binary Models
二元模型中推理的尖锐渐近性和最佳性能
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Taheri, Hossein;Pedarsani, Ramtin;Thrampoulidis, Christos - 通讯作者:
Thrampoulidis, Christos
Sharp Guarantees for Solving Random Equations with One-Bit Information
用一位信息求解随机方程的尖锐保证
- DOI:
10.1109/allerton.2019.8919905 - 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Taheri, Hossein;Pedarsani, Ramtin;Thrampoulidis, Christos - 通讯作者:
Thrampoulidis, Christos
Simple Error Bounds for Regularized Noisy Linear Inverse Problems
- DOI:
10.1109/isit.2014.6875386 - 发表时间:
2014-01-01 - 期刊:
- 影响因子:0
- 作者:
Thrampoulidis, Christos;Oymak, Samet;Hassibi, Babak - 通讯作者:
Hassibi, Babak
A Simple Bound on the BER of the Map Decoder for Massive MIMO Systems
大规模 MIMO 系统映射解码器 BER 的简单限制
- DOI:
10.1109/icassp.2019.8682440 - 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Thrampoulidis, Christos;Zadik, Ilias;Polyanskiy, Yury - 通讯作者:
Polyanskiy, Yury
Thrampoulidis, Christos的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Thrampoulidis, Christos', 18)}}的其他基金
Fundamentals of Modern Machine Learning: A Precise High-dimensional Approach
现代机器学习的基础:精确的高维方法
- 批准号:
RGPIN-2021-03677 - 财政年份:2021
- 资助金额:
$ 3.21万 - 项目类别:
Discovery Grants Program - Individual
Fundamentals of Modern Machine Learning: A Precise High-dimensional Approach
现代机器学习的基础:精确的高维方法
- 批准号:
DGECR-2021-00482 - 财政年份:2021
- 资助金额:
$ 3.21万 - 项目类别:
Discovery Launch Supplement
相似国自然基金
现代通信工程中新型序列的设计与构造
- 批准号:12301429
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
面向政府治理现代化的政务舆情信息萃取、价值发现及辅助决策研究
- 批准号:72374191
- 批准年份:2023
- 资助金额:41.00 万元
- 项目类别:面上项目
基于传统桥梁营造智慧的现代木结构模块化集成建筑(T-MiC)设计体系
- 批准号:52378023
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
清胰汤抑制肠道菌群代谢物TMAO介导的ROS/TXNIP/NLRP3治疗急性胰腺炎的机制研究诠释“清下”法现代内涵
- 批准号:82304943
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
现代表面风化作用对条带状铁建造的改造机制和过程
- 批准号:42302217
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Interactions of Human and Machine Intelligence in Modern Economic Systems
现代经济系统中人与机器智能的相互作用
- 批准号:
DP240100506 - 财政年份:2024
- 资助金额:
$ 3.21万 - 项目类别:
Discovery Projects
Modern Statistics and Statistical Machine Learning
现代统计学和统计机器学习
- 批准号:
2886365 - 财政年份:2023
- 资助金额:
$ 3.21万 - 项目类别:
Studentship
Modern Statistics and Statistical Machine Learning
现代统计学和统计机器学习
- 批准号:
2886852 - 财政年份:2023
- 资助金额:
$ 3.21万 - 项目类别:
Studentship
Next-Generation Algorithms in Statistical Genetics Based on Modern Machine Learning
基于现代机器学习的下一代统计遗传学算法
- 批准号:
10714930 - 财政年份:2023
- 资助金额:
$ 3.21万 - 项目类别:
CNS: CORE: Small: Scaling Graph Machine Learning Workloads on Modern Storage
CNS:核心:小型:在现代存储上扩展图机器学习工作负载
- 批准号:
2237193 - 财政年份:2023
- 资助金额:
$ 3.21万 - 项目类别:
Standard Grant