权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Coding Theory for Robust Large-Scale Machine Learning

职业：鲁棒大规模机器学习的编码理论

基本信息

批准号：
1844951
负责人：
Dimitrios Papailiopoulos
金额：
$ 50.83万
依托单位：
University of Wisconsin-Madison
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-05-01 至 2024-04-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1844951&HistoricalAwards=false
关键词：
CAREER Coding Theory Robust Large

项目摘要

Coding theory has played a critical role in modern information technology by supporting robustness of information against a backdrop of multifaceted uncertainty. Following recent successes in machine learning, robustness has emerged as a desired principle, but now in the context of large-scale computation. Challenges related to robustness are prevalent when deploying machine learning solutions in real applications and non-curated settings, which are often non-ideal environments. This project aims to address these challenges by developing novel solutions based on coding theory for computation. These solutions offer provable robustness guarantees, can outperform more traditional solutions in practice, and extend to machine learning systems the gains that have transformed communication and storage systems. Existing and new collaborations of the investigator will facilitate industry cooperation and increase the transition to practice for the frameworks and algorithms generated from this project. The research will be strongly coupled with educational developments guided by recent advances in education science, alongside an outreach program within the Wisconsin Institute for Discovery. This project aims to develop novel coding-theoretic solutions and fundamental trade-offs for robust large-scale machine learning. The research program is centered around three thrusts. The first thrust focuses on robustness during distributed optimization in the presence of delays and straggler nodes, where the speed of convergence is affected by nodes in the system that are significantly slower than average. The second thrust focuses on robustness during distributed optimization in the presence of Byzantine nodes and worst-case failures. Recent studies proposed robust aggregation rules to filter out the effect of worst-case or adversarial failures. This project develops coding-theoretic solutions that can be orders of magnitude faster, and give rise to unexplored trade-offs between computation and Byzantine tolerance. The third thrust focuses on adversarial perturbations during prediction that can force state-of-the-art models to consistently mis-classify events/data. The coding-theoretic approach of this project pursues provable defense mechanisms against adversarial attacks through ensembles of models with inherent redundancy and through data augmentation. The proposed theoretical and algorithmic solutions are afforded by an interdisciplinary mix of tools from information and coding theory, distributed optimization, and machine learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

编码理论在现代信息技术中发挥了关键作用，它支持了信息在多方面不确定性背景下的稳健性。随着机器学习最近的成功，稳健性已经成为一种理想的原则，但现在是在大规模计算的背景下。在实际应用程序和非管理环境中部署机器学习解决方案时，与健壮性相关的挑战普遍存在，这些环境通常不是理想的环境。该项目旨在通过开发基于计算编码理论的新解决方案来应对这些挑战。这些解决方案提供了可证明的健壮性保证，在实践中可以超越更传统的解决方案，并将改变了通信和存储系统的收益扩展到机器学习系统。调查员现有的和新的合作将促进行业合作，并促进该项目产生的框架和算法向实践的过渡。这项研究将与教育科学最新进展指导下的教育发展紧密结合，同时还将与威斯康星州探索研究所内的一项外联计划相结合。该项目旨在为稳健的大规模机器学习开发新的编码理论解决方案和基本权衡。研究计划围绕三个方面展开。第一个重点是在存在延迟和掉队节点的情况下，分布式优化过程中的稳健性，其中收敛速度受到系统中显著低于平均速度的节点的影响。第二个重点是在存在拜占庭节点和最坏故障的情况下，分布式优化过程中的健壮性。最近的研究提出了稳健的聚集规则来过滤最坏情况或对抗性失败的影响。这个项目开发了编码理论解决方案，其速度可以快几个数量级，并在计算和拜占庭容忍度之间产生了尚未探索的权衡。第三个重点是预测过程中的对抗性扰动，这些扰动可能会迫使最先进的模型一致地错误分类事件/数据。该项目的编码理论方法通过具有固有冗余性的模型集成和数据增强来寻求针对对手攻击的可证明的防御机制。建议的理论和算法解决方案由来自信息和编码理论、分布式优化和机器学习的跨学科工具组合提供。该奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（9）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient

DOI：
发表时间：
2020-06
期刊：
ArXiv
影响因子：
0
作者：
Ankit Pensia;Shashank Rajput;Alliot Nagle;Harit Vishwakarma;Dimitris Papailiopoulos
通讯作者：
Ankit Pensia;Shashank Rajput;Alliot Nagle;Harit Vishwakarma;Dimitris Papailiopoulos

Bad Global Minima Exist and SGD Can Reach Them

DOI：
发表时间：
2019-05
期刊：
ArXiv
影响因子：
0
作者：
Shengchao Liu;Dimitris Papailiopoulos;D. Achlioptas
通讯作者：
Shengchao Liu;Dimitris Papailiopoulos;D. Achlioptas

Attack of the Tails: Yes, You Really Can Backdoor Federated Learning

DOI：
发表时间：
2020-07
期刊：
ArXiv
影响因子：
0
作者：
Hongyi Wang;Kartik K. Sreenivasan;Shashank Rajput;Harit Vishwakarma;Saurabh Agarwal;Jy-yong Sohn;
通讯作者：
Hongyi Wang;Kartik K. Sreenivasan;Shashank Rajput;Harit Vishwakarma;Saurabh Agarwal;Jy-yong Sohn;

An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks

深度阈值网络记忆能力的指数级提升