权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Leveraging Combinatorial Structures for Robust and Scalable Learning

职业：利用组合结构实现稳健且可扩展的学习

基本信息

批准号：
1845032
负责人：
Amin Karbasi
金额：
$ 55万
依托单位：
Yale University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-05-01 至 2025-04-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1845032&HistoricalAwards=false
关键词：
CAREER Leveraging Combinatorial Structures Robust

项目摘要

The difficulty of searching through a massive amount of data in order to quickly make an informed decision is one of today's most ubiquitous challenges. Many scientific and engineering models feature data with inherently discrete characteristics, where discrete means that the data takes on a finite set of possible values. Examples of such data include phrases in text to objects in an image. Similarly, nearly all aspects of data science involve discrete tasks such as data summarization and model explanation. As computational methods pervade all aspects of science and engineering, it is of great importance to understand which discrete formulations can be solved efficiently and how to do so. Many of these problems are notoriously hard, and even those that are theoretically solvable may only be possible for only small amounts of data. However, the problems of practical interest are often much more well-behaved and possess inherent structure that allows them to be solved more efficiently. This CAREER award aims to substantially advance the frontiers of large-scale discrete optimization in data science and machine learning by developing fundamentally new algorithms. This project will also provide a number of educational opportunities such as outreach to local high school and middle school students through Yale's Pathways to Science program.Just as convexity has been a celebrated and well-studied condition under which continuous optimization is tractable, submodularity is a condition for which discrete objectives may be optimized. While current research in submodular optimization has led to fundamental breakthroughs in discrete mathematical programming, there is still a large gap between the theory and the limitations of the existing algorithms used by practitioners in the real world. In particular, most of the existing submodular optimization methods fail miserably when faced with the numerous sources of uncertainty inherent in machine learning tasks, from noise in the data to variability of the true objective. Moreover, submodularity is too strong an assumption for a variety of novel machine learning applications, necessitating the development of completely new algorithms. In order to lift current provable methods out of the sterile lab environment and scale them into the messy real world, it is important to carefully reexamine their limitations, consider more realistic but less perfect conditions, and develop correspondingly robust yet scalable algorithms. This CAREER project presents a research plan towards designing, analyzing, and evaluating new approaches for robust submodular optimization at a massive scale that leads to solving a broad array of optimization problems of significant practical importance. Furthermore, it addresses generalizations of submodular functions that widely broaden the applicability of these methods, moving to a realm beyond submodularity. The research directions in this project have deep and far-reaching societal benefits, as robust and scalable computational methods play a central role in nearly every scientific and industrial venture in today's information age. Such advances are expected to play crucial roles in enabling data-driven scientific discoveries, promoting fairness in machine learning, and supporting STEM education by helping these communities handle the computational challenges associated with big data. The results of this project will be broadly disseminated to the greater scientific community through tutorials, workshops, and open-source software.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

搜索大量数据以快速做出明智决策的困难是当今最普遍的挑战之一。许多科学和工程模型的特征数据具有固有的离散特性，其中离散意味着数据具有有限的可能值集。这种数据的示例包括文本中的短语到图像中的对象。同样，数据科学的几乎所有方面都涉及离散任务，例如数据汇总和模型解释。随着计算方法渗透到科学和工程的各个方面，了解哪些离散公式可以有效地求解以及如何求解是非常重要的。这些问题中的许多都是出了名的困难，甚至那些理论上可以解决的问题也可能只在少量数据的情况下才有可能解决。然而，实际感兴趣的问题通常表现得更好，并且具有允许更有效地解决它们的内在结构。该职业奖旨在通过开发全新的算法，大幅推进数据科学和机器学习中大规模离散优化的前沿。该项目还将提供一些教育机会，例如通过耶鲁大学的科学之路项目向当地高中和中学生推广。正如凸性是一个著名的和研究充分的条件下，连续优化是易于处理的，子模块化是一个条件，离散目标可以优化。虽然目前的研究在子模优化导致了根本性的突破，在离散数学规划，仍然有一个很大的差距之间的理论和现有的算法的限制，从业者在真实的世界。特别是，当面对机器学习任务固有的众多不确定性来源（从数据中的噪音到真实目标的可变性）时，大多数现有的子模块优化方法都会惨败。此外，子模块化对于各种新颖的机器学习应用来说是一个过于强大的假设，需要开发全新的算法。为了将当前可证明的方法从无菌的实验室环境中提升出来，并将其扩展到混乱的真实的世界中，重要的是要仔细重新检查它们的局限性，考虑更现实但不完美的条件，并开发相应的鲁棒但可扩展的算法。这个CAREER项目提出了一个研究计划，旨在设计，分析和评估大规模鲁棒子模块优化的新方法，从而解决具有重要实际意义的广泛优化问题。此外，它解决了子模块化函数的泛化问题，广泛地扩展了这些方法的适用性，进入了子模块化之外的领域。该项目的研究方向具有深远的社会效益，因为强大且可扩展的计算方法在当今信息时代的几乎所有科学和工业企业中发挥着核心作用。这些进步预计将在实现数据驱动的科学发现，促进机器学习的公平性以及通过帮助这些社区处理与大数据相关的计算挑战来支持STEM教育方面发挥关键作用。该项目的成果将通过教程、研讨会和开源软件广泛传播给更大的科学界。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（34）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Learning and Certification under Instance-targeted Poisoning

DOI：
发表时间：
2021-05
期刊：
影响因子：
0
作者：
Ji Gao;Amin Karbasi;Mohammad Mahmoody
通讯作者：
Ji Gao;Amin Karbasi;Mohammad Mahmoody

The Curious Case of Adversarially Robust Models: More Data Can Help, Double Descend, or Hurt Generalization

DOI：
发表时间：
2020-02
期刊：
影响因子：
0
作者：
Yifei Min;Lin Chen;Amin Karbasi
通讯作者：
Yifei Min;Lin Chen;Amin Karbasi

Meta Learning in the Continuous Time Limit

DOI：
发表时间：
2020-06
期刊：
ArXiv
影响因子：
0
作者：
Ruitu Xu;Lin Chen;Amin Karbasi
通讯作者：
Ruitu Xu;Lin Chen;Amin Karbasi

Scalable MCMC Sampling for Nonsymmetric Determinantal Point Processes

DOI：
10.48550/arxiv.2207.00486
发表时间：
2022-07
期刊：
影响因子：
0
作者：
Insu Han;Mike Gartrell;Elvis Dohmatob;Amin Karbasi
通讯作者：
Insu Han;Mike Gartrell;Elvis Dohmatob;Amin Karbasi

Stochastic Conditional Gradient Methods: From Convex Minimization to Submodular Maximization

DOI：
发表时间：
2018-04
期刊：
J. Mach. Learn. Res.
影响因子：
0
作者：
Aryan Mokhtari;Hamed Hassani;Amin Karbasi
通讯作者：
Aryan Mokhtari;Hamed Hassani;Amin Karbasi

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Amin Karbasi其他文献

Near-Optimal Active Learning of Halfspaces via Query Synthesis in the Noisy Setting

在噪声环境中通过查询合成实现半空间的近乎最优主动学习

DOI：
10.1609/aaai.v31i1.10783
发表时间：
2016
期刊：
ArXiv
影响因子：
0
作者：
Lin Chen;Seyed Hamed Hassani;Amin Karbasi
通讯作者：
Amin Karbasi

Learning network structures from firing patterns

从发射模式学习网络结构

DOI：
10.1109/icassp.2016.7471765
发表时间：
2016
期刊：
2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
影响因子：
0
作者：
Amin Karbasi;A. Salavati;M. Vetterli
通讯作者：
M. Vetterli