CAREER: Leveraging Combinatorial Structures for Robust and Scalable Learning

职业:利用组合结构实现稳健且可扩展的学习

基本信息

  • 批准号:
    1845032
  • 负责人:
  • 金额:
    $ 55万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-05-01 至 2025-04-30
  • 项目状态:
    未结题

项目摘要

The difficulty of searching through a massive amount of data in order to quickly make an informed decision is one of today's most ubiquitous challenges. Many scientific and engineering models feature data with inherently discrete characteristics, where discrete means that the data takes on a finite set of possible values. Examples of such data include phrases in text to objects in an image. Similarly, nearly all aspects of data science involve discrete tasks such as data summarization and model explanation. As computational methods pervade all aspects of science and engineering, it is of great importance to understand which discrete formulations can be solved efficiently and how to do so. Many of these problems are notoriously hard, and even those that are theoretically solvable may only be possible for only small amounts of data. However, the problems of practical interest are often much more well-behaved and possess inherent structure that allows them to be solved more efficiently. This CAREER award aims to substantially advance the frontiers of large-scale discrete optimization in data science and machine learning by developing fundamentally new algorithms. This project will also provide a number of educational opportunities such as outreach to local high school and middle school students through Yale's Pathways to Science program.Just as convexity has been a celebrated and well-studied condition under which continuous optimization is tractable, submodularity is a condition for which discrete objectives may be optimized. While current research in submodular optimization has led to fundamental breakthroughs in discrete mathematical programming, there is still a large gap between the theory and the limitations of the existing algorithms used by practitioners in the real world. In particular, most of the existing submodular optimization methods fail miserably when faced with the numerous sources of uncertainty inherent in machine learning tasks, from noise in the data to variability of the true objective. Moreover, submodularity is too strong an assumption for a variety of novel machine learning applications, necessitating the development of completely new algorithms. In order to lift current provable methods out of the sterile lab environment and scale them into the messy real world, it is important to carefully reexamine their limitations, consider more realistic but less perfect conditions, and develop correspondingly robust yet scalable algorithms. This CAREER project presents a research plan towards designing, analyzing, and evaluating new approaches for robust submodular optimization at a massive scale that leads to solving a broad array of optimization problems of significant practical importance. Furthermore, it addresses generalizations of submodular functions that widely broaden the applicability of these methods, moving to a realm beyond submodularity. The research directions in this project have deep and far-reaching societal benefits, as robust and scalable computational methods play a central role in nearly every scientific and industrial venture in today's information age. Such advances are expected to play crucial roles in enabling data-driven scientific discoveries, promoting fairness in machine learning, and supporting STEM education by helping these communities handle the computational challenges associated with big data. The results of this project will be broadly disseminated to the greater scientific community through tutorials, workshops, and open-source software.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
搜索大量数据以快速做出明智的决定的困难是当今最普遍的挑战之一。许多科学和工程模型具有具有固有离散特征的数据,其中离散意味着数据采用有限的可能值集。此类数据的示例包括图像中对象的文本中的短语。同样,数据科学的几乎所有方面都涉及离散的任务,例如数据摘要和模型解释。随着计算方法遍及科学和工程的各个方面,了解哪些离散配方可以有效地解决以及如何做到非常重要。众所周知,这些问题中的许多问题都很难,即使是理论上可以解决的问题也只有少量数据才有可能。但是,实际兴趣的问题通常更为良好,并且具有固有的结构,可以更有效地解决它们。该职业奖的目的是通过开发从根本的新算法来实质性地推进数据科学和机器学习中大规模离散优化的领域。该项目还将提供许多教育机会,例如通过耶鲁大学的科学计划向当地高中和中学生推广。正如凸性一样,始终是一种著名且良好的条件,在此疾病中,连续优化是可触犯的,因此supprodimulity是一个可以优化离散目标的条件。尽管当前的下二次优化研究导致了离散数学编程的基本突破,但该理论与实践者在现实世界中使用的现有算法的局限性之间仍然存在很大的差距。特别是,当面对机器学习任务固有的众多不确定性来源,从数据中的噪声到真实目标的可变性时,大多数现有的下二次优化方法都会失败。 此外,对于各种新型的机器学习应用来说,子二次性是太强的假设,因此需要开发全新的算法。为了将当前可证明的方法从无菌实验室的环境中提升到凌乱的现实世界中,仔细重新检查它们的局限性,考虑更现实但更完美的条件,并开发出相应强大而可扩展的算法很重要。该职业项目介绍了一项研究计划,以大规模设计,分析和评估新方法,以实现稳健的下二次优化,从而解决了一系列具有重要实际重要性的优化问题。此外,它解决了广泛扩大这些方法的适用性,转移到超出次数超越域的领域的概括的概括。该项目中的研究方向具有深厚且深远的社会利益,因为在当今的信息时代,强大而可扩展的计算方法在几乎每个科学和工业企业中都起着核心作用。预计这些进步将在实现数据驱动的科学发现,促进机器学习中的公平性以及支持STEM教育方面发挥关键作用,通过帮助这些社区应对与大数据相关的计算挑战。该项目的结果将通过教程,研讨会和开源软件大致传播给更大的科学界。该奖项反映了NSF的法定任务,并被认为是通过基金会的知识分子优点和更广泛的影响审查标准的评估来通过评估来支持的。

项目成果

期刊论文数量(34)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
The Curious Case of Adversarially Robust Models: More Data Can Help, Double Descend, or Hurt Generalization
  • DOI:
  • 发表时间:
    2020-02
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yifei Min;Lin Chen;Amin Karbasi
  • 通讯作者:
    Yifei Min;Lin Chen;Amin Karbasi
Learning and Certification under Instance-targeted Poisoning
  • DOI:
  • 发表时间:
    2021-05
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ji Gao;Amin Karbasi;Mohammad Mahmoody
  • 通讯作者:
    Ji Gao;Amin Karbasi;Mohammad Mahmoody
Meta Learning in the Continuous Time Limit
  • DOI:
  • 发表时间:
    2020-06
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ruitu Xu;Lin Chen;Amin Karbasi
  • 通讯作者:
    Ruitu Xu;Lin Chen;Amin Karbasi
Stochastic Conditional Gradient Methods: From Convex Minimization to Submodular Maximization
  • DOI:
  • 发表时间:
    2018-04
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Aryan Mokhtari;Hamed Hassani;Amin Karbasi
  • 通讯作者:
    Aryan Mokhtari;Hamed Hassani;Amin Karbasi
Scalable MCMC Sampling for Nonsymmetric Determinantal Point Processes
  • DOI:
    10.48550/arxiv.2207.00486
  • 发表时间:
    2022-07
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Insu Han;Mike Gartrell;Elvis Dohmatob;Amin Karbasi
  • 通讯作者:
    Insu Han;Mike Gartrell;Elvis Dohmatob;Amin Karbasi
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Amin Karbasi其他文献

Learning network structures from firing patterns
从发射模式学习网络结构
Near-Optimal Active Learning of Halfspaces via Query Synthesis in the Noisy Setting
在噪声环境中通过查询合成实现半空间的近乎最优主动学习
  • DOI:
    10.1609/aaai.v31i1.10783
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Lin Chen;Seyed Hamed Hassani;Amin Karbasi
  • 通讯作者:
    Amin Karbasi
Asynchronous decoding of LDPC codes over BEC
通过 BEC 异步解码 LDPC 码
Batched Multi-Armed Bandits with Optimal Regret
批量多臂强盗,最佳后悔
  • DOI:
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hossein Esfandiari;Amin Karbasi;Abbas Mehrabian;V. Mirrokni
  • 通讯作者:
    V. Mirrokni
Unconstrained submodular maximization with constant adaptive complexity
具有恒定自适应复杂度的无约束子模最大化

Amin Karbasi的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

建筑循环经济下预制、可拆卸和再利用的组合结构研究与设计
  • 批准号:
  • 批准年份:
    2023
  • 资助金额:
    10 万元
  • 项目类别:
建筑循环经济下预制、可拆卸和再利用的组合结构研究与设计
  • 批准号:
    52311530084
  • 批准年份:
    2023
  • 资助金额:
    10.00 万元
  • 项目类别:
    国际(地区)合作研究与交流项目
隔层与脱硫石膏不同组合模式对滨海盐渍土微生物利用秸秆碳的影响机制研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
隔层与脱硫石膏不同组合模式对滨海盐渍土微生物利用秸秆碳的影响机制研究
  • 批准号:
    42207410
  • 批准年份:
    2022
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
考虑应变时效的再利用钢-混凝土组合梁力学性能与设计方法研究
  • 批准号:
    52108116
  • 批准年份:
    2021
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Leveraging high-throughput continuous-flow synthesis of Charge-Altering Releasable Transporter gene delivery vectors to establish structure-function relationships for mRNA delivery
利用高通量连续流合成电荷改变可释放转运蛋白基因递送载体来建立 mRNA 递送的结构功能关系
  • 批准号:
    10007583
  • 财政年份:
    2019
  • 资助金额:
    $ 55万
  • 项目类别:
Leveraging high-throughput continuous-flow synthesis of Charge-Altering Releasable Transporter gene delivery vectors to establish structure-function relationships for mRNA delivery
利用高通量连续流合成电荷改变可释放转运蛋白基因递送载体来建立 mRNA 递送的结构功能关系
  • 批准号:
    9758810
  • 财政年份:
    2019
  • 资助金额:
    $ 55万
  • 项目类别:
Leveraging Temozolomide to Improve Treatment Efficacy of Immune Checkpoint Blockade in Glioblastoma
利用替莫唑胺提高胶质母细胞瘤免疫检查点阻断的治疗效果
  • 批准号:
    10458232
  • 财政年份:
    2016
  • 资助金额:
    $ 55万
  • 项目类别:
Leveraging Temozolomide to Improve Treatment Efficacy of Immune Checkpoint Blockade in Glioblastoma
利用替莫唑胺提高胶质母细胞瘤免疫检查点阻断的治疗效果
  • 批准号:
    9224807
  • 财政年份:
    2016
  • 资助金额:
    $ 55万
  • 项目类别:
Leveraging Temozolomide to Improve Treatment Efficacy of Immune Checkpoint Blockade in Glioblastoma
利用替莫唑胺提高胶质母细胞瘤免疫检查点阻断的治疗效果
  • 批准号:
    10462415
  • 财政年份:
    2016
  • 资助金额:
    $ 55万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了