Sublinear Algorithms for Approximating Probability Distributions

用于近似概率分布的次线性算法

基本信息

  • 批准号:
    EP/L021749/1
  • 负责人:
  • 金额:
    $ 12.59万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2014
  • 资助国家:
    英国
  • 起止时间:
    2014 至 无数据
  • 项目状态:
    已结题

项目摘要

The goal of this proposal is to advance a research program of developing sublinear-time algorithms for estimating a wide range of natural and important classes of probability distributions.We live in an era of "big data," where the amount of data that can be brought to bearon questions of biology, climate, economics, etc, is vast and expanding rapidly.Much of this raw data frequently consists of example points without corresponding labels.The challenge of how to make sense of this unlabeled data has immediate relevanceand has rapidly become a bottleneck in scientific understanding across many disciplines.An important class of big data is most naturally modeled as samples from a probability distribution over a very large domain. The challenge of big data is that the sizes of the domains of the distributions are immense, typically resulting in unacceptably slow algorithms. Scaling up a computational framework to comfortably deal with ever-larger data presents a series of challenges in algorithms. This prompts the basic question: Given samples from some unknown distribution, what can we infer?While this question has been studied for several decades by various different communities of researchers,both the number of samples and running time required for such estimation tasksare not yet well understood, even for some surprisingly simple types of discrete distributions.The proposed research focuses on sublinear-time algorithms, that is,algorithms that run in time that is significantly less than the domain of the underlying distributions.In this project we will develop sublinear-time algorithms for estimating various classes of discrete distributions over very large domains. Specific problems we will address include:(1) Developing sublinear algorithms to estimate probability distributions that satisfy variousnatural types of "shape restrictions" on the underlying probability density function.(2) Developing sublinear algorithms for estimating complex distributions that result from the aggregation of many independent simple sources of randomness.We believe that highly efficient algorithms for these estimation tasks may play an important role for the next generation of large-scale machine learning applications.
该提案的目标是推进一项研究计划,开发用于估计各种自然和重要类别的概率分布的次线性时间算法。我们生活在一个“大数据”时代,可以用来解决生物、气候、经济等问题的数据量,是巨大的,并迅速扩大。这些原始数据中的大部分经常由没有相应标签的示例点组成。如何理解这些未标记数据的挑战具有直接的相关性,并迅速成为许多学科科学理解的瓶颈。一类重要的大数据最自然地被建模为来自非常大的域上的概率分布的样本。大数据的挑战在于,分布域的大小是巨大的,通常会导致算法的速度慢得令人无法接受。扩展计算框架以舒适地处理越来越大的数据在算法中提出了一系列挑战。这就引出了一个基本的问题:给定一些未知分布的样本,我们能推断出什么?虽然这个问题已经研究了几十年的各种不同社区的研究人员,无论是样本的数量和运行时间所需的估计tasksare还没有很好地理解,即使是一些令人惊讶的简单类型的离散distributions.The拟议的研究重点是次线性时间算法,即,算法的运行时间明显小于底层分布的域。在这个项目中,我们将开发次线性-时间算法,用于估计非常大的域上的各种离散分布。我们将解决的具体问题包括:(1)开发次线性算法来估计概率分布,满足各种自然类型的“形状限制”的潜在概率密度函数。(2)开发用于估计复杂分布的次线性算法,这些复杂分布是由许多独立的简单随机性源聚合而成的。我们相信,用于这些估计任务的高效算法可能会在下一代大规模机器学习应用中发挥重要作用。

项目成果

期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Testing Shape Restrictions of Discrete Distributions
测试离散分布的形状限制
  • DOI:
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Canonne, C.L.
  • 通讯作者:
    Canonne, C.L.
On the Complexity of Optimal Lottery Pricing and Randomized Mechanisms for a Unit-Demand Buyer
关于单位需求购买者的最优彩票定价和随机机制的复杂性
  • DOI:
    10.1137/17m1136481
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    1.6
  • 作者:
    Chen, Xi;Diakonikolas, Ilias;Orfanou, Anthi;Paparas, Dimitris;Sun, Xiaorui;Yannakakis, Mihalis
  • 通讯作者:
    Yannakakis, Mihalis
Fast Algorithms for Segmented Regression
  • DOI:
  • 发表时间:
    2016-06
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jayadev Acharya;Ilias Diakonikolas;Jerry Li;Ludwig Schmidt
  • 通讯作者:
    Jayadev Acharya;Ilias Diakonikolas;Jerry Li;Ludwig Schmidt
Fourier-Based Testing for Families of Distributions
基于傅立叶的分布族测试
  • DOI:
    10.48550/arxiv.1706.05738
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Canonne C
  • 通讯作者:
    Canonne C
Sample-Optimal Density Estimation in Nearly-Linear Time
近线性时间内的样本最优密度估计
  • DOI:
    10.48550/arxiv.1506.00671
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Acharya J
  • 通讯作者:
    Acharya J
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ilias Diakonikolas其他文献

The Sample Complexity of Robust Covariance Testing
鲁棒协方差检验的样本复杂性
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ilias Diakonikolas;Daniel M. Kane
  • 通讯作者:
    Daniel M. Kane
Online Learning of Halfspaces with Massart Noise
使用 Massart 噪声在线学习半空间
  • DOI:
    10.48550/arxiv.2405.12958
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ilias Diakonikolas;Vasilis Kontonis;Christos Tzamos;Nikos Zarifis
  • 通讯作者:
    Nikos Zarifis
A Regularity Lemma, and Low-Weight Approximators, for Low-Degree Polynomial Threshold Functions
低次多项式阈值函数的正则引理和低权重近似器
Super Non-singular Decompositions of Polynomials and Their Application to Robustly Learning Low-Degree PTFs
多项式的超非奇异分解及其在鲁棒学习低次 PTF 中的应用
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ilias Diakonikolas;Daniel Kane;Vasilis Kontonis;Sihan Liu;Nikos Zarifis
  • 通讯作者:
    Nikos Zarifis
Near-Optimal Closeness Testing of Discrete Histogram Distributions
离散直方图分布的近最优紧密度测试

Ilias Diakonikolas的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ilias Diakonikolas', 18)}}的其他基金

CAREER: Learning Algorithms with Robustness and Efficiency Guarantees
职业:学习具有鲁棒性和效率保证的算法
  • 批准号:
    2144298
  • 财政年份:
    2022
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Continuing Grant
Collaborative Research: AF: Medium: Algorithmic High-Dimensional Robust Statistics
合作研究:AF:中:算法高维稳健统计
  • 批准号:
    2107079
  • 财政年份:
    2021
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Continuing Grant
AitF: Collaborative Research: Fast, Accurate, and Practical: Adaptive Sublinear Algorithms for Scalable Visualization
AitF:协作研究:快速、准确和实用:用于可扩展可视化的自适应次线性算法
  • 批准号:
    2006206
  • 财政年份:
    2019
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Standard Grant
CAREER: Efficient Algorithms for Learning and Testing Structured Probabilistic Models
职业:学习和测试结构化概率模型的有效算法
  • 批准号:
    2011255
  • 财政年份:
    2019
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Continuing Grant
CAREER: Efficient Algorithms for Learning and Testing Structured Probabilistic Models
职业:学习和测试结构化概率模型的有效算法
  • 批准号:
    1652862
  • 财政年份:
    2017
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Continuing Grant
AitF: Collaborative Research: Fast, Accurate, and Practical: Adaptive Sublinear Algorithms for Scalable Visualization
AitF:协作研究:快速、准确和实用:用于可扩展可视化的自适应次线性算法
  • 批准号:
    1733796
  • 财政年份:
    2017
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Standard Grant

相似海外基金

CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
  • 批准号:
    2338816
  • 财政年份:
    2024
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
  • 批准号:
    2338846
  • 财政年份:
    2024
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
  • 批准号:
    2348261
  • 财政年份:
    2024
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
  • 批准号:
    2348346
  • 财政年份:
    2024
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
  • 批准号:
    2348457
  • 财政年份:
    2024
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
  • 批准号:
    2404989
  • 财政年份:
    2024
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
  • 批准号:
    2339669
  • 财政年份:
    2024
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Continuing Grant
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
  • 批准号:
    EP/Y029089/1
  • 财政年份:
    2024
  • 资助金额:
    $ 12.59万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了