权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Sublinear Algorithms for Approximating Probability Distributions

用于近似概率分布的次线性算法

基本信息

批准号：
EP/L021749/1
负责人：
Ilias Diakonikolas
金额：
$ 12.59万
依托单位：
University of Edinburgh
依托单位国家：
英国
项目类别：
Research Grant
财政年份：
2014
资助国家：
英国
起止时间：
2014 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=EP%2FL021749%2F1
关键词：
Sublinear Algorithms Approximating Probability Distributions

项目摘要

The goal of this proposal is to advance a research program of developing sublinear-time algorithms for estimating a wide range of natural and important classes of probability distributions.We live in an era of "big data," where the amount of data that can be brought to bearon questions of biology, climate, economics, etc, is vast and expanding rapidly.Much of this raw data frequently consists of example points without corresponding labels.The challenge of how to make sense of this unlabeled data has immediate relevanceand has rapidly become a bottleneck in scientific understanding across many disciplines.An important class of big data is most naturally modeled as samples from a probability distribution over a very large domain. The challenge of big data is that the sizes of the domains of the distributions are immense, typically resulting in unacceptably slow algorithms. Scaling up a computational framework to comfortably deal with ever-larger data presents a series of challenges in algorithms. This prompts the basic question: Given samples from some unknown distribution, what can we infer?While this question has been studied for several decades by various different communities of researchers,both the number of samples and running time required for such estimation tasksare not yet well understood, even for some surprisingly simple types of discrete distributions.The proposed research focuses on sublinear-time algorithms, that is,algorithms that run in time that is significantly less than the domain of the underlying distributions.In this project we will develop sublinear-time algorithms for estimating various classes of discrete distributions over very large domains. Specific problems we will address include:(1) Developing sublinear algorithms to estimate probability distributions that satisfy variousnatural types of "shape restrictions" on the underlying probability density function.(2) Developing sublinear algorithms for estimating complex distributions that result from the aggregation of many independent simple sources of randomness.We believe that highly efficient algorithms for these estimation tasks may play an important role for the next generation of large-scale machine learning applications.

该提案的目标是推进一项研究计划，开发用于估计各种自然和重要类别的概率分布的次线性时间算法。我们生活在一个“大数据”时代，可以用来解决生物、气候、经济等问题的数据量，是巨大的，并迅速扩大。这些原始数据中的大部分经常由没有相应标签的示例点组成。如何理解这些未标记数据的挑战具有直接的相关性，并迅速成为许多学科科学理解的瓶颈。一类重要的大数据最自然地被建模为来自非常大的域上的概率分布的样本。大数据的挑战在于，分布域的大小是巨大的，通常会导致算法的速度慢得令人无法接受。扩展计算框架以舒适地处理越来越大的数据在算法中提出了一系列挑战。这就引出了一个基本的问题：给定一些未知分布的样本，我们能推断出什么？虽然这个问题已经研究了几十年的各种不同社区的研究人员，无论是样本的数量和运行时间所需的估计tasksare还没有很好地理解，即使是一些令人惊讶的简单类型的离散distributions.The拟议的研究重点是次线性时间算法，即，算法的运行时间明显小于底层分布的域。在这个项目中，我们将开发次线性-时间算法，用于估计非常大的域上的各种离散分布。我们将解决的具体问题包括：（1）开发次线性算法来估计概率分布，满足各种自然类型的“形状限制”的潜在概率密度函数。(2)开发用于估计复杂分布的次线性算法，这些复杂分布是由许多独立的简单随机性源聚合而成的。我们相信，用于这些估计任务的高效算法可能会在下一代大规模机器学习应用中发挥重要作用。

项目成果

期刊论文数量（10）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Testing Shape Restrictions of Discrete Distributions

测试离散分布的形状限制

DOI：
发表时间：
2016
期刊：
影响因子：
0
作者：
Canonne, C.L.
通讯作者：
Canonne, C.L.

On the Complexity of Optimal Lottery Pricing and Randomized Mechanisms for a Unit-Demand Buyer

关于单位需求购买者的最优彩票定价和随机机制的复杂性

DOI：
10.1137/17m1136481
发表时间：
2022
期刊：
SIAM Journal on Computing
影响因子：
1.6
作者：
Chen, Xi;Diakonikolas, Ilias;Orfanou, Anthi;Paparas, Dimitris;Sun, Xiaorui;Yannakakis, Mihalis
通讯作者：
Yannakakis, Mihalis

Fast Algorithms for Segmented Regression