The Complexity of Testing Distributions

测试分布的复杂性

基本信息

  • 批准号:
    0514771
  • 负责人:
  • 金额:
    $ 20万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2005
  • 资助国家:
    美国
  • 起止时间:
    2005-07-15 至 2008-06-30
  • 项目状态:
    已结题

项目摘要

In a wide variety of computational settings, where the data is most naturally viewed as coming from a distribution, it is often crucial to determine whether the underlying distribution satisfies various properties. Examples of such properties include whether two distributions are close or far in statistical distance, whether a joint distribution is independent, and whether a distribution has high entropy. For most such properties, standard statistical techniques which approximate the distribution lead to algorithms which use a number of samples that is nearly linear in the domain size. Until very recently, distributions over large domains, for which linear sample complexity can be daunting, have received surprisingly little attention. However, new interest in these questions comes from many directions, including data mining, Physics and machine learning applications in Neurobiology. Recent results have shown that one can achieve results which are significantly more efficient than the standard techniques for the case of large domains.The intellectual merit of this research will lead to an understanding of the sample, time and space complexity required to identify various natural properties of a probability distribution. The proposed research will focus on determining which properties can be understood with a number of samples that is sublinear in the domain size, and will lead to an understanding of the aspects of algorithm design that are specific to these constraints. The questions that will be considered range from considering the complexity of testing previously unstudied properties, finding general techniques which apply to classes of distribution testing problems, investigating new models of distribution testing, understanding structural aspects of the testing problems that can be solved in sublinear time, and further understanding the relationship between the computational complexity and sample complexity.The broader impact of this proposal includes educational and workforce development components. The educational component of this proposal involves designing course material on algorithms for testing distributions that would be appropriate for students at all levels. Enough material has been collected to develop a graduate course that highlights the body of techniques that have emerged in this field. Some of the recent advances are a perfect fit for conveying fundamental ideas behind randomized algorithms to undergraduates. The PI has been awarded two teaching awards for her efforts at undergraduate education. The PI will continue her efforts as an advisor and mentor to undergraduates, graduate students and postdoctoral researchers, which in the past have included several women who have gone on to successful research careers. The PI is currently co-organizing a Dagstuhl workshop on Sublinear Algorithms. A priority has been placed on inviting and supporting graduate students interested in working in the area. Finally, the ideas will also be disseminated through the use of the web, seminar, workshop and conference presentations, journal articles and survey articles aimed at a wider audience.
在各种各样的计算设置中,数据最自然地被视为来自分布,确定底层分布是否满足各种属性通常是至关重要的。这些性质的例子包括两个分布在统计距离上是近还是远,联合分布是否独立,以及分布是否具有高熵。对于大多数这样的性质,近似分布的标准统计技术导致算法使用的样本数量在域大小上接近线性。直到最近,对于线性样本复杂度令人生畏的大域上的分布,很少受到关注。然而,对这些问题的新兴趣来自许多方向,包括数据挖掘、物理和神经生物学中的机器学习应用。最近的结果表明,在大域的情况下,可以获得比标准技术更有效的结果。这项研究的智力价值将导致对识别概率分布的各种自然属性所需的样本,时间和空间复杂性的理解。拟议的研究将侧重于确定哪些属性可以用域大小的亚线性样本来理解,并将导致对特定于这些约束的算法设计方面的理解。将考虑的问题范围从考虑测试以前未研究的性质的复杂性,找到适用于分布测试问题类别的一般技术,研究分布测试的新模型,理解可以在次线性时间内解决的测试问题的结构方面,以及进一步理解计算复杂性和样本复杂性之间的关系。该提案的更广泛影响包括教育和劳动力发展部分。该提案的教育部分包括设计适合所有层次学生的测试分布算法的课程材料。已经收集了足够的材料来开发一门研究生课程,以突出该领域中出现的技术主体。最近的一些进展非常适合向大学生传达随机算法背后的基本思想。由于她在本科教育方面的努力,她曾两次获得教学奖。她将继续担任本科生、研究生和博士后研究人员的顾问和导师,过去曾有几位女性在研究事业上取得了成功。PI目前正在共同组织一个关于次线性算法的Dagstuhl研讨会。优先考虑的是邀请和支持有兴趣在该地区工作的研究生。最后,还将通过利用网络、研讨会、讲习班和会议发言、期刊文章和针对更广泛受众的调查文章来传播这些想法。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ronitt Rubinfeld其他文献

A Self-Tester for Linear Functions over the Integers with an Elementary Proof of Correctness
  • DOI:
    10.1007/s00224-015-9639-z
  • 发表时间:
    2015-06-20
  • 期刊:
  • 影响因子:
    0.400
  • 作者:
    Sheela Devadas;Ronitt Rubinfeld
  • 通讯作者:
    Ronitt Rubinfeld
On the time and space complexity of computation using write-once memory or is pen really much worse than pencil?
  • DOI:
    10.1007/bf02835833
  • 发表时间:
    1992-06-01
  • 期刊:
  • 影响因子:
    0.400
  • 作者:
    Sandy Irani;Moni Naor;Ronitt Rubinfeld
  • 通讯作者:
    Ronitt Rubinfeld
Learning fallible Deterministic Finite Automata
  • DOI:
    10.1007/bf00993409
  • 发表时间:
    1995-02-01
  • 期刊:
  • 影响因子:
    2.900
  • 作者:
    Dana Ron;Ronitt Rubinfeld
  • 通讯作者:
    Ronitt Rubinfeld
Exactly Learning Automata of Small Cover Time
  • DOI:
    10.1023/a:1007348927491
  • 发表时间:
    1997-04-01
  • 期刊:
  • 影响因子:
    2.900
  • 作者:
    Dana Ron;Ronitt Rubinfeld
  • 通讯作者:
    Ronitt Rubinfeld

Ronitt Rubinfeld的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ronitt Rubinfeld', 18)}}的其他基金

AF: SMALL: Extending the Reach of Distribution Testing via Structure
AF:小:通过结构扩展分布测试的范围
  • 批准号:
    2310818
  • 财政年份:
    2023
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
AF: Small: Sparsity in Local Computation
AF:小:局部计算的稀疏性
  • 批准号:
    2006664
  • 财政年份:
    2020
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
AitF: Collaborative Research: Fast, Accurate, and Practical: Adaptive Sublinear Algorithms for Scalable Visualization
AitF:协作研究:快速、准确和实用:用于可扩展可视化的自适应次线性算法
  • 批准号:
    1733808
  • 财政年份:
    2017
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
BIGDATA: F: Testing High Dimensional Distributions without the Curse of Dimensionality
BIGDATA:F:在没有维数灾难的情况下测试高维分布
  • 批准号:
    1741137
  • 财政年份:
    2017
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
EAGER: Testing Pseudorandom Distributions
EAGER:测试伪随机分布
  • 批准号:
    1650733
  • 财政年份:
    2016
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
AF: Small: New directions in the design of local computation algorithms
AF:小:局部计算算法设计的新方向
  • 批准号:
    1420692
  • 财政年份:
    2014
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
AF: Small: Local Computation Algorithms
AF:小:本地计算算法
  • 批准号:
    1217423
  • 财政年份:
    2012
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
AF: Medium: Taming Masssive Data with Sub-Linear Algorithms
AF:中:用次线性算法驯服海量数据
  • 批准号:
    1065125
  • 财政年份:
    2011
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
MSPA-MCS: Learning to Rank
MSPA-MCS:学习排名
  • 批准号:
    0732334
  • 财政年份:
    2007
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
CAREER: Algorithms for Self-testing/Correcting Program and Learning
职业:自我测试/纠正程序和学习的算法
  • 批准号:
    9624552
  • 财政年份:
    1996
  • 资助金额:
    $ 20万
  • 项目类别:
    Continuing Grant

相似海外基金

Subcube Conditional Samples And Testing Properties Of Probability Distributions
子立方条件样本和概率分布的测试属性
  • 批准号:
    EP/Y001680/1
  • 财政年份:
    2023
  • 资助金额:
    $ 20万
  • 项目类别:
    Research Grant
NSF Postdoctoral Fellowship in Biology FY 2020: From genomes to geographic distributions: testing the eco-evolutionary mechanisms of species range limits
2020 财年 NSF 生物学博士后奖学金:从基因组到地理分布:测试物种范围限制的生态进化机制
  • 批准号:
    2010892
  • 财政年份:
    2020
  • 资助金额:
    $ 20万
  • 项目类别:
    Fellowship Award
BIGDATA: F: Testing High Dimensional Distributions without the Curse of Dimensionality
BIGDATA:F:在没有维数灾难的情况下测试高维分布
  • 批准号:
    1741137
  • 财政年份:
    2017
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
EAGER: Testing Pseudorandom Distributions
EAGER:测试伪随机分布
  • 批准号:
    1650733
  • 财政年份:
    2016
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Collaborative Research: Testing a microbial-association-distribution hypothesis to explain spatial distributions and species co-existence in a community of epiphytic plants
合作研究:测试微生物关联分布假说,以解释附生植物群落中的空间分布和物种共存
  • 批准号:
    1355155
  • 财政年份:
    2014
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Collaborative Research: Testing a microbial-association-distribution hypothesis to explain spatial distributions and species co-existence in a community of epiphytic plants
合作研究:测试微生物关联分布假说,以解释附生植物群落中的空间分布和物种共存
  • 批准号:
    1354674
  • 财政年份:
    2014
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Collaborative Research: Testing a microbial-association-distribution hypothesis to explain spatial distributions and species co-existence in a community of epiphytic plants
合作研究:测试微生物关联分布假说,以解释附生植物群落中的空间分布和物种共存
  • 批准号:
    1352892
  • 财政年份:
    2014
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
AF: Small: Learning and Testing Classes of Distributions
AF:小:学习和测试分布类
  • 批准号:
    1319788
  • 财政年份:
    2013
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
DISSERTATION RESEARCH: Testing the limits: effects of climate and competition on conifer distributions at Mount Rainier
论文研究:测试极限:气候和竞争对雷尼尔山针叶树分布的影响
  • 批准号:
    1010787
  • 财政年份:
    2010
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Testing the physics of stellar atmospheres: optical interferometry and spectral energy distributions
测试恒星大气的物理特性:光学干涉测量和光谱能量分布
  • 批准号:
    609-2006
  • 财政年份:
    2010
  • 资助金额:
    $ 20万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了