CAREER: STATISTICAL INFERENCE FOR TOPOLOGICAL AND GEOMETRIC DATA ANALYSIS
职业:拓扑和几何数据分析的统计推断
基本信息
- 批准号:1149677
- 负责人:
- 金额:$ 40万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2012
- 资助国家:美国
- 起止时间:2012-06-01 至 2018-05-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The research objective of this proposal is to develop new theories and methods for estimating topological and geometric features of lower-dimensional sets based on noisy high-dimensional data. To this end, the investigator has formulated two separate but highly interdependent sets of research goals. The first set of research goals is the integration of statistical theory with methods of topological data analysis. Recent breakthroughs in computational topology have made it possible to compute topological invariants of sets from a collection of points in Euclidean spaces. Though the potential for high-dimensional statistical inference of these new types of data summaries is significant, their statistical properties are still largely unexplored. The investigator proposes to 1) to develop a comprehensive theory of minimax (and adaptive) estimation of topological properties of sets and 2) to create statistical procedures for non-parametric testing and de-noising based on topological invariants. The second set of research goals pertains to the traditional geometric data-analytic task of clustering in high-dimensions, and it is aimed at advancing the theory and practice of high-density clustering. Recent progress in the theory of clustering has demonstrated that clustering using density estimation can perform well in high-dimensional settings, and that the notion of high-density clustering provides a natural probabilistic framework for describing and analyzing clustering problems in great generality. Thus, the investigator intends 1) to generalize and refine the high-density clustering problem under weak conditions on the data-generating mechanism and 2) to investigate the theory and use of data resampling techniques for parameter tuning in high-density clustering and density estimation. A common thread in the proposed research is the reliance on density estimation, as a tool for both accurate high-dimensional clustering and smoothing/de-noising of topological features. In the last few decades, advances in data acquisition technologies have led to an explosion in the collection and diffusion of large-scale datasets, across a variety of scientific fields. The unprecedented magnitude and complexity of modern databases pose formidable challenges to statisticians, both of theoretical and methodological nature, and has required the development of new statistical tools for data analysis. Modern high-dimensional statistics is predicated on the key assumption that, while the data are observed in a high-dimensional space, the intrinsic complexity of the data-generating mechanism is in fact significantly smaller and, therefore, learnable in computationally efficient ways. This research proposal capitalizes on this premise, and describes an array of methods for summarizing, discriminating, visualizing and clustering high-dimensional noisy data and for extracting salient low-dimensional features. The proposed research encompasses several novel and open research problems at the interface of mathematics, computer science, statistics and machine learning. The procedures studied in the proposal are of broad applicability and promise to be used in a multitude of scientific areas, such as medical imaging, neuroscience, astrophysics, biology, genetics, geophysics and sensor networks, just to name a few. The broader impact of this project also includes interdisciplinary training of students in statistics, mathematics and computer science.
本课题的研究目标是发展新的理论和方法,用于在高维噪声数据的基础上估计低维集合的拓扑和几何特征。 为此,研究者制定了两套独立但高度相互依赖的研究目标。 第一组研究目标是统计理论与拓扑数据分析方法的整合。计算拓扑学的最新突破使得从欧几里得空间中的点集合计算集合的拓扑不变量成为可能。虽然这些新类型的数据摘要的高维统计推断的潜力是显着的,它们的统计特性仍然在很大程度上未被探索。 研究者建议:1)开发一个集的拓扑性质的极大极小(和自适应)估计的综合理论; 2)创建基于拓扑不变量的非参数检验和去噪的统计程序。第二组研究目标属于传统的高维聚类几何数据分析任务,旨在推进高密度聚类的理论和实践。聚类理论的最新进展表明,使用密度估计的聚类可以在高维环境中表现良好,并且高密度聚类的概念为描述和分析聚类问题提供了一个自然的概率框架。因此,研究者打算1)在弱条件下推广和改进高密度聚类问题的数据生成机制和2)研究高密度聚类和密度估计中参数调整的数据恢复技术的理论和使用。在拟议的研究中的一个共同点是依赖于密度估计,作为一种工具,准确的高维聚类和平滑/去噪的拓扑特征。在过去的几十年里,数据采集技术的进步导致了跨各种科学领域的大规模数据集的收集和传播的爆炸式增长。现代数据库的规模和复杂性前所未有,对统计人员在理论和方法上都构成了巨大挑战,需要开发新的数据分析统计工具。现代高维统计学基于一个关键假设,即虽然数据是在高维空间中观察到的,但数据生成机制的内在复杂性实际上要小得多,因此可以以计算效率高的方式学习。本研究建议利用这一前提,并描述了一系列的方法,用于总结,区分,可视化和聚类高维噪声数据和提取显着的低维特征。拟议的研究包括数学,计算机科学,统计学和机器学习接口的几个新颖和开放的研究问题。该提案中研究的程序具有广泛的适用性,并有望用于许多科学领域,如医学成像,神经科学,天体物理学,生物学,遗传学,生物物理学和传感器网络,仅举几例。该项目的更广泛影响还包括对学生进行统计、数学和计算机科学方面的跨学科培训。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Alessandro Rinaldo其他文献
On Least Square Estimation in Softmax Gating Mixture of Experts
关于Softmax专家门混合中的最小二乘估计
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Huy Nguyen;Nhat Ho;Alessandro Rinaldo - 通讯作者:
Alessandro Rinaldo
Cost-saving simultaneous stapling of the right superior vein and the anterior trunk of the right pulmonary artery during right upper lobectomy
- DOI:
10.1007/s12055-014-0332-7 - 发表时间:
2014-12-09 - 期刊:
- 影响因子:0.600
- 作者:
Massimo Torre;Sava Durkovic;Serena Conforti;Alessandro Rinaldo - 通讯作者:
Alessandro Rinaldo
On Least Squares Estimation in Softmax Gating Mixture of Experts
关于 Softmax 专家门混合中的最小二乘估计
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Huy Nguyen;Nhat Ho;Alessandro Rinaldo - 通讯作者:
Alessandro Rinaldo
Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts
在专家混合中,Sigmoid 门控比 Softmax 门控的样本效率更高
- DOI:
10.48550/arxiv.2405.13997 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Huy Nguyen;Nhat Ho;Alessandro Rinaldo - 通讯作者:
Alessandro Rinaldo
Detecting Abrupt Changes in Sequential Pairwise Comparison Data
检测连续成对比较数据中的突然变化
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Wanshan Li;Alessandro Rinaldo;Daren Wang - 通讯作者:
Daren Wang
Alessandro Rinaldo的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Alessandro Rinaldo', 18)}}的其他基金
DMS-EPSRC: Change-Point Detection and Localization in High Dimensions: Theory and Methods
DMS-EPSRC:高维变化点检测和定位:理论和方法
- 批准号:
2015489 - 财政年份:2020
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
相似海外基金
CAREER: Statistical foundations of particle tracking and trajectory inference
职业:粒子跟踪和轨迹推断的统计基础
- 批准号:
2339829 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Statistical Inference in Observational Studies -- Theory, Methods, and Beyond
职业:观察研究中的统计推断——理论、方法及其他
- 批准号:
2338760 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Distribution-Free and Adaptive Statistical Inference
职业:无分布和自适应统计推断
- 批准号:
2338464 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Statistical Inference in High Dimensions using Variational Approximations
职业:使用变分近似进行高维统计推断
- 批准号:
2239234 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Towards Tight Guarantees of Markov Chain Sampling Algorithms in High Dimensional Statistical Inference
职业:高维统计推断中马尔可夫链采样算法的严格保证
- 批准号:
2237322 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Computer-Intensive Statistical Inference on High-Dimensional and Massive Data: From Theoretical Foundations to Practical Computations
职业:高维海量数据的计算机密集统计推断:从理论基础到实际计算
- 批准号:
2347760 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: New Statistical Approaches for Studying Evolutionary Processes: Inference, Attribution and Computation
职业:研究进化过程的新统计方法:推理、归因和计算
- 批准号:
2143242 - 财政年份:2022
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Fast and Accurate Statistical Learning and Inference from Large-Scale Data: Theory, Methods, and Algorithms
职业:从大规模数据中快速准确地进行统计学习和推理:理论、方法和算法
- 批准号:
2046874 - 财政年份:2021
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Robust and Efficient Algorithms for Statistical Estimation and Inference
职业:用于统计估计和推理的稳健且高效的算法
- 批准号:
2045068 - 财政年份:2021
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CAREER: Statistical Inference of Tail Dependent Time Series
职业:尾部相关时间序列的统计推断
- 批准号:
2131821 - 财政年份:2021
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant