New Frontiers of Robust Statistics in the Era of Big Data
大数据时代稳健统计的新领域
基本信息
- 批准号:2113568
- 负责人:
- 金额:$ 23.59万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-07-01 至 2024-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Modern technologies have facilitated the collection of an unprecedented amount of features with complex structures. Although extensive progress has been made towards extracting useful information from massive data, the statistical analysis typically assumed that data are drawn without any contamination. However, in reality the data sets arising in applications such as genomics and medical imaging are usually more inhomogeneous due to either data collection process or the intrinsic nature of the data in the era of big data. For instance, in gene expression data analysis, outliers frequently arise in microarray experiments due to the array chip artifacts such as uneven spray of reagents within arrays. Compared to the recent advances in the era of big data, research in modeling and theoretical foundations for robust procedures under contamination models has fallen behind. To bridge this gap, this project seeks to develop new robust estimation and inference procedures which are rate-optimal for various contamination models as building blocks to address the modeling, theory and computational challenges. Upon completion, this work will lead to a comprehensive understanding of contamination models and have an immediate impact on various disciplines such as biology, genomics, astronomy and finance. The project also provides training opportunities for undergraduate and graduate students, and is used to enrich courses and outreach educational materials in statistics and data science.This project aims to address some of the most pressing challenges that are faced by robust procedures in high-dimensional and nonparametric contamination models. Specifically, (I) the research begins with statistical inference of low-dimensional parameters in both increasing-dimensional and high-dimensional regressions under contamination models. The PI will study the influence of contamination proportion in obtaining the root-n consistency results. Robust large-scale simultaneous inference under contamination models are also considered. (II) Next, the PI will revisit some classical nonparametric density estimation problems both under arbitrary and structured contamination distributions. The PI plans to propose rate-optimal procedures and carefully study the effect of contamination on estimation through various model indices, including contamination proportion, the structure of contamination and the choice of loss function. (III) The PI will develop a U-type robust covariance estimator under structured contamination models and provide rigorous theoretical guarantees on its rate optimality. This general robust estimator can serve as building blocks for establishing many rate-optimal procedures for structured large covariance/precision matrix estimation problems. User-friendly R packages will be developed to implement the proposed methods.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
现代技术促进了具有复杂结构的前所未有数量的特征的收集。虽然在从大量数据中提取有用信息方面取得了广泛的进展,但统计分析通常假设数据是在没有任何污染的情况下提取的。然而,在现实中,在基因组学和医学成像等应用中产生的数据集通常由于数据收集过程或大数据时代数据的内在性质而更加不均匀。例如,在基因表达数据分析中,由于阵列芯片伪像(如阵列内试剂的不均匀喷射),在微阵列实验中经常出现离群值。与大数据时代的最新进展相比,污染模型下鲁棒程序的建模和理论基础研究已经落后。为了弥合这一差距,该项目旨在开发新的鲁棒估计和推理程序,这些程序对于各种污染模型来说是速率最优的,作为构建模块,以解决建模,理论和计算挑战。 完成后,这项工作将导致对污染模型的全面了解,并对生物学,基因组学,天文学和金融等各个学科产生直接影响。该项目还为本科生和研究生提供培训机会,并用于丰富统计和数据科学方面的课程和宣传教育材料,旨在解决高维和非参数污染模型中稳健程序所面临的一些最紧迫的挑战。具体而言,(1)研究从污染模型下的增维和高维回归中的低维参数的统计推断开始。PI将研究污染比例对获得root-n一致性结果的影响。也考虑了污染模型下的鲁棒大规模同时推理。(II)接下来,PI将在任意和结构化污染分布下重新讨论一些经典的非参数密度估计问题。PI计划提出速率优化程序,并通过各种模型指标(包括污染比例、污染结构和损失函数的选择)仔细研究污染对估计的影响。(III)PI将在结构化污染模型下建立一个U型稳健协方差估计器,并对其速率最优性提供严格的理论保证。这个一般的鲁棒估计器可以作为构建模块,用于建立结构化大协方差/精度矩阵估计问题的许多速率最优程序。该奖项反映了NSF的法定使命,并被认为是值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Adaptive minimax density estimation on ℝ d for Huber’s contamination model
Huber 污染模型对 d 的自适应极小极大密度估计
- DOI:10.1093/imaiai/iaad045
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Zhang, Peiliang;Ren, Zhao
- 通讯作者:Ren, Zhao
High-dimension to high-dimension screening for detecting genome-wide epigenetic and noncoding RNA regulators of gene expression
高维到高维筛选,用于检测基因表达的全基因组表观遗传和非编码 RNA 调节因子
- DOI:10.1093/bioinformatics/btac518
- 发表时间:2022
- 期刊:
- 影响因子:5.8
- 作者:Ke, Hongjie;Ren, Zhao;Qi, Jianfei;Chen, Shuo;Tseng, George C.;Ye, Zhenyao;Ma, Tianzhou;Alkan, ed., Can
- 通讯作者:Alkan, ed., Can
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Zhao Ren其他文献
Tunneling mechanism in higher-dimensional rotating black hole with a cosmological constant in the approach of dimensional reduction
降维方法中具有宇宙学常数的高维旋转黑洞中的隧道机制
- DOI:
10.1007/s10509-011-0660-7 - 发表时间:
2011-03 - 期刊:
- 影响因子:1.9
- 作者:
Zhang Li-Chun;Li Huai-Fan;Zhao Ren - 通讯作者:
Zhao Ren
A new explanation for statistical entropy of charged black hole
带电黑洞统计熵的新解释
- DOI:
10.1007/s11433-013-5167-5 - 发表时间:
2013-07 - 期刊:
- 影响因子:0
- 作者:
Zhao Ren;Zhang LiChun - 通讯作者:
Zhang LiChun
Clapeyron equation and phase equilibrium properties in higher dimensional charged topological dilaton AdS black holes with a nonlinear source
非线性源高维带电拓扑膨胀 AdS 黑洞中的克拉佩龙方程和相平衡性质
- DOI:
10.1140/epjc/s10052-017-4831-8 - 发表时间:
2016-09 - 期刊:
- 影响因子:4.4
- 作者:
Li Huai-Fan;Zhao Hui-Hua;Zhang Li-Chun;Zhao Ren - 通讯作者:
Zhao Ren
Quantum Statistical Entropy of Black Hole
黑洞的量子统计熵
- DOI:
10.1023/a:1021179316964 - 发表时间:
2002 - 期刊:
- 影响因子:0
- 作者:
Zhao Ren;Zhang Junfang;Zhang Lichun - 通讯作者:
Zhang Lichun
The EIHW-GLAM Deep Attentive Multi-model Fusion System for Cough-based COVID-19 Recognition in the DiCOVA 2021 Challenge
EIHW-GLAM 深度注意力多模型融合系统,用于 DiCOVA 2021 挑战赛中基于咳嗽的 COVID-19 识别
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Zhao Ren;Yi Chang;Björn Schuller - 通讯作者:
Björn Schuller
Zhao Ren的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Zhao Ren', 18)}}的其他基金
New Methods and Theory of Statistical Inference for Non-Gaussian Graphical Models
非高斯图模型统计推断的新方法和理论
- 批准号:
1812030 - 财政年份:2018
- 资助金额:
$ 23.59万 - 项目类别:
Standard Grant
相似国自然基金
Frontiers of Environmental Science & Engineering
- 批准号:51224004
- 批准年份:2012
- 资助金额:20.0 万元
- 项目类别:专项基金项目
Frontiers of Physics 出版资助
- 批准号:11224805
- 批准年份:2012
- 资助金额:20.0 万元
- 项目类别:专项基金项目
Frontiers of Mathematics in China
- 批准号:11024802
- 批准年份:2010
- 资助金额:16.0 万元
- 项目类别:专项基金项目
相似海外基金
New Frontiers for Anonymous Authentication
匿名身份验证的新领域
- 批准号:
DE240100282 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Discovery Early Career Researcher Award
Conference: 2024 NanoFlorida Conference: New Frontiers in Nanoscale interactions
会议:2024 年纳米佛罗里达会议:纳米尺度相互作用的新前沿
- 批准号:
2415310 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Standard Grant
Collaborative Research: AF: Small: Exploring the Frontiers of Adversarial Robustness
合作研究:AF:小型:探索对抗鲁棒性的前沿
- 批准号:
2335411 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Standard Grant
New Frontiers in Large-Scale Polynomial Optimisation
大规模多项式优化的新领域
- 批准号:
DE240100674 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Discovery Early Career Researcher Award
RTG: Frontiers in Applied Analysis
RTG:应用分析前沿
- 批准号:
2342349 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Continuing Grant
Conference: Frontiers of Geometric Analysis
会议:几何分析前沿
- 批准号:
2347894 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Standard Grant
Conference: USA-UK-China-Israel Workshop on Frontiers in Ecology and Evolution of Infectious Diseases
会议:美国-英国-中国-以色列生态学和传染病进化前沿研讨会
- 批准号:
2406564 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Standard Grant
Mapping the Frontiers of Private Property in Australia
绘制澳大利亚私有财产的边界
- 批准号:
DP240100395 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Discovery Projects
Conference: FRONTIERS OF ENGINEERING (2024 US FOE, 2024 China-America FOE, and 2025 German-American FOE)
会议:工程前沿(2024年美国之敌、2024年中美之敌、2025年德美之敌)
- 批准号:
2405026 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Standard Grant
Frontiers in gravitational wave astronomy (FRoGW)
引力波天文学前沿(FRoGW)
- 批准号:
EP/Y023706/1 - 财政年份:2024
- 资助金额:
$ 23.59万 - 项目类别:
Fellowship














{{item.name}}会员




