Complexity of High-Dimensional Statistical Models: An Information-Based Approach

高维统计模型的复杂性:基于信息的方法

基本信息

  • 批准号:
    2015285
  • 负责人:
  • 金额:
    $ 30万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-09-01 至 2024-08-31
  • 项目状态:
    已结题

项目摘要

With big data come bigger goals – many modern statistical applications not only involve large datasets that may offer new insights but also require deciphering highly intricate relationships among a large number of variables, often characterized by very complex and high-dimensional functions. As more data are acquired, these functions inevitably become more complex. Oftentimes the most fundamental challenge for these applications is how to quantify the complexity of such tasks and learn these functions from data in an efficient way, both statistically and computationally. Despite impressive progress made in recent years, the current approach towards this goal is limited by the discrete nature of the classical notion of computational complexity and is not suitable for statistical problems that are continuous. This project aims to develop an information-based approach that better accounts for both statistical and computational efficiencies. This new framework of complexity is expected to offer insights into the potential trade-off between statistical and computational efficiencies and to reveal the role of experimental design in alleviating computational burden. The project provides training for graduate students through involvement in the research.Traditional nonparametric techniques based solely on smoothness are known to suffer from the so-called "curse of dimensionality." But in many scientific and engineering applications, the underlying high-dimensional object may have additional structures which, if appropriately accounted for, could help lift this barrier and allow for efficient methods to handle it. This project aims to develop a coherent framework to quantify the complexity of high dimensional models that appropriately accounts for both statistical accuracy and computational cost and helps better understand the role of these additional structures. The project will use this notion of complexity to examine several common yet notoriously difficult high-dimensional nonparametric regression problems: one based solely on smoothness, another based on smoothness and sparsity, and finally, one based on low-rank tensors. The exercise is designed to reveal interesting relationships between statistical and computational aspects of these problems and lead to the development of novel and optimal sampling and estimation strategies. The research will develop the new framework of complexity in more general statistical contexts as well and investigate its role in characterizing statistically and computationally optimal inference schemes. This will be achieved by developing new statistical methods and computational algorithms, theoretical study of their performance and fundamental limits, and the development of related mathematical tools and computational software.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
大数据带来了更大的目标——许多现代统计应用不仅涉及可能提供新见解的大型数据集,而且还需要破译大量变量之间高度复杂的关系,这些变量通常以非常复杂和高维的函数为特征。随着获取的数据越来越多,这些功能不可避免地变得越来越复杂。通常,这些应用程序最基本的挑战是如何量化这些任务的复杂性,并以有效的方式从数据中学习这些功能,包括统计和计算。尽管近年来取得了令人印象深刻的进展,但目前实现这一目标的方法受到经典计算复杂性概念的离散性的限制,并且不适合连续的统计问题。这个项目旨在发展一种基于信息的方法,更好地说明统计和计算效率。这一新的复杂性框架有望为统计和计算效率之间的潜在权衡提供见解,并揭示实验设计在减轻计算负担方面的作用。该项目通过参与研究为研究生提供培训。传统的基于平滑的非参数技术被称为“维数诅咒”。但在许多科学和工程应用中,潜在的高维物体可能有额外的结构,如果适当地解释,可以帮助消除这个障碍,并允许有效的方法来处理它。该项目旨在开发一个连贯的框架,以量化高维模型的复杂性,适当地考虑统计准确性和计算成本,并有助于更好地理解这些附加结构的作用。该项目将使用这个复杂性的概念来检查几个常见的但众所周知的困难的高维非参数回归问题:一个仅仅基于平滑,另一个基于平滑和稀疏,最后,一个基于低秩张量。该练习旨在揭示这些问题的统计和计算方面之间的有趣关系,并导致新的和最佳的采样和估计策略的发展。该研究将在更一般的统计背景下发展复杂性的新框架,并研究其在描述统计和计算最优推理方案中的作用。这将通过开发新的统计方法和计算算法,对其性能和基本限制的理论研究以及相关数学工具和计算软件的开发来实现。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Higher-Order Least Squares: Assessing Partial Goodness of Fit of Linear Causal Models
On the Optimality of Kernel-Embedding Based Goodness-of-Fit Tests
  • DOI:
  • 发表时间:
    2017-09
  • 期刊:
  • 影响因子:
    0
  • 作者:
    K. Balasubramanian;Tong Li;M. Yuan
  • 通讯作者:
    K. Balasubramanian;Tong Li;M. Yuan
Comments on “Factor Models for High-Dimensional Tensor Time Series”
对“高维张量时间序列的因子模型”的评论
Effective Tensor Sketching via Sparsification
On Estimating Rank-One Spiked Tensors in the Presence of Heavy Tailed Errors
存在重尾误差时估计一阶尖峰张量
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ming Yuan其他文献

h4 style=font-size:14px;font-family:Arial, Helvetica, sans-serif;background-color:#FFFFFF;Transition-Metal-Free Synthesis of Phenanthridinones from Biaryl-2-oxamic Acid under Radical Conditions/h4
自由基条件下由联芳基-2-草酰胺酸无过渡金属合成菲啶酮
  • DOI:
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    5.2
  • 作者:
    Ming Yuan;Li Chen;Junwei Wang;Shenjie Chen;Kongchao Wang;Yongbo Xue;Guangmin Yao;Zengwei Luo;Yonghui Zhang
  • 通讯作者:
    Yonghui Zhang
Breast Cancer Risk Prediction Using Electronic Health Records
使用电子健康记录预测乳腺癌风险
Geochemical distortion on shale oil maturity caused by oil migration: Insights from the non-hydrocarbons revealed by FT-ICR MS
石油运移引起的页岩油成熟度地球化学畸变:FT-ICR MS揭示的非烃洞察
  • DOI:
    10.1016/j.coal.2022.104142
  • 发表时间:
    2022-11
  • 期刊:
  • 影响因子:
    5.6
  • 作者:
    Ming Yuan;Songqi Pan;Zhenhua Jing;Stefanie Poetz;Quan Shi;Yuanjia Han;Caineng Zou
  • 通讯作者:
    Caineng Zou
A Novel Red Electroluminescent Polymers Derived from Carbazole and 4,7-Bis(2-thienyl)-2,1,3-benzothiadiazole,
一种源自咔唑和4,7-双(2-噻吩基)-2,1,3-苯并噻二唑的新型红色电致发光聚合物,
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jian Huang;Yishe Xu;Qiong Hou;Wei Yang;Ming Yuan;Yong Cao
  • 通讯作者:
    Yong Cao
Genome-wide association mapping and candidate gene analysis for water-soluble protein concentration in soybean (Glycine max) based on high-throughput single nucleotide polymorphism markers
基于高通量单核苷酸多态性标记的大豆水溶性蛋白浓度的全基因组关联图谱和候选基因分析
  • DOI:
    10.1071/cp19425
  • 发表时间:
    2020-04
  • 期刊:
  • 影响因子:
    1.9
  • 作者:
    Meinan Sui;Yue Wang;Zhihui Cui;Weili Teng;Ming Yuan;Wenbin Li;Xi Wang;Ruiqiong Li;Yan Lv;Ming Yan;Chao Quan;Xue Zhao;Yingpeng Han
  • 通讯作者:
    Yingpeng Han

Ming Yuan的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ming Yuan', 18)}}的其他基金

FRG: Collaborative Research: Dynamic Tensors: Statistical Methods, Theory, and Applications
FRG:协作研究:动态张量:统计方法、理论和应用
  • 批准号:
    2052955
  • 财政年份:
    2021
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Collaborative Research: Statistical Methods, Algorithms, and Theory for Large Tensors
合作研究:大张量的统计方法、算法和理论
  • 批准号:
    1721584
  • 财政年份:
    2017
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Collaborative Research: Statistical Methods, Algorithms, and Theory for Large Tensors
合作研究:大张量的统计方法、算法和理论
  • 批准号:
    1803450
  • 财政年份:
    2017
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
CAREER: Sparse Modeling and Estimation with High-dimensional Data
职业:高维数据的稀疏建模和估计
  • 批准号:
    1321692
  • 财政年份:
    2013
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
FRG: Collaborative Research: Statistical Modeling and Inference of Vast Matrices for Complex Problems
FRG:协作研究:复杂问题的庞大矩阵的统计建模和推理
  • 批准号:
    1265202
  • 财政年份:
    2013
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
CAREER: Sparse Modeling and Estimation with High-dimensional Data
职业:高维数据的稀疏建模和估计
  • 批准号:
    0846234
  • 财政年份:
    2009
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Statistical Modeling with High-dimensional Data: Variable Selection and Regularization
高维数据统计建模:变量选择和正则化
  • 批准号:
    0706724
  • 财政年份:
    2007
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant

相似国自然基金

Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    合作创新研究团队

相似海外基金

CAREER: Next-Generation Methods for Statistical Integration of High-Dimensional Disparate Data Sources
职业:高维不同数据源统计集成的下一代方法
  • 批准号:
    2422478
  • 财政年份:
    2024
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
CAREER: Practical algorithms and high dimensional statistical methods for multimodal haplotype modelling
职业:多模态单倍型建模的实用算法和高维统计方法
  • 批准号:
    2239870
  • 财政年份:
    2023
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Deepening and Expanding Research for Efficient Methods of Function Estimation in High Dimensional Statistical Analysis
高维统计分析中高效函数估计方法的深化和拓展研究
  • 批准号:
    23H03353
  • 财政年份:
    2023
  • 资助金额:
    $ 30万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
CAREER: Towards Tight Guarantees of Markov Chain Sampling Algorithms in High Dimensional Statistical Inference
职业:高维统计推断中马尔可夫链采样算法的严格保证
  • 批准号:
    2237322
  • 财政年份:
    2023
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Statistical methods for analysis of high-dimensional mediation pathways
高维中介路径分析的统计方法
  • 批准号:
    10582932
  • 财政年份:
    2023
  • 资助金额:
    $ 30万
  • 项目类别:
CAREER: Computer-Intensive Statistical Inference on High-Dimensional and Massive Data: From Theoretical Foundations to Practical Computations
职业:高维海量数据的计算机密集统计推断:从理论基础到实际计算
  • 批准号:
    2347760
  • 财政年份:
    2023
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Collaborative Research: Statistical Optimal Transport in High Dimensional Mixtures
合作研究:高维混合物中的统计最优传输
  • 批准号:
    2210563
  • 财政年份:
    2022
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Statistical learning and causal inference in high-dimensional genomics data across multiple information layers
跨多个信息层的高维基因组数据的统计学习和因果推理
  • 批准号:
    RGPIN-2022-03708
  • 财政年份:
    2022
  • 资助金额:
    $ 30万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical learning and causal inference in high-dimensional genomics data across multiple information layers
跨多个信息层的高维基因组数据的统计学习和因果推理
  • 批准号:
    DGECR-2022-00445
  • 财政年份:
    2022
  • 资助金额:
    $ 30万
  • 项目类别:
    Discovery Launch Supplement
Statistical learning algorithms for high-dimensional non-normally distributed data
高维非正态分布数据的统计学习算法
  • 批准号:
    RGPIN-2018-06787
  • 财政年份:
    2022
  • 资助金额:
    $ 30万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了