Dense and Sparse Methods in High-Dimensional Data Analysis

高维数据分析中的密集和稀疏方法

基本信息

  • 批准号:
    1208785
  • 负责人:
  • 金额:
    $ 16万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2012
  • 资助国家:
    美国
  • 起止时间:
    2012-08-01 至 2016-07-31
  • 项目状态:
    已结题

项目摘要

Many methods for high-dimensional data analysis begin with the assumption that the parameter of interest is, in some sense, sparse. Furthermore, the performance of many of these methods depends on the sparsity of the underlying parameters. However, statistical methods for checking sparsity assumptions and determining the implications of the absence or near-absence of sparsity are lacking. The driving goal of this project is to develop practical statistical tools for identifying situations where the relevant parameters are in fact sparse, or where sparse methods for high-dimensional data analysis may be applied effectively. Problems considered in this project will primarily be studied within the context of the linear model and the Gaussian location model. Methods will be assessed by decision theoretic-like criteria (e.g. asymptotic minimaxity). A null model based on dense (non-sparse) signals and dense estimation and prediction methods will be developed and thoroughly studied. This will provide a rich framework for sparsity testing, where the aim is to identify settings in which sparse methods are likely to be successful. Specific sparsity testing procedures will be proposed and analyzed. High-dimensional data analysis is one of the most active areas of current statistical research. Much of this research has been driven by technological advances that have enabled researchers to collect vast datasets with relative ease in a variety of scientific disciplines, including astrophysics, geological sciences, molecular biology, and genomics. In high-dimensional datasets, many features are measured for each unit of observation (e.g. thousands of gene expression levels may be measured for each individual in a genomic study). Sparsity plays a major role in much of the research on high-dimensional data analysis. Broadly speaking, sparsity measures the degree to which a specified outcome may be described by relatively few features. Sparse methods for high-dimensional data analysis attempt to leverage sparsity in the underlying dataset and have proven to be very effective in many applications, especially in engineering and signal processing. On the other hand, the performance of sparse methods has been more mixed in other important applications where high-dimensional data are abundant, such as genomics. In this project, the investigator will develop statistical methods for characterizing and identifying situations where sparse methods can be successfully applied. This will be achieved by developing tools for determining the level of sparsity in high-dimensional datasets. These methods, when applied to a given dataset, will help researchers determine the validity of subsequent statistical analyses and the potential benefits of using sparse methods for these analyses. This research is likely to have significant implications for understanding reproducibility in high-dimensional data analysis and broad applications in the analysis of genomic data. The methods developed during the course of this project will be utilized in collaborative work with highly experienced researchers in genomics.
许多用于高维数据分析的方法首先假设感兴趣的参数在某种意义上是稀疏的。此外,许多方法的性能取决于底层参数的稀疏性。然而,用于检查稀疏性假设和确定缺乏或几乎缺乏稀疏性的含义的统计方法是缺乏的。该项目的驱动目标是开发实用的统计工具,用于识别相关参数实际上是稀疏的情况,或者可以有效应用高维数据分析的稀疏方法的情况。在这个项目中考虑的问题将主要在线性模型和高斯位置模型的背景下进行研究。方法将通过类似决策理论的标准(例如渐近极小性)进行评估。一个基于密集(非稀疏)信号和密集估计和预测方法的零模型将被开发和深入研究。这将为稀疏性测试提供一个丰富的框架,其目的是识别稀疏方法可能成功的设置。具体的稀疏性测试程序将提出和分析。高维数据分析是当前统计研究中最活跃的领域之一。技术的进步使研究人员能够相对容易地收集各种科学学科的大量数据集,包括天体物理学、地质科学、分子生物学和基因组学,这在很大程度上推动了这些研究。在高维数据集中,每个观察单位测量许多特征(例如,在基因组研究中可能测量每个个体的数千个基因表达水平)。稀疏性在高维数据分析的许多研究中起着重要作用。广义地说,稀疏性度量的是一个特定的结果可以用相对较少的特征来描述的程度。用于高维数据分析的稀疏方法试图利用底层数据集的稀疏性,并已被证明在许多应用中非常有效,特别是在工程和信号处理中。另一方面,在其他高维数据丰富的重要应用中,如基因组学,稀疏方法的性能则更加参差不齐。在这个项目中,研究者将开发统计方法来描述和识别稀疏方法可以成功应用的情况。这将通过开发用于确定高维数据集的稀疏程度的工具来实现。当将这些方法应用于给定的数据集时,将帮助研究人员确定后续统计分析的有效性以及使用稀疏方法进行这些分析的潜在好处。这项研究可能对理解高维数据分析的可重复性和基因组数据分析的广泛应用具有重要意义。在此项目过程中开发的方法将用于与基因组学领域经验丰富的研究人员的合作工作。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Lee Dicker其他文献

IDENTIFICATION OF NOVEL URINARY BIOMARKERS OF RENAL OBSTRUCTION USING TEMPORAL QUANTITATIVE PROTEOMICS
  • DOI:
    10.1016/s0022-5347(09)60717-5
  • 发表时间:
    2009-04-01
  • 期刊:
  • 影响因子:
  • 作者:
    Alireza Vaezzadeh;Andrew C Briscoe;Lee Dicker;Oliver Hofman;Winston Hide;Hanno Steen;Richard S Lee
  • 通讯作者:
    Richard S Lee

Lee Dicker的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Lee Dicker', 18)}}的其他基金

CAREER: Maximum likelihood and nonparametric empirical Bayes methods in high dimensions
职业:高维中的最大似然和非参数经验贝叶斯方法
  • 批准号:
    1454817
  • 财政年份:
    2015
  • 资助金额:
    $ 16万
  • 项目类别:
    Continuing Grant

相似国自然基金

基于Sparse-Land模型的SAR图像噪声抑制与分割
  • 批准号:
    60971128
  • 批准年份:
    2009
  • 资助金额:
    30.0 万元
  • 项目类别:
    面上项目

相似海外基金

Development of Adaptive Sparse Grid Discontinuous Galerkin Methods for Multiscale Kinetic Simulations in Plasmas
等离子体多尺度动力学模拟的自适应稀疏网格间断伽辽金方法的发展
  • 批准号:
    2404521
  • 财政年份:
    2023
  • 资助金额:
    $ 16万
  • 项目类别:
    Standard Grant
Acceleration, Complexity and Implementation of Active Set Methods for Large-scale Sparse Nonlinear Optimization
大规模稀疏非线性优化的活跃集方法的加速、复杂性和实现
  • 批准号:
    2309549
  • 财政年份:
    2023
  • 资助金额:
    $ 16万
  • 项目类别:
    Standard Grant
CAREER: Statistical Models and Parallel-computing Methods for Analyzing Sparse and Large Single-cell Chromatin Interaction Datasets
职业:用于分析稀疏和大型单细胞染色质相互作用数据集的统计模型和并行计算方法
  • 批准号:
    2239350
  • 财政年份:
    2023
  • 资助金额:
    $ 16万
  • 项目类别:
    Continuing Grant
4D Imaging of Spatially and Temporally Dynamic Biophysical Processes using Sparse Data Methods
使用稀疏数据方法对时空动态生物物理过程进行 4D 成像
  • 批准号:
    RGPIN-2017-04293
  • 财政年份:
    2021
  • 资助金额:
    $ 16万
  • 项目类别:
    Discovery Grants Program - Individual
Sparse methods on manifolds of images and shapes for significant anatomy detection linked to disease
针对多种图像和形状的稀疏方法,用于与疾病相关的重要解剖学检测
  • 批准号:
    RGPIN-2016-04671
  • 财政年份:
    2021
  • 资助金额:
    $ 16万
  • 项目类别:
    Discovery Grants Program - Individual
4D Imaging of Spatially and Temporally Dynamic Biophysical Processes using Sparse Data Methods
使用稀疏数据方法对时空动态生物物理过程进行 4D 成像
  • 批准号:
    RGPIN-2017-04293
  • 财政年份:
    2020
  • 资助金额:
    $ 16万
  • 项目类别:
    Discovery Grants Program - Individual
Efficient regularised estimation methods for sparse time series models
稀疏时间序列模型的高效正则化估计方法
  • 批准号:
    2445108
  • 财政年份:
    2020
  • 资助金额:
    $ 16万
  • 项目类别:
    Studentship
Development of Adaptive Sparse Grid Discontinuous Galerkin Methods for Multiscale Kinetic Simulations in Plasmas
等离子体多尺度动力学模拟的自适应稀疏网格间断伽辽金方法的发展
  • 批准号:
    2011838
  • 财政年份:
    2020
  • 资助金额:
    $ 16万
  • 项目类别:
    Standard Grant
Advances in Robust Multilevel Preconditioning Methods for Sparse Linear Systems
稀疏线性系统鲁棒多级预处理方法的进展
  • 批准号:
    1912048
  • 财政年份:
    2019
  • 资助金额:
    $ 16万
  • 项目类别:
    Standard Grant
Sparse methods on manifolds of images and shapes for significant anatomy detection linked to disease
针对多种图像和形状的稀疏方法,用于与疾病相关的重要解剖学检测
  • 批准号:
    RGPIN-2016-04671
  • 财政年份:
    2019
  • 资助金额:
    $ 16万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了