Development of Statistical Methods for High-dimensional and Complex Data

高维复杂数据统计方法发展

基本信息

  • 批准号:
    0905561
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2009
  • 资助国家:
    美国
  • 起止时间:
    2009-07-01 至 2013-06-30
  • 项目状态:
    已结题

项目摘要

As technology advances, scientists are challenged by more and more high-dimensional and complex data. For example, genetic data from microarray experiments are very large in size and new techniques are needed to identify specific genes for various diseases. Longitudinal data on various variables over time on millions of individuals produce interesting challenges. Such data call for new statistical techniques. Some of the challenges in high-dimensional data include variable selection from a large group of variables. Some of the existing methods also suffer from a high false discovery rate. In addition, in quantile regression methods, an odd phenomenon of quantile crossing needs to be addressed. Finally, spatial and longitudinal studies require special efficient methods for estimating the covariance patterns. Motivated by different features of high-dimensional or complex data, the PI develops several methods. In this grant application, the PI proposes: 1. new techniques for variable selection for high-dimensional data and new methods to reduce the false discovery rate; 2. new techniques to handle the phenomenon of quantile crossing with application to class probability estimation; 3. new methods to estimate covariance structure for spatial data and longitudinal data; and 4. parametrically guided nonparametric estimation for the quasi-likelihood method. The proposed methods will be studied theoretically for their asymptotic behavior and compared with some of the existing methods both theoretically and through simulations.High-dimensional variable selection techniques are called for by many scientists to efficiently analyze large-scale complex financial, environmental, and biomedical data such as gene expression, proteomics and metabolomics, or brain imaging data. These types of data require techniques to identify important features. To achieve this goal, the PI proposes a screening method to select appropriate statistical models. This screening method can be applied to biomedical data to locate important genes responsible for diseases of interest such as breast cancer and leukemia. Spatial and longitudinal data are sparsely and irregularly observed also in environmental and clinical studies. For such data, many efforts have been devoted to studying their covariance structures. In this proposal, the PI proposes a flexible convolution-based method to estimate covariance structures nonparametrically. This method can be applied to many environmental data such as precipitation and wind to improve our understanding of environmental changes including the well-known "climate change" issue. This research has many societal applications. In addition, the PI takes advantage of the mentoring program in the department to work with US doctoral students, especially women and minorities. The PI also works with undergraduates from NSF-CSUMS program in the department as it is important to train computationally strong critical thinkers for the future.
随着技术的进步,科学家们面临着越来越多的高维、复杂数据的挑战。例如,来自微阵列实验的遗传数据非常大,需要新的技术来识别各种疾病的特定基因。数以百万计的个体随时间变化的各种变量的纵向数据产生了有趣的挑战。这样的数据需要新的统计技术。高维数据中的一些挑战包括从大量变量中选择变量。现有的一些方法也存在较高的错误发现率。此外,在分位数回归方法中,需要解决分位数交叉的奇怪现象。最后,空间和纵向研究需要特别有效的方法来估计协方差模式。根据高维或复杂数据的不同特征,PI开发了几种方法。在此拨款申请中,PI提出:1。高维数据变量选择的新技术和降低错误发现率的新方法;2. 处理分位数交叉现象的新技术及其在类概率估计中的应用3. 空间数据和纵向数据协方差结构估计的新方法和4。准似然方法的参数引导非参数估计。本文将对所提方法的渐近性进行理论研究,并与现有的一些方法进行理论和仿真比较。许多科学家需要高维变量选择技术来有效地分析大规模复杂的金融、环境和生物医学数据,如基因表达、蛋白质组学和代谢组学或脑成像数据。这些类型的数据需要识别重要特征的技术。为了实现这一目标,PI提出了一种筛选方法来选择合适的统计模型。这种筛选方法可以应用于生物医学数据,以定位引起乳腺癌和白血病等感兴趣疾病的重要基因。在环境和临床研究中,空间和纵向数据也是稀疏和不规则的。对于这类数据,人们一直在努力研究它们的协方差结构。在这个提议中,PI提出了一种灵活的基于卷积的非参数估计协方差结构的方法。这种方法可以应用于许多环境数据,如降水和风,以提高我们对环境变化的认识,包括众所周知的“气候变化”问题。这项研究有许多社会应用。此外,PI利用该部门的指导计划与美国博士生合作,特别是女性和少数民族。PI还与该系nsf - csum项目的本科生合作,因为这对培养未来计算能力强的批判性思考者很重要。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Yichao Wu其他文献

Soil phyllosilicate and iron oxide inhibit the quorum sensing of Chromobacterium violaceum
土壤页硅酸盐和氧化铁抑制紫色色杆菌的群体感应
  • DOI:
    10.1007/s42832-020-0051-5
  • 发表时间:
    2020-07
  • 期刊:
  • 影响因子:
    4
  • 作者:
    Shanshan Yang;Chenchen Qu;Manisha Mukherjee;Yichao Wu;Qiaoyun Huang;Peng Cai
  • 通讯作者:
    Peng Cai
Research on damage and stress monitoring analysis of cement-based materials based on integrated sensing element (ISE)
基于集成传感元件(ISE)的水泥基材料损伤与应力监测分析研究
  • DOI:
    10.1016/j.cscm.2025.e04789
  • 发表时间:
    2025-07-01
  • 期刊:
  • 影响因子:
    6.600
  • 作者:
    Ming Sun;Weiwei Xu;Kaifeng Zheng;Yuanxing Wang;Weijian Ding;Jie Yao;Jianbin Zheng;Yichao Wu;Fengxia Xu
  • 通讯作者:
    Fengxia Xu
Extraction of extracellular polymeric substances (EPS) from red soils (Ultisols)
从红土(Ultisols)中提取细胞外聚合物(EPS)
  • DOI:
    10.1016/j.soilbio.2019.05.014
  • 发表时间:
    2019-08
  • 期刊:
  • 影响因子:
    9.7
  • 作者:
    Shuang Wang;Marc Redmile-Gordon;Monika Mortimer;Peng Cai;Yichao Wu;Caroline L. Peacock;Chunhui Gao;Qiaoyun Huang
  • 通讯作者:
    Qiaoyun Huang
Estimation and Prediction of a Class of Convolution-Based Spatial Nonstationary Models for Large Spatial Data
一类基于卷积的大空间数据空间非平稳模型的估计与预测
Probability approximations with applications in computational finance and computational biology
概率近似在计算金融和计算生物学中的应用
  • DOI:
  • 发表时间:
    2006
  • 期刊:
  • 影响因子:
    0
  • 作者:
    C. Ji;H. Hurd;Yichao Wu
  • 通讯作者:
    Yichao Wu

Yichao Wu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Yichao Wu', 18)}}的其他基金

FRG: Collaborative Research: Mathematical and Statistical Analysis of Compressible Data on Compressive Networks
FRG:协作研究:压缩网络上可压缩数据的数学和统计分析
  • 批准号:
    2152070
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Collaborative Research: A Fast Hierarchical Algorithm for Computing High Dimensional Truncated Multivariate Gaussian Probabilities and Expectations
协作研究:计算高维截断多元高斯概率和期望的快速分层算法
  • 批准号:
    1821171
  • 财政年份:
    2018
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
CAREER: New Statistical Methods for Classification and Analysis of High Dimensional and Functional Data
职业:高维和功能数据分类和分析的新统计方法
  • 批准号:
    1812354
  • 财政年份:
    2017
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
CAREER: New Statistical Methods for Classification and Analysis of High Dimensional and Functional Data
职业:高维和功能数据分类和分析的新统计方法
  • 批准号:
    1055210
  • 财政年份:
    2011
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant

相似海外基金

Use of deep learning and development of statistical prediction result evaluation methods for the acceleration of personalized medicine
利用深度学习和开发统计预测结果评估方法加速个性化医疗
  • 批准号:
    23K11014
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
NSF FDA/SiR: Development of eeDAP microscopy platform software, validation data, and statistical methods to assess performance of candidate Software as a Medical Device (SaMD)
NSF FDA/SiR:开发 eeDAP 显微镜平台软件、验证数据和统计方法,以评估候选软件作为医疗设备 (SaMD) 的性能
  • 批准号:
    2326317
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Development of Efficient and Practical Privacy-Preserving Methods for Large-Scale Genomic Statistical Analysis
开发用于大规模基因组统计分析的高效实用的隐私保护方法
  • 批准号:
    23KJ0649
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Development of general statistical methods for estimating gust wind speed in different urban morphology and weather conditions based on the Weibull distribution
基于威布尔分布的不同城市形态和天气条件下阵风风速估算通用统计方法的发展
  • 批准号:
    23K13454
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Accelerating biomarker development through novel statistical methods for analyzing phase III/IV studies
通过分析 III/IV 期研究的新统计方法加速生物标志物开发
  • 批准号:
    10568744
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
Development of statistical methods that help to interpret randomized clinical trials
开发有助于解释随机临床试验的统计方法
  • 批准号:
    22K17301
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Prize 202203PJT - The test-negative design for the estimation of COVID-19 vaccine effectiveness: design evaluation and development of statistical methods in the evolving context
奖 202203PJT - 用于估计 COVID-19 疫苗有效性的测试阴性设计:不断发展的背景下统计方法的设计评估和开发
  • 批准号:
    467985
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Operating Grants
Novel Statistical Methods for Development of Polygenic Scores in Multi-Ancestry Cohorts
多祖先队列中多基因评分开发的新统计方法
  • 批准号:
    10794931
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
The test-negative design for the estimation of COVID-19 vaccine effectiveness: design evaluation and development of statistical methods in the evolving context
用于估计 COVID-19 疫苗有效性的测试阴性设计:不断变化的背景下统计方法的设计评估和开发
  • 批准号:
    462230
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Operating Grants
Novel Statistical Methods for Development of Polygenic Scores in Multi-Ancestry Cohorts
多祖先队列中多基因评分开发的新统计方法
  • 批准号:
    10464189
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了