Statistical Modeling with High-dimensional Data: Variable Selection and Regularization

高维数据统计建模:变量选择和正则化

基本信息

  • 批准号:
    0706733
  • 负责人:
  • 金额:
    $ 11.85万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2007
  • 资助国家:
    美国
  • 起止时间:
    2007-06-01 至 2010-05-31
  • 项目状态:
    已结题

项目摘要

With high-dimensional data parsimonious models are preferred because they are much more interpretable and at the same time reduce prediction errors. Regularization is also an essential component in most modern developments for data analysis, in particular when the number of predictors is large. Non-regularized fitting is guaranteed to give badly over-fitted and useless models. The investigators take a regularization approach to the variable selection problem in high-dimensional statistical modeling such that the resulting model enjoys excellent prediction accuracy and at the same time has a sparse representation. In particular, the investigators develop: (1) new fused variable selection methods in proteomics data analysis which has been arevolutionary cancer diagnostic tool; (2) a novel kernel logistic regression model which automatically adopts a support-vector representation; (3) several new techniques for performing simultaneous variable selection in estimating multiple quantile regression functions. The investigators also study the theory of these new variable selection techniques. Efficient algorithms and software are developed for public use.Modern scientific innovations allow scientists to collect massive and high-dimensional data. It is critical in scientific investigations to extract useful information from the huge amount of data. For this reason, variable selection and dimension reduction play a fundamental role in high-dimensional statistical modeling. Variable selection problems arise from a wide range of fields, machine learning, drug discovery, biomarker finding, genetics, proteomics, brain imaging analysis, financial modeling, environmental sciences, to name a few. The research project aims to develop state-of-the-art statistical tools that help researchers in various fields to analyze their data.
对于高维数据,首选简约模型,因为它们更具可解释性,同时减少预测误差。在大多数数据分析的现代发展中,正则化也是一个重要的组成部分,特别是当预测器的数量很大时。非正则拟合肯定会得到严重过拟合和无用的模型。研究人员对高维统计建模中的变量选择问题采用正则化方法,使所得到的模型具有良好的预测精度,同时具有稀疏表示。具体而言,研究人员开发了:(1)蛋白质组学数据分析中新的融合变量选择方法,已成为革命性的癌症诊断工具;(2)自动采用支持向量表示的核逻辑回归模型;(3)在估计多分位数回归函数中进行同时变量选择的几种新技术。研究者还研究了这些新的变量选择技术的理论。高效的算法和软件被开发出来供公众使用。现代科学创新使科学家能够收集大量高维数据。从大量的数据中提取有用的信息对科学调查是至关重要的。因此,变量选择和降维在高维统计建模中起着至关重要的作用。变量选择问题出现在广泛的领域,机器学习,药物发现,生物标志物发现,遗传学,蛋白质组学,脑成像分析,金融建模,环境科学,仅举几例。该研究项目旨在开发最先进的统计工具,帮助各个领域的研究人员分析他们的数据。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Hui Zou其他文献

Effect of amino acids on formation of pigment precursors in garlic discoloration using UPLC–ESI-Q-TOF-MS analysis
使用 UPLC-ESI-Q-TOF-MS 分析氨基酸对大蒜变色过程中色素前体形成的影响
  • DOI:
    10.1016/j.jfca.2021.104231
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    4.3
  • 作者:
    Ruixuan Zhao;Hui Zou;Renjie Zhao;Ningyang Li;Zhenjia Zheng;X. Qiao
  • 通讯作者:
    X. Qiao
The Oxidation and Combustion Properties of Gas Atomized Aluminum−Boron−Europium Alloy Powders
气雾化铝硼铕合金粉末的氧化和燃烧性能
Coordinatewise Gaussianization: Theories and Applications
坐标高斯化:理论与应用
Dietary inulin alleviated constipation induced depression and anxiety-like behaviors: Involvement of gut microbiota and microbial metabolite short-chain fatty acid
  • DOI:
    https://doi.org/10.1016/j.ijbiomac.2024.129420
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    8.2
  • 作者:
    Hui Zou;Huajing Gao;Yanhong Liu;Zhiwo Zhang;Jia Zhao;Wenxuan Wang;Bo Ren;Xintong Tan
  • 通讯作者:
    Xintong Tan
TRIM 9 is up-regulated in human lung cancer and involved in cell proliferation and apoptosis
TRIM 9 在人肺癌中表达上调并参与细胞增殖和凋亡
  • DOI:
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xiaolin Wang;Y. Shu;Hongcan Shi;Shichun Lu;Kang Wang;Chao Sun;Jiansheng He;Weiguo Jin;X. Lv;Hui Zou;Weiping Shi
  • 通讯作者:
    Weiping Shi

Hui Zou的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Hui Zou', 18)}}的其他基金

IMR: MM-1A: Evolutionary Modeling and Acquisition of Multidimensional 5G Internet Measurements
IMR:MM-1A:多维 5G 互联网测量的演化建模和获取
  • 批准号:
    2220286
  • 财政年份:
    2022
  • 资助金额:
    $ 11.85万
  • 项目类别:
    Standard Grant
Novel Inference Procedures for Non-Standard High-Dimensional Regression Models
非标准高维回归模型的新颖推理程序
  • 批准号:
    2015120
  • 财政年份:
    2020
  • 资助金额:
    $ 11.85万
  • 项目类别:
    Standard Grant
Flexible Statistical Modelling for High Dimensional Data
高维数据的灵活统计建模
  • 批准号:
    1915842
  • 财政年份:
    2019
  • 资助金额:
    $ 11.85万
  • 项目类别:
    Standard Grant
Collaborative Research: New Statistical Methods and Theory for High-Dimensional Data
合作研究:高维数据的新统计方法和理论
  • 批准号:
    1505111
  • 财政年份:
    2015
  • 资助金额:
    $ 11.85万
  • 项目类别:
    Continuing Grant
CAREER: New Statistical Methodology and Theory for Mining High-Dimensional Data
职业:挖掘高维数据的新统计方法和理论
  • 批准号:
    0846068
  • 财政年份:
    2009
  • 资助金额:
    $ 11.85万
  • 项目类别:
    Continuing Grant

相似国自然基金

Galaxy Analytical Modeling Evolution (GAME) and cosmological hydrodynamic simulations.
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目

相似海外基金

Classification of Ankle Osteoarthritis Severity from Weightbearing Computed Tomography Using Statistical Shape Modeling and Machine Learning
使用统计形状建模和机器学习根据负重计算机断层扫描对踝骨关节炎严重程度进行分类
  • 批准号:
    10525301
  • 财政年份:
    2022
  • 资助金额:
    $ 11.85万
  • 项目类别:
Classification of Ankle Osteoarthritis Severity from Weightbearing Computed Tomography Using Statistical Shape Modeling and Machine Learning
使用统计形状建模和机器学习根据负重计算机断层扫描对踝骨关节炎严重程度进行分类
  • 批准号:
    10669281
  • 财政年份:
    2022
  • 资助金额:
    $ 11.85万
  • 项目类别:
Multimodal Integrative Dimension Reduction and Statistical Modeling with Applications to Temporomandibular Joint (TMJ) Morphometry and Biomechanics
多模态综合降维和统计建模及其在颞下颌关节 (TMJ) 形态测量和生物力学中的应用
  • 批准号:
    10196077
  • 财政年份:
    2021
  • 资助金额:
    $ 11.85万
  • 项目类别:
Multimodal Integrative Dimension Reduction and Statistical Modeling with Applications to Temporomandibular Joint (TMJ) Morphometry and Biomechanics
多模态综合降维和统计建模及其在颞下颌关节 (TMJ) 形态测量和生物力学中的应用
  • 批准号:
    10366073
  • 财政年份:
    2021
  • 资助金额:
    $ 11.85万
  • 项目类别:
Development of novel statistical modeling based on functional data analysis for high-dimensional data and its application
基于函数数据分析的高维数据统计模型开发及其应用
  • 批准号:
    20K11707
  • 财政年份:
    2020
  • 资助金额:
    $ 11.85万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Statistical modeling of long-range chromatin interactions on gene regulation and underlying molecular
长程染色质相互作用对基因调控和潜在分子的统计模型
  • 批准号:
    10172932
  • 财政年份:
    2018
  • 资助金额:
    $ 11.85万
  • 项目类别:
On Statistical Modeling and Parameter Estimation for High Dimensional Systems
高维系统的统计建模和参数估计
  • 批准号:
    1818674
  • 财政年份:
    2017
  • 资助金额:
    $ 11.85万
  • 项目类别:
    Standard Grant
On Statistical Modeling and Parameter Estimation for High Dimensional Systems
高维系统的统计建模和参数估计
  • 批准号:
    1612924
  • 财政年份:
    2016
  • 资助金额:
    $ 11.85万
  • 项目类别:
    Standard Grant
Collaborative Research: Statistical Modeling and Inference for High-dimensional Multi-Subject Neuroimaging Data
合作研究:高维多主体神经影像数据的统计建模和推理
  • 批准号:
    1209118
  • 财政年份:
    2012
  • 资助金额:
    $ 11.85万
  • 项目类别:
    Standard Grant
Collaborative Research: Statistical Modeling and Inference for High-dimensional Multi-Subject Neuroimaging Data
合作研究:高维多主体神经影像数据的统计建模和推理
  • 批准号:
    1208983
  • 财政年份:
    2012
  • 资助金额:
    $ 11.85万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了