Ensemble subspace, penalty, pretest, and shrinkage strategies for high dimensional data

高维数据的集成子空间、惩罚、预测试和收缩策略

基本信息

  • 批准号:
    RGPIN-2017-05228
  • 负责人:
  • 金额:
    $ 3.13万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2020
  • 资助国家:
    加拿大
  • 起止时间:
    2020-01-01 至 2021-12-31
  • 项目状态:
    已结题

项目摘要

There are a host of buzzwords in today's data-centric world. We encounter data in all walks of life, and for analytically- and objectively-minded people, data is crucial to their goals. Making sense of the data and extracting meaningful information from it may not be an easy task. The growth in the size and scope of data sets in a host of disciplines has created a need for innovative statistical strategies for understanding and analyzing such data. A variety of statistical and computational tools are needed to reveal the story that is contained in the data. We define high dimensional data (HDD) as data sets for which the number of predictors are larger than the sample size. The analysis of HDD is an important feature in a host of research fields such as social media, engineering networks, bio-informatics, environmental, and others. The buzzword “Big Data” is nebulously defined, but its problems are real and statisticians play a vital role in this data world. Undoubtedly, overcoming the challenges of HDD is key to successful research in a host of fields. Many organizations are using sophisticated number-crunching, data mining, or Big Data analytics to reveal patterns based on collected information. Clearly, there is an increasing demand for efficient prediction strategies for analyzing HDD. Some examples of HDD that have prompted demand are gene expression arrays, social network modeling, clinical, genetics and phenotypic data. Most of the exiting methods for dealing with HDD begin with model selection for further investigation. Penalized methods are unstable unless very stringent conditions are imposed. This research proposal in HDD focusses on post selection strategies to combat some of the issues inherited in penalized methods. We also propose to investigate ensemble strategy and tuning-parameter free strategy to analyze HDD. Further, I will consider model misspecification problems in HDD and provide a systematic analysis of pretest procedures via divergence theory. Finally, we will develop Bayesian methodology for brain imaging and genetic data. The overarching objective is to provide answers to the question “what are the tools and tricks, pitfalls, applications, challenges and opportunities in HDD analysis”. This proposal emphasizes that statisticians can play a dominant role in solving Big Data problems, and will move statisticians from the cellar of the scientific discovery to the penthouse. The proposed research will provide opportunities for training highly qualified personnel at all levels. The training will be three-fold, methodological, coding/computational, and analysis of data from the real life problems. More public and private sectors are now acknowledging the importance of statistical tools and its critical role in analyzing Big Data. According to a research 4 million jobs may be available globally for Big Data analysis. The proposed research will train individuals for these jobs.
在当今以数据为中心的世界里,有许多流行语。我们在各行各业都会遇到数据,对于分析和客观的人来说,数据对他们的目标至关重要。理解数据并从中提取有意义的信息可能不是一件容易的事。许多学科中数据集的规模和范围的增长需要创新的统计策略来理解和分析这些数据。需要各种统计和计算工具来揭示包含在数据中的故事。我们将高维数据(HDD)定义为预测因子数量大于样本量的数据集。硬盘分析是许多研究领域的重要特征,如社会媒体、工程网络、生物信息学、环境等。“大数据”这个流行词的定义很模糊,但它的问题是真实存在的,统计学家在这个数据世界中扮演着至关重要的角色。毫无疑问,克服HDD的挑战是在许多领域成功研究的关键。许多组织正在使用复杂的数字运算、数据挖掘或大数据分析来揭示基于收集信息的模式。显然,对于分析HDD的有效预测策略的需求正在增加。推动需求的硬盘的一些例子是基因表达阵列、社会网络建模、临床、遗传学和表型数据。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ahmed, Ejaz其他文献

INTERNET OF THINGS ARCHITECTURE: RECENT ADVANCES, TAXONOMY, REQUIREMENTS, AND OPEN CHALLENGES
  • DOI:
    10.1109/mwc.2017.1600421
  • 发表时间:
    2017-06-01
  • 期刊:
  • 影响因子:
    12.9
  • 作者:
    Yaqoob, Ibrar;Ahmed, Ejaz;Guizani, Mohsen
  • 通讯作者:
    Guizani, Mohsen
Analysis of Tinto's student integration theory in first-year undergraduate computing students of a UK higher education institution
Antioxidant activity with flavonoidal constituents from Aerva persica
  • DOI:
    10.1007/bf02968582
  • 发表时间:
    2006-05-01
  • 期刊:
  • 影响因子:
    6.7
  • 作者:
    Ahmed, Ejaz;Imran, Muhammad;Ashraf, Muhammad
  • 通讯作者:
    Ashraf, Muhammad
Synthesis of pyrite thin films and transition metal doped pyrite thin films by aerosol-assisted chemical vapour deposition
  • DOI:
    10.1039/c4nj01461h
  • 发表时间:
    2015-01-01
  • 期刊:
  • 影响因子:
    3.3
  • 作者:
    Khalid, Sadia;Ahmed, Ejaz;O'Brien, Paul
  • 通讯作者:
    O'Brien, Paul
Room-Temperature Synthesis of the Highly Polar Cluster Compound Sn[SnCl][W3Cl13]
  • DOI:
    10.1002/ejic.201000706
  • 发表时间:
    2010-11-01
  • 期刊:
  • 影响因子:
    2.3
  • 作者:
    Ahmed, Ejaz;Groh, Matthias;Ruck, Michael
  • 通讯作者:
    Ruck, Michael

Ahmed, Ejaz的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ahmed, Ejaz', 18)}}的其他基金

Ensemble subspace, penalty, pretest, and shrinkage strategies for high dimensional data
高维数据的集成子空间、惩罚、预测试和收缩策略
  • 批准号:
    RGPIN-2017-05228
  • 财政年份:
    2021
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

基于个体分析的投影式非线性非负张量分解在高维非结构化数据模式分析中的研究
  • 批准号:
    61502059
  • 批准年份:
    2015
  • 资助金额:
    19.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

DMS/NIGMS 1: Multilevel stochastic orthogonal subspace transformations for robust machine learning with applications to biomedical data and Alzheimer's disease subtyping
DMS/NIGMS 1:多级随机正交子空间变换,用于稳健的机器学习,应用于生物医学数据和阿尔茨海默病亚型分析
  • 批准号:
    2347698
  • 财政年份:
    2024
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Continuing Grant
EPSRC-SFI: Krylov subspace methods for non-symmetric PDE problems: a deeper understanding and faster convergence
EPSRC-SFI:非对称 PDE 问题的 Krylov 子空间方法:更深入的理解和更快的收敛
  • 批准号:
    EP/W035561/1
  • 财政年份:
    2023
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Research Grant
AF: Small: RUI: Toward High-Performance Block Krylov Subspace Algorithms for Solving Large-Scale Linear Systems
AF:小:RUI:用于求解大规模线性系统的高性能块 Krylov 子空间算法
  • 批准号:
    2327619
  • 财政年份:
    2023
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Standard Grant
Developing subspace methods for constrained optimization problems and their application to machine learning
开发约束优化问题的子空间方法及其在机器学习中的应用
  • 批准号:
    23H03351
  • 财政年份:
    2023
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Development of learning subspace-based methods for pattern recognition
基于学习子空间的模式识别方法的开发
  • 批准号:
    22K17960
  • 财政年份:
    2022
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Ensemble subspace, penalty, pretest, and shrinkage strategies for high dimensional data
高维数据的集成子空间、惩罚、预测试和收缩策略
  • 批准号:
    RGPIN-2017-05228
  • 财政年份:
    2022
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Discovery Grants Program - Individual
Improvement of prediction accuracy of advanced reactors: Precise and robust cross section adjustment method based on active subspace approach
先进反应堆预测精度的提高:基于主动子空间方法的精确鲁棒截面调整方法
  • 批准号:
    21K04940
  • 财政年份:
    2021
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
CAREER: Inference for High-Dimensional Structures via Subspace Learning: Statistics, Computation, and Beyond
职业:通过子空间学习推理高维结构:统计、计算及其他
  • 批准号:
    2203741
  • 财政年份:
    2021
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Continuing Grant
Development of a closed-loop subspace identification method with accuracy assurance
一种精度保证的闭环子空间识别方法的开发
  • 批准号:
    21K04124
  • 财政年份:
    2021
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Tensor and Subspace Learning Methods with Applications to Medical Imaging
张量和子空间学习方法及其在医学成像中的应用
  • 批准号:
    2053697
  • 财政年份:
    2021
  • 资助金额:
    $ 3.13万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了