Dimension Reduction and Complex High-Dimensional Data

降维和复杂的高维数据

基本信息

  • 批准号:
    RGPIN-2021-04073
  • 负责人:
  • 金额:
    $ 1.31万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

Recent technological advances in many fields of science have led to the routine collection of vast amounts of data. The collected data is typically high--dimensional and highly correlated. My research interests lie in adapting and extending dimension reduction methods to accommodate the complexities of these high--throughput biological data. As part of my research program, the challenges that my students and I will tackle over the next five years can be grouped into three objectives. First, missing data is common in any applied science, but it is particularly problematic with high--throughput data. Moreover, the nature of multivariate data implies that complete--case analysis can discard a significant amount of information. To increase efficiency, I will look at how multiple imputation procedures can be used with multivariate methods. To this end, I will describe the distribution of aggregates of test statistics common in multivariate analysis, and I will reformulate the optimisation problem of these methods to accommodate missing data. Second, building on some of my previous work, I will incorporate a priori knowledge through correlation structures. For example, the linear correlation patterns observed in DNA methylation data can be accounted for in dimension reduction methods by using autoregressive covariance structures. I will then look at extending this work to incorporate spatial correlation, which is common in neuroimaging data. I will also study hierarchical probabilistic multivariate models in order to model correlation between observations. Third, complex correlation also manifests through the complex geometry of the sample space, such as with image and text data. Topological data analysis provides tools to study the geometric structure of our data. I will develop simulation frameworks to study the impact of this geometry on multivariate methods. I will use this knowledge to develop methods to generate synthetic data to oversample imbalanced data. I will also combine tools from multivariate analysis and topological data analysis to extend PCEV, CCA and PLS to nonlinear dimension reduction methods. The research program proposed here will have an impact across multiple disciplines. All proposed methodologies will be implemented in software packages, so that applied researchers can more easily use them in their work and methodological researchers can more rapidly build upon them.
许多科学领域的最新技术进步导致了大量数据的常规收集。所收集的数据通常是高维且高度相关的。我的研究兴趣在于适应和扩展降维方法,以适应这些高通量生物数据的复杂性。作为我研究计划的一部分,我和我的学生将在未来五年内解决的挑战可以分为三个目标。首先,丢失数据在任何应用科学中都很常见,但对于高吞吐量数据来说,这尤其成问题。此外,多元数据的性质意味着完整的个案分析可能会丢弃大量信息。为了提高效率,我将研究如何将多个imputation过程与多变量方法一起使用。为此,我将描述多元分析中常见的检验统计总量的分布,并将重新制定这些方法的优化问题,以适应缺失的数据。其次,在我之前的一些工作的基础上,我将通过关联结构纳入先验知识。例如,在DNA甲基化数据中观察到的线性相关模式可以通过使用自回归协方差结构在降维方法中进行解释。然后,我将扩展这项工作,将空间相关性纳入其中,这在神经成像数据中很常见。我还将研究分层概率多元模型,以便模拟观测之间的相关性。第三,复相关性还表现为样本空间的复杂几何结构,如图像和文本数据。拓扑数据分析为研究数据的几何结构提供了工具。我将开发模拟框架来研究这种几何对多元方法的影响。我将利用这些知识来开发生成合成数据的方法来对不平衡数据进行过采样。我还将结合多元分析和拓扑数据分析的工具,将PCEV、CCA和PLS扩展到非线性降维方法。这里提出的研究计划将对多个学科产生影响。所有提出的方法都将在软件包中实现,以便应用研究人员可以更容易地在他们的工作中使用它们,并且方法研究人员可以更快地在它们的基础上进行构建。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Turgeon, Maxime其他文献

Principal component of explained variance: An efficient and optimal data dimension reduction framework for association studies
  • DOI:
    10.1177/0962280216660128
  • 发表时间:
    2018-05-01
  • 期刊:
  • 影响因子:
    2.3
  • 作者:
    Turgeon, Maxime;Oualkacha, Karim;Labbe, Aurelie
  • 通讯作者:
    Labbe, Aurelie
A Mendelian randomization study of the effect of type-2 diabetes on coronary heart disease.
  • DOI:
    10.1038/ncomms8060
  • 发表时间:
    2015-05-28
  • 期刊:
  • 影响因子:
    16.6
  • 作者:
    Ahmad, Omar S.;Morris, John A.;Mujammami, Muhammad;Forgetta, Vincenzo;Leong, Aaron;Li, Rui;Turgeon, Maxime;Greenwood, Celia M. T.;Thanassoulis, George;Meigs, James B.;Sladek, Robert;Richards, J. Brent
  • 通讯作者:
    Richards, J. Brent

Turgeon, Maxime的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Turgeon, Maxime', 18)}}的其他基金

Dimension Reduction and Complex High-Dimensional Data
降维和复杂的高维数据
  • 批准号:
    DGECR-2021-00296
  • 财政年份:
    2021
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Discovery Launch Supplement
Dimension Reduction and Complex High-Dimensional Data
降维和复杂的高维数据
  • 批准号:
    RGPIN-2021-04073
  • 财政年份:
    2021
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

兼捕减少装置(Bycatch Reduction Devices, BRD)对拖网网囊系统水动力及渔获性能的调控机制
  • 批准号:
    32373187
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目

相似海外基金

Model Reduction for Control-based Continuation of Complex Nonlinear Structures
复杂非线性结构基于控制的连续性的模型简化
  • 批准号:
    EP/X026027/1
  • 财政年份:
    2023
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Fellowship
Asymmetric Gait Generation for Legged Locomotion in Complex Environments via Off-Line Model Reduction and Real-Time Optimal Control
通过离线模型简化和实时最优控制,生成复杂环境中腿部运动的不对称步态
  • 批准号:
    2128568
  • 财政年份:
    2021
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Standard Grant
Dimension Reduction and Complex High-Dimensional Data
降维和复杂的高维数据
  • 批准号:
    DGECR-2021-00296
  • 财政年份:
    2021
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Discovery Launch Supplement
Research on seismic risk reduction through diversification of components of large and complex systems
通过大型复杂系统组件多样化降低地震风险的研究
  • 批准号:
    21K04568
  • 财政年份:
    2021
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Dimension Reduction and Complex High-Dimensional Data
降维和复杂的高维数据
  • 批准号:
    RGPIN-2021-04073
  • 财政年份:
    2021
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Discovery Grants Program - Individual
Development of a non-noble metal complex as a molecular photocatalyst for photocatalytic carbon dioxide reduction
开发非贵金属络合物作为光催化二氧化碳还原的分子光催化剂
  • 批准号:
    21K14642
  • 财政年份:
    2021
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Nonlinear model reduction for bifurcation analysis of complex aero-elastic models
用于复杂气动弹性模型分岔分析的非线性模型简化
  • 批准号:
    2434237
  • 财政年份:
    2020
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Studentship
Electrocatalytic Nitrite Reduction by a Biomimetic Copper Complex
仿生铜配合物电催化还原亚硝酸盐
  • 批准号:
    532919-2019
  • 财政年份:
    2020
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Postdoctoral Fellowships
Elucidation of CO2 Reduction Reaction Mechanism of Metal Complex Catalysts Using Metal Complex-Semiconductor Hybrid Photoelectrode
利用金属配合物-半导体混合光电极阐明金属配合物催化剂的CO2还原反应机理
  • 批准号:
    19K05516
  • 财政年份:
    2019
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Data-driven model reduction of nonlinear complex systems
非线性复杂系统的数据驱动模型简化
  • 批准号:
    19K23517
  • 财政年份:
    2019
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Grant-in-Aid for Research Activity Start-up
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了