Predictive Methods for Analyzing High-throughput Data and Spatial-Temporal Data

分析高通量数据和时空数据的预测方法

基本信息

  • 批准号:
    RGPIN-2019-07020
  • 负责人:
  • 金额:
    $ 1.46万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2020
  • 资助国家:
    加拿大
  • 起止时间:
    2020-01-01 至 2021-12-31
  • 项目状态:
    已结题

项目摘要

The accelerated development of high-throughput sequencing biotechnologies has made it affordable to collect high-dimensional molecular-level profiles, such as gene expression, which are called features in this proposal. It is of great interest to identify relevant features associated with a phenotype (eg. cancer status, health disorder). Many researchers have advocated to apply statistical learning methods to perform predictive analysis for high-throughput data. Predictive analysis results can be used in many ways. For example, they can be used to diagnose human diseases, to predict response to a medicine (personalized medicine); they can be used to choose an optimal gene subset for further experiments by plant/animal breeders; the subset of features extracted from good predictive models can facilitate the uncovering of the biological mechanism for a phenotype. Unfortunately, the high-dimensionality causes enormous overfitting in predictive analysis even with very simple models. The chance of finding false predictive features/patterns is extremely high. Therefore, it is challenging to fight against false discovery in predictive analysis when searching for more predictive features. My research outcomes will include new tools for honestly measuring predictivity (such as error rate, AUC) of selected features, and new tools for identifying truly predictive features and for building sharper predictive models for phenotypes. I will also practice predictive analysis with specific high-throughput datasets in a variety of scientific problems related to human health and food security, which will lead to new scientific discoveries and new solutions for these areas. In science, a theory is tested by performing predictions for observations in the future. Significant discrepancies between observations and predictions suggest that the theory is incorrect or flawed. Similarly, looking at out-of-sample predictions is a straightforward method for comparing and checking goodness-of-fit (GOF) of statistical models. Today, increasingly complex models are being proposed for a variety of correlated data such as, temporal, spatial, and repeated measurements data. More widely applicable predictive methods for comparing and checking such complex models are demanded. I will work to improve predictive model comparison and checking methods for generalized linear mixed models (GLMM) for datasets with correlation structure, and to release R add-on packages to facilitate the comparison and checking of GLMM. My research outcomes will include new tools for evaluating complex Bayesian/non-Bayesian models with correlated random effects. These new model evaluation tools will be essential for researchers in epidemiology, ecology, and environmental sciences. Improved modelling of the datasets from these areas will lead to more solid data analysis conclusions, which have essential impact on policy making in economic-social problems.
高通量测序生物技术的加速发展使得收集高维度分子水平的图谱(如基因表达)变得负担得起,在本提案中,这些图谱被称为特征。 识别与表型相关的相关特征(例如,癌症状态、健康障碍)。许多研究人员主张应用统计学习方法对高通量数据进行预测分析。预测分析结果可以以多种方式使用。例如,它们可用于诊断人类疾病,预测对药物的反应(个性化药物);它们可用于选择植物/动物育种者进一步实验的最佳基因子集;从良好的预测模型中提取的特征子集可以促进揭示表型的生物学机制。不幸的是,即使是非常简单的模型,高维性也会导致预测分析中的巨大过拟合。发现错误的预测特征/模式的机会非常高。 因此,当搜索更多预测特征时,在预测分析中对抗错误发现是具有挑战性的。 我的研究成果将包括用于诚实测量选定特征的预测性(如错误率,AUC)的新工具,以及用于识别真正的预测特征和建立更清晰的表型预测模型的新工具。我还将在与人类健康和食品安全相关的各种科学问题中使用特定的高通量数据集进行预测分析,这将为这些领域带来新的科学发现和新的解决方案。 在科学中,一个理论是通过对未来观测结果的预测来检验的。观察和预测之间的重大差异表明该理论是不正确的或有缺陷的。同样,查看样本外预测是比较和检查统计模型拟合优度(GOF)的一种简单方法。如今,越来越复杂的模型被提出用于各种相关的数据,如时间,空间和重复测量数据。需要更广泛适用的预测方法来比较和检查这种复杂的模型。我将致力于改进具有相关结构数据集的广义线性混合模型(GLMM)的预测模型比较和检查方法,并发布R附加包以方便GLMM的比较和检查。 我的研究成果将包括用于评估具有相关随机效应的复杂贝叶斯/非贝叶斯模型的新工具。这些新的模型评估工具对于流行病学、生态学和环境科学的研究人员来说是必不可少的。 改进这些领域数据集的建模将导致更可靠的数据分析结论,这对经济社会问题的政策制定产生重要影响。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Li, Longhai其他文献

Study on Erosion Behavior of Laser Wire Feeding Cladding High-Manganese Steel Coatings.
  • DOI:
    10.3390/ma16175733
  • 发表时间:
    2023-08-22
  • 期刊:
  • 影响因子:
    3.4
  • 作者:
    Guo, Huafeng;Zhang, Chenglin;He, Yibo;Yang, Haifeng;Zhao, Enlan;Li, Longhai;He, Shaohua;Liu, Lei
  • 通讯作者:
    Liu, Lei
Serum Chemokine CXCL7 as a Diagnostic Biomarker for Colorectal Cancer
  • DOI:
    10.3389/fonc.2019.00921
  • 发表时间:
    2019-10-09
  • 期刊:
  • 影响因子:
    4.7
  • 作者:
    Li, Longhai;Zhang, Lihua;Mao, Yong
  • 通讯作者:
    Mao, Yong
Bias-Corrected Hierarchical Bayesian Classification With a Selected Subset of High-Dimensional Features
Morphology and nanoindentation properties of mouthparts in Cyrtotrachelus longimanus (Coleoptera: curculionidae)
Cyrtotrachelus longimanus(鞘翅目:象甲科)口器的形态和纳米压痕特性
  • DOI:
    10.1002/jemt.22855
  • 发表时间:
    2017-07-01
  • 期刊:
  • 影响因子:
    2.5
  • 作者:
    Li, Longhai;Guo, Ce;Han, Cheng
  • 通讯作者:
    Han, Cheng
The formation mechanism of titania nanotube arrays in hydrofluoric acid electrolyte
氢氟酸电解液中二氧化钛纳米管阵列的形成机理
  • DOI:
    10.1007/s10853-007-2418-8
  • 发表时间:
    2008-01
  • 期刊:
  • 影响因子:
    4.5
  • 作者:
    Zou, Lexi;Zheng, Qing;Shao, Jiahui;Liao, Junsheng;Bai, Jing;Li, Longhai;Cai, Weimin;Zhou, Baoxue;Liu, Yanbiao;Zhu, Xinyuan
  • 通讯作者:
    Zhu, Xinyuan

Li, Longhai的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Li, Longhai', 18)}}的其他基金

Predictive Methods for Analyzing High-throughput Data and Spatial-Temporal Data
分析高通量数据和时空数据的预测方法
  • 批准号:
    RGPIN-2019-07020
  • 财政年份:
    2022
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Discovery Grants Program - Individual
Predictive Methods for Analyzing High-throughput Data and Spatial-Temporal Data
分析高通量数据和时空数据的预测方法
  • 批准号:
    RGPIN-2019-07020
  • 财政年份:
    2021
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Discovery Grants Program - Individual
Predictive Methods for Analyzing High-throughput Data and Spatial-Temporal Data
分析高通量数据和时空数据的预测方法
  • 批准号:
    RGPIN-2019-07020
  • 财政年份:
    2019
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Discovery Grants Program - Individual
Bayesian Methods for High-dimensional and Correlated Data
高维和相关数据的贝叶斯方法
  • 批准号:
    RGPIN-2014-05010
  • 财政年份:
    2018
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Discovery Grants Program - Individual
Bayesian Methods for High-dimensional and Correlated Data
高维和相关数据的贝叶斯方法
  • 批准号:
    RGPIN-2014-05010
  • 财政年份:
    2017
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Discovery Grants Program - Individual
Bayesian Methods for High-dimensional and Correlated Data
高维和相关数据的贝叶斯方法
  • 批准号:
    RGPIN-2014-05010
  • 财政年份:
    2016
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Discovery Grants Program - Individual
Bayesian Methods for High-dimensional and Correlated Data
高维和相关数据的贝叶斯方法
  • 批准号:
    RGPIN-2014-05010
  • 财政年份:
    2015
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Discovery Grants Program - Individual
Bayesian Methods for High-dimensional and Correlated Data
高维和相关数据的贝叶斯方法
  • 批准号:
    RGPIN-2014-05010
  • 财政年份:
    2014
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Discovery Grants Program - Individual
Efficient Bayesian analysis for complex models
复杂模型的高效贝叶斯分析
  • 批准号:
    356014-2009
  • 财政年份:
    2013
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Discovery Grants Program - Individual
Efficient Bayesian analysis for complex models
复杂模型的高效贝叶斯分析
  • 批准号:
    356014-2009
  • 财政年份:
    2012
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

Computational Methods for Analyzing Toponome Data
  • 批准号:
    60601030
  • 批准年份:
    2006
  • 资助金额:
    17.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: RUI: Topological methods for analyzing shifting patterns and population collapse
合作研究:RUI:分析变化模式和人口崩溃的拓扑方法
  • 批准号:
    2327892
  • 财政年份:
    2024
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Standard Grant
Collaborative Research: RUI: Topological methods for analyzing shifting patterns and population collapse
合作研究:RUI:分析变化模式和人口崩溃的拓扑方法
  • 批准号:
    2327893
  • 财政年份:
    2024
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Standard Grant
CAREER: Super-Quantile Based Methods for Analyzing Large-Scale Heterogenous Data
职业:基于超分位数的大规模异构数据分析方法
  • 批准号:
    2238428
  • 财政年份:
    2023
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Continuing Grant
Methods for inferring and analyzing gene regulatory networks using single-cell multiomics and spatial genomics data
使用单细胞多组学和空间基因组学数据推断和分析基因调控网络的方法
  • 批准号:
    10712174
  • 财政年份:
    2023
  • 资助金额:
    $ 1.46万
  • 项目类别:
CAREER: Statistical Models and Parallel-computing Methods for Analyzing Sparse and Large Single-cell Chromatin Interaction Datasets
职业:用于分析稀疏和大型单细胞染色质相互作用数据集的统计模型和并行计算方法
  • 批准号:
    2239350
  • 财政年份:
    2023
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Continuing Grant
Development of Deep Learning Methods for Generating and Analyzing Neural Network Microscopy Images
用于生成和分析神经网络显微图像的深度学习方法的开发
  • 批准号:
    23KF0296
  • 财政年份:
    2023
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Computational Methods for Analyzing lmmunoglobulin Allelic Diversity in B cells
分析 B 细胞中免疫球蛋白等位基因多样性的计算方法
  • 批准号:
    10751541
  • 财政年份:
    2023
  • 资助金额:
    $ 1.46万
  • 项目类别:
Accelerating biomarker development through novel statistical methods for analyzing phase III/IV studies
通过分析 III/IV 期研究的新统计方法加速生物标志物开发
  • 批准号:
    10568744
  • 财政年份:
    2022
  • 资助金额:
    $ 1.46万
  • 项目类别:
Robust and efficient methods for analyzing complex longitudinal and survival data
用于分析复杂纵向和生存数据的稳健且高效的方法
  • 批准号:
    RGPIN-2022-04899
  • 财政年份:
    2022
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Discovery Grants Program - Individual
Adaptive Data Processing, Modeling, and Quantification Methods for Analyzing Cardiac Fibrillation
用于分析心颤的自适应数据处理、建模和量化方法
  • 批准号:
    RGPIN-2020-04933
  • 财政年份:
    2022
  • 资助金额:
    $ 1.46万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了