Multivariate Modelling and Inference of Dependent and High-Dimensional Data in Recent Genetic Studies
最近遗传学研究中相关和高维数据的多变量建模和推理
基本信息
- 批准号:RGPIN-2019-06727
- 负责人:
- 金额:$ 1.46万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2020
- 资助国家:加拿大
- 起止时间:2020-01-01 至 2021-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Statistical genetics is undergoing the same transition to big data that several branches of applied statistics are experiencing, and this transition is accelerating with the advent of high-throughput genomic experiments. My research activities have evolved accordingly throughout the years to address the new challenges brought on by the emerging genomic technologies. This research program consists of three innovative research themes dealing with such challenges, with a focus on the development of multivariate statistical tools for dependent genomics data.
Next-generation sequencing (NGS) technology is now providing an exhaustive catalog of DNA sequence variation and the challenge becomes understanding the phenotypic consequences of these variants. The first objective deals with the optimal use of multiple phenotypes in NGS association studies. Multiple correlated phenotypes often measure the same underlying trait and can bear a more direct relationship with the disease diagnosis. By providing flexible modeling of the phenotypes dependence structure via copula models, and exploiting Kernel trick (i.e. machine learning methods for exploring phenotype-genotype relationship), this research axis lays out a broad modeling of the underlying relationship between correlated phenotypes and genetic variants. The framework will increase power in identifying novel genetic variants responsible for human complex diseases, which may help to better understand disease etiology.
Data-regularization is appealing for high-throughput genomics data to detect/select a smaller subset of relevant predictors for an outcome. Such data display also heterogeneity which is of interest to many researchers but it tends to be overlooked by existing predictive models. The second objective implements pillar algorithms of modern computational statistics within penalized robust regression models to capture within-subject dependence and select/detect relevant heterogeneous predictors for dependent genomics data. Such prediction models will be a useful tool to build genetic risk scores that can be very useful for risk stratification and clinical decision-making.
The third objective is a long-term goal which couples statistical tools from the first and second research axes to build a unified copula-based association framework capable to identify heterogeneous genetic variants while providing flexible modeling of the phenotypes dependence. It will gain more insight on how genetic variation is explaining the phenotypic variation; this is known as “missing heritability” problem and is encountered by most existing genetic studies.
The lack of efficient statistical methods to analyze modern genomics data is a major bottleneck faced by the genomics research community to better understand the related biology. I strongly believe that the strategies I propose in this proposal will be very useful for analyzing and integrating such complex data, and will help with maximizing their utility.
统计遗传学正在经历与应用统计学的几个分支相同的向大数据的过渡,并且随着高通量基因组实验的出现,这种过渡正在加速。多年来,我的研究活动也相应地发展,以应对新兴基因组技术带来的新挑战。该研究计划包括三个创新的研究主题,以应对这些挑战,重点是依赖基因组学数据的多元统计工具的发展。
下一代测序(NGS)技术现在提供了一个详尽的DNA序列变异目录,挑战是了解这些变异的表型后果。第一个目标涉及NGS关联研究中多个表型的最佳使用。多个相关的表型通常测量相同的潜在性状,并且可以与疾病诊断有更直接的关系。通过copula模型提供表型依赖结构的灵活建模,并利用Kernel技巧(即用于探索表型-基因型关系的机器学习方法),该研究轴为相关表型和遗传变异之间的潜在关系提供了广泛的建模。该框架将增加识别导致人类复杂疾病的新型遗传变异的能力,这可能有助于更好地了解疾病病因。
数据正则化对于高通量基因组学数据是有吸引力的,以检测/选择用于结果的相关预测因子的较小子集。这些数据也显示异质性,这是许多研究人员感兴趣的,但它往往被现有的预测模型所忽视。第二个目标是在惩罚鲁棒回归模型中实现现代计算统计的支柱算法,以捕获受试者内依赖性并选择/检测依赖基因组学数据的相关异质预测因子。这样的预测模型将是一个有用的工具,以建立遗传风险评分,可以是非常有用的风险分层和临床决策。
第三个目标是一个长期的目标,耦合的统计工具,从第一和第二个研究轴,建立一个统一的copula为基础的关联框架,能够识别异质性遗传变异,同时提供灵活的建模的表型依赖。它将获得更多关于遗传变异如何解释表型变异的见解;这被称为“缺失遗传力”问题,大多数现有的遗传研究都会遇到。
缺乏有效的统计方法来分析现代基因组学数据是基因组学研究界面临的一个主要瓶颈,以更好地了解相关的生物学。我坚信,我在本提案中提出的策略对于分析和整合此类复杂数据非常有用,并将有助于最大限度地发挥其效用。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Oualkacha, Karim其他文献
A method for analyzing multiple continuous phenotypes in rare variant association studies allowing for flexible correlations in variant effects
- DOI:
10.1038/ejhg.2016.8 - 发表时间:
2016-09-01 - 期刊:
- 影响因子:5.2
- 作者:
Sun, Jianping;Oualkacha, Karim;Greenwood, Celia M. T. - 通讯作者:
Greenwood, Celia M. T.
A rare variant association test in family-based designs and non-normal quantitative traits
- DOI:
10.1002/sim.6750 - 发表时间:
2016-03-15 - 期刊:
- 影响因子:2
- 作者:
Lakhal-Chaieb, Lajmi;Oualkacha, Karim;Greenwood, Celia M. T. - 通讯作者:
Greenwood, Celia M. T.
A smoothed EM-algorithm for DNA methylation profiles from sequencing-based methods in cell lines or for a single cell type
- DOI:
10.1515/sagmb-2016-0062 - 发表时间:
2017-12-01 - 期刊:
- 影响因子:0.9
- 作者:
Lakhal-Chaieb, Lajmi;Greenwood, Celia M. T.;Oualkacha, Karim - 通讯作者:
Oualkacha, Karim
A coordinate descent algorithm for computing penalized smooth quantile regression
- DOI:
10.1007/s11222-016-9659-9 - 发表时间:
2017-07-01 - 期刊:
- 影响因子:2.2
- 作者:
Mkhadri, Abdallah;Ouhourane, Mohamed;Oualkacha, Karim - 通讯作者:
Oualkacha, Karim
Adjusted Sequence Kernel Association Test for Rare Variants Controlling for Cryptic and Family Relatedness
- DOI:
10.1002/gepi.21725 - 发表时间:
2013-05-01 - 期刊:
- 影响因子:2.1
- 作者:
Oualkacha, Karim;Dastani, Zari;Greenwood, Celia M. T. - 通讯作者:
Greenwood, Celia M. T.
Oualkacha, Karim的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Oualkacha, Karim', 18)}}的其他基金
Multivariate Modelling and Inference of Dependent and High-Dimensional Data in Recent Genetic Studies
最近遗传学研究中相关和高维数据的多变量建模和推理
- 批准号:
RGPIN-2019-06727 - 财政年份:2022
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference of Dependent and High-Dimensional Data in Recent Genetic Studies
最近遗传学研究中相关和高维数据的多变量建模和推理
- 批准号:
RGPIN-2019-06727 - 财政年份:2021
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference of Dependent and High-Dimensional Data in Recent Genetic Studies
最近遗传学研究中相关和高维数据的多变量建模和推理
- 批准号:
RGPIN-2019-06727 - 财政年份:2019
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference in Genetic Studies
遗传研究中的多变量建模和推理
- 批准号:
433266-2013 - 财政年份:2018
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference in Genetic Studies
遗传研究中的多变量建模和推理
- 批准号:
433266-2013 - 财政年份:2017
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference in Genetic Studies
遗传研究中的多变量建模和推理
- 批准号:
433266-2013 - 财政年份:2016
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference in Genetic Studies
遗传研究中的多变量建模和推理
- 批准号:
433266-2013 - 财政年份:2015
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference in Genetic Studies
遗传研究中的多变量建模和推理
- 批准号:
433266-2013 - 财政年份:2014
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference in Genetic Studies
遗传研究中的多变量建模和推理
- 批准号:
433266-2013 - 财政年份:2013
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
Improving modelling of compact binary evolution.
- 批准号:10903001
- 批准年份:2009
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Multivariate Modelling and Inference of Dependent and High-Dimensional Data in Recent Genetic Studies
最近遗传学研究中相关和高维数据的多变量建模和推理
- 批准号:
RGPIN-2019-06727 - 财政年份:2022
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference of Dependent and High-Dimensional Data in Recent Genetic Studies
最近遗传学研究中相关和高维数据的多变量建模和推理
- 批准号:
RGPIN-2019-06727 - 财政年份:2021
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference of Dependent and High-Dimensional Data in Recent Genetic Studies
最近遗传学研究中相关和高维数据的多变量建模和推理
- 批准号:
RGPIN-2019-06727 - 财政年份:2019
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference in Genetic Studies
遗传研究中的多变量建模和推理
- 批准号:
433266-2013 - 财政年份:2018
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference in Genetic Studies
遗传研究中的多变量建模和推理
- 批准号:
433266-2013 - 财政年份:2017
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference in Genetic Studies
遗传研究中的多变量建模和推理
- 批准号:
433266-2013 - 财政年份:2016
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference in Genetic Studies
遗传研究中的多变量建模和推理
- 批准号:
433266-2013 - 财政年份:2015
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference in Genetic Studies
遗传研究中的多变量建模和推理
- 批准号:
433266-2013 - 财政年份:2014
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Multivariate Modelling and Inference in Genetic Studies
遗传研究中的多变量建模和推理
- 批准号:
433266-2013 - 财政年份:2013
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Bayesian Inference for Flexible Parametric Multivariate Econometric Modelling
用于灵活参数多元计量经济建模的贝叶斯推理
- 批准号:
DP0985505 - 财政年份:2009
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Projects