Statistical models and computational tools for gene-gene interaction analyses by utilizing multi-scale omics

利用多尺度组学进行基因间相互作用分析的统计模型和计算工具

基本信息

  • 批准号:
    RGPIN-2018-05147
  • 负责人:
  • 金额:
    $ 2.26万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2018
  • 资助国家:
    加拿大
  • 起止时间:
    2018-01-01 至 2019-12-31
  • 项目状态:
    已结题

项目摘要

Understanding the genetic basis of phenotypic changes and predicting phenotypes based on genotypes are long-standing goals of the field of genetics. Today, scalable tools, i.e., software that allows rapid analysis of very large data using limited memory in a personal computer, is an emerging requirement in the era of genomic big-data. The long-term goal of my research program is to develop novel statistical models and their scalable implementations to facilitate genotype-phenotype mappings and predictions. ******Background. Recent advances in high-throughput sequencing technologies including whole-exome sequencing, RNA-Seq (for transcriptome) and Bisulfite-Seq (for methylome) have created an excitement in genetics-related areas. However, there is a lack of tools that allow the seamless integration of multi-scale omics datasets for the precise prediction of phenotypes in a biologically relevant and meaningful context. In particular, gene-gene interactions have not been fully characterized and utilized in predictors. Moreover, in the coming big-data era, there is a lack of scalable tools that permit the effective statistical analysis of large datasets without requiring a machine with very large memory.******Objectives. Building upon my previous work of identifying gene-gene interactions and implementing scalable software, I will focus on three short-term objectives: (1) Identify gene interactions using genotype-phenotype data by integrating multi-scale omics data; Bayesian Network and Frequent Itemset Mining will be integrated to achieve this goal. (2) Build a polygenic phenotype predictor that integrates gene interactions; an extension of Group LASSO will be implemented for this. (3) Create scalable software implementing the aforementioned statistical models using disk-based solutions, i.e., memory virtualization and huge-page techniques in computer science. It will store large data on disk while allowing rapid calculation as if the data resided in main memory. The multi-scale omics data from the 1,001 Arabidopsis Genomes Project and the multi-scale omics plant data generated in Alberta will be used. ******Impact. The proposed research will not only provide a new theoretical framework of statistical genetics, but will also furnish novel computational tools to assist experimental scientists carrying out gene mapping projects. Moreover, it will enable practitioners in agriculture and health to improve predictions of phenotype, benefiting Canadian food productions and the health system. Practically, my software represents a scalable solution for big-data analyses with minimum memory usage. This will be particularly relevant to many Canadian research groups that do not have immediate access to high-performance computing facilities. This program will train HQP to carry out bioinformatics and biostatistics analyses to fully utilize the future genomic big-data.
了解表型变化的遗传基础和基于基因型预测表型是遗传学领域的长期目标。如今,在基因组大数据时代,可扩展的工具,即允许使用个人计算机有限内存快速分析非常大数据的软件,是一种新兴的需求。我的研究计划的长期目标是开发新的统计模型及其可扩展的实现,以促进基因型-表型映射和预测。* * * * * *的背景。高通量测序技术的最新进展,包括全外显子组测序,RNA-Seq(转录组)和亚硫酸酯- seq(甲基组),已经在遗传学相关领域创造了一个兴奋。然而,缺乏工具可以无缝整合多尺度组学数据集,以便在生物学相关和有意义的背景下精确预测表型。特别是,基因-基因相互作用尚未被充分表征并用于预测。此外,在即将到来的大数据时代,缺乏可扩展的工具,可以在不需要具有非常大内存的机器的情况下对大型数据集进行有效的统计分析。******目标在我之前鉴定基因相互作用和实施可扩展软件的工作的基础上,我将重点关注三个短期目标:(1)通过整合多尺度组学数据,利用基因型-表型数据鉴定基因相互作用;将贝叶斯网络和频繁项集挖掘相结合来实现这一目标。(2)构建整合基因相互作用的多基因表型预测因子;为此将实施LASSO小组的扩展。(3)创建可扩展的软件,使用基于磁盘的解决方案实现上述统计模型,即内存虚拟化和计算机科学中的大页面技术。它将大量数据存储在磁盘上,同时允许快速计算,就好像数据驻留在主存储器中一样。将使用1001拟南芥基因组计划的多尺度组学数据和阿尔伯塔省产生的多尺度组学植物数据。* * * * * *的影响。该研究不仅为统计遗传学提供了新的理论框架,而且为实验科学家开展基因定位项目提供了新的计算工具。此外,它将使农业和卫生从业人员能够改进表型预测,使加拿大食品生产和卫生系统受益。实际上,我的软件代表了一个可扩展的解决方案,用于大数据分析,内存使用最少。这对许多加拿大研究小组来说尤其重要,因为他们不能立即使用高性能计算设备。该项目将培训HQP进行生物信息学和生物统计学分析,以充分利用未来的基因组大数据。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Zhang, Qingrun其他文献

Stabilized COre gene and Pathway Election uncovers pan-cancer shared pathways and a cancer-specific driver.
  • DOI:
    10.1126/sciadv.abo2846
  • 发表时间:
    2022-12-21
  • 期刊:
  • 影响因子:
    13.6
  • 作者:
    Kossinna, Pathum;Cai, Weijia;Lu, Xuewen;Shemanko, Carrie S.;Zhang, Qingrun
  • 通讯作者:
    Zhang, Qingrun
JAWAMix5: an out-of-core HDF5-based java implementation of whole-genome association studies using mixed models
  • DOI:
    10.1093/bioinformatics/btt122
  • 发表时间:
    2013-05-01
  • 期刊:
  • 影响因子:
    5.8
  • 作者:
    Long, Quan;Zhang, Qingrun;Nordborg, Magnus
  • 通讯作者:
    Nordborg, Magnus
Universal primers for HBV genome DNA amplification across subtypes: a case study for designing more effective viral primers (Retracted article. See vol 4, pg Nil_1, 2007)
  • DOI:
    10.1186/1743-422x-4-92
  • 发表时间:
    2007-09-24
  • 期刊:
  • 影响因子:
    4.8
  • 作者:
    Zhang, Qingrun;Wu, Guanghua;Zeng, Changqing
  • 通讯作者:
    Zeng, Changqing
Massive genomic variation and strong selection in Arabidopsis thaliana lines from Sweden.
  • DOI:
    10.1038/ng.2678
  • 发表时间:
    2013-08
  • 期刊:
  • 影响因子:
    30.8
  • 作者:
    Long, Quan;Rabanal, Fernando A.;Meng, Dazhe;Huber, Christian D.;Farlow, Ashley;Platzer, Alexander;Zhang, Qingrun;Vilhjalmsson, Bjarni J.;Korte, Arthur;Nizhynska, Viktoria;Voronin, Viktor;Korte, Pamela;Sedman, Laura;Mandakova, Terezie;Lysak, Martin A.;Seren, Uemit;Hellmann, Ines;Nordborg, Magnus
  • 通讯作者:
    Nordborg, Magnus
A second generation human haplotype map of over 3.1 million SNPs.
  • DOI:
    10.1038/nature06258
  • 发表时间:
    2007-10-18
  • 期刊:
  • 影响因子:
    64.8
  • 作者:
    Frazer, Kelly A.;Ballinger, Dennis G.;Cox, David R.;Hinds, David A.;Stuve, Laura L.;Gibbs, Richard A.;Belmont, John W.;Boudreau, Andrew;Hardenbol, Paul;Leal, Suzanne M.;Pasternak, Shiran;Wheeler, David A.;Willis, Thomas D.;Yu, Fuli;Yang, Huanming;Zeng, Changqing;Gao, Yang;Hu, Haoran;Hu, Weitao;Li, Chaohua;Lin, Wei;Liu, Siqi;Pan, Hao;Tang, Xiaoli;Wang, Jian;Wang, Wei;Yu, Jun;Zhang, Bo;Zhang, Qingrun;Zhao, Hongbin;Zhao, Hui;Zhou, Jun;Gabriel, Stacey B.;Barry, Rachel;Blumenstiel, Brendan;Camargo, Amy;Defelice, Matthew;Faggart, Maura;Goyette, Mary;Gupta, Supriya;Moore, Jamie;Nguyen, Huy;Onofrio, Robert C.;Parkin, Melissa;Roy, Jessica;Stahl, Erich;Winchester, Ellen;Ziaugra, Liuda;Altshuler, David;Shen, Yan;Yao, Zhijian;Huang, Wei;Chu, Xun;He, Yungang;Jin, Li;Liu, Yangfan;Shen, Yayun;Sun, Weiwei;Wang, Haifeng;Wang, Yi;Wang, Ying;Xiong, Xiaoyan;Xu, Liang;Waye, Mary M. Y.;Tsui, Stephen K. W.;Wong, J. Tze-Fei;Galver, Luana M.;Fan, Jian-Bing;Gunderson, Kevin;Murray, Sarah S.;Oliphant, Arnold R.;Chee, Mark S.;Montpetit, Alexandre;Chagnon, Fanny;Ferretti, Vincent;Leboeuf, Martin;Olivier, Jean-Franccois;Phillips, Michael S.;Roumy, Stephanie;Sallee, Clementine;Verner, Andrei;Hudson, Thomas J.;Kwok, Pui-Yan;Cai, Dongmei;Koboldt, Daniel C.;Miller, Raymond D.;Pawlikowska, Ludmila;Taillon-Miller, Patricia;Xiao, Ming;Tsui, Lap-Chee;Mak, William;Song, You Qiang;Tam, Paul K. H.;Nakamura, Yusuke;Kawaguchi, Takahisa;Kitamoto, Takuya;Morizono, Takashi;Nagashima, Atsushi;Ohnishi, Yozo;Sekine, Akihiro;Tanaka, Toshihiro;Tsunoda, Tatsuhiko;Deloukas, Panos;Bird, Christine P.;Delgado, Marcos;Dermitzakis, Emmanouil T.;Gwilliam, Rhian;Hunt, Sarah;Morrison, Jonathan;Powell, Don;Stranger, Barbara E.;Whittaker, Pamela;Bentley, David R.;Daly, Mark J.;de Bakker, Paul I. W.;Barrett, Jeff;Chretien, Yves R.;Maller, Julian;McCarroll, Steve;Patterson, Nick;Pe'er, Itsik;Price, Alkes;Purcell, Shaun;Richter, Daniel J.;Sabeti, Pardis;Saxena, Richa;Schaffner, Stephen F.;Sham, Pak C.;Varilly, Patrick;Altshuler, David;Stein, Lincoln D.;Krishnan, Lalitha;Smith, Albert Vernon;Tello-Ruiz, Marcela K.;Thorisson, Gudmundur A.;Chakravarti, Aravinda;Chen, Peter E.;Cutler, David J.;Kashuk, Carl S.;Lin, Shin;Abecasis, Goncalo R.;Guan, Weihua;Li, Yun;Munro, Heather M.;Qin, Zhaohui Steve;Thomas, Daryl J.;McVean, Gilean;Auton, Adam;Bottolo, Leonardo;Cardin, Niall;Eyheramendy, Susana;Freeman, Colin;Marchini, Jonathan;Myers, Simon;Spencer, Chris;Stephens, Matthew;Donnelly, Peter;Cardon, Lon R.;Clarke, Geraldine;Evans, David M.;Morris, Andrew P.;Weir, Bruce S.;Tsunoda, Tatsuhiko;Johnson, Todd A.;Mullikin, James C.;Sherry, Stephen T.;Feolo, Michael;Skol, Andrew
  • 通讯作者:
    Skol, Andrew

Zhang, Qingrun的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Zhang, Qingrun', 18)}}的其他基金

Statistical models and computational tools for gene-gene interaction analyses by utilizing multi-scale omics
利用多尺度组学进行基因间相互作用分析的统计模型和计算工具
  • 批准号:
    RGPIN-2018-05147
  • 财政年份:
    2022
  • 资助金额:
    $ 2.26万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical models and computational tools for gene-gene interaction analyses by utilizing multi-scale omics
利用多尺度组学进行基因间相互作用分析的统计模型和计算工具
  • 批准号:
    RGPIN-2018-05147
  • 财政年份:
    2021
  • 资助金额:
    $ 2.26万
  • 项目类别:
    Discovery Grants Program - Individual
A GPU Server for Integration of Machine Learning in Mathematics and Statistics Research and Training
用于将机器学习集成到数学和统计研究与培训中的 GPU 服务器
  • 批准号:
    RTI-2021-00675
  • 财政年份:
    2020
  • 资助金额:
    $ 2.26万
  • 项目类别:
    Research Tools and Instruments
Statistical models and computational tools for gene-gene interaction analyses by utilizing multi-scale omics
利用多尺度组学进行基因间相互作用分析的统计模型和计算工具
  • 批准号:
    RGPIN-2018-05147
  • 财政年份:
    2020
  • 资助金额:
    $ 2.26万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical models and computational tools for gene-gene interaction analyses by utilizing multi-scale omics
利用多尺度组学进行基因间相互作用分析的统计模型和计算工具
  • 批准号:
    RGPIN-2018-05147
  • 财政年份:
    2019
  • 资助金额:
    $ 2.26万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical models and computational tools for gene-gene interaction analyses by utilizing multi-scale omics
利用多尺度组学进行基因间相互作用分析的统计模型和计算工具
  • 批准号:
    DGECR-2018-00061
  • 财政年份:
    2018
  • 资助金额:
    $ 2.26万
  • 项目类别:
    Discovery Launch Supplement

相似国自然基金

Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    合作创新研究团队
河北南部地区灰霾的来源和形成机制研究
  • 批准号:
    41105105
  • 批准年份:
    2011
  • 资助金额:
    25.0 万元
  • 项目类别:
    青年科学基金项目
保险风险模型、投资组合及相关课题研究
  • 批准号:
    10971157
  • 批准年份:
    2009
  • 资助金额:
    24.0 万元
  • 项目类别:
    面上项目
RKTG对ERK信号通路的调控和肿瘤生成的影响
  • 批准号:
    30830037
  • 批准年份:
    2008
  • 资助金额:
    190.0 万元
  • 项目类别:
    重点项目
新型手性NAD(P)H Models合成及生化模拟
  • 批准号:
    20472090
  • 批准年份:
    2004
  • 资助金额:
    23.0 万元
  • 项目类别:
    面上项目

相似海外基金

SCH: Novel and Interpretable Statistical Learning for Brain Images in AD/ADRDs
SCH:针对 AD/ADRD 大脑图像的新颖且可解释的统计学习
  • 批准号:
    10816764
  • 财政年份:
    2023
  • 资助金额:
    $ 2.26万
  • 项目类别:
Statistical methods to characterize causal mechanisms by which air pollution affects the recurrence of cardiovascular events
描述空气污染影响心血管事件复发因果机制的统计方法
  • 批准号:
    10660281
  • 财政年份:
    2023
  • 资助金额:
    $ 2.26万
  • 项目类别:
Statistical methods for population-level cell-type-specific analyses of tissue omics data for Alzheimer's disease
阿尔茨海默病组织组学数据的群体水平细胞类型特异性分析的统计方法
  • 批准号:
    10589254
  • 财政年份:
    2023
  • 资助金额:
    $ 2.26万
  • 项目类别:
Statistical models for the integrative analysis of complex biomedical images with manifold structure
具有流形结构的复杂生物医学图像综合分析的统计模型
  • 批准号:
    10590469
  • 财政年份:
    2023
  • 资助金额:
    $ 2.26万
  • 项目类别:
Statistical methods for longitudinal integrated mechanistic modeling of multiview data
多视图数据纵向综合机制建模的统计方法
  • 批准号:
    10445698
  • 财政年份:
    2022
  • 资助金额:
    $ 2.26万
  • 项目类别:
CORE 1/2: INIA Stress and Chronic Alcohol Interactions: Computational and Statistical Analysis Core (CSAC)
CORE 1/2:INIA 压力和慢性酒精相互作用:计算和统计分析核心 (CSAC)
  • 批准号:
    10411629
  • 财政年份:
    2022
  • 资助金额:
    $ 2.26万
  • 项目类别:
Statistical methods for longitudinal integrated mechanistic modeling of multiview data
多视图数据纵向综合机制建模的统计方法
  • 批准号:
    10685565
  • 财政年份:
    2022
  • 资助金额:
    $ 2.26万
  • 项目类别:
Core B: Statistical and Computational Analysis Core
核心B:统计和计算分析核心
  • 批准号:
    10698077
  • 财政年份:
    2022
  • 资助金额:
    $ 2.26万
  • 项目类别:
Efficient Statistical and Computational Methods for Genetics and Dynamical Models
遗传学和动力学模型的高效统计和计算方法
  • 批准号:
    RGPIN-2019-06131
  • 财政年份:
    2022
  • 资助金额:
    $ 2.26万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical and Computational Optimisation In Pricing Models (Insurance)
定价模型中的统计和计算优化(保险)
  • 批准号:
    2669153
  • 财政年份:
    2022
  • 资助金额:
    $ 2.26万
  • 项目类别:
    Studentship
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了