Probabilistic methods towards understanding complex human phenotypes using genomic and healthcare data

使用基因组和医疗数据理解复杂人类表型的概率方法

基本信息

  • 批准号:
    RGPIN-2019-06216
  • 负责人:
  • 金额:
    $ 2.84万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2019
  • 资助国家:
    加拿大
  • 起止时间:
    2019-01-01 至 2020-12-31
  • 项目状态:
    已结题

项目摘要

The advent of massive biological datasets challenge existing analytic frameworks. Large genomic profiling data confer the molecular basis to link mutations to gene expression changes in specific tissues. The broad adoption of EHR systems creates rich phenotypic data including diagnostic code, lab tests, and questionnaires. These data provide promising venues for developing novel machine learning methods to elucidate the biological mechanisms that give rise to the phenotypic diversities and interdependence. However, due to the lack of scalable inference methods, existing research is often limited to analyzing only a small snapshot of the entire datasets and unable to account for the sparse, multimodal, longitudinal, irregularly sampled, and non-missing-at-random nature of the data. ******Our long-term vision is to develop novel machine learning methods to decipher, in a human-understandable manner, the etiology of diverse phenotypes based on genetic variants, cell-type specificities, genomic regulatory elements, gene and pathway functions, and their interactions with environments. In pursuing this vision, we propose four short-term objectives. We will develop: ******1. Bayesian model to account for the multi-modality of the heterogeneous data distributions and predict composite biomarkers by associating genes, tissues, lab results, diagnosis codes via latent phenotypic topics,***2. generative model to impute correlated non-randomly missing lab results and answers to self-reported questionnaires in patients' EHR and gene expression in inaccessible tissue samples of new patients,***3. unsupervised model to infer latent trajectory of diverse patients' health states based on their longitudinal and irregularly sampled outpatient and inpatient medical records,***4. hierarchical Bayesian network that leverages the functional impacts of sequence mutations inferred from genomic data and jointly infer the directed paths from driver genetic variants, causal genes and pathways, and to phenotypes.******The key innovation of our proposed methods is that, in contrast to the existing ad hoc methods, we learn all components of our proposed models simultaneously (despite their complexity) and therefore harmonize diverse datasets with complementary information. We achieve this by scalable variational inference algorithms that leverage probability theory and deep learning techniques.******The proposed research will advance Bayesian learning for mining massive heterogenous data with impactful applications in medicine including composite biomarker discovery, imputation-based clinical recommendations, forecasting health trajectories, personalized risk predictions, deep interpretable models for inferring causal mutations and disease risks. Together, we present a step towards bridging the gap between the genome and the phenome by efficient Bayesian integrations of massive data, thereby improving our understanding of the cascading events from genetic mutations to a broad phenotypic spectrum. **
海量生物数据集的出现挑战了现有的分析框架。大量的基因组分析数据赋予了将特定组织中的突变与基因表达变化联系起来的分子基础。电子病历系统的广泛采用创造了丰富的表型数据,包括诊断代码、实验室测试和问卷调查。这些数据为开发新的机器学习方法提供了有希望的场所,以阐明引起表型多样性和相互依赖性的生物学机制。然而,由于缺乏可扩展的推理方法,现有的研究往往仅限于分析整个数据集的一个小快照,无法解释数据的稀疏、多模态、纵向、不规则采样和非随机缺失的性质。******我们的长期愿景是开发新的机器学习方法,以人类可理解的方式破译基于遗传变异、细胞类型特异性、基因组调控元件、基因和途径功能及其与环境相互作用的各种表型的病因学。为了实现这一愿景,我们提出了四个短期目标。我们将发展:******贝叶斯模型用于解释异质性数据分布的多模态,并通过潜在表型主题关联基因、组织、实验室结果、诊断代码来预测复合生物标志物,***2。生成模型,将相关非随机缺失的实验室结果和患者EHR自我报告问卷的答案与新患者不可获取的组织样本中的基因表达相关联,***3。基于门诊和住院病历纵向和不规则抽样的无监督模型推断不同患者健康状态的潜在轨迹,***4。分层贝叶斯网络,利用从基因组数据推断的序列突变的功能影响,并共同推断驱动遗传变异、因果基因和途径以及表型的定向路径。******我们提出的方法的关键创新之处在于,与现有的特设方法相比,我们同时学习我们提出的模型的所有组件(尽管它们很复杂),因此协调不同的数据集和互补的信息。我们通过利用概率论和深度学习技术的可扩展变分推理算法来实现这一点。******提出的研究将推进贝叶斯学习挖掘大量异构数据,并在医学上有影响力的应用,包括复合生物标志物发现、基于假设的临床建议、预测健康轨迹、个性化风险预测、用于推断因果突变和疾病风险的深度可解释模型。总之,我们通过对大量数据进行有效的贝叶斯整合,向弥合基因组和表型之间的差距迈出了一步,从而提高了我们对从基因突变到广泛表型谱的级联事件的理解。**

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Li, Yue其他文献

A nanoscale Fe(II) metal-organic framework with a bipyridinedicarboxylate ligand as a high performance heterogeneous Fenton catalyst
具有联吡啶二羧酸酯配体的纳米级 Fe(II) 金属有机骨架作为高性能非均相芬顿催化剂
  • DOI:
    10.1039/c5ra22779h
  • 发表时间:
    2016-01-01
  • 期刊:
  • 影响因子:
    3.9
  • 作者:
    Li, Yue;Liu, Huan;Ruan, Wen-Juan
  • 通讯作者:
    Ruan, Wen-Juan
Formation and stability of W/O microemulsion formed by food grade ingredients and its oral delivery of insulin in mice
食品级原料W/O微乳的形成、稳定性及其小鼠胰岛素口服给药
  • DOI:
    10.1016/j.jff.2017.01.006
  • 发表时间:
    2017-03-01
  • 期刊:
  • 影响因子:
    5.6
  • 作者:
    Li, Yue;Yokoyama, Wallace;Zhong, Fang
  • 通讯作者:
    Zhong, Fang
Prediction of the Old-Age Dependency Ratio in Chinese Cities Using DMSP/OLS Nighttime Light Data.
Negative capacitors and inductors enabling wideband waveguide metatronics.
  • DOI:
    10.1038/s41467-023-42808-z
  • 发表时间:
    2023-11-03
  • 期刊:
  • 影响因子:
    16.6
  • 作者:
    Qin, Xu;Fu, Pengyu;Yan, Wendi;Wang, Shuyu;Lv, Qihao;Li, Yue
  • 通讯作者:
    Li, Yue
Did Massachusetts COVID-19 vaccine lottery increase vaccine uptake?
  • DOI:
    10.1371/journal.pone.0279283
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    3.7
  • 作者:
    Kim, Yeunkyung;Kim, Jihye;Li, Yue
  • 通讯作者:
    Li, Yue

Li, Yue的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Li, Yue', 18)}}的其他基金

Probabilistic methods towards understanding complex human phenotypes using genomic and healthcare data
使用基因组和医疗数据理解复杂人类表型的概率方法
  • 批准号:
    RGPIN-2019-06216
  • 财政年份:
    2022
  • 资助金额:
    $ 2.84万
  • 项目类别:
    Discovery Grants Program - Individual
Probabilistic methods towards understanding complex human phenotypes using genomic and healthcare data
使用基因组和医疗数据理解复杂人类表型的概率方法
  • 批准号:
    RGPIN-2019-06216
  • 财政年份:
    2021
  • 资助金额:
    $ 2.84万
  • 项目类别:
    Discovery Grants Program - Individual
Probabilistic methods towards understanding complex human phenotypes using genomic and healthcare data
使用基因组和医疗数据理解复杂人类表型的概率方法
  • 批准号:
    RGPIN-2019-06216
  • 财政年份:
    2020
  • 资助金额:
    $ 2.84万
  • 项目类别:
    Discovery Grants Program - Individual
Probabilistic methods towards understanding complex human phenotypes using genomic and healthcare data
使用基因组和医疗数据理解复杂人类表型的概率方法
  • 批准号:
    DGECR-2019-00253
  • 财政年份:
    2019
  • 资助金额:
    $ 2.84万
  • 项目类别:
    Discovery Launch Supplement
Multisensory integration training to improve working memory
多感觉统合训练改善工作记忆
  • 批准号:
    515134-2017
  • 财政年份:
    2017
  • 资助金额:
    $ 2.84万
  • 项目类别:
    Alexander Graham Bell Canada Graduate Scholarships - Master's
Discovery of RNA Biomarkers for Prostate Cancer using High-Throughput Sequencing Data
使用高通量测序数据发现前列腺癌的 RNA 生物标志物
  • 批准号:
    426531-2012
  • 财政年份:
    2014
  • 资助金额:
    $ 2.84万
  • 项目类别:
    Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Discovery of RNA Biomarkers for Prostate Cancer using High-Throughput Sequencing Data
使用高通量测序数据发现前列腺癌的 RNA 生物标志物
  • 批准号:
    426531-2012
  • 财政年份:
    2013
  • 资助金额:
    $ 2.84万
  • 项目类别:
    Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Discovery of RNA Biomarkers for Prostate Cancer using High-Throughput Sequencing Data
使用高通量测序数据发现前列腺癌的 RNA 生物标志物
  • 批准号:
    426531-2012
  • 财政年份:
    2012
  • 资助金额:
    $ 2.84万
  • 项目类别:
    Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Machine-learning in finding correlates of immunity against pertussis
机器学习寻找百日咳免疫力的相关性
  • 批准号:
    393825-2010
  • 财政年份:
    2010
  • 资助金额:
    $ 2.84万
  • 项目类别:
    Alexander Graham Bell Canada Graduate Scholarships - Master's
A probabilistic model for whole-proteome analyses
全蛋白质组分析的概率模型
  • 批准号:
    384849-2009
  • 财政年份:
    2009
  • 资助金额:
    $ 2.84万
  • 项目类别:
    University Undergraduate Student Research Awards

相似国自然基金

复杂图像处理中的自由非连续问题及其水平集方法研究
  • 批准号:
    60872130
  • 批准年份:
    2008
  • 资助金额:
    28.0 万元
  • 项目类别:
    面上项目
Computational Methods for Analyzing Toponome Data
  • 批准号:
    60601030
  • 批准年份:
    2006
  • 资助金额:
    17.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

MCA: Towards a Theory of Engineering Identity Development & Persistence of Minoritized Students with Imposter Feelings: A Longitudinal Mixed-methods Study of Developmental Networks
MCA:迈向工程身份发展理论
  • 批准号:
    2421846
  • 财政年份:
    2024
  • 资助金额:
    $ 2.84万
  • 项目类别:
    Standard Grant
State Transformation in Historical Syria: Towards New Middle Eastern Area Studies with Mixed Methods
历史上叙利亚的国家转型:混合方法的新中东地区研究
  • 批准号:
    23H00043
  • 财政年份:
    2023
  • 资助金额:
    $ 2.84万
  • 项目类别:
    Grant-in-Aid for Scientific Research (A)
Towards treatment for the complex patient: investigations of low-intensity focused ultrasound.
针对复杂患者的治疗:低强度聚焦超声的研究。
  • 批准号:
    10775216
  • 财政年份:
    2023
  • 资助金额:
    $ 2.84万
  • 项目类别:
Towards the understanding of how chaperones function and prevent amyloidogenic diseases
了解伴侣如何发挥作用并预防淀粉样蛋白形成疾病
  • 批准号:
    10734397
  • 财政年份:
    2023
  • 资助金额:
    $ 2.84万
  • 项目类别:
Towards equitable early identification of autism spectrum disorders in females
实现女性自闭症谱系障碍的公平早期识别
  • 批准号:
    10722011
  • 财政年份:
    2023
  • 资助金额:
    $ 2.84万
  • 项目类别:
PTSD and Autoimmune Disease: Towards Causal Effects, Risk Factors, and Mitigators
创伤后应激障碍 (PTSD) 和自身免疫性疾病:因果效应、危险因素和缓解措施
  • 批准号:
    10696671
  • 财政年份:
    2023
  • 资助金额:
    $ 2.84万
  • 项目类别:
Towards a neurobiology of "oromanual" motor control: behavioral analysis and neural mechanisms
走向“手动”运动控制的神经生物学:行为分析和神经机制
  • 批准号:
    10819032
  • 财政年份:
    2023
  • 资助金额:
    $ 2.84万
  • 项目类别:
Towards personalized medicine: pathophysiologic contributions to post-stroke sleep apnea
迈向个性化医疗:中风后睡眠呼吸暂停的病理生理学贡献
  • 批准号:
    10654941
  • 财政年份:
    2023
  • 资助金额:
    $ 2.84万
  • 项目类别:
On Demand Dissoluble Supramolecular Hydrogels: Towards Pain Free Burn Dressings
按需可溶性超分子水凝胶:迈向无痛烧伤敷料
  • 批准号:
    10658220
  • 财政年份:
    2023
  • 资助金额:
    $ 2.84万
  • 项目类别:
Towards an integrated analytics solution to creating a spatially-resolved single-cell multi-omics brain atlas
寻求集成分析解决方案来创建空间解析的单细胞多组学大脑图谱
  • 批准号:
    10724843
  • 财政年份:
    2023
  • 资助金额:
    $ 2.84万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了