Integrating genomic and clinical data to predict disease phenotypes using heterogeneous ensembles

使用异质集合整合基因组和临床数据来预测疾病表型

基本信息

  • 批准号:
    10218766
  • 负责人:
  • 金额:
    $ 54万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2021
  • 资助国家:
    美国
  • 起止时间:
    2021-06-01 至 2025-03-31
  • 项目状态:
    未结题

项目摘要

PROJECT SUMMARY Genomic and other “omic” profiles hold immense potential for advancing personalize/precision medicine by enabling the accurate prediction of disease phenotypes or outcomes for individual patients, which can be used by a clinician to design an appropriate plan of care. However, despite this potential, the actual impact of these omic profiles on disease phenotype prediction may be limited by the fact that even large cohorts collecting these data do not cover large enough numbers of individuals. In contrast, a variety of clinical data types, such as laboratory tests and physician notes, are routinely collected and studied for a much larger number of patients undergoing treatment for such diseases at medical centers. The abundance of these clinical data, and their complementarity with multi-omic data, offer an opportunity to advance personalized medicine by integrating these disparate types of data. However, this disparity in data formats, namely several omic profiles being structured, and several clinical data types, such as physician notes, being unstructured, poses challenges for this integration. An associated challenge due to this disparity is that different classes of computational methods are likely to be the most effective for predicting disease phenotypes from these clinical and omics datasets. These challenges pose barriers for current data integration methods to address this problem. Here, we propose an innovative approach to this integration by assimilating diverse base phenotype predictors inferred from individual clinical and omics datasets into heterogeneous ensembles. These ensembles, which have shown promise for several other computational genomics problems, can aggregate an unrestricted number and variety of base predictors, which is ideal for this integration problem. Specifically, we describe how existing heterogeneous ensemble methods for single datasets can be transformed and advanced to address the multiple clinical and omic dataset integration problem. In particular, we detail novel algorithms for improving these integrative ensembles by modeling and incorporating the inherent patient and dataset heterogeneity in these datasets. We also propose novel algorithms for leveraging the inherent complementarity among clinical and omic datasets, as well as an innovative approach for handling expected missing data, both with the goal of making ensemble phenotype predictors more accurate and applicable to patient cohorts. To assess the performance of this novel suite of data integration-oriented heterogeneous ensembles, we will validate their effectiveness for predicting asthma and Inflammatory Bowel Disease phenotypes in substantial patient cohorts with diverse omics and clinical datasets. We will publicly release efficient software implementations of the methods developed in this project to enable others to carry out similar analyses with other diverse data collections. Successful accomplishment of the proposed work will contribute to the advancement of personalized medicine through accurate individualized prediction of disease phenotypes.
项目摘要 基因组和其他“组学”概况具有巨大的潜力,可通过以下方式推进个性化/精准医学: 能够准确预测个体患者的疾病表型或结果, 由临床医生设计适当的护理计划。然而,尽管有这种潜力,这些措施的实际影响 关于疾病表型预测的组学谱可能受到以下事实的限制,即即使收集了大量的 这些数据没有涵盖足够多的个人。相反,各种临床数据类型,例如 作为实验室测试和医生记录,常规收集和研究大量的 在医疗中心接受此类疾病治疗的患者。这些临床数据的丰富性, 它们与多组学数据的互补性,为推进个性化医疗提供了机会, 整合这些不同类型的数据。然而,这种数据格式的差异,即几个组学概况, 被结构化,并且几种临床数据类型,例如医生笔记,是非结构化的, 这种融合的挑战。由于这种差异而产生的一个相关挑战是, 计算方法可能是最有效的预测疾病表型从这些临床 和组学数据集。这些挑战为当前的数据集成方法解决这一问题带来了障碍 问题.在这里,我们提出了一种创新的方法,通过同化不同的基础表型, 从个体临床和组学数据集推断的预测因子到异质集合中。这些 集合,已经显示出对其他几个计算基因组学问题的承诺,可以聚合一个 不受限制的数量和种类的基础预测,这是理想的集成问题。我们特别 描述如何转换和改进现有的用于单个数据集的异构集成方法 以解决多个临床和组学数据集集成问题。特别是,我们详细介绍了新的算法 通过建模和整合固有的患者和数据集来改善这些综合性集合 这些数据集的异质性。我们还提出了新的算法,利用固有的互补性 在临床和组学数据集之间,以及处理预期缺失数据的创新方法, 其目的是使总体表型预测因子更准确并适用于患者组群。到 为了评估这套面向数据集成的异构集成的性能,我们将 验证它们在预测哮喘和炎症性肠病表型方面的有效性, 具有不同组学和临床数据集的患者队列。我们将公开发布高效软件 实施本项目中开发的方法,使其他人能够进行类似的分析, 其他不同的数据收集。成功完成拟议的工作将有助于 通过准确的个体化预测疾病表型来推进个性化医疗。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Gaurav Pandey其他文献

Gaurav Pandey的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Gaurav Pandey', 18)}}的其他基金

Multi-modal data integration to identify kinase substrates
多模式数据集成识别激酶底物
  • 批准号:
    10659156
  • 财政年份:
    2022
  • 资助金额:
    $ 54万
  • 项目类别:
Multi-modal data integration to identify kinase substrates
多模式数据集成识别激酶底物
  • 批准号:
    10451941
  • 财政年份:
    2022
  • 资助金额:
    $ 54万
  • 项目类别:
Integrating genomic and clinical data to predict disease phenotypes using heterogeneous ensembles
使用异质集合整合基因组和临床数据来预测疾病表型
  • 批准号:
    10589827
  • 财政年份:
    2021
  • 资助金额:
    $ 54万
  • 项目类别:
Integrating genomic and clinical data to predict disease phenotypes using heterogeneous ensembles
使用异质集合整合基因组和临床数据来预测疾病表型
  • 批准号:
    10409755
  • 财政年份:
    2021
  • 资助金额:
    $ 54万
  • 项目类别:
Boosting the Translational Impact of Scientific Competitions by Ensemble Learning
通过集成学习提升科学竞赛的转化影响
  • 批准号:
    8864679
  • 财政年份:
    2015
  • 资助金额:
    $ 54万
  • 项目类别:

相似海外基金

CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 54万
  • 项目类别:
    Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
  • 批准号:
    2338816
  • 财政年份:
    2024
  • 资助金额:
    $ 54万
  • 项目类别:
    Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
  • 批准号:
    2338846
  • 财政年份:
    2024
  • 资助金额:
    $ 54万
  • 项目类别:
    Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
  • 批准号:
    2348261
  • 财政年份:
    2024
  • 资助金额:
    $ 54万
  • 项目类别:
    Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
  • 批准号:
    2348346
  • 财政年份:
    2024
  • 资助金额:
    $ 54万
  • 项目类别:
    Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
  • 批准号:
    2348457
  • 财政年份:
    2024
  • 资助金额:
    $ 54万
  • 项目类别:
    Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
  • 批准号:
    2404989
  • 财政年份:
    2024
  • 资助金额:
    $ 54万
  • 项目类别:
    Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 54万
  • 项目类别:
    Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
  • 批准号:
    2339669
  • 财政年份:
    2024
  • 资助金额:
    $ 54万
  • 项目类别:
    Continuing Grant
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
  • 批准号:
    EP/Y029089/1
  • 财政年份:
    2024
  • 资助金额:
    $ 54万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了