Transfer learning leveraging large-scale transcriptomics to map disrupted gene networks in cardiovascular disease

利用大规模转录组学的转移学习来绘制心血管疾病中被破坏的基因网络

基本信息

项目摘要

PROJECT SUMMARY/ABSTRACT Mapping the gene regulatory networks driving human disease enables the design of network-correcting treatments that target the core disease mechanism rather than merely managing symptoms. I previously developed a framework for mapping disease-dependent gene networks to enable network-based screening leveraging machine learning and human induced pluripotent stem cell modeling that identified a promising network-correcting therapy for cardiac valve disease currently progressing towards clinical trial, reported in Cell1 and Science2. However, computationally inferring the network map requires large amounts of transcriptomic data to learn the connections between genes, which impedes network-correcting drug discovery in settings with limited data including rare disease and disease affecting clinically inaccessible tissues. Although data remains limited in these settings, recent advances in sequencing technologies have driven a rapid expansion in the amount of transcriptomic data available from human tissues more broadly. Recently, the concept of transfer learning has revolutionized fields such as natural language understanding and computer vision by leveraging deep learning models pretrained on large-scale general datasets that can then be fine- tuned towards a vast array of downstream tasks with limited application-specific data that would be too limited to yield meaningful predictions in isolation. To test whether an analogous approach could enable gene network predictions with limited data, I developed and pretrained my novel deep learning model, Geneformer, with a large-scale pretraining corpus I assembled with ~30 million human single cell transcriptomes, thereby generating an invaluable checkpoint from which fine-tuning towards a broad range of downstream applications could be pursued to accelerate discovery of key network regulators and candidate network-correcting therapies. Geneformer consistently boosted predictive accuracy in a diverse panel of downstream tasks using just a limited set of task-specific training examples. I now propose to leverage Geneformer’s learned understanding of contextual gene network dynamics to address two major challenges in cardiac biology. In Aim 1, I will determine novel dosage-sensitive gene combinations and their context-dependency in cardiac cell types, thereby generating a map of contextual dosage sensitivity for genes individually or in combination that has the potential of dramatically improving our interpretation of copy number variants in genetic diagnosis of cardiac disease. In Aim 2, I will map the dysregulated gene network and discover candidate network-correcting therapeutics in a prototypical rare disease affecting clinically inaccessible tissue where progress has been impeded by limited data, hypertrophic cardiomyopathy, to accelerate the discovery of a much-needed targeted therapeutic for this life-threatening progressive disease. Overall, my novel deep learning model, Geneformer, pretrained with large-scale single cell transcriptomic data has the potential of revolutionizing the field of network biology through transfer learning to accelerate discovery in settings with limited data.
项目概要/摘要 绘制驱动人类疾病的基因调控网络可以设计网络校正 针对核心疾病机制的治疗,而不仅仅是控制症状。我以前 开发了一个绘制疾病依赖性基因网络的框架,以实现基于网络的筛查 利用机器学习和人类诱导多能干细胞模型确定了一个有前途的 据报道,针对心脏瓣膜疾病的网络校正疗法目前正在进行临床试验 细胞1和科学2。然而,计算推断网络图需要大量的 转录组数据来了解基因之间的联系,这阻碍了网络校正药物发现 在数据有限的环境中,包括罕见疾病和影响临床无法到达的组织的疾病。 尽管这些环境中的数据仍然有限,但测序技术的最新进展推动了 从更广泛的人体组织中获得的转录组数据量迅速增加。最近, 迁移学习的概念彻底改变了自然语言理解和计算机等领域 通过利用在大规模通用数据集上预训练的深度学习模型来实现视觉,然后可以对这些模型进行精细化处理 针对大量下游任务进行调整,但特定于应用程序的数据太有限 孤立地产生有意义的预测。测试类似的方法是否可以实现基因网络 为了用有限的数据进行预测,我开发并预训练了我的新颖的深度学习模型 Geneformer 大规模预训练语料库 I 由约 3000 万个人类单细胞转录组组装而成,从而 生成一个宝贵的检查点,从中针对广泛的下游应用程序进行微调 可以加快发现关键网络监管机构和候选网络校正的速度 疗法。 Geneformer 使用以下方法持续提高了各种下游任务的预测准确性 只是一组有限的特定于任务的培训示例。我现在建议利用 Geneformer 的经验 了解背景基因网络动态,以解决心脏生物学的两个主要挑战。瞄准 1,我将确定新的剂量敏感基因组合及其在心脏细胞中的背景依赖性 类型,从而生成单独或组合基因的背景剂量敏感性图 有可能显着改善我们对基因诊断中拷贝数变异的解释 心脏病。在目标 2 中,我将绘制失调基因网络图并发现候选网络校正 影响临床上难以接近的组织的典型罕见疾病的治疗方法已取得进展 由于肥厚型心肌病的数据有限,加速发现急需的靶向药物受到阻碍 治疗这种危及生命的进行性疾病。总的来说,我新颖的深度学习模型 Geneformer, 用大规模单细胞转录组数据进行预训练有可能彻底改变该领域 网络生物学通过迁移学习来加速数据有限的环境中的发现。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Christina Vicky Theodoris其他文献

Christina Vicky Theodoris的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

Rational design of rapidly translatable, highly antigenic and novel recombinant immunogens to address deficiencies of current snakebite treatments
合理设计可快速翻译、高抗原性和新型重组免疫原,以解决当前蛇咬伤治疗的缺陷
  • 批准号:
    MR/S03398X/2
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Fellowship
Re-thinking drug nanocrystals as highly loaded vectors to address key unmet therapeutic challenges
重新思考药物纳米晶体作为高负载载体以解决关键的未满足的治疗挑战
  • 批准号:
    EP/Y001486/1
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Research Grant
CAREER: FEAST (Food Ecosystems And circularity for Sustainable Transformation) framework to address Hidden Hunger
职业:FEAST(食品生态系统和可持续转型循环)框架解决隐性饥饿
  • 批准号:
    2338423
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Continuing Grant
Metrology to address ion suppression in multimodal mass spectrometry imaging with application in oncology
计量学解决多模态质谱成像中的离子抑制问题及其在肿瘤学中的应用
  • 批准号:
    MR/X03657X/1
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Fellowship
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
  • 批准号:
    2348066
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Standard Grant
The Abundance Project: Enhancing Cultural & Green Inclusion in Social Prescribing in Southwest London to Address Ethnic Inequalities in Mental Health
丰富项目:增强文化
  • 批准号:
    AH/Z505481/1
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Research Grant
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10107647
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    EU-Funded
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
  • 批准号:
    2341402
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Standard Grant
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10106221
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    EU-Funded
Recite: Building Research by Communities to Address Inequities through Expression
背诵:社区开展研究,通过表达解决不平等问题
  • 批准号:
    AH/Z505341/1
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了