Transfer learning leveraging large-scale transcriptomics to map disrupted gene networks in cardiovascular disease

利用大规模转录组学的转移学习来绘制心血管疾病中被破坏的基因网络

基本信息

项目摘要

PROJECT SUMMARY/ABSTRACT Mapping the gene regulatory networks driving human disease enables the design of network-correcting treatments that target the core disease mechanism rather than merely managing symptoms. I previously developed a framework for mapping disease-dependent gene networks to enable network-based screening leveraging machine learning and human induced pluripotent stem cell modeling that identified a promising network-correcting therapy for cardiac valve disease currently progressing towards clinical trial, reported in Cell1 and Science2. However, computationally inferring the network map requires large amounts of transcriptomic data to learn the connections between genes, which impedes network-correcting drug discovery in settings with limited data including rare disease and disease affecting clinically inaccessible tissues. Although data remains limited in these settings, recent advances in sequencing technologies have driven a rapid expansion in the amount of transcriptomic data available from human tissues more broadly. Recently, the concept of transfer learning has revolutionized fields such as natural language understanding and computer vision by leveraging deep learning models pretrained on large-scale general datasets that can then be fine- tuned towards a vast array of downstream tasks with limited application-specific data that would be too limited to yield meaningful predictions in isolation. To test whether an analogous approach could enable gene network predictions with limited data, I developed and pretrained my novel deep learning model, Geneformer, with a large-scale pretraining corpus I assembled with ~30 million human single cell transcriptomes, thereby generating an invaluable checkpoint from which fine-tuning towards a broad range of downstream applications could be pursued to accelerate discovery of key network regulators and candidate network-correcting therapies. Geneformer consistently boosted predictive accuracy in a diverse panel of downstream tasks using just a limited set of task-specific training examples. I now propose to leverage Geneformer’s learned understanding of contextual gene network dynamics to address two major challenges in cardiac biology. In Aim 1, I will determine novel dosage-sensitive gene combinations and their context-dependency in cardiac cell types, thereby generating a map of contextual dosage sensitivity for genes individually or in combination that has the potential of dramatically improving our interpretation of copy number variants in genetic diagnosis of cardiac disease. In Aim 2, I will map the dysregulated gene network and discover candidate network-correcting therapeutics in a prototypical rare disease affecting clinically inaccessible tissue where progress has been impeded by limited data, hypertrophic cardiomyopathy, to accelerate the discovery of a much-needed targeted therapeutic for this life-threatening progressive disease. Overall, my novel deep learning model, Geneformer, pretrained with large-scale single cell transcriptomic data has the potential of revolutionizing the field of network biology through transfer learning to accelerate discovery in settings with limited data.
项目总结/摘要 绘制驱动人类疾病的基因调控网络, 针对核心疾病机制的治疗,而不仅仅是控制症状。我以前 开发了一个框架,用于绘制疾病依赖性基因网络,以实现基于网络的筛查 利用机器学习和人类诱导多能干细胞建模, 心脏瓣膜疾病的网络校正疗法目前正朝着临床试验的方向发展, 细胞1和科学2。然而,计算地推断网络图需要大量的计算。 转录组学数据来了解基因之间的联系,这阻碍了网络校正药物的发现 在数据有限的情况下,包括罕见疾病和影响临床不可及组织的疾病。 尽管在这些环境中数据仍然有限,但测序技术的最新进展已经推动了一个新的研究领域。 从更广泛的人类组织中获得的转录组学数据量的快速扩展。近日 迁移学习的概念已经彻底改变了自然语言理解和计算机等领域, 通过利用在大规模通用数据集上预训练的深度学习模型, 针对大量下游任务进行调整,这些任务具有有限的应用程序特定数据,这些数据将非常有限 孤立地产生有意义的预测。为了测试类似的方法是否可以使基因网络 利用有限的数据进行预测,我开发并预训练了我的新颖深度学习模型Geneformer,使用 大规模预训练语料库I用约3000万个人类单细胞转录组组装, 生成一个宝贵的检查点,从该检查点向广泛的下游应用程序进行微调 可以加快发现关键的网络调节器和候选网络校正 治疗Geneformer在不同的下游任务中一致地提高了预测准确性, 只有有限的一组特定于任务的培训示例。我现在提议利用基因成形者的经验 了解背景基因网络动态,以解决心脏生物学中的两个主要挑战。在Aim中 1,我将确定新的剂量敏感的基因组合和他们的上下文依赖性在心脏细胞 类型,从而生成单独或组合的基因的背景剂量敏感性图, 有可能极大地改善我们在遗传诊断中对拷贝数变异的解释, 心脏病。在目标2中,我将绘制失调基因网络并发现候选网络校正 治疗影响临床上不可接近的组织的典型罕见疾病, 由于数据有限,肥厚性心肌病,以加速发现急需的靶向 治疗这种危及生命的进行性疾病。总的来说,我的新型深度学习模型Geneformer, 用大规模单细胞转录组学数据预训练有可能彻底改变 网络生物学通过迁移学习来加速在有限数据环境中的发现。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Christina Vicky Theodoris其他文献

Christina Vicky Theodoris的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

Rational design of rapidly translatable, highly antigenic and novel recombinant immunogens to address deficiencies of current snakebite treatments
合理设计可快速翻译、高抗原性和新型重组免疫原,以解决当前蛇咬伤治疗的缺陷
  • 批准号:
    MR/S03398X/2
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Fellowship
Re-thinking drug nanocrystals as highly loaded vectors to address key unmet therapeutic challenges
重新思考药物纳米晶体作为高负载载体以解决关键的未满足的治疗挑战
  • 批准号:
    EP/Y001486/1
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Research Grant
CAREER: FEAST (Food Ecosystems And circularity for Sustainable Transformation) framework to address Hidden Hunger
职业:FEAST(食品生态系统和可持续转型循环)框架解决隐性饥饿
  • 批准号:
    2338423
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Continuing Grant
Metrology to address ion suppression in multimodal mass spectrometry imaging with application in oncology
计量学解决多模态质谱成像中的离子抑制问题及其在肿瘤学中的应用
  • 批准号:
    MR/X03657X/1
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Fellowship
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
  • 批准号:
    2348066
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Standard Grant
The Abundance Project: Enhancing Cultural & Green Inclusion in Social Prescribing in Southwest London to Address Ethnic Inequalities in Mental Health
丰富项目:增强文化
  • 批准号:
    AH/Z505481/1
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Research Grant
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10107647
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    EU-Funded
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
  • 批准号:
    2341402
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Standard Grant
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10106221
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    EU-Funded
Recite: Building Research by Communities to Address Inequities through Expression
背诵:社区开展研究,通过表达解决不平等问题
  • 批准号:
    AH/Z505341/1
  • 财政年份:
    2024
  • 资助金额:
    $ 47.25万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了