Developing robust and scalable genomics tools and databases to analyze immune receptor repertoires across diverse populations

开发强大且可扩展的基因组学工具和数据库来分析不同人群的免疫受体库

基本信息

  • 批准号:
    10656981
  • 负责人:
  • 金额:
    $ 82.38万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2023
  • 资助国家:
    美国
  • 起止时间:
    2023-02-10 至 2028-01-31
  • 项目状态:
    未结题

项目摘要

Abstract The recent advances in high-throughput sequencing technologies enable cost-effective characterization of the immune system and provide novel opportunities to study adaptive immune receptor repertoire (AIRR) at the population scale. In particular, AIRR analysis provides essential insight into the complexity of the immune system across a large variety of human diseases, including infectious diseases, cancer, autoimmune conditions, and neurodegenerative diseases. A commonly used assay-based approach (i.e. AIRR-Seq) provides a detailed view of the adaptive immune system by leveraging the deep sequencing of amplified DNA or RNA from the variable region of the T and B cell receptors (TCR and BCR) loci. However, the limited number of samples probed by the AIRR-Seq approach restricts the ability to detect novel population-specific V(D)J gene alleles across ethnically diverse and admixed populations. Non-targeted next-generation sequencing (NGS) (e.g. WGS) promises to fill the existing data gap by providing hundreds of thousands of NGS datasets across various ancestry groups. However, reliable and scalable bioinformatics algorithms have yet to be developed to utilize non-targeted NGS technologies to assemble novel population-specific alleles that would support effect-size heterogeneity across ancestries. There's a lack of comprehensive population-specific allelic immunogenomics reference databases. This void exacerbates existing health disparities, as discoveries in medical immunogenomics continue to be a privilege and benefit for populations of European ancestry. The current state-of-the-art databases were built on the genetic architecture based on individuals of European ancestry and thus fail to capture allelic variation across diverse populations. Ongoing initiatives by the Adaptive Immune Receptor Repertoire Community (AIRR-C) to improve the representation of diverse populations in reference databases (e.g. OGRDB and VDJbase) ignore individuals of non-European ancestry and only incorporate an extremely small number of individuals of European descent. We propose to utilize a data science approach for studying the variation of the human adaptive immune system at a truly global scale, improving studies of immunological health and diseases, and reducing health disparities. In this study, we will develop robust and scalable bioinformatics tools and databases able to leverage the largest datasets covering individuals of various ancestries composed of over half a million NGS samples spanning the AIRR-Seq, RNA-Seq, and WGS technologies. We will perform rigorous benchmarking of the developed bioinformatics methods based on both simulated and real data to demonstrate the feasibility of using NGS-based approaches to assemble novel V(D)J alleles. The availability of large and ethnically diverse sets of samples will allow us to discover novel population-specific V(D)J alleles, which will enrich existing immunogenomics databases with population-specific immune alleles. To promote the dissemination of the obtained results, the novel alleles and assembled receptor sequences will be shared as an easy-to-use database with a rich set of functionalities.
摘要 高通量测序技术的最新进展使得能够成本有效地表征DNA序列。 免疫系统,并提供新的机会,研究适应性免疫受体库(AIRR)在 人口规模。特别是,AIRR分析提供了对免疫系统复杂性的基本见解 包括传染病、癌症、自身免疫性疾病, 神经退行性疾病常用的基于检测的方法(即AIRR-Seq)提供了详细视图 适应性免疫系统通过利用来自可变的DNA或RNA的扩增的深度测序, T和B细胞受体(TCR和BCR)基因座的区域。然而,由于探测到的样本数量有限, AIRR-Seq方法限制了跨种族检测新的群体特异性V(D)J基因等位基因的能力 不同的和混合的人口。非靶向下一代测序(NGS)(例如WGS)有望填补 通过提供跨越不同祖先群体的数十万个NGS数据集来填补现有的数据缺口。 然而,尚未开发可靠和可扩展的生物信息学算法来利用非靶向NGS 技术来组装新的群体特异性等位基因,这将支持跨群体的效应大小异质性。 祖先缺乏全面的人群特异性等位基因免疫基因组学参考数据库。 这一空白加剧了现有的健康差距,因为医学免疫基因组学的发现仍然是一个新的挑战。 为欧洲血统的人口提供特权和福利。目前最先进的数据库是建立在 遗传结构基于欧洲血统的个体,因此未能捕获跨物种的等位基因变异。 不同的人群。适应性免疫受体库社区(AIRR-C)正在进行的倡议, 提高参考数据库(如OGRDB和VDJbase)中不同人群的代表性 非欧洲血统的个体,仅包含极少数欧洲血统的个体。 血统.我们建议利用数据科学方法来研究人类适应性免疫的变化, 在真正的全球范围内建立一个系统,改善免疫健康和疾病的研究, 差距。在这项研究中,我们将开发强大的和可扩展的生物信息学工具和数据库, 最大的数据集涵盖了由超过50万个NGS样本组成的各种祖先的个体 涵盖AIRR-Seq、RNA-Seq和WGS技术。我们将严格执行 开发了基于模拟和真实的数据的生物信息学方法,以证明使用 基于NGS的方法组装新的V(D)J等位基因。提供大量不同种族的 样本将使我们能够发现新的群体特异性V(D)J等位基因,这将丰富现有的 具有群体特异性免疫等位基因的免疫基因组学数据库。为了促进传播 获得的结果,新的等位基因和组装的受体序列将作为一个易于使用的数据库共享 with a rich丰富set组of functionalities功能.

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

SERGHEI MANGUL其他文献

SERGHEI MANGUL的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('SERGHEI MANGUL', 18)}}的其他基金

Developing robust and scalable genomics tools and databases to analyze immune receptor repertoires across diverse populations
开发强大且可扩展的基因组学工具和数据库来分析不同人群的免疫受体库
  • 批准号:
    10910354
  • 财政年份:
    2023
  • 资助金额:
    $ 82.38万
  • 项目类别:

相似海外基金

CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 82.38万
  • 项目类别:
    Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
  • 批准号:
    2338816
  • 财政年份:
    2024
  • 资助金额:
    $ 82.38万
  • 项目类别:
    Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
  • 批准号:
    2338846
  • 财政年份:
    2024
  • 资助金额:
    $ 82.38万
  • 项目类别:
    Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
  • 批准号:
    2348261
  • 财政年份:
    2024
  • 资助金额:
    $ 82.38万
  • 项目类别:
    Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
  • 批准号:
    2348346
  • 财政年份:
    2024
  • 资助金额:
    $ 82.38万
  • 项目类别:
    Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
  • 批准号:
    2348457
  • 财政年份:
    2024
  • 资助金额:
    $ 82.38万
  • 项目类别:
    Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
  • 批准号:
    2404989
  • 财政年份:
    2024
  • 资助金额:
    $ 82.38万
  • 项目类别:
    Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 82.38万
  • 项目类别:
    Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
  • 批准号:
    2339669
  • 财政年份:
    2024
  • 资助金额:
    $ 82.38万
  • 项目类别:
    Continuing Grant
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
  • 批准号:
    EP/Y029089/1
  • 财政年份:
    2024
  • 资助金额:
    $ 82.38万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了