Scalable detection and interpretation of structural variation in human genomes

人类基因组结构变异的可扩展检测和解释

基本信息

  • 批准号:
    10576268
  • 负责人:
  • 金额:
    $ 69.2万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-05-01 至 2025-02-28
  • 项目状态:
    未结题

项目摘要

PROJECT SUMMARY Structural variation (SV), is a diverse class of genome variation that includes copy number variants (CNVs) such as deletions and duplications, as well as balanced rearrangements, such as inversions and reciprocal translocations. A typical human genome harbors >4,000 SVs larger than 300bp and their large size increases the potential to delete or duplicate genes, disrupt chromatin structure, and alter expression. Despite their prevalence and potential for phenotypic consequence, SVs remain notoriously difficult to detect and genotype with high accuracy. Much of this difficulty is driven by the fact DNA sequence alignment “signals” indicating SVs are far more complex than for single-nucleotide and insertion deletion variants. Unlike SNP alignments that vary only in allele state, alignments supporting SVs vary in state (supports an alternate structure or not) alignment location, and type. Consequently, the accuracy of SV discovery is much lower than that of SNPs and INDELs. Furthermore, SV pipelines scale poorly and are difficult to run. These challenges are a barrier for single genome analysis and studies of families must invest substantial effort into eliminating a sea of false positives. These problems become exponentially more acute for large-scale sequencing efforts such as TOPmed, the Centers for Common Disease Genetics, and the All of Us program. Software efficiency is key to scalability for such projects. However, of equal importance is comprehensive, accurate discovery. Building upon more than a decade of software development experience and analyzing SV in diverse disease contexts, we have invested significant effort into understanding the causes of the insufficient accuracy for SV discovery. These efforts, together with our research and development experience in this area, give us unique insight into improving the accuracy and scalability of SV discovery. Our goal is to narrow the accuracy gap between SNP/INDEL variation and structural variation discovery. These developments will empower studies of human genomes in diverse contexts and will therefore have broad impact. Our goals are to: 1. Develop a deep learning model to correct systematic variation in sequence depth. This new machine learning model will correct systematic biases in DNA sequence depth and dramatically improve the discovery of deletions and duplications. 2. Improve the speed, scalability, and accuracy of SV detection and genotyping. Using new algorithms, we will bring the accuracy of SV detection much closer to that of SNP and INDEL discovery and allow accurate SV discovery to be deployed at scale. 3. Create a map of genomic constraint for SV from population-scale genome analysis. We will deploy our new methods to detect and genotype structural variation among tens of thousands of human genomes. The resulting SV map will empower the creation of a model of genomic constraint for SV and enable new software to predict deleterious SVs, especially in the noncoding genome.
项目摘要 结构变异(SV)是包括拷贝数变异(CNVs)在内的多种基因组变异 如缺失和重复,以及平衡重排,如倒置和互惠 易位一个典型的人类基因组含有超过4,000个大于300 bp的SV,并且它们的大尺寸会增加 删除或复制基因、破坏染色质结构和改变表达的潜力。尽管他们 尽管SV的流行率和表型后果的潜在性,但SV仍然非常难以检测和基因分型, 具有高精度。这一困难的大部分是由DNA序列比对“信号”指示的事实驱动的。 SV比单核苷酸和插入缺失变体复杂得多。与SNP比对不同 仅在等位基因状态中变化,支持SV的比对在状态中变化(支持替代结构或不支持替代结构) 对齐位置和类型。因此,SV发现的准确性远低于SNP, INDEL。此外,SV管道的规模很小,难以运行。这些挑战是一个障碍, 单基因组分析和家庭研究必须投入大量的努力,以消除虚假的海洋, 积极的。对于大规模测序工作,这些问题变得指数级地更加严重,例如 TOPmed,常见疾病遗传学中心和我们所有人计划。软件效率是关键 这类项目的可扩展性。然而,同样重要的是全面、准确的发现。 基于十多年的软件开发经验, 在疾病背景下,我们投入了大量精力来了解准确性不足的原因, 用于SV发现。这些努力,加上我们在这一领域的研发经验, 对提高SV发现的准确性和可扩展性的独特见解。我们的目标是缩小 SNP/INDEL变异和结构变异发现之间的差距。这些发展将使 在不同的背景下研究人类基因组,因此将产生广泛的影响。我们的目标是: 1.开发深度学习模型来纠正序列深度的系统性变化。这款新机 学习模型将纠正DNA序列深度的系统偏差,并显着提高 发现缺失和重复。 2.提高SV检测和基因分型的速度、可扩展性和准确性。使用新的算法, 我们将使SV检测的准确性更接近SNP和INDEL发现的准确性, 准确的SV发现将大规模部署。 3.从群体规模的基因组分析中创建SV的基因组约束图。我们将部署 我们的新方法来检测和基因型之间的结构变异成千上万的人类基因组。 由此产生的SV图谱将使SV的基因组约束模型的创建成为可能,并使新的 软件来预测有害的SV,特别是在非编码基因组中。

项目成果

期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
trfermikit: a tool to discover VNTR-associated deletions.
  • DOI:
    10.1093/bioinformatics/btab805
  • 发表时间:
    2022-02-07
  • 期刊:
  • 影响因子:
    0
  • 作者:
    McHale P;Quinlan AR
  • 通讯作者:
    Quinlan AR
Annotation of structural variants with reported allele frequencies and related metrics from multiple datasets using SVAFotate.
  • DOI:
    10.1186/s12859-022-05008-y
  • 发表时间:
    2022-11-16
  • 期刊:
  • 影响因子:
    3
  • 作者:
    Nicholas, Thomas J.;Cormier, Michael J.;Quinlan, Aaron R.
  • 通讯作者:
    Quinlan, Aaron R.
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Aaron R Quinlan其他文献

Extending reference assembly models
  • DOI:
    10.1186/s13059-015-0587-3
  • 发表时间:
    2015-01-24
  • 期刊:
  • 影响因子:
    9.400
  • 作者:
    Deanna M Church;Valerie A Schneider;Karyn Meltz Steinberg;Michael C Schatz;Aaron R Quinlan;Chen-Shan Chin;Paul A Kitts;Bronwen Aken;Gabor T Marth;Michael M Hoffman;Javier Herrero;M Lisandra Zepeda Mendoza;Richard Durbin;Paul Flicek
  • 通讯作者:
    Paul Flicek
Erratum: A reference bacterial genome dataset generated on the MinIONTM portable single-molecule nanopore sequencer
  • DOI:
    10.1186/s13742-015-0043-z
  • 发表时间:
    2015-02-13
  • 期刊:
  • 影响因子:
    3.900
  • 作者:
    Joshua Quick;Aaron R Quinlan;Nicholas J Loman
  • 通讯作者:
    Nicholas J Loman

Aaron R Quinlan的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Aaron R Quinlan', 18)}}的其他基金

New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
  • 批准号:
    10357060
  • 财政年份:
    2022
  • 资助金额:
    $ 69.2万
  • 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
  • 批准号:
    10560502
  • 财政年份:
    2022
  • 资助金额:
    $ 69.2万
  • 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
  • 批准号:
    9973582
  • 财政年份:
    2020
  • 资助金额:
    $ 69.2万
  • 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
  • 批准号:
    10341175
  • 财政年份:
    2020
  • 资助金额:
    $ 69.2万
  • 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
  • 批准号:
    10153847
  • 财政年份:
    2020
  • 资助金额:
    $ 69.2万
  • 项目类别:
Software for exploring all forms of genetic variation in any species
用于探索任何物种中所有形式的遗传变异的软件
  • 批准号:
    9749979
  • 财政年份:
    2017
  • 资助金额:
    $ 69.2万
  • 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
  • 批准号:
    8273206
  • 财政年份:
    2012
  • 资助金额:
    $ 69.2万
  • 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
  • 批准号:
    9272425
  • 财政年份:
    2012
  • 资助金额:
    $ 69.2万
  • 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
  • 批准号:
    8661785
  • 财政年份:
    2012
  • 资助金额:
    $ 69.2万
  • 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
  • 批准号:
    8460819
  • 财政年份:
    2012
  • 资助金额:
    $ 69.2万
  • 项目类别:

相似海外基金

RII Track-4:NSF: From the Ground Up to the Air Above Coastal Dunes: How Groundwater and Evaporation Affect the Mechanism of Wind Erosion
RII Track-4:NSF:从地面到沿海沙丘上方的空气:地下水和蒸发如何影响风蚀机制
  • 批准号:
    2327346
  • 财政年份:
    2024
  • 资助金额:
    $ 69.2万
  • 项目类别:
    Standard Grant
BRC-BIO: Establishing Astrangia poculata as a study system to understand how multi-partner symbiotic interactions affect pathogen response in cnidarians
BRC-BIO:建立 Astrangia poculata 作为研究系统,以了解多伙伴共生相互作用如何影响刺胞动物的病原体反应
  • 批准号:
    2312555
  • 财政年份:
    2024
  • 资助金额:
    $ 69.2万
  • 项目类别:
    Standard Grant
How Does Particle Material Properties Insoluble and Partially Soluble Affect Sensory Perception Of Fat based Products
不溶性和部分可溶的颗粒材料特性如何影响脂肪基产品的感官知觉
  • 批准号:
    BB/Z514391/1
  • 财政年份:
    2024
  • 资助金额:
    $ 69.2万
  • 项目类别:
    Training Grant
Graduating in Austerity: Do Welfare Cuts Affect the Career Path of University Students?
紧缩毕业:福利削减会影响大学生的职业道路吗?
  • 批准号:
    ES/Z502595/1
  • 财政年份:
    2024
  • 资助金额:
    $ 69.2万
  • 项目类别:
    Fellowship
Insecure lives and the policy disconnect: How multiple insecurities affect Levelling Up and what joined-up policy can do to help
不安全的生活和政策脱节:多种不安全因素如何影响升级以及联合政策可以提供哪些帮助
  • 批准号:
    ES/Z000149/1
  • 财政年份:
    2024
  • 资助金额:
    $ 69.2万
  • 项目类别:
    Research Grant
感性個人差指標 Affect-X の構築とビスポークAIサービスの基盤確立
建立个人敏感度指数 Affect-X 并为定制人工智能服务奠定基础
  • 批准号:
    23K24936
  • 财政年份:
    2024
  • 资助金额:
    $ 69.2万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
How does metal binding affect the function of proteins targeted by a devastating pathogen of cereal crops?
金属结合如何影响谷类作物毁灭性病原体靶向的蛋白质的功能?
  • 批准号:
    2901648
  • 财政年份:
    2024
  • 资助金额:
    $ 69.2万
  • 项目类别:
    Studentship
ERI: Developing a Trust-supporting Design Framework with Affect for Human-AI Collaboration
ERI:开发一个支持信任的设计框架,影响人类与人工智能的协作
  • 批准号:
    2301846
  • 财政年份:
    2023
  • 资助金额:
    $ 69.2万
  • 项目类别:
    Standard Grant
Investigating how double-negative T cells affect anti-leukemic and GvHD-inducing activities of conventional T cells
研究双阴性 T 细胞如何影响传统 T 细胞的抗白血病和 GvHD 诱导活性
  • 批准号:
    488039
  • 财政年份:
    2023
  • 资助金额:
    $ 69.2万
  • 项目类别:
    Operating Grants
How motor impairments due to neurodegenerative diseases affect masticatory movements
神经退行性疾病引起的运动障碍如何影响咀嚼运动
  • 批准号:
    23K16076
  • 财政年份:
    2023
  • 资助金额:
    $ 69.2万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了