Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
基本信息
- 批准号:10341175
- 负责人:
- 金额:$ 69.2万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-05-01 至 2024-02-29
- 项目状态:已结题
- 来源:
- 关键词:AcuteAffectAlgorithmic SoftwareAlgorithmsAll of Us Research ProgramAllelesAreaAutomobile DrivingBiological AssayChromatin StructureChromosome StructuresClipCloud ComputingCodeCommunitiesComplexComputer softwareCopy Number PolymorphismDNADNA SequenceDataData ReportingDetectionDevelopmentDiseaseEnvironmentError SourcesExhibitsFamily StudyFundingFutureGene DuplicationGene ExpressionGene FusionGene StructureGeneticGenetic DiseasesGenetic VariationGenomeGenomicsGenotypeGoalsHumanHuman GenomeIndividualLaboratoriesLarge-Scale SequencingLocationMapsMethodsModelingNoiseNucleotidesPaintPathogenicityPerformancePhenotypePopulationPositioning AttributePrevalenceProcessReciprocal TranslocationResearchRunningSamplingSeaSensitivity and SpecificitySequence AlignmentSeriesSignal TransductionSoftware ToolsSourceSpeedStructureSystematic BiasTechniquesTechnologyTrainingTrans-Omics for Precision MedicineUnited States National Institutes of HealthUntranslated RNAVariantalgorithm developmentbaseconvolutional neural networkdeep learningdeep learning modeldevelopmental diseasedosageexomeexperiencegenome analysisgenome sequencinggenome-widehuman diseaseimprovedinnovationinsertion/deletion mutationinsightlarge datasetsmachine learning modelmethod developmentnanoporenovelpreventresearch and developmentsoftware developmentsuccesstoolvariant detectionwhole genome
项目摘要
PROJECT SUMMARY
Structural variation (SV), is a diverse class of genome variation that includes copy number variants (CNVs)
such as deletions and duplications, as well as balanced rearrangements, such as inversions and reciprocal
translocations. A typical human genome harbors >4,000 SVs larger than 300bp and their large size increases
the potential to delete or duplicate genes, disrupt chromatin structure, and alter expression. Despite their
prevalence and potential for phenotypic consequence, SVs remain notoriously difficult to detect and genotype
with high accuracy. Much of this difficulty is driven by the fact DNA sequence alignment “signals” indicating
SVs are far more complex than for single-nucleotide and insertion deletion variants. Unlike SNP alignments
that vary only in allele state, alignments supporting SVs vary in state (supports an alternate structure or not)
alignment location, and type. Consequently, the accuracy of SV discovery is much lower than that of SNPs and
INDELs. Furthermore, SV pipelines scale poorly and are difficult to run. These challenges are a barrier for
single genome analysis and studies of families must invest substantial effort into eliminating a sea of false
positives. These problems become exponentially more acute for large-scale sequencing efforts such as
TOPmed, the Centers for Common Disease Genetics, and the All of Us program. Software efficiency is key to
scalability for such projects. However, of equal importance is comprehensive, accurate discovery.
Building upon more than a decade of software development experience and analyzing SV in diverse
disease contexts, we have invested significant effort into understanding the causes of the insufficient accuracy
for SV discovery. These efforts, together with our research and development experience in this area, give us
unique insight into improving the accuracy and scalability of SV discovery. Our goal is to narrow the accuracy
gap between SNP/INDEL variation and structural variation discovery. These developments will empower
studies of human genomes in diverse contexts and will therefore have broad impact. Our goals are to:
1. Develop a deep learning model to correct systematic variation in sequence depth. This new machine
learning model will correct systematic biases in DNA sequence depth and dramatically improve the
discovery of deletions and duplications.
2. Improve the speed, scalability, and accuracy of SV detection and genotyping. Using new algorithms,
we will bring the accuracy of SV detection much closer to that of SNP and INDEL discovery and allow
accurate SV discovery to be deployed at scale.
3. Create a map of genomic constraint for SV from population-scale genome analysis. We will deploy
our new methods to detect and genotype structural variation among tens of thousands of human genomes.
The resulting SV map will empower the creation of a model of genomic constraint for SV and enable new
software to predict deleterious SVs, especially in the noncoding genome.
项目总结
结构变异(SV)是一类多样的基因组变异,包括拷贝数变异(CNV)
如删除和重复,以及平衡重排,如颠倒和互换
易位。一个典型的人类基因组含有大于300bp的4000个SVS,并且它们的大小增加
删除或复制基因、破坏染色质结构和改变表达的可能性。尽管他们
SVS的患病率和潜在的表型后果仍然是出了名的难以检测和分型
精确度很高。这种困难很大程度上是由DNA序列比对“信号”这一事实驱动的
SVS比单核苷酸和插入缺失变异体复杂得多。与SNP对齐不同
仅在等位基因状态上变化,支持SVS的比对在状态上变化(支持或不支持替代结构)
对齐位置和类型。因此,SV发现的准确性远远低于SNPs和SNPs
Indels。此外,SV管道的伸缩性很差,难以运行。这些挑战是
单基因组分析和对家族的研究必须投入大量精力来消除虚假的海洋
积极的一面。对于大规模的测序工作,这些问题变得更加尖锐,例如
TOPmed、常见病遗传学中心和我们所有人计划。软件效率是关键
这类项目的可伸缩性。然而,同样重要的是全面、准确的发现。
建立在十多年软件开发经验的基础上,分析各种不同的SV
在疾病的背景下,我们已经投入了大量的努力来了解准确性不足的原因
用于SV发现。这些努力,加上我们在这一领域的研发经验,给了我们
在提高SV发现的准确性和可扩展性方面具有独特的见解。我们的目标是缩小精确度
SNP/Indel变异与结构变异发现之间的差距。这些发展将使
在不同的背景下研究人类基因组,因此将产生广泛的影响。我们的目标是:
1.建立深度学习模型,校正层序深度的系统变化。这台新机器
学习模型将纠正DNA序列深度中的系统偏差,并显著提高
发现缺失和重复。
2.提高SV检测和基因分型的速度、可扩展性和准确性。使用新的算法,
我们将使SV检测的准确性更接近SNP和Indel发现,并允许
精确的SV发现将大规模部署。
3.从群体规模的基因组分析中创建了SV的基因组约束图谱。我们将部署
我们的新方法可以检测数万个人类基因组中的结构变异并进行基因分型。
所得到的SV图将使SV基因组约束模型的创建成为可能,并使新的
预测有害的SVS的软件,特别是在非编码基因组中。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Aaron R Quinlan其他文献
Extending reference assembly models
- DOI:
10.1186/s13059-015-0587-3 - 发表时间:
2015-01-24 - 期刊:
- 影响因子:9.400
- 作者:
Deanna M Church;Valerie A Schneider;Karyn Meltz Steinberg;Michael C Schatz;Aaron R Quinlan;Chen-Shan Chin;Paul A Kitts;Bronwen Aken;Gabor T Marth;Michael M Hoffman;Javier Herrero;M Lisandra Zepeda Mendoza;Richard Durbin;Paul Flicek - 通讯作者:
Paul Flicek
Erratum: A reference bacterial genome dataset generated on the MinIONTM portable single-molecule nanopore sequencer
- DOI:
10.1186/s13742-015-0043-z - 发表时间:
2015-02-13 - 期刊:
- 影响因子:3.900
- 作者:
Joshua Quick;Aaron R Quinlan;Nicholas J Loman - 通讯作者:
Nicholas J Loman
Aaron R Quinlan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Aaron R Quinlan', 18)}}的其他基金
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
10357060 - 财政年份:2022
- 资助金额:
$ 69.2万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
10560502 - 财政年份:2022
- 资助金额:
$ 69.2万 - 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
- 批准号:
10576268 - 财政年份:2020
- 资助金额:
$ 69.2万 - 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
- 批准号:
9973582 - 财政年份:2020
- 资助金额:
$ 69.2万 - 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
- 批准号:
10153847 - 财政年份:2020
- 资助金额:
$ 69.2万 - 项目类别:
Software for exploring all forms of genetic variation in any species
用于探索任何物种中所有形式的遗传变异的软件
- 批准号:
9749979 - 财政年份:2017
- 资助金额:
$ 69.2万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
8273206 - 财政年份:2012
- 资助金额:
$ 69.2万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
9272425 - 财政年份:2012
- 资助金额:
$ 69.2万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
8661785 - 财政年份:2012
- 资助金额:
$ 69.2万 - 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
- 批准号:
8460819 - 财政年份:2012
- 资助金额:
$ 69.2万 - 项目类别:
相似海外基金
How Does Particle Material Properties Insoluble and Partially Soluble Affect Sensory Perception Of Fat based Products
不溶性和部分可溶的颗粒材料特性如何影响脂肪基产品的感官知觉
- 批准号:
BB/Z514391/1 - 财政年份:2024
- 资助金额:
$ 69.2万 - 项目类别:
Training Grant
BRC-BIO: Establishing Astrangia poculata as a study system to understand how multi-partner symbiotic interactions affect pathogen response in cnidarians
BRC-BIO:建立 Astrangia poculata 作为研究系统,以了解多伙伴共生相互作用如何影响刺胞动物的病原体反应
- 批准号:
2312555 - 财政年份:2024
- 资助金额:
$ 69.2万 - 项目类别:
Standard Grant
RII Track-4:NSF: From the Ground Up to the Air Above Coastal Dunes: How Groundwater and Evaporation Affect the Mechanism of Wind Erosion
RII Track-4:NSF:从地面到沿海沙丘上方的空气:地下水和蒸发如何影响风蚀机制
- 批准号:
2327346 - 财政年份:2024
- 资助金额:
$ 69.2万 - 项目类别:
Standard Grant
Graduating in Austerity: Do Welfare Cuts Affect the Career Path of University Students?
紧缩毕业:福利削减会影响大学生的职业道路吗?
- 批准号:
ES/Z502595/1 - 财政年份:2024
- 资助金额:
$ 69.2万 - 项目类别:
Fellowship
Insecure lives and the policy disconnect: How multiple insecurities affect Levelling Up and what joined-up policy can do to help
不安全的生活和政策脱节:多种不安全因素如何影响升级以及联合政策可以提供哪些帮助
- 批准号:
ES/Z000149/1 - 财政年份:2024
- 资助金额:
$ 69.2万 - 项目类别:
Research Grant
感性個人差指標 Affect-X の構築とビスポークAIサービスの基盤確立
建立个人敏感度指数 Affect-X 并为定制人工智能服务奠定基础
- 批准号:
23K24936 - 财政年份:2024
- 资助金额:
$ 69.2万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
How does metal binding affect the function of proteins targeted by a devastating pathogen of cereal crops?
金属结合如何影响谷类作物毁灭性病原体靶向的蛋白质的功能?
- 批准号:
2901648 - 财政年份:2024
- 资助金额:
$ 69.2万 - 项目类别:
Studentship
Investigating how double-negative T cells affect anti-leukemic and GvHD-inducing activities of conventional T cells
研究双阴性 T 细胞如何影响传统 T 细胞的抗白血病和 GvHD 诱导活性
- 批准号:
488039 - 财政年份:2023
- 资助金额:
$ 69.2万 - 项目类别:
Operating Grants
New Tendencies of French Film Theory: Representation, Body, Affect
法国电影理论新动向:再现、身体、情感
- 批准号:
23K00129 - 财政年份:2023
- 资助金额:
$ 69.2万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The Protruding Void: Mystical Affect in Samuel Beckett's Prose
突出的虚空:塞缪尔·贝克特散文中的神秘影响
- 批准号:
2883985 - 财政年份:2023
- 资助金额:
$ 69.2万 - 项目类别:
Studentship