The construction and utility of reference pan-genome graphs
参考泛基因组图的构建和利用
基本信息
- 批准号:10777673
- 负责人:
- 金额:$ 80.2万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-02-04 至 2024-02-29
- 项目状态:已结题
- 来源:
- 关键词:AddressAlgorithmsBase SequenceBiomedical ResearchCactaceaeCommunitiesComplexDataData AnalysesData SetDevelopmentDiagnosticDiseaseGenesGenetic ResearchGenetic VariationGenomeGenotypeGerm-Line MutationGraphHaplotypesHeartHumanHuman GenomeHuman ResourcesIndividualLibrariesMapsMedicalMethodsModelingMorphologic artifactsPerformancePhenotypePlayPopulationPopulation GeneticsResearchResolutionResourcesRoleStructureSystemUpdateVariantWorkcancer genomicsdesignexperiencehuman pangenomehuman population geneticshuman reference genomeimprovedmosaicpan-genomepractical applicationprogramsreference genomesegregationsuccesssupport toolstooltranscriptome sequencingworking group
项目摘要
PROJECT SUMMARY
The current human reference genome, GRCh38, plays a central role in medical and population human genetics.
It primarily models a single human individual and is missing hundreds of thousands of large structural variations
segregating in the population. This underrepresentation of genetic diversity leads to various artifacts in data
analysis and significantly hampers our understanding of the functional and medical relevance of these large
human variations, which may collectively have pervasive impact. To address this issue, we will extend our
previous work on sequence graphs and alignment algorithms and construct a pan-genome reference graph from
hundreds of long-read human assemblies that more completely represent genetic diversity. Specifically, we will
(1) design a reference graph model with a stable coordinate system compatible with GRCh38 and develop
toolkits and libraries to interact with this model; (2) develop minimizer-based sequence-to-graph alignment
algorithms for short and long sequences; (3) incrementally construct a reference graph by mapping assemblies
to an existing graph and updating the graph; and (4) develop a graph-based genotyping algorithm and apply it
to short-read based projects to call structural variations missed by the current pipelines. Upon completion, the
proposed project could replace the current practices based on a linear genome and will enable the profiling and
study of complex human variations missed in most current research.
项目总结
目前的人类参考基因组GRCh38在医学和人口人类遗传学中发挥着核心作用。
它主要模拟单个人类个体,缺少数十万个大的结构变化
人口中的种族隔离。这种对遗传多样性的低估导致了数据中的各种伪像
分析并显著阻碍了我们对这些大的
人类的变异,这些变异可能共同产生普遍的影响。为了解决这个问题,我们将延长我们的
以前在序列图和比对算法方面的工作,并构建了一个泛基因组参考图
数百个长期阅读的人类组合,更完整地代表了遗传多样性。具体来说,我们将
(1)设计与GRCH38兼容的具有稳定坐标系的参考图模型,并开发
与该模型交互的工具包和库;(2)开发基于最小化的序列到图的比对
针对短序列和长序列的算法;(3)通过映射程序集来增量地构造参考图
(4)开发了一种基于图形的基因分型算法,并将其应用于
以短文为基础的项目,以调用当前管道错过的结构变化。完成后,
拟议的项目可以取代目前基于线性基因组的做法,并将使分析和
对复杂的人类变异的研究在大多数当前的研究中遗漏了。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
KmerKeys: a web resource for searching indexed genome assemblies and variants.
- DOI:10.1093/nar/gkac266
- 发表时间:2022-07-05
- 期刊:
- 影响因子:14.9
- 作者:Pavlichin, Dmitri S.;Lee, HoJoon;Greer, Stephanie U.;Grimes, Susan M.;Weissman, Tsachy;Ji, Hanlee P.
- 通讯作者:Ji, Hanlee P.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Heng Li其他文献
Heng Li的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Heng Li', 18)}}的其他基金
The construction and utility of reference pan-genome graphs
参考泛基因组图的构建和利用
- 批准号:
10112282 - 财政年份:2020
- 资助金额:
$ 80.2万 - 项目类别:
The construction and utility of reference pan-genome graphs
参考泛基因组图的构建和利用
- 批准号:
9904877 - 财政年份:2020
- 资助金额:
$ 80.2万 - 项目类别:
The construction and utility of reference pan-genome graphs
参考泛基因组图的构建和利用
- 批准号:
10379369 - 财政年份:2020
- 资助金额:
$ 80.2万 - 项目类别:
Advanced computational methods in analyzing high-throughput sequencing data
分析高通量测序数据的先进计算方法
- 批准号:
10559560 - 财政年份:2018
- 资助金额:
$ 80.2万 - 项目类别:
Bioinformatics Technology to Characterize Tumor Infiltrating Immune Repertoires
生物信息学技术表征肿瘤浸润免疫库
- 批准号:
9888343 - 财政年份:2018
- 资助金额:
$ 80.2万 - 项目类别:
Advanced computational methods in analyzing high-throughput sequencing data
分析高通量测序数据的先进计算方法
- 批准号:
10367263 - 财政年份:2018
- 资助金额:
$ 80.2万 - 项目类别:
相似海外基金
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
- 批准号:
2337776 - 财政年份:2024
- 资助金额:
$ 80.2万 - 项目类别:
Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
- 批准号:
2338816 - 财政年份:2024
- 资助金额:
$ 80.2万 - 项目类别:
Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
- 批准号:
2338846 - 财政年份:2024
- 资助金额:
$ 80.2万 - 项目类别:
Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
- 批准号:
2348261 - 财政年份:2024
- 资助金额:
$ 80.2万 - 项目类别:
Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
- 批准号:
2348346 - 财政年份:2024
- 资助金额:
$ 80.2万 - 项目类别:
Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
- 批准号:
2348457 - 财政年份:2024
- 资助金额:
$ 80.2万 - 项目类别:
Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
- 批准号:
2404989 - 财政年份:2024
- 资助金额:
$ 80.2万 - 项目类别:
Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
- 批准号:
2339310 - 财政年份:2024
- 资助金额:
$ 80.2万 - 项目类别:
Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
- 批准号:
2339669 - 财政年份:2024
- 资助金额:
$ 80.2万 - 项目类别:
Continuing Grant
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
- 批准号:
EP/Y029089/1 - 财政年份:2024
- 资助金额:
$ 80.2万 - 项目类别:
Research Grant