The Architecture of Missing and Archaic Variation in Human Population Genomic Data
人类基因组数据中缺失和古老变异的结构
基本信息
- 批准号:10292375
- 负责人:
- 金额:$ 44.1万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-07-01 至 2024-06-30
- 项目状态:已结题
- 来源:
- 关键词:AccountingAddressAdmixtureAfricaAfricanAlgorithmsAllelesArchitectureAsiaAsiansBenchmarkingBioinformaticsChinese PeopleClinicalComputer softwareComputing MethodologiesDataDevelopmentDiseaseEnvironmentEuropeanEventEvolutionGene FrequencyGenesGenomeGenomicsGenotypeHealthHumanHuman GenomeIndividualJointsLeadLinkLinkage DisequilibriumMarkov chain Monte Carlo methodologyMemoryMethodsModelingModern 1601-historyModernizationMorphologic artifactsMosaicismMuscleNatural SelectionsOceaniaPatternPhenotypePopulationPopulation SizesProcessRecording of previous eventsRelaxationSamplingSeriesShapesSourceStatistical MethodsStructureTechniquesTestingTimeTrainingUtahVariantWorkbasebiomedical data sciencecareerclimate changediverse dataexpectationfunctional genomicsgenetic variantgenome-widegenomic datagenomic locusgenomic variationgraduate studenthuman modelmigrationmodels and simulationnovelpopulation genetic structureprogramssoftware developmentstructural genomicstooltraitundergraduate student
项目摘要
Project Summary
Modern human genomes are mosaics of variation from numerous archaic non-human hominins, often termed
“ghost” populations. However, our understanding of the evolutionary history of “ghost” variation is still developing.
Importantly, computational methods to address missing “ghost” variation are still nascent, and not accounting for
the presence of “ghosts” often leads to erroneous inference. Here I propose a series of programmatic
developments to address inference of evolutionary history from modern human genomes, while
accounting for gene flow from archaic “ghosts”. In AIM 1, I propose to develop a parallelized statistical
framework for estimating population genetic structure from multi-allelic, multi-locus genomic data that
incorporates sequencing and imputation errors of data considered missing due to gene flow from archaic “ghost”
populations into a maximum likelihood based statistical framework. This method will be incorporated into a
computationally efficient program called p-MULTICLUST, a multi-threaded, parallelized tool which extends the
popular “admixture” model incorporated in tools like STRUCTURE and ADMIXTURE to account for missing multi-
allelic human genomic data. AIM 2 will involve a two-pronged approach to estimate evolutionary history and
population structure in the presence of gene flow from an archaic “ghost” under the Isolation with Migration (IM)
model. We will (a) develop extensions to the IMa3/IMa2p suite of tools to incorporate joint estimation of
population structure and demographic history from genomic data, and (b) train undergraduate students in
developing simulation models for the stdpopsim consortium under two important models of human history – (1)
archaic “ghost” gene flow in native Africans, and (2) multiple-epochs of admixture into Asians/Oceanians. In AIM
3, I propose to quantify the selection landscape of “ghost” variation across diverse human genomes due to
ancestral gene flow from now extinct “ghost” populations. In this aim, we will focus on (a) improvements to the
MigSelect program to quantify linked selection effects due to gene flow from “ghost” populations under the IM
model, and (b) a larger, more encompassing study of functional genomic variation across diverse human
populations including high-quality genomes from Africa, supplemented with more complete Neanderthal, and
other non-human hominin genomes which will help us delineate patterns of human evolutionary history, and
understand the functional consequences of archaic gene flow. These discoveries also have direct consequences
for understanding modern human ancestry, and disease allele evolution. Importantly, this R15 will train numerous
underrepresented Undergraduate and Graduate students in genomics and bioinformatics, towards careers in
the biomedical and data sciences.
项目摘要
现代人类基因组是众多古代非人类人类的变异的镶嵌物,通常称为
“幽灵”人口。但是,我们对“幽灵”变化的进化史的理解仍在发展。
重要的是,解决缺少“鬼”变化的计算方法仍然很新生,而不是考虑
“幽灵”的存在通常会导致错误的推断。在这里,我提出了一系列程序化
解决现代人类基因组进化史的推论的发展,而
从古老的“幽灵”中解释基因流。在AIM 1中,我建议开发并行统计
从多相关的多洛克斯基因组数据中估算种群遗传结构的框架
将由于存档“幽灵”的基因流而被认为缺失的数据的测序和插图错误
种群成为基于最大似然的统计框架。此方法将被整合到
计算高效的程序称为P-Multiclust,这是一种多线程,并行化工具,可扩展
流行的“混合”模型纳入了结构和混合等工具中,以说明缺失的多种多样
等位基因人类基因组数据。 AIM 2将涉及两种估计的进化史的方法和
在存在基因流的情况下,从迁移(IM)的隔离下从古老的“幽灵”中流动的种群结构
模型。我们将(a)开发到IMA3/IMA2P工具套件的扩展,以结合关节估计
基因组数据的人口结构和人口历史,(b)培训本科生
在人类历史的两个重要模型下为Stdpopsim联盟开发模拟模型 - (1)
档案中非洲人的档案“幽灵”基因流,以及(2)将亚洲人/大洋洲人的多个杂物。目标
3,我建议量化由于潜水员基因组的“幽灵”变异的选择景观
祖先基因从现在灭绝的“幽灵”种群流动。在此目标中,我们将重点介绍(a)改进
MigSelect程序,以量化IM下“幽灵”种群的基因流量引起的链接选择效果
模型和(b)对潜水员人类功能基因组变异的更大,更包含的研究
包括非洲高质量基因组在内的人群,补充了更完整的尼安德特人,
其他非人类人类基因组将有助于我们描述人类进化史的模式,并
了解存档基因流的功能后果。这些发现也有直接的后果
重要的是,该R15将训练多次。
在基因组学和生物信息学领域的本科生和研究生不足,从事职业
生物医学和数据科学。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Kimberly Ayers其他文献
Kimberly Ayers的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
时空序列驱动的神经形态视觉目标识别算法研究
- 批准号:61906126
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
本体驱动的地址数据空间语义建模与地址匹配方法
- 批准号:41901325
- 批准年份:2019
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
大容量固态硬盘地址映射表优化设计与访存优化研究
- 批准号:61802133
- 批准年份:2018
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
IP地址驱动的多径路由及流量传输控制研究
- 批准号:61872252
- 批准年份:2018
- 资助金额:64.0 万元
- 项目类别:面上项目
针对内存攻击对象的内存安全防御技术研究
- 批准号:61802432
- 批准年份:2018
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Uncovering sources of human gene expression variation in a globally diverse cohort
揭示全球多样化群体中人类基因表达变异的来源
- 批准号:
10607411 - 财政年份:2023
- 资助金额:
$ 44.1万 - 项目类别:
BridgePRS: bridging the gap in polygenic risk scores between ancestries.
BridgePRS:缩小祖先之间多基因风险评分的差距。
- 批准号:
10737057 - 财政年份:2023
- 资助金额:
$ 44.1万 - 项目类别:
Empowering gene discovery and accelerating clinical translation for diverse admixed populations
促进基因发现并加速不同混合人群的临床转化
- 批准号:
10584936 - 财政年份:2023
- 资助金额:
$ 44.1万 - 项目类别:
Understanding the Increased Risk of Childhood Acute Lymphoblastic Leukemia in Latinos
了解拉丁裔儿童儿童急性淋巴细胞白血病风险增加
- 批准号:
10629825 - 财政年份:2022
- 资助金额:
$ 44.1万 - 项目类别:
Understanding Alzheimer disease heterogeneity in Hispanic populations.
了解西班牙裔人群中阿尔茨海默病的异质性。
- 批准号:
10449014 - 财政年份:2022
- 资助金额:
$ 44.1万 - 项目类别: