Mathematical foundations of non-reversible MCMC for genome-scale inference
用于基因组规模推理的不可逆 MCMC 的数学基础
基本信息
- 批准号:EP/V049208/1
- 负责人:
- 金额:$ 9.71万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2021
- 资助国家:英国
- 起止时间:2021 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The world is undergoing an explosion of genetic DNA sequence data. Patterns within data sets carry information about unobservable biological and demographic histories of populations, which in turn are fueling discoveries in areas such as medicine, demography, and conservation. A central tool connecting observed patterns to testable predictions and inference is the Ancestral Recombination Graph, which models the patterns of common ancestry along sampled DNA sequences. Since common ancestry is typically not observable directly, inferences are made by averaging over possible ancestries. In very simple cases the averaging can be carried out exactly, but in biologically relevant settings it typically has to be approximated. Typical approximation methods create an ensemble of candidate ancestries, and use the ensemble average as a proxy for the true average. The accuracy of this procedure depends on the degree to which the ensemble is representative of the set of all possible ancestries. The computational time required to guarantee a representative ensemble grows rapidly as the size of a data set increases, and in practice, such ensemble-based methods can only be applied to small data sets by modern standards. In practice, researchers resort to computationally faster methods, the theoretical performance of which is less well understood. The lack of theoretical foundations can complicate the interpretability of findings, and makes it difficult to accurately quantify their associated uncertainty.A new class of methods for building representative ensembles, called zig-zag algorithms, has been developed and become increasingly widespread over the last several years. It has also shown promise in pilot applications in genetics, but an effective data structure for applying zig-zag methods to genome-scale data is an essential ingredient, and remains unknown. This project aims to develop and test suitable data structures, making possible the engineering of software packages which combine feasible run times with efficient use of statistical signal in data sets.
世界正在经历基因DNA序列数据的爆炸。数据集内的模式携带着有关不可观察的生物和人口统计学历史的信息,这些信息反过来又推动了医学,人口统计学和保护等领域的发现。连接观察到的模式与可测试的预测和推理的核心工具是祖先进化图,它沿着沿着采样的DNA序列对共同祖先的模式进行建模。由于共同祖先通常不能直接观察到,因此通过对可能的祖先进行平均来进行推断。在非常简单的情况下,可以精确地进行平均,但在生物学相关的设置中,通常必须近似。典型的近似方法创建候选祖先的集合,并使用集合平均值作为真实平均值的代理。这个过程的准确性取决于集合在多大程度上代表了所有可能祖先的集合。随着数据集大小的增加,保证代表性集合所需的计算时间迅速增长,并且在实践中,这种基于集合的方法只能应用于现代标准的小数据集。在实践中,研究人员采用计算速度更快的方法,其理论性能还不太清楚。缺乏理论基础会使结果的可解释性变得复杂,并且难以准确地量化其相关的不确定性。一种新的用于构建代表性集合的方法,称为zig-zag算法,在过去几年中得到了发展并变得越来越普遍。它在遗传学的试点应用中也显示出了希望,但将锯齿形方法应用于基因组规模数据的有效数据结构是一个重要组成部分,仍然未知。这个项目的目的是发展和测试适当的数据结构,使联合收割机的软件包工程能够结合可行的运行时间和有效利用数据集中的统计信号。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
EWF: simulating exact paths of the Wright-Fisher diffusion.
- DOI:10.1093/bioinformatics/btad017
- 发表时间:2023-01-01
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Weak convergence of non-neutral genealogies to Kingman's coalescent
非中立谱系与金曼合并的弱收敛
- DOI:10.1016/j.spa.2023.04.016
- 发表时间:2023
- 期刊:
- 影响因子:1.4
- 作者:Brown S
- 通讯作者:Brown S
Sweepstakes reproductive success via pervasive and recurrent selective sweeps.
- DOI:10.7554/elife.80781
- 发表时间:2023-02-20
- 期刊:
- 影响因子:7.7
- 作者:Árnason E;Koskela J;Halldórsdóttir K;Eldon B
- 通讯作者:Eldon B
Sweepstakes reproductive success via pervasive and recurrent selective sweeps
- DOI:10.1101/2022.05.29.493887
- 发表时间:2022-12
- 期刊:
- 影响因子:7.7
- 作者:E. Árnason;Jere Koskela;Katrín Halldórsdóttir;Bjarki Eldon
- 通讯作者:E. Árnason;Jere Koskela;Katrín Halldórsdóttir;Bjarki Eldon
Bernoulli factories and duality in Wright-Fisher and Allen-Cahn models of population genetics
伯努利工厂以及群体遗传学赖特-费舍尔和艾伦-卡恩模型中的二元性
- DOI:10.1016/j.tpb.2024.01.002
- 发表时间:2024
- 期刊:
- 影响因子:1.4
- 作者:Koskela J
- 通讯作者:Koskela J
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jere Koskela其他文献
Consistency of Bayesian nonparametric inference for discretely observed jump diffusions
离散观察跳跃扩散的贝叶斯非参数推理的一致性
- DOI:
- 发表时间:
2015 - 期刊:
- 影响因子:1.5
- 作者:
Jere Koskela;Dario Spanò;P. A. Jenkins - 通讯作者:
P. A. Jenkins
Zig-Zag Sampling for Discrete Structures and Nonreversible Phylogenetic MCMC
离散结构和不可逆系统发育 MCMC 的 Zig-Zag 采样
- DOI:
10.1080/10618600.2022.2032722 - 发表时间:
2020 - 期刊:
- 影响因子:2.4
- 作者:
Jere Koskela - 通讯作者:
Jere Koskela
Bayesian non-parametric inference for $\Lambda$-coalescents: Posterior consistency and a parametric method
$Lambda$-coalescents 的贝叶斯非参数推理:后验一致性和参数方法
- DOI:
10.3150/16-bej923 - 发表时间:
2015 - 期刊:
- 影响因子:1.5
- 作者:
Jere Koskela;P. A. Jenkins;Dario Spanò - 通讯作者:
Dario Spanò
Multi-locus data distinguishes between population growth and multiple merger coalescents
- DOI:
10.1515/sagmb-2017-0011 - 发表时间:
2017-01 - 期刊:
- 影响因子:0.9
- 作者:
Jere Koskela - 通讯作者:
Jere Koskela
Genealogical processes of non-neutral population models under rapid mutation
快速突变下非中性种群模型的谱系过程
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Jere Koskela;Paul A. Jenkins;A. M. Johansen;Dario Spanò - 通讯作者:
Dario Spanò
Jere Koskela的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jere Koskela', 18)}}的其他基金
Exact scalable inference for coalescent processes
合并过程的精确可扩展推理
- 批准号:
EP/R044732/1 - 财政年份:2018
- 资助金额:
$ 9.71万 - 项目类别:
Research Grant
相似海外基金
The Epigenetic Regulator Prdm16 Controls Smooth Muscle Phenotypic Modulation and Atherosclerosis Risk
表观遗传调节因子 Prdm16 控制平滑肌表型调节和动脉粥样硬化风险
- 批准号:
10537602 - 财政年份:2023
- 资助金额:
$ 9.71万 - 项目类别:
Developing a robust native extracellular matrix to improve islet function with attenuated immunogenicity for transplantation
开发强大的天然细胞外基质,以改善胰岛功能,并减弱移植的免疫原性
- 批准号:
10596047 - 财政年份:2023
- 资助金额:
$ 9.71万 - 项目类别:
The role of GPR84 signaling during skin repair
GPR84 信号在皮肤修复中的作用
- 批准号:
10637039 - 财政年份:2023
- 资助金额:
$ 9.71万 - 项目类别:
The impact of auditory access on the development of speech perception
听觉访问对言语感知发展的影响
- 批准号:
10677429 - 财政年份:2023
- 资助金额:
$ 9.71万 - 项目类别:
Mechanistic dissection of allosteric modulation and nonproteolytic chaperone activity of human insulin-degrading enzyme
人胰岛素降解酶变构调节和非蛋白水解伴侣活性的机制剖析
- 批准号:
10667987 - 财政年份:2023
- 资助金额:
$ 9.71万 - 项目类别:
Non-invasive vagus nerve stimulation to mitigate subarachnoid hemorrhage induced inflammation
无创迷走神经刺激减轻蛛网膜下腔出血引起的炎症
- 批准号:
10665166 - 财政年份:2023
- 资助金额:
$ 9.71万 - 项目类别:
The Natural History of Overall Mortality with Diagnosed Symptomatic Gallstone Disease in the United States: A Sequential Mixed-methods Study Evaluating Emergency, Non-emergency, and No Cholecystectomy
美国诊断有症状胆结石病的总体死亡率的自然史:一项评估紧急、非紧急和不进行胆囊切除术的序贯混合方法研究
- 批准号:
10664339 - 财政年份:2023
- 资助金额:
$ 9.71万 - 项目类别:
Interrogating the role of m6A mRNA methylation in the aging of the β-cell and diabetes
探讨 m6A mRNA 甲基化在 β 细胞衰老和糖尿病中的作用
- 批准号:
10644215 - 财政年份:2023
- 资助金额:
$ 9.71万 - 项目类别:
Establishing patient-derived iPSCs as a platform for discovery research in NAFLD
建立源自患者的 iPSC 作为 NAFLD 发现研究的平台
- 批准号:
10647450 - 财政年份:2023
- 资助金额:
$ 9.71万 - 项目类别:
Harnessing iron acquisition to hinder enterobacterial pathogenesis
利用铁的获取来阻碍肠细菌的发病机制
- 批准号:
10651432 - 财政年份:2023
- 资助金额:
$ 9.71万 - 项目类别: