Scalable Coalescent Inference for Large Data Sets
适用于大型数据集的可扩展合并推理
基本信息
- 批准号:10192760
- 负责人:
- 金额:$ 30.48万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-09-05 至 2022-06-30
- 项目状态:已结题
- 来源:
- 关键词:AddressAlgorithmsAreaBayesian AnalysisBiologicalBiologyBisonCessation of lifeCommunicable DiseasesComputer softwareComputing MethodologiesDNADNA SequenceDataData SetDevelopmentDimensionsEncapsulatedEnsureEvolutionExplosionFrequenciesGenealogical TreeGenealogyGenesGeneticGenetic PhenomenaGenetic VariationGenetic studyGoalsHumanInvestigationLabelMathematicsMethodologyMethodsModelingModernizationMolecularMolecular AnalysisNorth AmericaPhylogenetic AnalysisPopulationPopulation GeneticsProcessProcessed GenesPropertyPublic HealthRecording of previous eventsResearchResearch PersonnelResolutionSample SizeSamplingShapesSiteStatistical MethodsStatistical ModelsStochastic ProcessesStructureTimeTreesUncertaintyZIKAbasecancer genomicsgenomic dataimprovedinnovationlarge datasetsmathematical modelnext generation sequencingnovelopen sourcesimulationstatisticstheoriestool
项目摘要
Mathematical and statistical modeling of gene genealogies-trees that reflect ancestral relationships among sampled
molecular sequences-is central to many biological fields, including population genetics, phylodynamics of infectious
disease, paleogenomics, phylogenetics, and cancer genomics. Kingman's n-coalescent is a stochastic process of gene
genealogies whose parameters depend on an evolutionary model. Inference of model parameters then contributes to an
understanding of the phenomena that have given rise to the sequences. Though many sophisticated methods have been
developed to date, major statistical and computational challenges remain because the state space of genealogies grows
superexponentially with the number of samples. We are no longer data-limited but instead, we lack computational and
statistical methods for analysis of large scale emerging genomic data sets. The long-term goal of the researchers is to
develop statistically consistent and computationally efficient coalescent methods for exact inference of evolutionary
parameters from next-generation sequencing datasets. The objective of this research is to apply the notion of
lumpability of Kingman's n-coalescent to address the state-space explosion problem of coalescent methods. The basic
idea is to model a coarser resolution of the underlying genealogy and reduce the cardinality of the hidden state space.
These coarser coalescent models include Tajima's coalescent and the pure-death process coalescent. The specific aims
include (1) prove theorems for coalescent models and provide theoretical and practical tools for addressing
computational challenges when modeling different resolutions or "lumpings" of Kingman's coalescent; (2) develop
scalable methods for inference of evolutionary parameters using different coalescent models; (3) theoretically and
empirically validate the inference methods, applying them in simulations and in molecular sequences from infectious
diseases such as Zika, as well as ancient DNA samples from bison in North America and ancient and modern human
samples; (4) implement the novel methods in open source software, ensuring fast dissemination of the methodology
among researchers. The research is innovative in many distinct ways. First, Tajima's coalescent has not yet been
exploited for inference despite the potential based on the smaller state space. Second, the methods developed here will
allow inference from data sets that have not been exploited before because of computational limitations. Third, we
will not only provide a suite of tools ready for application but we will also provide statistical results supporting our
implementations. Our proposed research on scalable modeling of genealogical trees will be significant in a number eJf
fields, including the theory of evolutionary trees, statistical inference in population genetics and phylogenetics, and
the analysis of molecular sequences from infectious disease and ancient DNA.
基因谱系的数学和统计建模——反映样本之间祖先关系的树
项目成果
期刊论文数量(12)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Statistical Challenges in Tracking the Evolution of SARS-CoV-2.
跟踪SARS-COV-2的演变方面的统计挑战。
- DOI:10.1214/22-sts853
- 发表时间:2022-05
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Discussion on "Horseshoe-based Bayesian nonparametric estimation of effective population size trajectories" by James R. Faulkner, Andrew F. Magee, Beth Shapiro, and Vladimir N. Minin.
James R. Faulkner、Andrew F. Magee、Beth Shapiro 和 Vladimir N. Minin 对“有效人口规模轨迹的基于马蹄形贝叶斯非参数估计”的讨论。
- DOI:10.1111/biom.13275
- 发表时间:2020
- 期刊:
- 影响因子:1.9
- 作者:Cappello,Lorenzo;Ghosh,Swarnadip;Palacios,JuliaA
- 通讯作者:Palacios,JuliaA
Distance metrics for ranked evolutionary trees.
- DOI:10.1073/pnas.1922851117
- 发表时间:2020-11-17
- 期刊:
- 影响因子:11.1
- 作者:Kim J;Rosenberg NA;Palacios JA
- 通讯作者:Palacios JA
Adaptive Preferential Sampling in Phylodynamics With an Application to SARS-CoV-2.
在系统动力学中的自适应优先采样,并应用于SARS-COV-2。
- DOI:10.1080/10618600.2021.1987256
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
Exact limits of inference in coalescent models.
合并模型中推理的精确限制。
- DOI:10.1016/j.tpb.2018.11.004
- 发表时间:2019
- 期刊:
- 影响因子:1.4
- 作者:Johndrow,JamesE;Palacios,JuliaA
- 通讯作者:Palacios,JuliaA
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Julia Palacios其他文献
Julia Palacios的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Julia Palacios', 18)}}的其他基金
Novel Coalescent Approaches for Studying Evolutionary Processes
研究进化过程的新联合方法
- 批准号:
10552480 - 财政年份:2023
- 资助金额:
$ 30.48万 - 项目类别:
相似海外基金
Approximate algorithms and architectures for area efficient system design
区域高效系统设计的近似算法和架构
- 批准号:
LP170100311 - 财政年份:2018
- 资助金额:
$ 30.48万 - 项目类别:
Linkage Projects
AMPS: Rank Minimization Algorithms for Wide-Area Phasor Measurement Data Processing
AMPS:用于广域相量测量数据处理的秩最小化算法
- 批准号:
1736326 - 财政年份:2017
- 资助金额:
$ 30.48万 - 项目类别:
Standard Grant
Low Power, Area Efficient, High Speed Algorithms and Architectures for Computer Arithmetic, Pattern Recognition and Cryptosystems
用于计算机算术、模式识别和密码系统的低功耗、面积高效、高速算法和架构
- 批准号:
1686-2013 - 财政年份:2017
- 资助金额:
$ 30.48万 - 项目类别:
Discovery Grants Program - Individual
Rigorous simulation of speckle fields caused by large area rough surfaces using fast algorithms based on higher order boundary element methods
使用基于高阶边界元方法的快速算法对大面积粗糙表面引起的散斑场进行严格模拟
- 批准号:
375876714 - 财政年份:2017
- 资助金额:
$ 30.48万 - 项目类别:
Research Grants
Low Power, Area Efficient, High Speed Algorithms and Architectures for Computer Arithmetic, Pattern Recognition and Cryptosystems
用于计算机算术、模式识别和密码系统的低功耗、面积高效、高速算法和架构
- 批准号:
1686-2013 - 财政年份:2016
- 资助金额:
$ 30.48万 - 项目类别:
Discovery Grants Program - Individual
Low Power, Area Efficient, High Speed Algorithms and Architectures for Computer Arithmetic, Pattern Recognition and Cryptosystems
用于计算机算术、模式识别和密码系统的低功耗、面积高效、高速算法和架构
- 批准号:
1686-2013 - 财政年份:2015
- 资助金额:
$ 30.48万 - 项目类别:
Discovery Grants Program - Individual
Low Power, Area Efficient, High Speed Algorithms and Architectures for Computer Arithmetic, Pattern Recognition and Cryptosystems
用于计算机算术、模式识别和密码系统的低功耗、面积高效、高速算法和架构
- 批准号:
1686-2013 - 财政年份:2014
- 资助金额:
$ 30.48万 - 项目类别:
Discovery Grants Program - Individual
AREA: Optimizing gene expression with mRNA free energy modeling and algorithms
区域:利用 mRNA 自由能建模和算法优化基因表达
- 批准号:
8689532 - 财政年份:2014
- 资助金额:
$ 30.48万 - 项目类别:
CPS: Synergy: Collaborative Research: Distributed Asynchronous Algorithms and Software Systems for Wide-Area Monitoring of Power Systems
CPS:协同:协作研究:用于电力系统广域监控的分布式异步算法和软件系统
- 批准号:
1329780 - 财政年份:2013
- 资助金额:
$ 30.48万 - 项目类别:
Standard Grant
CPS: Synergy: Collaborative Research: Distributed Asynchronous Algorithms and Software Systems for Wide-Area Mentoring of Power Systems
CPS:协同:协作研究:用于电力系统广域指导的分布式异步算法和软件系统
- 批准号:
1329745 - 财政年份:2013
- 资助金额:
$ 30.48万 - 项目类别:
Standard Grant