CRII: AF: Novel evolutionary models and algorithms to connect genomic sequence and phenotypic data
CRII:AF:连接基因组序列和表型数据的新颖进化模型和算法
基本信息
- 批准号:1565719
- 负责人:
- 金额:$ 17.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-08-01 至 2019-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
An organism's genome is the collection of all of its DNA, which can be written as a single string. All of the biological complexity of an organism is encoded within its genome. One of the greatest challenges in science today is understanding how the "syntax" of the genome gives rise to the "semantics" of biological function and, indeed, all life on Earth. The theory of evolution offers a way forward. These seemingly heterogeneous biological features are merely facets of the common evolutionary process by which they arose. By querying their phylogeny, or evolutionary history, we can begin to decipher the living language in which genomes are written and, ultimately, master it for our own purposes.Today, phylogenies are primarily reconstructed by computational analysis of biomolecular sequence data. Crucially, state-of-the-art algorithms treat genomes as an unordered bag of observations, not as an ordered sequence of observations. This assumption is made for pure mathematical convenience. In contrast, DNA is a linear molecule, the information encoded in the genome is understood to be sequential, and its order matters greatly. Recombination is one of the major evolutionary processes that rearranges genomes over time, ultimately shaping the sequential ordering of information. Sequence dependence due to recombination (or the lack thereof) is an essential aspect of the computational problem of phylogenetic inference, and yet the common assumption that loci (positions in the genome) are independent and identically distributed remains a major methodological gap.To address this critical need, this project will create new evolutionary models and algorithms for inferring species phylogenies from genomes while accounting for point mutations, genetic drift, and recombination. A connection is then forged to systems biology by building the new evolutionary models into a new computational method for mapping the genomic architecture of complex phenotypes (observable traits). The new methods will be validated using an extensive performance study incorporating empirical and synthetic data. Analyses of the empirical data are anticipated to result in new biological discoveries such as understanding the genetic basis of adaptive traits in house mouse, the most widely used laboratory organism.This project incorporates significant educational and outreach components. The new mathematical models, algorithms, and tools proposed in the research objectives will be the basis for two workshop series: one targeted to evolutionary computation researchers and the other to evolutionary biologists. Interdisciplinary training at the undergraduate and graduate level includes underrepresented minority students. Open implementations of all methods and data will be publicly available through a collaborative online community. This project entails three integrated research objectives. First, algorithms for inferring species phylogenies under a new combined model of point mutations, genetic drift, and recombination will be developed. The combined model unites the coalescent model of population genetics with a hidden Markov model to capture varying degrees of sequence dependence among neighboring loci due to recombination. A key challenge is scalability, which is addressed using new approximation algorithms. Second, the new evolutionary models will be fused with a linear mixed model to capture dependence between genomic loci and a trait encoded by causal loci within the genome. The new models will be the basis for new algorithms that address several related problems in functional genomics. One application is association mapping, which seeks to infer causal loci based upon significant correlation between allele frequencies and observed trait values. Third, a performance study will be conducted to validate the new computational methodologies.
生物体的基因组是其所有DNA的集合,可以写成一个单独的字符串。生物体的所有生物复杂性都编码在它的基因组中。当今科学界最大的挑战之一是理解基因组的“语法”如何产生生物功能的“语义”,实际上,地球上的所有生命都是如此。进化论提供了一条前进的道路。这些看似异质的生物特征仅仅是它们产生的共同进化过程的几个方面。通过查询它们的系统发育或进化史,我们可以开始破译基因组所用的活语言,并最终为我们自己的目的掌握它。今天,系统发育主要是通过对生物分子序列数据的计算分析来重建的。至关重要的是,最先进的算法将基因组视为观察的无序袋子,而不是观察的有序序列。这个假设纯粹是为了数学上的方便而做出的。相比之下,DNA是一个线性分子,基因组中编码的信息被认为是连续的,它的顺序非常重要。重组是主要的进化过程之一,它会随着时间的推移重新排列基因组,最终形成信息的顺序排序。由于重组(或缺乏重组)导致的序列依赖是系统发育推断计算问题的一个重要方面,然而,普遍认为基因座(在基因组中的位置)是独立的和相同分布的假设仍然是一个主要的方法论差距。为了解决这一关键需求,该项目将创建新的进化模型和算法,用于从基因组推断物种系统发育,同时考虑点突变、基因漂移和重组。然后,通过将新的进化模型构建成一种新的计算方法来绘制复杂表型(可观察特征)的基因组结构,从而与系统生物学建立联系。新方法将使用广泛的性能研究来验证,该研究结合了经验数据和合成数据。对经验数据的分析有望导致新的生物学发现,如了解家鼠适应性特征的遗传基础,家鼠是最广泛使用的实验室组织。该项目包括重要的教育和推广部分。研究目标中提出的新的数学模型、算法和工具将成为两个系列研讨会的基础:一个面向进化计算研究人员,另一个面向进化生物学家。本科生和研究生一级的跨学科培训包括代表性不足的少数族裔学生。所有方法和数据的开放实施将通过一个协作的在线社区公开提供。该项目包括三个综合研究目标。首先,将开发在点突变、遗传漂移和重组的新组合模式下推断物种系统发育的算法。该组合模型将群体遗传学的联合模型与隐马尔可夫模型相结合,以捕捉由于重组而导致的相邻基因座之间不同程度的序列依赖。一个关键的挑战是可伸缩性,这是使用新的近似算法来解决的。其次,新的进化模型将与线性混合模型融合,以捕捉基因组基因座和由基因组内因果基因座编码的性状之间的相关性。这些新模型将成为解决功能基因组学中几个相关问题的新算法的基础。一种应用是关联作图,它试图根据等位基因频率和观察到的性状值之间的显著相关性来推断因果基因座。第三,将进行性能研究,以验证新的计算方法。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Coal-Miner: A Statistical Method for GWA Studies of Quantitative Traits with Complex Evolutionary Origins
- DOI:10.1145/3107411.3107490
- 发表时间:2017-08
- 期刊:
- 影响因子:0
- 作者:Hussein A. Hejase;N. V. Pol;G. Bonito;P. Edger;Kevin J. Liu
- 通讯作者:Hussein A. Hejase;N. V. Pol;G. Bonito;P. Edger;Kevin J. Liu
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Kevin Liu其他文献
Medulloblastoma. Treatment results.
髓母细胞瘤。
- DOI:
- 发表时间:
1978 - 期刊:
- 影响因子:0
- 作者:
Alexander E. Black;Kevin Liu;A. McDonough;Garrett Nelson;Michael C. Wigal;Mei Yin;Youngho Yoo - 通讯作者:
Youngho Yoo
Characterizing Planar Tanglegram Layouts and Applications to Edge Insertion Problems
表征平面缠结图布局及其在边缘插入问题中的应用
- DOI:
10.37236/11299 - 发表时间:
2022 - 期刊:
- 影响因子:0.7
- 作者:
Kevin Liu - 通讯作者:
Kevin Liu
Permutation Statistics in Conjugacy Classes of the Symmetric Group
对称群共轭类的排列统计
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Michael Levet;Kevin Liu;Jesse Campion Loth;E. Stucky;S. Sundaram;Mei Yin - 通讯作者:
Mei Yin
632: The FLASH Effect is dependent on Dose per Pulse and not Mean Dose Rate for Abdominal Irradiations
632:闪光效应取决于每个脉冲的剂量,而不是腹部辐照的平均剂量率
- DOI:
10.1016/s0167-8140(24)01200-3 - 发表时间:
2024-05-01 - 期刊:
- 影响因子:5.300
- 作者:
Kevin Liu;Trey Waldrop;Edgardo Aguilar;Nefititi Mims;Denae Neill;Abagail Delahoussaye;Cullen Taniguchi;Devarati Mitra;Emil Schueler - 通讯作者:
Emil Schueler
Kevin Liu的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Kevin Liu', 18)}}的其他基金
CAREER: Future phylogenies: novel computational frameworks for biomolecular sequence analysis involving complex evolutionary origins
职业:未来的系统发育:涉及复杂进化起源的生物分子序列分析的新型计算框架
- 批准号:
2144121 - 财政年份:2022
- 资助金额:
$ 17.5万 - 项目类别:
Continuing Grant
AF: Small: Fast and accurate computational tools for large-scale evolutionary inference: a phylogenetic network approach
AF:小型:用于大规模进化推理的快速准确的计算工具:系统发育网络方法
- 批准号:
1714417 - 财政年份:2017
- 资助金额:
$ 17.5万 - 项目类别:
Standard Grant
相似国自然基金
基于前瞻性队列的双酚AF联合果糖加重代谢损伤的靶向代谢组学研究
- 批准号:2025JJ30049
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
U2AF2-circMMP1信号轴促进结直肠癌进展的分子机制研究
- 批准号:2025JJ80723
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
U2AF2精氯酸甲基化调控RNA转录合成在MTAP缺失骨肉瘤T细胞耗竭中的机制研究
- 批准号:
- 批准年份:2024
- 资助金额:0 万元
- 项目类别:青年科学基金项目
BDA-366通过MYD88/NF-κB/PGC1β通路杀伤 KMT2A/AF9 AML细胞的机制研究
- 批准号:
- 批准年份:2024
- 资助金额:15.0 万元
- 项目类别:省市级项目
Lu AF21934减少缺血性脑卒中导致的神经损伤的机制研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
H2S介导剪接因子BraU2AF65a的S-巯基化修饰促进大白菜开花的分子机制
- 批准号:32372727
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
AF9通过ARRB2-MRGPRB2介导肠固有肥大细胞活化促进重症急性胰腺炎发生MOF的研究
- 批准号:82300739
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
剪接因子U2AF1突变在急性髓系白血病原发耐药中的机制研究
- 批准号:82370157
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
线粒体活性氧介导的胎盘早衰在孕期双酚AF暴露致婴幼儿神经发育迟缓中的作用
- 批准号:82304160
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
U2AF2-circMMP1调控能量代谢促进结直肠癌肝转移的分子机制
- 批准号:82303789
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
AF: Small: Problems in Algorithmic Game Theory for Online Markets
AF:小:在线市场的算法博弈论问题
- 批准号:
2332922 - 财政年份:2024
- 资助金额:
$ 17.5万 - 项目类别:
Standard Grant
CRII: AF: Efficiently Computing and Updating Topological Descriptors for Data Analysis
CRII:AF:高效计算和更新数据分析的拓扑描述符
- 批准号:
2348238 - 财政年份:2024
- 资助金额:
$ 17.5万 - 项目类别:
Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
- 批准号:
2348346 - 财政年份:2024
- 资助金额:
$ 17.5万 - 项目类别:
Standard Grant
CRII: AF: Streaming Approximability of Maximum Directed Cut and other Constraint Satisfaction Problems
CRII:AF:最大定向切割和其他约束满足问题的流近似性
- 批准号:
2348475 - 财政年份:2024
- 资助金额:
$ 17.5万 - 项目类别:
Standard Grant
Collaborative Research: AF: Medium: The Communication Cost of Distributed Computation
合作研究:AF:媒介:分布式计算的通信成本
- 批准号:
2402836 - 财政年份:2024
- 资助金额:
$ 17.5万 - 项目类别:
Continuing Grant
Collaborative Research: AF: Medium: Foundations of Oblivious Reconfigurable Networks
合作研究:AF:媒介:遗忘可重构网络的基础
- 批准号:
2402851 - 财政年份:2024
- 资助金额:
$ 17.5万 - 项目类别:
Continuing Grant
Collaborative Research: AF: Small: New Directions in Algorithmic Replicability
合作研究:AF:小:算法可复制性的新方向
- 批准号:
2342244 - 财政年份:2024
- 资助金额:
$ 17.5万 - 项目类别:
Standard Grant
Collaborative Research: AF: Small: Exploring the Frontiers of Adversarial Robustness
合作研究:AF:小型:探索对抗鲁棒性的前沿
- 批准号:
2335411 - 财政年份:2024
- 资助金额:
$ 17.5万 - 项目类别:
Standard Grant
NSF-BSF: Collaborative Research: AF: Small: Algorithmic Performance through History Independence
NSF-BSF:协作研究:AF:小型:通过历史独立性实现算法性能
- 批准号:
2420942 - 财政年份:2024
- 资助金额:
$ 17.5万 - 项目类别:
Standard Grant
Collaborative Research: AF: Medium: Algorithms Meet Machine Learning: Mitigating Uncertainty in Optimization
协作研究:AF:媒介:算法遇见机器学习:减轻优化中的不确定性
- 批准号:
2422926 - 财政年份:2024
- 资助金额:
$ 17.5万 - 项目类别:
Continuing Grant