Algebraic Invariants for Phylogenetic Network Inference

系统发育网络推理的代数不变量

基本信息

  • 批准号:
    EP/W007134/1
  • 负责人:
  • 金额:
    $ 8.16万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2022
  • 资助国家:
    英国
  • 起止时间:
    2022 至 无数据
  • 项目状态:
    已结题

项目摘要

The key goal in phylogenetics is to be able to infer the evolutionary histories of species from DNA sequence data of their living relatives. This has applications in many fields, such as tracing the mutations of viral outbreaks, understanding speciation events to aid conservation, and even tracing the histories of ancient manuscripts that were copied by hand through generations.Most evolutionary histories can be described with a phylogenetic tree, where the "leaves" of the tree represent species that are alive today, and the vertices higher up the tree represent common ancestor species. However, for many biological problems, a tree cannot properly represent the evolutionary history of the species involved. Such problems are said to have seen "horizontal evolution". One example occurs in microbiomes, where different microbial species are able to share portions of their DNA in a process called horizontal gene transfer. This is one mechanism by which antibiotic resistance can spread between bacteria, and so being able to describe when such events have occurred has important implications for human health. To describe horizontal evolution, biologists use what's called a phylogenetic network. Here, one can use a tree structure as a backbone, onto which further edges are drawn to represent horizontal evolution events.The problem of inferring the evolutionary histories of species where horizontal evolution has occurred is particularly challenging, and is the focus of much of the research in phylogenetics today. One method of phylogenetic inference is to use algebraic invariants. These have seen significant development for inferring evolution along a tree, and in some cases have been shown to outperform other methods. For phylogenetic networks however, very little research on algebraic invariants has been done. This project will develop and test the method of using algebraic invariants for phylogenetic network inference.For a particular phylogenetic network, the process of evolution along it can be modelled using a type of probabilistic model called a Markov model. Under this model, one can calculate the probability of observing particular patterns of DNA at the leaves of the network, and these probabilities can be expressed as polynomials in the numerical parameters of the model. By allowing the numerical parameters to vary freely (i.e. treating them as variables) we can represent the network as the set of solutions to the equations describing the probabilities. Such a set of solutions forms an object that algebraists call an algebraic variety. Using this model gives us the advantage of being able to use the powerful machinery of algebraic geometry in determining whether observed DNA sequence data is a good fit for the network. In particular, we can describe the variety corresponding to a network by using expressions called algebraic invariants. To determine whether a particular network is a good fit for observed DNA sequence data, the idea is to calculate the frequencies of patterns in the data, and then apply the network's algebraic invariants to these frequencies. The resulting quantities will determine how closely the data matches the network.This project will examine how effective this method is to infer phylogenetic networks from DNA sequence data. To do this, we will utilize the most recent developments in the field to calculate the invariants for a small class of phylogenetic networks. Next, we will develop a computational tool that will infer the network that best describes the evolutionary history coming from a set of DNA sequence data, by using the invariants we have calculated. We will then test our tool on both simulated DNA sequence data and real DNA sequence data, and compare the results to state of the art methods.
系统发育学的关键目标是能够从现存亲缘物种的DNA序列数据中推断出物种的进化史。这在许多领域都有应用,比如追踪病毒爆发的突变,了解物种形成事件以帮助保护,甚至追踪世代手工抄写的古代手稿的历史。大多数进化史都可以用系统发育树来描述,树的“叶子”代表今天活着的物种,树的顶点代表共同的祖先物种。然而,对于许多生物学问题,一棵树不能恰当地代表有关物种的进化史。这些问题被称为“横向进化”。一个例子发生在微生物群落中,在一个称为水平基因转移的过程中,不同的微生物物种能够共享它们的部分DNA。这是抗生素耐药性在细菌之间传播的一种机制,因此能够描述此类事件何时发生对人类健康具有重要意义。为了描述水平进化,生物学家使用了所谓的系统发育网络。在这里,可以使用树形结构作为主干,在其上绘制进一步的边缘以表示水平进化事件。推断发生水平进化的物种的进化史是一个特别具有挑战性的问题,也是当今许多系统发育学研究的焦点。系统发育推理的一种方法是使用代数不变量。这些方法在推断树的进化方面取得了重大进展,在某些情况下,已经证明比其他方法更有效。然而,对于系统发育网络,代数不变量的研究却很少。该项目将开发和测试使用代数不变量进行系统发育网络推理的方法。对于一个特定的系统发育网络,沿着它的进化过程可以用一种称为马尔可夫模型的概率模型来建模。在这个模型下,人们可以计算在网络的叶子处观察到特定DNA模式的概率,这些概率可以在模型的数值参数中表示为多项式。通过允许数值参数自由变化(即将它们视为变量),我们可以将网络表示为描述概率的方程的一组解。这样的一组解形成了一个对象,代数学家称之为代数变体。使用该模型的优势在于,我们能够使用代数几何的强大机制来确定观察到的DNA序列数据是否适合网络。特别地,我们可以用称为代数不变量的表达式来描述与网络相对应的变化。为了确定一个特定的网络是否适合观察到的DNA序列数据,其思想是计算数据中模式的频率,然后将网络的代数不变量应用于这些频率。结果的数量将决定数据与网络的匹配程度。这个项目将检验这种方法从DNA序列数据推断系统发育网络的有效性。为了做到这一点,我们将利用该领域的最新发展来计算一小类系统发育网络的不变量。接下来,我们将开发一种计算工具,通过使用我们计算出的不变量,从一组DNA序列数据中推断出最能描述进化史的网络。然后,我们将在模拟DNA序列数据和真实DNA序列数据上测试我们的工具,并将结果与最先进的方法进行比较。

项目成果

期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Algebraic Invariants for Inferring 4-leaf Semi-directed Phylogenetic networks
  • DOI:
    10.1101/2023.09.11.557152
  • 发表时间:
    2023-09
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Samuel Martin;Vincent Moulton;R. Leggett
  • 通讯作者:
    Samuel Martin;Vincent Moulton;R. Leggett
Dimensions of Level-1 Group-Based Phylogenetic Networks
基于 1 级组的系统发育网络的维度
  • DOI:
    10.48550/arxiv.2307.15166
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Gross E
  • 通讯作者:
    Gross E
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Richard Leggett其他文献

Remarks on set-contractions and condensing maps
  • DOI:
    10.1007/bf01179741
  • 发表时间:
    1973-12-01
  • 期刊:
  • 影响因子:
    1.000
  • 作者:
    Richard Leggett
  • 通讯作者:
    Richard Leggett

Richard Leggett的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Richard Leggett', 18)}}的其他基金

Algorithms for Phylogenetic Network Inference from DNA Sequence Data
从 DNA 序列数据进行系统发育网络推断的算法
  • 批准号:
    BB/X005186/1
  • 财政年份:
    2022
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Research Grant
New software for nanopore based diagnostics and surveillance
用于基于纳米孔的诊断和监测的新软件
  • 批准号:
    BB/R022445/1
  • 财政年份:
    2018
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Research Grant
Rapid in-field Nanopore-based identification of plant and animal pathogens
基于纳米孔的现场快速植物和动物病原体鉴定
  • 批准号:
    BB/N023196/1
  • 财政年份:
    2017
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Research Grant
Development of computational strategies for identification and characterisation of viruses in metagenomic samples
开发用于识别和表征宏基因组样本中病毒的计算策略
  • 批准号:
    BB/M004805/1
  • 财政年份:
    2014
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Research Grant

相似海外基金

Structure vs Invariants in Proofs (StrIP)
证明中的结构与不变量 (StrIP)
  • 批准号:
    MR/Y011716/1
  • 财政年份:
    2024
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Fellowship
CAREER: Gauge-theoretic Floer invariants, C* algebras, and applications of analysis to topology
职业:规范理论 Floer 不变量、C* 代数以及拓扑分析应用
  • 批准号:
    2340465
  • 财政年份:
    2024
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Continuing Grant
Motivic invariants and birational geometry of simple normal crossing degenerations
简单正态交叉退化的动机不变量和双有理几何
  • 批准号:
    EP/Z000955/1
  • 财政年份:
    2024
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Research Grant
Conference: Tensor Invariants in Geometry and Complexity Theory
会议:几何和复杂性理论中的张量不变量
  • 批准号:
    2344680
  • 财政年份:
    2024
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Standard Grant
Rational GAGA and Applications to Field Invariants
Rational GAGA 及其在场不变量中的应用
  • 批准号:
    2402367
  • 财政年份:
    2024
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Continuing Grant
Categorical Invariants of Matroids
拟阵的分类不变量
  • 批准号:
    2344861
  • 财政年份:
    2024
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Continuing Grant
FRG: Collaborative Research: New birational invariants
FRG:协作研究:新的双有理不变量
  • 批准号:
    2244978
  • 财政年份:
    2023
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Continuing Grant
Non-semisimple quantum invariants of three and four manifolds
三流形和四流形的非半简单量子不变量
  • 批准号:
    2304990
  • 财政年份:
    2023
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Standard Grant
D-modules and invariants of singularities
D 模和奇点不变量
  • 批准号:
    2301463
  • 财政年份:
    2023
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Standard Grant
Research on finite type invariants and local moves for welded links
焊接链接有限类型不变量和局部移动的研究
  • 批准号:
    23K12973
  • 财政年份:
    2023
  • 资助金额:
    $ 8.16万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了