An algorithmic framework for gene family evolution
基因家族进化的算法框架
基本信息
- 批准号:RGPIN-2018-05049
- 负责人:
- 金额:$ 3.5万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2020
- 资助国家:加拿大
- 起止时间:2020-01-01 至 2021-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This is a computational biology project aiming at developing the appropriate algorithmic tools for inferring the evolution of genes.
Genes are the basic molecular units of heredity, key for understanding biological mechanisms. A first step of most genetic studies is to group genes into families according to sequence similarity, the underlying idea being that similar sequences reflect divergence from a common ancestor. Within a gene family, «orthlogs» genes originating from a speciation event are more likely to preserve the ancestral function than «paralogs» or «xenologs» originating from duplications or Horizontal Gene Transfer (HGT). This is a major motivation for inferring gene evolution, as it is a prerequisite for functional prediction purposes.
Tree-based methods for gene relation prediction consist in reconstructing a phylogenetic tree for the gene family, and then inferring the nature of internal nodes (speciation, duplication or HGT) from a «reconciliation», i.e. an embedding of the gene tree into the species tree. The accuracy of these methods strongly depend on the accuracy of the considered gene tree. However, for various reasons, classical phylogenetic methods are error prone. This motivates the gene tree correction part of this project.
Reconciliation is based on the assumption that each gene family evolves independently through single gain and loss events. Although this hypothesis holds for genes that are far apart in the genome, it is not appropriate for genes appearing grouped into blocs of co-linear genes, which are more plausibly the result of a concerted evolution. Methods for inferring segmental duplication, loss and HGT, combining both tree and order information, are required in this case.
Tree-free methods also exist for gene relation prediction. They are mainly based on hierarchical clustering according to sequence similarity. Results of these methods are pairwise gene relations that can be represented by means of a coloured graph (a colour for each type of relation). While a gene tree induces a set of relations between genes, the converse is not always true, as a set of relations may not represent a valid history for a gene family. Determining and correcting a set of relation for «satisfiability» and «consistency» with a species tree are two important problematics that we handle in this project.
In summary, our goal is to produce gold standard gene trees, infer accurate gene relations and predict the actual evolutionary events that have led to the observed gene diversity, as well as ancestral gene contents and orders. We will explore optimization problems on strings, trees and graphs, study their theoretical complexity, develop exact, approximation and heuristic algorithms, test them on simulated datasets and apply them to the biological datasets if interest. Developed algorithms will be implemented into freely available user-friendly and well documented software.
这是一个计算生物学项目,旨在开发适当的算法工具来推断基因的进化。
基因是遗传的基本分子单位,是理解生物机制的关键。大多数遗传学研究的第一步是根据序列相似性将基因分组到家族中,其基本思想是相似的序列反映了来自共同祖先的分歧。在一个基因家族中,源自物种形成事件的“直系同源”基因比源自复制或水平基因转移(HGT)的“旁系同源”或“异种同源”基因更有可能保留祖先的功能。这是推断基因进化的主要动机,因为它是功能预测目的的先决条件。
用于基因关系预测的基于树的方法包括重建基因家族的系统发育树,然后从“和解”(即将基因树嵌入物种树)中推断内部节点(物种形成,复制或HGT)的性质。这些方法的准确性很大程度上取决于所考虑的基因树的准确性。然而,由于各种原因,经典的系统发育方法容易出错。这激发了该项目的基因树校正部分。
和解是基于这样的假设,即每个基因家族通过单一的获得和损失事件独立进化。虽然这一假设适用于基因组中相距甚远的基因,但它不适用于出现在共线基因块中的基因,这更可能是协同进化的结果。在这种情况下,需要结合树和顺序信息来推断片段重复、丢失和HGT的方法。
无树方法也存在于基因关系预测中。它们主要基于根据序列相似性的层次聚类。这些方法的结果是成对的基因关系,可以通过彩色图(每种类型的关系的颜色)表示。虽然基因树诱导基因之间的一组关系,但匡威并不总是正确的,因为一组关系可能不代表基因家族的有效历史。确定和修正一组与物种树的“可满足性”和“一致性”关系是我们在这个项目中处理的两个重要问题。
总之,我们的目标是产生黄金标准基因树,推断准确的基因关系,并预测导致观察到的基因多样性的实际进化事件,以及祖先基因的内容和顺序。我们将探索字符串,树和图形上的优化问题,研究其理论复杂性,开发精确,近似和启发式算法,在模拟数据集上测试它们,并将其应用于生物数据集。所开发的算法将被应用到免费提供的用户友好和记录良好的软件中。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
ElMabrouk, Nadia其他文献
ElMabrouk, Nadia的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('ElMabrouk, Nadia', 18)}}的其他基金
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2022
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2021
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2019
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2018
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
Evolution of Genome Organization
基因组组织的进化
- 批准号:
217222-2013 - 财政年份:2017
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
Evolution of Genome Organization
基因组组织的进化
- 批准号:
217222-2013 - 财政年份:2016
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
Evolution of Genome Organization
基因组组织的进化
- 批准号:
217222-2013 - 财政年份:2015
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
Evolution of Genome Organization
基因组组织的进化
- 批准号:
217222-2013 - 财政年份:2014
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
Evolution of Genome Organization
基因组组织的进化
- 批准号:
217222-2013 - 财政年份:2013
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
Algorithms for the inference of gene family evolution
基因家族进化推断算法
- 批准号:
217222-2008 - 财政年份:2012
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
相似海外基金
Interrogation of the Impact of Selection on the Evolution of Human Pancreatic Cancer Precursor Lesions
探究选择对人类胰腺癌前驱病变进化的影响
- 批准号:
10703414 - 财政年份:2022
- 资助金额:
$ 3.5万 - 项目类别:
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2022
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
Interrogation of the Impact of Selection on the Evolution of Human Pancreatic Cancer Precursor Lesions
探究选择对人类胰腺癌前驱病变进化的影响
- 批准号:
10556018 - 财政年份:2022
- 资助金额:
$ 3.5万 - 项目类别:
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2021
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2019
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2018
- 资助金额:
$ 3.5万 - 项目类别:
Discovery Grants Program - Individual
Reconstruction and Modeling of Dynamical Molecular Networks
动态分子网络的重建和建模
- 批准号:
10189695 - 财政年份:2018
- 资助金额:
$ 3.5万 - 项目类别:
Reconstruction and Modeling of Dynamical Molecular Networks
动态分子网络的重建和建模
- 批准号:
9756474 - 财政年份:2018
- 资助金额:
$ 3.5万 - 项目类别:
Integrative genomic framework for dissecting regulatory mechanisms underlying hepatocellular carcinoma
剖析肝细胞癌调控机制的整合基因组框架
- 批准号:
9296968 - 财政年份:2017
- 资助金额:
$ 3.5万 - 项目类别:
Connecting transposable elements and regulatory innovation using ENCODE data
使用 ENCODE 数据连接转座元件和监管创新
- 批准号:
9247278 - 财政年份:2017
- 资助金额:
$ 3.5万 - 项目类别:














{{item.name}}会员




