An algorithmic framework for gene family evolution
基因家族进化的算法框架
基本信息
- 批准号:RGPIN-2018-05049
- 负责人:
- 金额:$ 6.99万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This is a computational biology project aiming at developing the appropriate algorithmic tools for inferring the evolution of genes.Genes are the basic molecular units of heredity, key for understanding biological mechanisms. A first step of most genetic studies is to group genes into families according to sequence similarity, the underlying idea being that similar sequences reflect divergence from a common ancestor. Within a gene family, «orthlogs» genes originating from a speciation event are more likely to preserve the ancestral function than «paralogs» or «xenologs» originating from duplications or Horizontal Gene Transfer (HGT). This is a major motivation for inferring gene evolution, as it is a prerequisite for functional prediction purposes.Tree-based methods for gene relation prediction consist in reconstructing a phylogenetic tree for the gene family, and then inferring the nature of internal nodes (speciation, duplication or HGT) from a «reconciliation», i.e. an embedding of the gene tree into the species tree. The accuracy of these methods strongly depend on the accuracy of the considered gene tree. However, for various reasons, classical phylogenetic methods are error prone. This motivates the gene tree correction part of this project.Reconciliation is based on the assumption that each gene family evolves independently through single gain and loss events. Although this hypothesis holds for genes that are far apart in the genome, it is not appropriate for genes appearing grouped into blocs of co-linear genes, which are more plausibly the result of a concerted evolution. Methods for inferring segmental duplication, loss and HGT, combining both tree and order information, are required in this case.Tree-free methods also exist for gene relation prediction. They are mainly based on hierarchical clustering according to sequence similarity. Results of these methods are pairwise gene relations that can be represented by means of a coloured graph (a colour for each type of relation). While a gene tree induces a set of relations between genes, the converse is not always true, as a set of relations may not represent a valid history for a gene family. Determining and correcting a set of relation for «satisfiability» and «consistency» with a species tree are two important problematics that we handle in this project.In summary, our goal is to produce gold standard gene trees, infer accurate gene relations and predict the actual evolutionary events that have led to the observed gene diversity, as well as ancestral gene contents and orders. We will explore optimization problems on strings, trees and graphs, study their theoretical complexity, develop exact, approximation and heuristic algorithms, test them on simulated datasets and apply them to the biological datasets if interest. Developed algorithms will be implemented into freely available user-friendly and well documented software.
这是一个计算生物学项目,旨在开发适当的算法工具来推断基因的进化。基因是遗传的基本分子单位,是理解生物机制的关键。大多数基因研究的第一步是根据序列相似性将基因分组为家族,其基本思想是相似的序列反映了来自共同祖先的分歧。在一个基因家族中,起源于物种形成事件的“同源”基因比起源于重复或水平基因转移(HGT)的“同源”或“异种”基因更有可能保留祖先的功能。这是推断基因进化的主要动机,因为它是功能预测目的的先决条件。基于树的基因关系预测方法包括重建基因家族的系统发育树,然后从“和解”中推断内部节点(物种形成、复制或HGT)的性质,即将基因树嵌入到物种树中。这些方法的准确性很大程度上取决于所考虑的基因树的准确性。然而,由于各种原因,经典的系统发育方法容易出错。这激发了这个项目的基因树校正部分。和解是基于每个基因家族通过单一的增益和损失事件独立进化的假设。尽管这一假设适用于基因组中相距很远的基因,但它不适用于出现在共线性基因块中的基因,这更像是协同进化的结果。在这种情况下,需要结合树和顺序信息来推断片段重复、损失和HGT的方法。无树方法也可用于基因关系预测。它们主要是基于序列相似性的分层聚类。这些方法的结果是成对的基因关系,可以用彩色图表示(每种类型的关系用一种颜色)。虽然基因树诱导了基因之间的一组关系,但反过来并不总是正确的,因为一组关系可能并不代表一个基因家族的有效历史。确定和纠正一组与物种树的“可满足性”和“一致性”的关系是我们在这个项目中处理的两个重要问题。综上所述,我们的目标是建立金标准基因树,推断准确的基因关系,预测导致观察到的基因多样性的实际进化事件,以及祖先基因的含量和顺序。我们将探索字符串、树和图上的优化问题,研究它们的理论复杂性,开发精确、近似和启发式算法,在模拟数据集上测试它们,如果感兴趣,将它们应用于生物数据集。开发的算法将实现到免费提供的用户友好和良好的文档软件。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
ElMabrouk, Nadia其他文献
ElMabrouk, Nadia的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('ElMabrouk, Nadia', 18)}}的其他基金
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2021
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2020
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2019
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2018
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
Evolution of Genome Organization
基因组组织的进化
- 批准号:
217222-2013 - 财政年份:2017
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
Evolution of Genome Organization
基因组组织的进化
- 批准号:
217222-2013 - 财政年份:2016
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
Evolution of Genome Organization
基因组组织的进化
- 批准号:
217222-2013 - 财政年份:2015
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
Evolution of Genome Organization
基因组组织的进化
- 批准号:
217222-2013 - 财政年份:2014
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
Evolution of Genome Organization
基因组组织的进化
- 批准号:
217222-2013 - 财政年份:2013
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
Algorithms for the inference of gene family evolution
基因家族进化推断算法
- 批准号:
217222-2008 - 财政年份:2012
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
相似海外基金
Interrogation of the Impact of Selection on the Evolution of Human Pancreatic Cancer Precursor Lesions
探究选择对人类胰腺癌前驱病变进化的影响
- 批准号:
10703414 - 财政年份:2022
- 资助金额:
$ 6.99万 - 项目类别:
Interrogation of the Impact of Selection on the Evolution of Human Pancreatic Cancer Precursor Lesions
探究选择对人类胰腺癌前驱病变进化的影响
- 批准号:
10556018 - 财政年份:2022
- 资助金额:
$ 6.99万 - 项目类别:
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2021
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2020
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2019
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
An algorithmic framework for gene family evolution
基因家族进化的算法框架
- 批准号:
RGPIN-2018-05049 - 财政年份:2018
- 资助金额:
$ 6.99万 - 项目类别:
Discovery Grants Program - Individual
Reconstruction and Modeling of Dynamical Molecular Networks
动态分子网络的重建和建模
- 批准号:
10189695 - 财政年份:2018
- 资助金额:
$ 6.99万 - 项目类别:
Reconstruction and Modeling of Dynamical Molecular Networks
动态分子网络的重建和建模
- 批准号:
9756474 - 财政年份:2018
- 资助金额:
$ 6.99万 - 项目类别:
Integrative genomic framework for dissecting regulatory mechanisms underlying hepatocellular carcinoma
剖析肝细胞癌调控机制的整合基因组框架
- 批准号:
9296968 - 财政年份:2017
- 资助金额:
$ 6.99万 - 项目类别:
Connecting transposable elements and regulatory innovation using ENCODE data
使用 ENCODE 数据连接转座元件和监管创新
- 批准号:
9247278 - 财政年份:2017
- 资助金额:
$ 6.99万 - 项目类别: