CAREER: Algorithms for Gene Family Evolution with Gene Duplication, Loss, and Coalescence

职业:基因家族进化与基因复制、丢失和合并的算法

基本信息

  • 批准号:
    1751399
  • 负责人:
  • 金额:
    $ 50.55万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2018
  • 资助国家:
    美国
  • 起止时间:
    2018-05-15 至 2024-04-30
  • 项目状态:
    已结题

项目摘要

Evolution of genes and genomes is responsible for the immense biological diversity on our planet.However, despite its central role as the most fundamental property of life, the process of evolutionremains poorly understood, and current models have typically been unable to span the diversity ofscales at which evolution can act. This work addresses this fundamental shortcoming by developing newmodels and algorithms that simultaneously account for the most prevalent processes in eukaryotic genefamily evolution: gene duplication, loss, and coalescence. The new computational framework andmethods will enable researchers to systematically interpret data sets, make substantially more reliableand robust inferences, and improve our understanding of genome evolution. Because these historiesform the basis of many genomic studies, these results will, in turn, benefit many areas of biology.Additionally, the PI is committed to educating the next generation of scientists. As part of this project,the PI will provide compelling research experiences for a substantial number of undergraduates, developundergraduate courses in Data Science, and continue engaging in several college-wide and externalinitiatives aimed at broadening participation in STEM.This research develops models and algorithms in the field of phylogenetic reconciliation, whichcompares a gene tree with its species tree to infer the evolutionary events that link them. Foreukaryotic organisms, the most popular reconciliation methods allow for gene duplications and genelosses, which is appropriate only for species sampled at large evolutionary distances, or allow forcoalescences, which is appropriate only for species sampled at close evolutionary distances. That is,each model provides only a partial view of evolution, limiting their applicability and accuracy. Bybridging these two models, this research will impact how gene family evolution is represented and howreconciliations are inferred and analyzed. Specifically, this work addresses three key problems in thefield: algorithmic challenges of scaling to large datasets, statistical challenges of distinguishing biologicalsignal from noise, and modeling challenges of generalizing across genomes. Expected contributionsinclude novel algorithms and heuristics for the reconciliation problem, methods for resolvingmultifurcating trees, and models that can account for multiple samples per species and for specieshybridization. In addition, the joint evolutionary models and inference algorithms developed here maymotivate further unified approaches in the field.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
基因和基因组的进化是我们星球上巨大的生物多样性的原因。然而,尽管它作为生命最基本的属性发挥着核心作用,但对进化过程的了解仍然很少,目前的模型通常无法跨越进化可以发挥作用的尺度的多样性。这项工作通过开发新的模型和算法来解决这个根本性的缺点,这些模型和算法同时解释了真核生物基因家族进化中最普遍的过程:基因复制、丢失和合并。新的计算框架和方法将使研究人员能够系统地解释数据集,做出更加可靠和强大的推论,并提高我们对基因组进化的理解。因为这些历史形成了许多基因组研究的基础,这些结果将反过来使生物学的许多领域受益。此外,PI致力于教育下一代科学家。作为该项目的一部分,PI将为大量本科生提供引人注目的研究经验,开发数据科学本科课程,并继续参与旨在扩大STEM参与的几个学院范围和外部倡议。该研究开发系统发育和解领域的模型和算法,将基因树与其物种树进行比较,以推断将它们联系起来的进化事件。最受欢迎的调和方法是考虑到基因复制和基因丢失,这只适用于在大进化距离处采样的物种,或者考虑到聚结,这只适用于在近进化距离处采样的物种。也就是说,每个模型只提供了进化的部分观点,限制了它们的适用性和准确性。通过桥接这两个模型,这项研究将影响如何表达基因家族进化以及如何推断和分析和解。具体来说,这项工作解决了该领域的三个关键问题:扩展到大型数据集的算法挑战,区分生物信号与噪声的统计挑战,以及跨基因组泛化的建模挑战。预期的contributionsinclude新的算法和和解问题,resolvingmultifurcating树的方法,模型,可以考虑每个物种的多个样本和specieshybridization。此外,这里开发的联合进化模型和推理算法可能会激励该领域进一步统一的方法。该奖项反映了NSF的法定使命,并被认为是值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。

项目成果

期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Multiple Optimal Reconciliations Under the Duplication-Loss-Coalescence Model
重复-丢失-合并模型下的多重最优协调
The Most Parsimonious Reconciliation Problem in the Presence of Incomplete Lineage Sorting and Hybridization Is NP-Hard
存在不完整谱系排序和杂交时最简约的协调问题是 NP 难问题
A Polynomial-Time Algorithm for Minimizing the Deep Coalescence Cost for Level-1 Species Networks
一种最小化 1 级物种网络深度合并成本的多项式时间算法
An Integer Linear Programming Solution for the Most Parsimonious Reconciliation Problem under the Duplication-Loss-Coalescence Model
重复-丢失-合并模型下最简洁协调问题的整数线性规划解
  • DOI:
    10.1145/3388440.3412474
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Carothers, Morgan;Gardi, Joseph;Gross, Gianluca;Kuze, Tatsuki;Liu, Nuo;Plunkett, Fiona;Qian, Julia;Wu, Yi-Chieh
  • 通讯作者:
    Wu, Yi-Chieh
Inferring Pareto-optimal reconciliations across multiple event costs under the duplication-loss-coalescence model
在重复-丢失-合并模型下推断多个事件成本的帕累托最优调节
  • DOI:
    10.1186/s12859-019-3206-6
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    3
  • 作者:
    Mawhorter, Ross;Liu, Nuo;Libeskind-Hadas, Ran;Wu, Yi-Chieh
  • 通讯作者:
    Wu, Yi-Chieh
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Yi-Chieh Wu其他文献

Computational evolutionary genomics: phylogenomic models spanning domains, genes, individuals, and species
  • DOI:
  • 发表时间:
    2014
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yi-Chieh Wu
  • 通讯作者:
    Yi-Chieh Wu

Yi-Chieh Wu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

CAREER: Scalable algorithms for regularized and non-linear genetic models of gene expression
职业:基因表达的正则化和非线性遗传模型的可扩展算法
  • 批准号:
    2336469
  • 财政年份:
    2024
  • 资助金额:
    $ 50.55万
  • 项目类别:
    Continuing Grant
Developing Algorithms for Identifying Gene Modules in Single-Cell RNA-Seq Using Signed Graphs
开发使用符号图识别单细胞 RNA-Seq 中基因模块的算法
  • 批准号:
    24K18100
  • 财政年份:
    2024
  • 资助金额:
    $ 50.55万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
MCA: Genomic algorithms and statistical models for gene transfer in naturally transformable bacteria
MCA:自然转化细菌中基因转移的基因组算法和统计模型
  • 批准号:
    2221039
  • 财政年份:
    2022
  • 资助金额:
    $ 50.55万
  • 项目类别:
    Standard Grant
Algorithms and analyses for the evolution of plant chromosomes and gene orders
植物染色体和基因顺序进化的算法和分析
  • 批准号:
    RGPIN-2022-05212
  • 财政年份:
    2022
  • 资助金额:
    $ 50.55万
  • 项目类别:
    Discovery Grants Program - Individual
Development and application of integrative learning algorithms for gene function prediction in plants
植物基因功能预测集成学习算法的开发与应用
  • 批准号:
    20K06043
  • 财政年份:
    2020
  • 资助金额:
    $ 50.55万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
EAGER: A Framework for Learning Graph Algorithms with Applications to Social and Gene Networks
EAGER:学习图算法及其在社交和基因网络中的应用的框架
  • 批准号:
    1841351
  • 财政年份:
    2018
  • 资助金额:
    $ 50.55万
  • 项目类别:
    Standard Grant
Machine-learning algorithms for gene score prediction of cardio-metabolic traits
用于心脏代谢特征基因评分预测的机器学习算法
  • 批准号:
    364941
  • 财政年份:
    2017
  • 资助金额:
    $ 50.55万
  • 项目类别:
    Operating Grants
Gene Golub SIAM Summer School: Data Sparse Approximations and Algorithms
Gene Golub SIAM 暑期学校:数据稀疏近似和算法
  • 批准号:
    1712970
  • 财政年份:
    2017
  • 资助金额:
    $ 50.55万
  • 项目类别:
    Standard Grant
CAREER: Algorithms for Domain-Level Analysis of Gene Family Evolution
职业:基因家族进化域级分析算法
  • 批准号:
    1553421
  • 财政年份:
    2016
  • 资助金额:
    $ 50.55万
  • 项目类别:
    Continuing Grant
ABI Innovation: Scalable kmer-based algorithms and software for gene expression and regulation
ABI Innovation:用于基因表达和调控的可扩展的基于 kmer 的算法和软件
  • 批准号:
    1564785
  • 财政年份:
    2016
  • 资助金额:
    $ 50.55万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了