Detecting Evolution of Amino-Acid Fitness in Vertebrate Genomes

检测脊椎动物基因组中氨基酸适应性的进化

基本信息

  • 批准号:
    RGPIN-2014-03651
  • 负责人:
  • 金额:
    $ 2.11万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2019
  • 资助国家:
    加拿大
  • 起止时间:
    2019-01-01 至 2020-12-31
  • 项目状态:
    已结题

项目摘要

The genomes of hundreds of vertebrate species have now been sequenced to near-completion, and nearly 10,000 more are slated to be determined over the next several years ("Genome10k"). A major motivation for this work is to increase the power of comparative analysis to illuminate how gene and genome function evolves (and how it has evolved). Although dense taxonomic sampling across the major lineages of vertebrate biodiversity is expected to substantially increase the ability to make statistical inferences about the genetic changes that "mattered" during evolution, substantial computational and analytic barriers to progress exist.**Objectives**The primary objective of this project is to exploit computational and modeling innovations to reliably detect coordinated changes in amino-acid fitness ("fitness shifts") across multiple positions in the protein-coding regions of up to hundreds to thousands of vertebrate genomes. Such shifts are an expected outcome of changes in the functional requirements of a protein by directional selection, and thus may imply functional divergence or adaptation. To characterize the limits of reliable detection, detailed calculations of statistical information under alternative experimental designs (numbers of species, divergence levels, etc.) will be performed to determine how well comparative data can distinguish fitness shifts from other phenomena (e.g., reductions in population size). Fast Markov Chain Monte Carlo methods of inferring fitness shifts in large comparative datasets will be developed and evaluated. * *Scientific approach **We recently developed several general approaches for rapid, Bayesian analysis of large phylogenomic datasets, which can help eliminate computational bottlenecks and in some cases reduce data analysis times from months to minutes. In this project, we will integrate these techniques, along with unpublished improvements, with algorithms that exploit the massive parallelism of inexpensive, many-core coprocessors of emerging importance in scientific computing. We will use these approaches to implement models of discrete spatial and temporal heterogeneity in selective constraints and population size, and will examine their performance on a large set of vertebrate single-copy genes. Throughout, scalability of computations and reductions in time-complexity (even at the expense of demonstrably mild approximations) will be prioritized to maximize utility in large datasets. Asymptotic power analysis methods that we have recently begun developing (unpublished) will be elaborated and used to characterize the impact of experimental design on power and to evaluate the limits of inference.**Expected significance**With increasingly large numbers of vertebrate genomes now available, tremendous opportunities are emerging to advance knowledge of the genetic basis for fundamental evolutionary processes including functional divergence and adaptation. Statistical methods for detecting functional divergence are among the most widely used ways that functional inferences are made from genomic data. Although it is widely appreciated that such approaches are oversimplified and flawed in important ways, they are tolerated because of their computational convenience. By focusing here on algorithms for fast and scalable inference, it is hoped that recent progress in molecular evolution can be "scaled up" to enable more principled approaches in comparative genomics. Through the development of a quantitative framework for experimental design, more reasoned methods for selecting which and how many species are needed to address a particular question will emerge. Thus, the tools developed here will facilitate advances in both the rational design and execution of large-scale comparative genomic studies.
数百种脊椎动物的基因组现在已经测序接近完成,还有近10,000种将在接下来的几年里确定(“基因组10k”)。这项工作的一个主要动机是增加比较分析的能力,以阐明基因和基因组功能是如何进化的(以及它是如何进化的)。尽管对脊椎动物生物多样性的主要谱系进行密集的分类学采样有望大大提高对进化过程中“重要的”遗传变化做出统计推断的能力,但仍存在巨大的计算和分析障碍。**目标**该项目的主要目标是利用计算和建模创新来可靠地检测多达数百到数千种脊椎动物基因组蛋白质编码区中多个位置的氨基酸适合度的协调变化(“适合度变化”)。这种转变是通过定向选择改变蛋白质功能需求的预期结果,因此可能意味着功能分化或适应。为了确定可靠检测的限度,在替代试验设计(物种数量、差异水平等)下对统计信息进行了详细计算。将用来确定比较数据能在多大程度上区分适应度变化与其他现象(例如,种群数量的减少)。将开发和评估在大型比较数据集中推断适应度变化的快速马尔可夫链蒙特卡罗方法。**科学方法**我们最近开发了几种用于大型系统基因组数据集的快速贝叶斯分析的通用方法,这些方法可以帮助消除计算瓶颈,在某些情况下将数据分析时间从几个月减少到几分钟。在这个项目中,我们将把这些技术,以及未发表的改进,与利用科学计算中正在出现的重要的廉价多核协处理器的大规模并行性的算法相结合。我们将使用这些方法来实现选择性约束和种群规模中的离散空间和时间异质性模型,并将在一大组脊椎动物单拷贝基因上检验它们的性能。在整个过程中,将优先考虑计算的可伸缩性和时间复杂性的降低(即使以明显较小的近似为代价),以最大限度地提高大型数据集的实用性。我们最近开始开发(未发表)的渐近功率分析方法将被详细阐述并用于表征实验设计对功率的影响以及评估推断的限度。**预期意义**随着越来越多的脊椎动物基因组的出现,出现了巨大的机会来推进对包括功能分化和适应在内的基本进化过程的遗传基础的了解。检测功能差异的统计方法是从基因组数据进行功能推断的最广泛使用的方法之一。虽然人们普遍认识到,这种方法过于简化,在重要方面存在缺陷,但由于它们的计算方便,它们是可以容忍的。通过将重点放在快速和可扩展推理的算法上,人们希望分子进化的最新进展能够被“放大”,以便能够在比较基因组学中采用更有原则的方法。通过开发实验设计的量化框架,将出现更合理的方法来选择需要哪些物种和多少物种来解决特定问题。因此,这里开发的工具将促进大规模比较基因组研究的合理设计和执行方面的进展。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

DeKoning, APJason其他文献

DeKoning, APJason的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

Galaxy Analytical Modeling Evolution (GAME) and cosmological hydrodynamic simulations.
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目
Understanding structural evolution of galaxies with machine learning
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目
The formation and evolution of planetary systems in dense star clusters
  • 批准号:
    11043007
  • 批准年份:
    2010
  • 资助金额:
    10.0 万元
  • 项目类别:
    专项基金项目
Improving modelling of compact binary evolution.
  • 批准号:
    10903001
  • 批准年份:
    2009
  • 资助金额:
    20.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Ultra-high throughput evolution of designer enzymes with extended amino acid alphabets
具有扩展氨基酸字母表的设计酶的超高通量进化
  • 批准号:
    BB/X010724/1
  • 财政年份:
    2023
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Fellowship
Development of amino-acid coordination polymers as hydrogen evolution catalysts
氨基酸配位聚合物作为析氢催化剂的开发
  • 批准号:
    22K05286
  • 财政年份:
    2022
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Chemical evolution of synthetic bacterial cells by reprograming protein translation with non-canonical amino acids
通过使用非规范氨基酸重新编程蛋白质翻译来合成细菌细胞的化学进化
  • 批准号:
    RGPIN-2020-05669
  • 财政年份:
    2022
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Grants Program - Individual
Directed evolution of protein functional structures based on non-natural amino acid repertoire
基于非天然氨基酸库的蛋白质功能结构定向进化
  • 批准号:
    22H02591
  • 财政年份:
    2022
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Challenge to reveal ecosystem evolution: Compound specific isotope analysis on fossilized amino acids
揭示生态系统进化的挑战:化石氨基酸的复合特定同位素分析
  • 批准号:
    21K18650
  • 财政年份:
    2021
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Grant-in-Aid for Challenging Research (Exploratory)
Chemical evolution of synthetic bacterial cells by reprograming protein translation with non-canonical amino acids
通过使用非规范氨基酸重新编程蛋白质翻译来合成细菌细胞的化学进化
  • 批准号:
    RGPIN-2020-05669
  • 财政年份:
    2021
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Grants Program - Individual
Chemical evolution of synthetic bacterial cells by reprograming protein translation with non-canonical amino acids
通过使用非规范氨基酸重新编程蛋白质翻译来合成细菌细胞的化学进化
  • 批准号:
    RGPIN-2020-05669
  • 财政年份:
    2020
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Grants Program - Individual
Design and Evolution of Enzymes with Non-Canonical Amino Acids
非规范氨基酸酶的设计和进化
  • 批准号:
    2465805
  • 财政年份:
    2020
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Studentship
Detecting Evolution of Amino-Acid Fitness in Vertebrate Genomes
检测脊椎动物基因组中氨基酸适应性的进化
  • 批准号:
    RGPIN-2014-03651
  • 财政年份:
    2018
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Grants Program - Individual
Study on the evolution of amino acid usage in early translation system
早期翻译系统中氨基酸使用演变的研究
  • 批准号:
    17H03716
  • 财政年份:
    2017
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了