Comparative Analysis Of Completely Sequenced Genomes

完全测序的基因组的比较分析

基本信息

  • 批准号:
    10927035
  • 负责人:
  • 金额:
    $ 351.37万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
  • 资助国家:
    美国
  • 起止时间:
  • 项目状态:
    未结题

项目摘要

The rapidly growing database of completely and nearly completely sequenced genomes of bacteria, archaea, eukaryotes and viruses (millions of genomes already available and many more in progress) creates both extensive new opportunities and major new challenges for genome research. During the year in review, we performed a variety of studies that took advantage of the genomic information to establish fundamental principles of genome evolution and to investigate the evolution of biologically and medically important groups of organisms. All extant eukaryotes descend from the last eukaryotic common ancestor (LECA), which is thought to have featured complex cellular organization. To gain insight into LECA biology and eukaryogenesis-the origin of the eukaryotic cell, which remains poorly understood- in collaboration with Drs. Valerian Dolja and Mart Krupovic, we reconstructed the LECA virus repertoire. We compiled an inventory of eukaryotic hosts of all major virus taxa and reconstructed the LECA virome by inferring the origins of these groups of viruses. The origin of the LECA virome can be traced back to a small set of bacterial-not archaeal-viruses. This provenance of the LECA virome is probably due to the bacterial origin of eukaryotic membranes, which is most compatible with two endosymbiosis events in a syntrophic model of eukaryogenesis. In the first endosymbiosis, a bacterial host engulfed an Asgard archaeon, preventing archaeal viruses from entry owing to a lack of archaeal virus receptors on the external membranes. In another major project, we mined metatranscriptomes for viroid-like, circular RNA replicons. Viroids and viroid-like covalently closed circular (ccc) RNAs are minimal replicators that typically encode no proteins and hijack cellular enzymes for replication. The extent and diversity of viroid-like agents are poorly understood. We developed a computational pipeline to identify viroid-like cccRNAs and applied it to 5,131 metatranscriptomes and 1,344 plant transcriptomes. The search yielded 11,378 viroid-like cccRNAs spanning 4,409 species-level clusters, a 5-fold increase compared to the previously identified viroid-like elements. Within this diverse collection, we discovered numerous putative viroids, satellite RNAs, retrozymes, and ribozy-like viruses. Diverse ribozyme combinations and unusual ribozymes within the cccRNAs were identified. Self-cleaving ribozymes were identified in ambiviruses, some mito-like viruses and capsid-encoding satellite virus-like cccRNAs. The broad presence of viroid-like cccRNAs in diverse transcriptomes and ecosystems implies that their host range is far broader than currently known, and matches to CRISPR spacers suggest that some cccRNAs replicate in prokaryotes. We also continued our long term investigation in comparative genomics of CRISPR loci in bacteria and archaea. CRISPR-cas loci typically contain CRISPR arrays with unique spacers separating direct repeats. Spacers along with portions of adjacent repeats are transcribed and processed into CRISPR(cr) RNAs that target complementary sequences (protospacers) in mobile genetic elements, resulting in cleavage of the target DNA or RNA. Additional, standalone repeats in some CRISPR-cas loci produce distinct cr-like RNAs implicated in regulatory or other functions. We developed a computational pipeline to systematically predict crRNA-like elements by scanning for standalone repeat sequences that are conserved in closely related CRISPR-cas loci. Numerous crRNA-like elements were detected in diverse CRISPR-Cas systems, mostly, of type I, but also subtype V-A. Standalone repeats often form mini-arrays containing two repeat-like sequence separated by a spacer that is partially complementary to promoter regions of cas genes, in particular cas8, or cargo genes located within CRISPR-Cas loci, such as toxins-antitoxins. We show experimentally that a mini-array from a type I-F1 CRISPR-Cas system functions as a regulatory guide. We also identified mini-arrays in bacteriophages that could abrogate CRISPR immunity by inhibiting effector expression. Thus, recruitment of CRISPR effectors for regulatory functions via spacers with partial complementarity to the target is a common feature of diverse CRISPR-Cas systems. In collaboration with the laboratory of Dr. Ping Xu of Gunazhou University, we explored the evolution of the genomes of spore-forming archaea. Several groups of bacteria have complex life cycles involving cellular differentiation and multicellular structures. For example, actinobacteria of the genus Streptomyces form multicellular vegetative hyphae, aerial hyphae, and spores. However, similar life cycles have not yet been described for archaea. Here, we show that several haloarchaea of the family Halobacteriaceae display a life cycle resembling that of Streptomyces bacteria. Strain YIM 93972 (isolated from a salt marsh) undergoes cellular differentiation into mycelia and spores. Other closely related strains are also able to form mycelia, and comparative genomic analyses point to gene signatures (apparent gain or loss of certain genes) that are shared by members of this clade within the Halobacteriaceae. Genomic, transcriptomic and proteomic analyses of non-differentiating mutants suggest that a Cdc48-family ATPase might be involved in cellular differentiation in strain YIM 93972. Additionally, a gene encoding a putative oligopeptide transporter from YIM 93972 can restore the ability to form hyphae in a Streptomyces coelicolor mutant that carries a deletion in a homologous gene cluster (bldKA-bldKE), suggesting functional equivalence. We propose strain YIM 93972 as representative of a new species in a new genus within the family Halobacteriaceae, for which the name Actinoarchaeum halophilum gen. nov., sp. nov. is herewith proposed. Our demonstration of a complex life cycle in a group of haloarchaea adds a new dimension to our understanding of the biological diversity and environmental adaptation of archaea. In collaboration with Drs. Allahverdyan and Khachatryan of the Yerevan Physics Institute and Dr. Lopez-Garcia of Universite Paris-Saclay, we performed a theoretical study of the coevolution of reproducers and replicators. There are two fundamentally distinct but inextricably linked types of biological evolutionary units, reproducers and replicators. Reproducers are cells and organelles that reproduce via various forms of division and maintain the physical continuity of compartments and their content. Replicators are genetic elements (GE), including genomes of cellular organisms and various autonomous elements, that both cooperate with reproducers and rely on the latter for replication. All known cells and organisms comprise a union between replicators and reproducers. We explored a model in which cells emerged via symbiosis between primordial "metabolic" reproducers (protocells) which evolved, on short time scales, via a primitive form of selection and random drift, and mutualist replicators. Mathematical modeling identifies the conditions, under which GE-carrying protocells can outcompete GE-less ones, taking into account that, from the earliest stages of evolution, replicators split into mutualists and parasites. Analysis of the model shows that, for the GE-containing protocells to win the competition and to be fixed in evolution, it is essential that the birth-death process of the GE is coordinated with the rate of protocell division. At the early stages of evolution, random, high-variance cell division is advantageous compared with symmetrical division because the former provides for the emergence of protocells containing only mutualists, preventing takeover by parasites. These findings illuminate the likely order of key events on the evolutionary route from protocells to cells.
细菌、古细菌、真核生物和病毒的完全和近乎完全测序的基因组数据库的快速增长(数百万个基因组已经可用,还有更多基因组正在进行中),为基因组研究创造了广泛的新机遇和重大新挑战。在回顾的一年中,我们进行了各种研究,利用基因组信息来建立基因组进化的基本原理,并研究生物学和医学上重要的生物体群体的进化。 所有现存的真核生物都起源于最后的真核共同祖先(LECA),人们认为它具有复杂的细胞组织。为了深入了解 LECA 生物学和真核发生(真核细胞的起源,目前仍知之甚少),与 Drs. 合作。 Valerian Dolja 和 Mart Krupovic,我们重建了 LECA 病毒库。我们编制了所有主要病毒类群的真核宿主清单,并通过推断这些病毒组的起源重建了 LECA 病毒组。 LECA 病毒组的起源可以追溯到一小部分细菌病毒,而不是古细菌病毒。 LECA 病毒组的这种起源可能是由于真核细胞膜的细菌起源,它与真核发生互养模型中的两个内共生事件最相容。在第一次内共生中,细菌宿主吞噬了阿斯加德古菌,由于外膜上缺乏古菌病毒受体,从而阻止古菌病毒进入。 在另一个重大项目中,我们挖掘了类病毒环状 RNA 复制子的元转录组。类病毒和类病毒样共价闭合环状 (ccc) RNA 是最小的复制子,通常不编码蛋白质并劫持细胞酶进行复制。人们对类病毒的范围和多样性知之甚少。我们开发了一个计算流程来识别类病毒 cccRNA,并将其应用于 5,131 个元转录组和 1,344 个植物转录组。此次搜索产生了 11,378 个类病毒 cccRNA,涵盖 4,409 个物种级簇,比之前发现的类病毒元素增加了 5 倍。在这个多样化的集合中,我们发现了许多假定的类病毒、卫星 RNA、逆转录酶和核酶样病毒。鉴定出 cccRNA 内不同的核酶组合和不寻常的核酶。在双链病毒、一些有丝分裂样病毒和衣壳编码卫星病毒样 cccRNA 中发现了自裂解核酶。类病毒 cccRNA 在不同的转录组和生态系统中广泛存在,这意味着它们的宿主范围比目前已知的要广泛得多,并且与 CRISPR 间隔区的匹配表明一些 cccRNA 在原核生物中复制。 我们还继续对细菌和古细菌中 CRISPR 位点的比较基因组学进行长期研究。 CRISPR-cas 位点通常包含带有分隔同向重复序列的独特间隔区的 CRISPR 阵列。间隔区以及相邻重复序列的部分被转录并加工成 CRISPR(cr) RNA,其靶向移动遗传元件中的互补序列(原型间隔区),从而导致目标 DNA 或 RNA 的切割。另外,一些 CRISPR-cas 位点中的独立重复会产生与调控或其他功能有关的独特的 cr 样 RNA。我们开发了一个计算管道,通过扫描密切相关的 CRISPR-cas 位点中保守的独立重复序列来系统地预测 crRNA 样元件。在不同的 CRISPR-Cas 系统中检测到了许多 crRNA 样元件,大部分是 I 型,但也有 V-A 亚型。独立重复序列通常形成包含两个由间隔区分隔的类似重复序列的微型阵列,该间隔区与cas基因的启动子区域部分互补,特别是cas8,或位于CRISPR-Cas基因座内的货物基因,例如毒素-抗毒素。我们通过实验证明,来自 I-F1 型 CRISPR-Cas 系统的微型阵列可充当调控指南。我们还在噬菌体中发现了微型阵列,可以通过抑制效应子表达来消除 CRISPR 免疫。因此,通过与靶标部分互补的间隔区募集 CRISPR 效应子来实现调节功能是多种 CRISPR-Cas 系统的共同特征。 我们与广州大学徐平博士实验室合作,探索了孢子形成古菌基因组的进化。几类细菌具有复杂的生命周期,涉及细胞分化和多细胞结构。例如,链霉菌属的放线菌形成多细胞营养菌丝、气生菌丝和孢子。然而,尚未描述古细菌类似的生命周期。在这里,我们发现盐杆菌科的几种盐古菌表现出类似于链霉菌的生命周期。菌株 YIM 93972(从盐沼中分离)经历细胞分化为菌丝体和孢子。其他密切相关的菌株也能够形成菌丝体,比较基因组分析指出了盐杆菌科内该分支成员所共有的基因特征(某些基因的明显获得或丢失)。非分化突变体的基因组、转录组和蛋白质组分析表明,Cdc48 家族 ATP 酶可能参与菌株 YIM 93972 的细胞分化。此外,编码来自 YIM 93972 的假定寡肽转运蛋白的基因可以恢复天蓝色链霉菌突变体形成菌丝的能力,该突变体携带 同源基因簇(bldKA-bldKE),表明功能等效。我们建议菌株 YIM 93972 作为盐杆菌科新属的新物种的代表,其名称为 Actinoarchaeum halophilum gen。十一月,SP。十一月特此提出。我们对盐古菌群复杂生命周期的展示为我们对古菌生物多样性和环境适应的理解增加了一个新的维度。 与博士合作。在埃里温物理研究所的 Allahverdyan 和 Khachatryan 以及巴黎萨克雷大学的 Lopez-Garcia 博士的帮助下,我们对复制基因和复制基因的协同进化进行了理论研究。生物进化单位有两种本质上不同但又密不可分的类型:繁殖者和复制者。繁殖体是通过各种形式的分裂进行繁殖并维持区室及其内容物的物理连续性的细胞和细胞器。复制子是遗传元件(GE),包括细胞生物的基因组和各种自主元件,它们与复制子合作并依赖后者进行复制。所有已知的细胞和生物体都包含复制子和繁殖子之间的联合。我们探索了一种模型,其中细胞通过原始“代谢”繁殖体(原始细胞)之间的共生而出现,原始“代谢”繁殖体在短时间内通过选择和随机漂移的原始形式与互利复制体进化。数学模型确定了携带 GE 的原始细胞可以在竞争中胜过无 GE 的原始细胞的条件,同时考虑到从进化的最早阶段开始,复制基因就分裂为共生体和寄生体。模型分析表明,为了使含有GE的原始细胞在竞争中获胜并在进化中固定下来,GE的生灭过程与原始细胞分裂速率的协调至关重要。在进化的早期阶段,随机、高变异的细胞分裂与对称分裂相比是有利的,因为前者提供了仅包含互利共生体的原始细胞的出现,从而防止被寄生虫接管。这些发现阐明了从原始细胞到细胞的进化路线上关键事件的可能顺序。

项目成果

期刊论文数量(212)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
On the origin of cells and viruses: primordial virus world scenario.
Single domain response regulators: molecular switches with emerging roles in cell organization and dynamics.
  • DOI:
    10.1016/j.mib.2009.01.010
  • 发表时间:
    2009-04
  • 期刊:
  • 影响因子:
    5.4
  • 作者:
    Jenal, Urs;Galperin, Michael Y.
  • 通讯作者:
    Galperin, Michael Y.
Just how Lamarckian is CRISPR-Cas immunity: the continuum of evolvability mechanisms.
  • DOI:
    10.1186/s13062-016-0111-z
  • 发表时间:
    2016-02-24
  • 期刊:
  • 影响因子:
    5.5
  • 作者:
    Koonin EV;Wolf YI
  • 通讯作者:
    Wolf YI
TPV1, the first virus isolated from the hyperthermophilic genus Thermococcus.
  • DOI:
    10.1111/j.1462-2920.2011.02662.x
  • 发表时间:
    2012-02
  • 期刊:
  • 影响因子:
    5.1
  • 作者:
    Gorlas A;Koonin EV;Bienvenu N;Prieur D;Geslin C
  • 通讯作者:
    Geslin C
From complete genome sequence to 'complete' understanding?
  • DOI:
    10.1016/j.tibtech.2010.05.006
  • 发表时间:
    2010-08
  • 期刊:
  • 影响因子:
    17.3
  • 作者:
    Galperin, Michael Y.;Koonin, Eugene V.
  • 通讯作者:
    Koonin, Eugene V.
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Eugene V Koonin其他文献

The common ancestry of life
  • DOI:
    10.1186/1745-6150-5-64
  • 发表时间:
    2010-01-01
  • 期刊:
  • 影响因子:
    4.900
  • 作者:
    Eugene V Koonin;Yuri I Wolf
  • 通讯作者:
    Yuri I Wolf
Identification of dephospho-CoA kinase in Thermococcus kodakarensis and the complete CoA biosynthesis pathway
Thermococcus kodakarensis 中去磷酸 CoA 激酶的鉴定及完整 CoA 生物合成途径
  • DOI:
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Takahiro Shimosaka;Kira S Makarova;Eugene V Koonin;Haruyuki Atomi
  • 通讯作者:
    Haruyuki Atomi
Positive and strongly relaxed purifying selection drive the evolution of repeats in proteins
积极且强烈放松的纯化选择驱动蛋白质中重复序列的进化
  • DOI:
    10.1038/ncomms13570
  • 发表时间:
    2016-11-18
  • 期刊:
  • 影响因子:
    15.700
  • 作者:
    Erez Persi;Yuri I. Wolf;Eugene V Koonin
  • 通讯作者:
    Eugene V Koonin
Evolutionary primacy of sodium bioenergetics
  • DOI:
    10.1186/1745-6150-3-13
  • 发表时间:
    2008-04-01
  • 期刊:
  • 影响因子:
    4.900
  • 作者:
    Armen Y Mulkidjanian;Michael Y Galperin;Kira S Makarova;Yuri I Wolf;Eugene V Koonin
  • 通讯作者:
    Eugene V Koonin
Classification and evolutionary history of the single-strand annealing proteins, RecT, Redβ, ERF and RAD52
  • DOI:
    10.1186/1471-2164-3-8
  • 发表时间:
    2002-03-21
  • 期刊:
  • 影响因子:
    3.700
  • 作者:
    Lakshminarayan M Iyer;Eugene V Koonin;L Aravind
  • 通讯作者:
    L Aravind

Eugene V Koonin的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Eugene V Koonin', 18)}}的其他基金

Finding Protein Sequence Motifs--Methods and Application
寻找蛋白质序列基序--方法与应用
  • 批准号:
    6988455
  • 财政年份:
  • 资助金额:
    $ 351.37万
  • 项目类别:
Finding Protein Sequence Motifs--methods And Application
寻找蛋白质序列基序--方法与应用
  • 批准号:
    6681337
  • 财政年份:
  • 资助金额:
    $ 351.37万
  • 项目类别:
Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
  • 批准号:
    7969213
  • 财政年份:
  • 资助金额:
    $ 351.37万
  • 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
  • 批准号:
    8943217
  • 财政年份:
  • 资助金额:
    $ 351.37万
  • 项目类别:
Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
  • 批准号:
    9160910
  • 财政年份:
  • 资助金额:
    $ 351.37万
  • 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
  • 批准号:
    9555730
  • 财政年份:
  • 资助金额:
    $ 351.37万
  • 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
  • 批准号:
    7594460
  • 财政年份:
  • 资助金额:
    $ 351.37万
  • 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
  • 批准号:
    7735068
  • 财政年份:
  • 资助金额:
    $ 351.37万
  • 项目类别:
COMPARATIVE ANALYSIS OF COMPLETELY SEQUENCED GENOMES
全测序基因组的比较分析
  • 批准号:
    6111075
  • 财政年份:
  • 资助金额:
    $ 351.37万
  • 项目类别:
Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
  • 批准号:
    6988458
  • 财政年份:
  • 资助金额:
    $ 351.37万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了