Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
基本信息
- 批准号:8149599
- 负责人:
- 金额:$ 215.46万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
The rapidly growing database of completely sequenced genomes of bacteria, archaea and eukaryotes (over 1100 genomes available by the beginning of 2010 and many more in progress) creates both new opportunities and new challenges for genome research. During the last year, we performed several studies that took advantage of the genomic information to establish fundamental principles of genome evolution and function. In particular, we performed a comprehensive comparative analysis of eukaryotic nucleo-cytoplasmic large DNA viruses (NCLDV) including the construction of clusters of orthologous genes and reconstruction of viral genome evolution. The NCLDV comprise an apparently monophyletic class of viruses that infect a broad variety of eukaryotic hosts. Recent progress in isolation of new viruses and genome sequencing resulted in a substantial expansion of the NCLDV diversity, resulting in additional opportunities for comparative genomic analysis, and a demand for a comprehensive classification of viral genes. A comprehensive comparison of the protein sequences encoded in the genomes of 45 NCLDV belonging to 6 families was performed in order to delineate cluster of orthologous viral genes. Using previously developed computational methods for orthology identification, 1445 Nucleo-Cytoplasmic Virus Orthologous Groups (NCVOGs) were identified of which 177 are represented in more than one NCLDV family. The NCVOGs were manually curated and annotated and can be used as a computational platform for functional annotation and evolutionary analysis of new NCLDV genomes. A maximum-likelihood reconstruction of the NCLDV evolution yielded a set of 47 conserved genes that were probably present in the genome of the common ancestor of this class of eukaryotic viruses. This reconstructed ancestral gene set is robust to the parameters of the reconstruction procedure and so is likely to accurately reflect the gene core of the ancestral NCLDV, indicating that this virus encoded a complex machinery of replication, expression and morphogenesis that made it relatively independent from host cell functions. The NCVOGs are a flexible and expandable platform for genome analysis and functional annotation of newly characterized NCLDV. Evolutionary reconstructions employing NCVOGs point to complex ancestral viruses.
In another project, we investigated the abundance and distribution of type I toxin-antitoxin systems in bacteria with the goals of searching for new candidates and discovering novel families. Small, hydrophobic proteins whose synthesis is repressed by small RNAs (sRNAs), denoted type I toxin-antitoxin modules, were first discovered on plasmids where they regulate plasmid stability, but were subsequently found on a few bacterial chromosomes. We used exhaustive PSI-BLAST and TBLASTN searches across 774 bacterial genomes to identify homologs of known type I toxins. These searches substantially expanded the collection of predicted type I toxins, revealed homology of the Ldr and Fst toxins, and suggested that type I toxin-antitoxin loci are not spread by horizontal gene transfer. To discover novel type I toxin-antitoxin systems, we developed a set of search parameters based on characteristics of known loci including the presence of tandem repeats and clusters of charged and bulky amino acids at the C-termini of short proteins containing predicted transmembrane regions. We detected sRNAs for three predicted toxins from enterohemorrhagic Escherichia coli and Bacillus subtilis, and showed that two of the respective proteins indeed are toxic when overexpressed. We also demonstrated that the local free-energy minima of RNA folding can be used to detect the positions of the sRNA genes. Our results suggest that type I toxin-antitoxin modules are much more widely distributed among bacteria than previously appreciated.
In a separate comparative genomic study we explored distinct patterns of expression and evolution of intronless and intron-containing mammalian genes. Comparison of expression levels and breadth and evolutionary rates of intronless and intron-containing mammalian genes shows that intronless genes are expressed at lower levels, tend to be tissue specific, and evolve significantly faster than spliced genes. By contrast, monomorphic spliced genes that are not subject to detectable alternative splicing and polymorphic alternatively spliced genes show similar statistically indistinguishable patterns of expression and evolution. Alternative splicing is most common in ancient genes, whereas intronless genes appear to have relatively recent origins. These results imply tight coupling between different stages of gene expression, in particular, transcription, splicing, and nucleocytosolic transport of transcripts, and suggest that formation of intronless genes is an important route of evolution of novel tissue-specific functions in animals.
细菌、古生物和真核生物全测序基因组数据库的快速增长(截至2010年初已有1100多个基因组,还有更多基因组正在进行中)为基因组研究带来了新的机遇和挑战。在过去的一年里,我们进行了几项研究,利用基因组信息来建立基因组进化和功能的基本原则。特别是,我们对真核核质大DNA病毒(NCLDV)进行了全面的比较分析,包括构建同源基因簇和重建病毒基因组进化。NCLDV包括一类明显单一的病毒,可感染多种真核宿主。最近在分离新病毒和基因组测序方面的进展导致NCLDV多样性的大幅扩大,从而带来了更多的比较基因组分析机会,并要求对病毒基因进行全面分类。对6个科45株NCLDV基因组中编码的蛋白质序列进行了综合比较,以确定同源病毒基因簇。利用以前开发的计算方法进行正交学鉴定,共鉴定了1445个核质病毒直系组(NCVOG),其中177个NCVOG代表一个以上的NCLDV家族。NCVOG是人工整理和注释的,可以作为新的NCLDV基因组功能注释和进化分析的计算平台。对NCLDV进化的最大似然重建产生了一组47个保守基因,这些基因可能存在于这类真核病毒的共同祖先的基因组中。这种重建的祖先基因集对重建过程中的参数是健壮的,因此可能准确地反映了祖先NCLDV的基因核心,表明该病毒编码了一种复杂的复制、表达和形态发生机制,使其相对独立于宿主细胞的功能。NCVOGs是一个灵活和可扩展的平台,用于新鉴定的NCLDV的基因组分析和功能注释。使用NCVOG的进化重建指向复杂的祖先病毒。
在另一个项目中,我们调查了I型毒素-抗毒素系统在细菌中的丰度和分布,目的是寻找新的候选者和发现新的家族。小的疏水蛋白的合成被小RNA(SRNAs)抑制,被称为I型毒素-抗毒素模块,最初是在调节质粒稳定性的质粒上发现的,但随后在一些细菌染色体上发现了它们。我们使用了详尽的PSI-BLAST和TBLASTN搜索774个细菌基因组来识别已知I型毒素的同源物。这些搜索大大扩展了预测的I型毒素的集合,揭示了Ldr和Fst毒素的同源性,并表明I型毒素-抗毒素基因座不是通过水平基因转移传播的。为了发现新的I型毒素-抗毒素系统,我们开发了一套基于已知基因座特征的搜索参数,包括在包含预测跨膜区的短蛋白质的C末端存在串联重复序列和带电氨基酸和巨型氨基酸簇。我们检测了来自肠出血性大肠杆菌和枯草芽孢杆菌的三种预测毒素的sRNA,并表明其中两种蛋白质在过度表达时确实是有毒的。我们还证明了RNA折叠的局部自由能极小值可以用来检测sRNA基因的位置。我们的结果表明,I型毒素-抗毒素模块在细菌中的分布比以前估计的要广泛得多。
在一项单独的比较基因组研究中,我们探索了无内含子和含内含子的哺乳动物基因表达和进化的不同模式。对无内含子和含内含子的哺乳动物基因表达水平、表达广度和进化速率的比较表明,无内含子基因的表达水平较低,往往具有组织特异性,并且进化速度明显快于剪接基因。相比之下,不受可检测的选择性剪接影响的单态剪接基因和多态的选择性剪接基因显示出相似的、在统计上无法区分的表达和进化模式。选择性剪接在古代基因中最为常见,而无内含子基因似乎有相对较新的起源。这些结果暗示了基因表达的不同阶段之间的紧密耦合,特别是转录产物的转录、剪接和核浆运输,并表明无内含子基因的形成是动物新的组织特异性功能进化的重要途径。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Eugene V Koonin其他文献
The common ancestry of life
- DOI:
10.1186/1745-6150-5-64 - 发表时间:
2010-01-01 - 期刊:
- 影响因子:4.900
- 作者:
Eugene V Koonin;Yuri I Wolf - 通讯作者:
Yuri I Wolf
Identification of dephospho-CoA kinase in Thermococcus kodakarensis and the complete CoA biosynthesis pathway
Thermococcus kodakarensis 中去磷酸 CoA 激酶的鉴定及完整 CoA 生物合成途径
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Takahiro Shimosaka;Kira S Makarova;Eugene V Koonin;Haruyuki Atomi - 通讯作者:
Haruyuki Atomi
Positive and strongly relaxed purifying selection drive the evolution of repeats in proteins
积极且强烈放松的纯化选择驱动蛋白质中重复序列的进化
- DOI:
10.1038/ncomms13570 - 发表时间:
2016-11-18 - 期刊:
- 影响因子:15.700
- 作者:
Erez Persi;Yuri I. Wolf;Eugene V Koonin - 通讯作者:
Eugene V Koonin
Evolutionary primacy of sodium bioenergetics
- DOI:
10.1186/1745-6150-3-13 - 发表时间:
2008-04-01 - 期刊:
- 影响因子:4.900
- 作者:
Armen Y Mulkidjanian;Michael Y Galperin;Kira S Makarova;Yuri I Wolf;Eugene V Koonin - 通讯作者:
Eugene V Koonin
Classification and evolutionary history of the single-strand annealing proteins, RecT, Redβ, ERF and RAD52
- DOI:
10.1186/1471-2164-3-8 - 发表时间:
2002-03-21 - 期刊:
- 影响因子:3.700
- 作者:
Lakshminarayan M Iyer;Eugene V Koonin;L Aravind - 通讯作者:
L Aravind
Eugene V Koonin的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Eugene V Koonin', 18)}}的其他基金
Finding Protein Sequence Motifs--methods And Application
寻找蛋白质序列基序--方法与应用
- 批准号:
6681337 - 财政年份:
- 资助金额:
$ 215.46万 - 项目类别:
Finding Protein Sequence Motifs--Methods and Application
寻找蛋白质序列基序--方法与应用
- 批准号:
6988455 - 财政年份:
- 资助金额:
$ 215.46万 - 项目类别:
Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
- 批准号:
7969213 - 财政年份:
- 资助金额:
$ 215.46万 - 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
- 批准号:
8943217 - 财政年份:
- 资助金额:
$ 215.46万 - 项目类别:
Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
- 批准号:
9160910 - 财政年份:
- 资助金额:
$ 215.46万 - 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
- 批准号:
7735068 - 财政年份:
- 资助金额:
$ 215.46万 - 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
- 批准号:
7594460 - 财政年份:
- 资助金额:
$ 215.46万 - 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
- 批准号:
9555730 - 财政年份:
- 资助金额:
$ 215.46万 - 项目类别:
COMPARATIVE ANALYSIS OF COMPLETELY SEQUENCED GENOMES
全测序基因组的比较分析
- 批准号:
6111075 - 财政年份:
- 资助金额:
$ 215.46万 - 项目类别:
Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
- 批准号:
6988458 - 财政年份:
- 资助金额:
$ 215.46万 - 项目类别:
相似海外基金
HIGH THROUGHPUT GENOTYPING AND DNA SEQUENCING FOR STUDYING THE GENETIC CONTRIBUTIONS TO HUMAN HEALTH AND DISEASE: Whole Genome Sequencing of Bilateral Cleft Lip and Palate families from Africa
用于研究对人类健康和疾病的遗传贡献的高通量基因分型和 DNA 测序:非洲双侧唇裂和腭裂家族的全基因组测序
- 批准号:
10498645 - 财政年份:2021
- 资助金额:
$ 215.46万 - 项目类别:
Barriers and enablers to quality implementation of clinical genome sequencing in Canada: A comparative case study
加拿大高质量实施临床基因组测序的障碍和推动因素:比较案例研究
- 批准号:
440131 - 财政年份:2020
- 资助金额:
$ 215.46万 - 项目类别:
Studentship Programs
Genome sequencing and comparative analysis of brewing yeast cultures
酿酒酵母培养物的基因组测序和比较分析
- 批准号:
543995-2019 - 财政年份:2019
- 资助金额:
$ 215.46万 - 项目类别:
Engage Grants Program
Whole genome sequencing and comparative genomics of different F. graminearum isolates
不同禾谷镰刀菌分离株的全基因组测序和比较基因组学
- 批准号:
444521-2013 - 财政年份:2015
- 资助金额:
$ 215.46万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Whole genome sequencing and comparative genomics of different F. graminearum isolates
不同禾谷镰刀菌分离株的全基因组测序和比较基因组学
- 批准号:
444521-2013 - 财政年份:2014
- 资助金额:
$ 215.46万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
Whole genome sequencing and comparative genomics of different F. graminearum isolates
不同禾谷镰刀菌分离株的全基因组测序和比较基因组学
- 批准号:
444521-2013 - 财政年份:2013
- 资助金额:
$ 215.46万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
WHOLE GENOME SEQUENCING AND ANNOTATION OF THE MARINE FISH PATHOGEN VIBRIO ANGUIL
海洋鱼类病原体鳗弧菌全基因组测序及注释
- 批准号:
8173255 - 财政年份:2010
- 资助金额:
$ 215.46万 - 项目类别:
WHOLE GENOME SEQUENCING AND ANNOTATION OF THE MARINE FISH PATHOGEN VIBRIO ANGUIL
海洋鱼类病原体鳗弧菌全基因组测序及注释
- 批准号:
7958526 - 财政年份:2009
- 资助金额:
$ 215.46万 - 项目类别:
WHOLE GENOME SEQUENCING AND ANNOTATION OF THE MARINE FISH PATHOGEN VIBRIO ANGUIL
海洋鱼类病原体鳗弧菌全基因组测序及注释
- 批准号:
7716027 - 财政年份:2008
- 资助金额:
$ 215.46万 - 项目类别:
Force spectroscopy platform for label free genome sequencing
用于无标记基因组测序的力谱平台
- 批准号:
7295822 - 财政年份:2006
- 资助金额:
$ 215.46万 - 项目类别:














{{item.name}}会员




