Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
基本信息
- 批准号:9160910
- 负责人:
- 金额:$ 30.47万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:AccountingAdoptedAffectAnimalsArchaeaBacteriaBacteriophagesBerylliumClustered Regularly Interspaced Short Palindromic RepeatsCollectionComputing MethodologiesDataDatabasesDouble Stranded DNA VirusDouble Stranded RNA VirusDouble-Stranded RNAElementsEukaryotaEvolutionFamilyGene FrequencyGenesGenetic VariationGenomeGenome engineeringGenomicsGoalsHorizontal Gene TransferImmune systemImmunoglobulin GenesIndividualLifeMethodsMinorityMobile Genetic ElementsModelingModificationNatureOrganismOrthologous GenePhylogenetic AnalysisPhylogenyPlant RootsPlasmidsProbabilityProcessProkaryotic CellsRNA VirusesRecording of previous eventsRelative (related person)ResearchRetroelementsRoleSamplingSingle Stranded DNA VirusSourceSurveysSystemTechnologyTectivirusesTheoretical StudiesTreesV(D)J RecombinationVertebratesViralVirionVirusadaptive immunitycomparativeds-DNAexperiencefusion genegenetic elementgenome editinggenome sequencinggenome-wideinsightlong term memorymathematical methodsmathematical modelparalogous geneplasmid DNArecombinasetooltrendvirome
项目摘要
The rapidly growing database of completely and nearly completely sequenced genomes of bacteria, archaea, eukaryotes and viruses (several thousand genomes already available and many more in progress) creates both new opportunities and new challenges for genome research. During the last year, we performed a variety of studies that took advantage of the genomic information to establish fundamental principles of genome evolution.
A comprehensive phylogenomic analysis of viruses infecting eukaryotic hosts and the related mobile elements was performed. Viruses and other selfish genetic elements are dominant entities in the biosphere, with respect to both physical abundance and genetic diversity. Various selfish elements parasitize on all cellular life forms. The relative abundances of different classes of viruses are dramatically different between prokaryotes and eukaryotes. In prokaryotes, the great majority of viruses possess double-stranded (ds) DNA genomes, with a substantial minority of single-stranded (ss) DNA viruses and only limited presence of RNA viruses. In contrast, in eukaryotes, RNA viruses account for the majority of the virome diversity although ssDNA and dsDNA viruses are common as well. Phylogenomic analysis yields tangible clues for the origins of major classes of eukaryotic viruses and in particular their likely roots in prokaryotes. Specifically, the ancestral genome of positive-strand RNA viruses of eukaryotes might have been assembled de novo from genes derived from prokaryotic retroelements and bacteria although a primordial origin of this class of viruses cannot be ruled out. Different groups of double-stranded RNA viruses derive either from dsRNA bacteriophages or from positive-strand RNA viruses. The eukaryotic ssDNA viruses apparently evolved via a fusion of genes from prokaryotic rolling circle-replicating plasmids and positive-strand RNA viruses. Different families of eukaryotic dsDNA viruses appear to have originated from specific groups of bacteriophages on at least two independent occasions. Polintons, the largest known eukaryotic transposons, predicted to also form virus particles, most likely, were the evolutionary intermediates between bacterial tectiviruses and several groups of eukaryotic dsDNA viruses including the proposed order "Megavirales" that unites diverse families of large and giant viruses. Strikingly, evolution of all classes of eukaryotic viruses appears to have involved fusion between structural and replicative gene modules derived from different sources along with additional acquisitions of diverse genes.
We developed a general scenario of evolution of adaptive immunity systems and possibly other genome manipulation machineries from mobile genetic elements. Adaptive immune systems in prokaryotes and animals give rise to long-term memory through modification of specific genomic loci, such as by insertion of foreign (viral or plasmid) DNA fragments into clustered regularly interspaced short palindromic repeat (CRISPR) loci in prokaryotes and by V(D)J recombination of immunoglobulin genes in vertebrates. Strikingly, recombinases derived from unrelated mobile genetic elements have essential roles in both prokaryotic and vertebrate adaptive immune systems. Mobile elements, which are ubiquitous in cellular life forms, provide the only known, naturally evolved tools for genome engineering that are successfully adopted by both innate immune systems and genome-editing technologies.
We performed a theoretical study of the evolution of bacterial and archaeal supergenomes. Because prokaryotic genomes experience a rapid flux of genes, selection may act at a higher level than an individual genome. We explore a quantitative model of the distributed genome whereby groups of genomes evolve by acquiring genes from a fixed reservoir which we denote as supergenome. Previous attempts to understand the nature of the supergenome treated genomes as random, independent collections of genes and assumed that the supergenome consists of a small number of homogeneous sub-reservoirs. Here we explore the consequences of relaxing both assumptions.
We surveyed several methods for estimating the size and composition of the supergenome. The methods assumed that genomes were either random, independent samples of the supergenome or that they evolved from a common ancestor along a known tree via stochastic sampling from the reservoir. The reservoir was assumed to be either a collection of homogeneous sub-reservoirs or alternatively composed of genes with Gamma distributed gain probabilities. Empirical gene frequencies were used to either compute the likelihood of the data directly or first to reconstruct the history of gene gains and then compute the likelihood of the reconstructed numbers of gains. Supergenome size estimates using the empirical gene frequencies directly are not robust with respect to the choice of the model. By contrast, using the gene frequencies and the phylogenetic tree to reconstruct multiple gene gains produces reliable estimates of the supergenome size and indicates that a homogeneous supergenome is more consistent with the data than a supergenome with Gamma distributed gain probabilities.
Taken together, these studies advance the existing understanding of the genome evolution in diverse life forms, in particular viruses and mobile elements, and provide new insights into general principles of genome evolution.
细菌,古细菌,真核生物和病毒(已经有几千个基因组,正在进行的更多基因组,正在进行的更多基因组)的完全且几乎完全测序的基因组的快速增长数据库为基因组研究带来了新的机会和新的挑战。在过去的一年中,我们进行了各种研究,利用基因组信息来建立基因组进化的基本原理。
对感染真核宿主的病毒和相关的移动元素进行了全面的系统基因分析。就身体丰度和遗传多样性而言,病毒和其他自私的遗传因素是生物圈中的主要实体。各种自私元素在所有细胞生命形式上都寄生。原核生物和真核生物之间不同类别的病毒的相对丰度截然不同。在原核生物中,绝大多数病毒具有双链(DS)DNA基因组,具有少数单链(SS)DNA病毒,并且只有有限的RNA病毒。相反,在真核生物中,尽管ssDNA和dsDNA病毒也很常见,但RNA病毒占了大多数病毒多样性。系统基因分析为主要类型的真核病毒的起源提供了切实的线索,尤其是它们在原核生物中的根源。具体而言,真核生物的正链RNA病毒的祖先基因组可能是从源自原核生物恢复和细菌的基因组装而来的,尽管该类别病毒的原始起源不能排除。不同的双链RNA病毒源自DSRNA噬菌体或阳性链RNA病毒。真核生物ssDNA病毒显然是通过原核滚动圆圈复制质粒和阳性链RNA病毒的基因融合而发展的。不同的真核DSDNA病毒家族似乎至少在两个独立的情况下起源于特定的噬菌体。 Polinton是最大的已知真核转座子,预计还形成了病毒颗粒,很可能是细菌构造病毒与几组真核DSDNA病毒之间的进化中间体,包括拟议的“巨型大型和巨大病毒的多样家庭)。令人惊讶的是,所有类别的真核病毒的演变似乎都涉及从不同来源得出的结构和复制基因模块之间的融合以及多种基因的其他获得。
我们开发了一种从移动遗传元素中的自适应免疫系统以及其他基因组操纵机器的演变的一般情况。 Adaptive immune systems in prokaryotes and animals give rise to long-term memory through modification of specific genomic loci, such as by insertion of foreign (viral or plasmid) DNA fragments into clustered regularly interspaced short palindromic repeat (CRISPR) loci in prokaryotes and by V(D)J recombination of immunoglobulin genes in vertebrates.引人注目的是,来自无关的移动遗传元素衍生的重组酶在原核生物和脊椎动物适应性免疫系统中都具有重要作用。在细胞生命形式中无处不在的移动元素为基因组工程的唯一已知,自然发展的工具提供了由先天免疫系统和基因组编辑技术成功采用的基因组工程。
我们对细菌和古细菌超基因组的进化进行了理论研究。由于原核基因组经历了基因的快速通量,因此选择可以比单个基因组更高的水平起作用。我们探索了分布式基因组的定量模型,从而通过从固定储层中获取基因来进化,我们将其表示为超基因组。以前的尝试将超基因组治疗的基因组视为随机的,独立的基因集合的尝试,并假设超基因组由少量均质亚储库组成。在这里,我们探讨了放松这两个假设的后果。
我们调查了几种估计超基因组大小和组成的方法。该方法假设基因组是超基因组的随机,独立的样本,或者它们是通过储层的随机采样从已知树沿着已知树的共同祖先演变而来的。该储层被认为是同质亚库的集合,或者由具有伽马分布的增益概率的基因组成。经验基因频率被用来直接计算数据的可能性,或者首先要重建基因增长的历史,然后计算重建数量增长数量的可能性。使用经验基因频率直接相对于模型的选择,超基因组大小的估计并不强大。相比之下,使用基因频率和系统发育树来重建多个基因增益会产生对超基因组大小的可靠估计,并表明均质超基因组与数据比具有GAMMA分布的增益概率的超基因组更一致。
综上所述,这些研究提高了对各种生命形式的基因组进化的现有理解,尤其是病毒和移动元素,并为基因组进化的一般原理提供了新的见解。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Eugene V Koonin其他文献
Identification of dephospho-CoA kinase in Thermococcus kodakarensis and the complete CoA biosynthesis pathway
Thermococcus kodakarensis 中去磷酸 CoA 激酶的鉴定及完整 CoA 生物合成途径
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Takahiro Shimosaka;Kira S Makarova;Eugene V Koonin;Haruyuki Atomi - 通讯作者:
Haruyuki Atomi
超好熱性アーキアThermococcus kodakarensisにおける新規dephospho-CoA kinaseの同定および解析
超嗜热古菌 Thermococcus kodakarensis 中新型去磷酸 CoA 激酶的鉴定和分析
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Takahiro Shimosaka;Kira S Makarova;Eugene V Koonin;Haruyuki Atomi - 通讯作者:
Haruyuki Atomi
超好熱性アーキアThermococcus kodakarensisにおけるアーキア特異的な新規 dephospho-CoA kinaseの同定および解析
超嗜热古菌 Thermococcus kodakarensis 中新型古菌特异性去磷酸 CoA 激酶的鉴定和分析
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Takahiro Shimosaka;Kira S Makarova;Eugene V Koonin;Haruyuki Atomi - 通讯作者:
Haruyuki Atomi
Eugene V Koonin的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Eugene V Koonin', 18)}}的其他基金
Finding Protein Sequence Motifs--methods And Application
寻找蛋白质序列基序--方法与应用
- 批准号:
6681337 - 财政年份:
- 资助金额:
$ 30.47万 - 项目类别:
Finding Protein Sequence Motifs--Methods and Application
寻找蛋白质序列基序--方法与应用
- 批准号:
6988455 - 财政年份:
- 资助金额:
$ 30.47万 - 项目类别:
Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
- 批准号:
7969213 - 财政年份:
- 资助金额:
$ 30.47万 - 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
- 批准号:
8943217 - 财政年份:
- 资助金额:
$ 30.47万 - 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
- 批准号:
7735068 - 财政年份:
- 资助金额:
$ 30.47万 - 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
- 批准号:
7594460 - 财政年份:
- 资助金额:
$ 30.47万 - 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
- 批准号:
9555730 - 财政年份:
- 资助金额:
$ 30.47万 - 项目类别:
Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
- 批准号:
6988458 - 财政年份:
- 资助金额:
$ 30.47万 - 项目类别:
相似国自然基金
采用积分投影模型解析克隆生长对加拿大一枝黄花种群动态的影响
- 批准号:32301322
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
山丘区农户生计分化对水保措施采用的影响及其调控对策
- 批准号:42377321
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
跨期决策中偏好反转的影响因素及作用机制:采用体验式实验范式的综合研究
- 批准号:72271190
- 批准年份:2022
- 资助金额:43 万元
- 项目类别:面上项目
农民合作社视角下组织支持、个人规范对农户化肥农药减量增效技术采用行为的影响机制研究
- 批准号:72103054
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
采用磁共振技术研究帕金森病蓝斑和黑质神经退变及其对大脑结构功能的影响
- 批准号:
- 批准年份:2021
- 资助金额:55 万元
- 项目类别:面上项目
相似海外基金
Spatially Resolved CRISPR Genomics for Dissecting Testicular Gene Functions at Scale
空间分辨 CRISPR 基因组学用于大规模剖析睾丸基因功能
- 批准号:
10573701 - 财政年份:2023
- 资助金额:
$ 30.47万 - 项目类别:
Impact of Structural Racism on Racial Disparities in Cognitive Impairment
结构性种族主义对认知障碍种族差异的影响
- 批准号:
10572864 - 财政年份:2023
- 资助金额:
$ 30.47万 - 项目类别:
Developing causal inference methods to evaluate and leverage spillover effects through social Interactions for designing improved HIV prevention interventions
开发因果推理方法,通过社会互动评估和利用溢出效应,设计改进的艾滋病毒预防干预措施
- 批准号:
10762679 - 财政年份:2023
- 资助金额:
$ 30.47万 - 项目类别:
Glycemic Observation Using A1C for Gestational Diabetes Diagnosis
使用 A1C 进行血糖观察以诊断妊娠期糖尿病
- 批准号:
10364803 - 财政年份:2022
- 资助金额:
$ 30.47万 - 项目类别:
Preventing Peanut Allergy in Young Children: Identifying Barriers to Protocol Engagement and Adherence
预防幼儿花生过敏:识别协议参与和遵守的障碍
- 批准号:
10641769 - 财政年份:2022
- 资助金额:
$ 30.47万 - 项目类别: