Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
基本信息
- 批准号:9555730
- 负责人:
- 金额:$ 31.91万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:Actinomyces InfectionsAmino Acid MotifsAmino Acid SequenceAnimalsAntitumor ResponseApoptosisArchaeal GenomeArchitectureBacteriaBacterial GenomeBiologicalCapsid ProteinsCensusesClassificationClustered Regularly Interspaced Short Palindromic RepeatsCollectionComplexCustomDNADeath DomainDevelopmentDiseaseDissectionEukaryotaEvolutionFamilyFamily memberGenerationsGenesGenomeGenome engineeringGenomicsGoalsHomology ModelingHumanIndividualInvestigationLeadLibrariesLifeMethodologyMethodsMobile Genetic ElementsNomenclatureOrganismPatternPeriodicityPhenotypePlanet EarthPositioning AttributeProcessProkaryotic CellsPropertyProtein AnalysisProtein FamilyProtein Structure InitiativeProteinsRNA BindingRecruitment ActivityRegulationResearchRouteSAM DomainSignal TransductionStructureSystemTertiary Protein StructureVariantViralViral GenomeVirionVirusWorkadaptive immunitydatabase structuredesignexhaustionexperiencegenetic elementgenome editingmarkov modelmicrobialmolecular sequence databasenovelnucleoside triphosphatasepolymerizationprotein profilingprotein structuresample fixationtooltrait
项目摘要
The rapid accumulation of genome sequences and protein structures during the last decade has been paralleled by major advances in sequence database search methods. The powerful Position-Specific Iterating BLAST (PSI-BLAST) method developed at the NCBI forms the basis of our work on protein motif analysis. In addition, Hidden Markov Models (HMM), protein profile-against-profile comparison implemented in the HHSearch method, protein structure comparison methods, homology modeling of protein structure and genome context analysis were extensively and increasingly applied. Furthermore, custom libraries of protein domain profiles as well as computational pipelines for novel domain identification have been developed and applied.
The research performed over the last year, has led to further progress in the study of the classification, evolution, and functions of several classes of proteins and domains. In particular, we have performed a comprehensive analysis of the relationships among viral capsid proteins. Viruses are the most abundant biological entities on earth and show remarkable diversity of genome sequences, replication and expression strategies, and virion structures. Evolutionary genomics of viruses revealed many unexpected connections but the general scenario(s) for the evolution of the virosphere remains a matter of intense debate among proponents of the cellular regression, escaped genes, and primordial virus world hypotheses. A comprehensive sequence and structure analysis of major virion proteins indicates that they evolved on about 20 independent occasions, and in some of these cases likely ancestors are identifiable among the proteins of cellular organisms. Virus genomes typically consist of distinct structural and replication modules that recombine frequently and can have different evolutionary trajectories. The results of this analysis suggest that, although the replication modules of at least some classes of viruses might descend from primordial selfish genetic elements, bona fide viruses evolved on multiple, independent occasions throughout the course of evolution by the recruitment of diverse host proteins that became major virion components.
In another project, we performed a detailed analysis and classification of the protein domains that comprise the Class 2 CRISPR-Cas systems, the microbial defense machinery that has been recently exploited for development of a new generation of genome editing tools. Class 2 CRISPR-Cas systems are characterized by effector modules that consist of a single multidomain protein, such as Cas9 or Cpf1. We designed a computational pipeline for the discovery of novel class 2 variants and used it to identify six new CRISPR-Cas subtypes. The diverse properties of these new systems provide potential for the development of versatile tools for genome editing and regulation. We performed a comprehensive census of class 2 types and subtypes in complete and draft bacterial and archaeal genomes, outlined evolutionary scenarios for the independent origin of different class 2 CRISPR-Cas systems from mobile genetic elements, and proposed an amended classification and nomenclature of CRISPR-Cas.
In a separate development, we performed an exhaustive computational dissection of the domain architecture of the SAMD9 family proteins that are involved in antivirus and antitumor response in humans. We show that the SAMD9 protein family is represented in most animals and also, unexpectedly, in bacteria, in particular actinomycetes. From the N to C terminus, the core SAMD9 family architecture includes DNA/RNA-binding AlbA domain, a variant Sir2-like domain, a STAND-like P-loop NTPase, an array of TPR repeats and an OB-fold domain with predicted RNA-binding properties. Vertebrate SAMD9 family proteins contain the eponymous SAM domain capable of polymerization, whereas some family members from other animals instead contain homotypic adaptor domains of the DEATH superfamily, known as dedicated components of apoptosis networks. Such complex domain architecture is reminiscent of the STAND superfamily NTPases that are involved in various signaling processes, including programmed cell death, in both eukaryotes and prokaryotes. These findings suggest that SAMD9 is a hub of a novel, evolutionarily conserved defense network that remains to be characterized.
In a more theoretically oriented project, we performed a genomic census and evolutionary analysis of repeats arrays in diverse protein families. Protein repeats are considered hotspots of protein evolution, associated with acquisition of new functions and novel phenotypic traits, including disease. Paradoxically, however, repeats are often strongly conserved through long spans of evolution. To resolve this conundrum, it is necessary to directly compare paralogous (horizontal) evolution of repeats within proteins with their orthologous (vertical) evolution through speciation. Here we develop a rigorous methodology to identify highly periodic repeats with significant sequence similarity, for which evolutionary rates and selection (dN/dS) can be estimated, and systematically characterize their evolution. We showed that horizontal evolution of repeats is markedly accelerated compared with their divergence from orthologues in closely related species. This observation is universal across the diversity of life forms and implies a biphasic evolutionary regime whereby new copies experience rapid functional divergence under combined effects of strongly relaxed purifying selection and positive selection, followed by fixation and conservation of each individual repeat.
Taken together, these studies expand the known repertoire of protein domains with defined functions and lead to the discovery of novel biologically important functional systems in diverse organisms some of which are expected to have practical implications, e.g. in genome engineering. The findings also contribute to the current understanding of the routes of protein evolution.
在过去的十年中,基因组序列和蛋白质结构的快速积累与序列数据库搜索方法的重大进展是并行的。NCBI开发的强大的位置特异性迭代BLAST (PSI-BLAST)方法构成了我们蛋白质基序分析工作的基础。此外,隐马尔可夫模型(HMM)、HHSearch方法中实现的蛋白质谱间比较、蛋白质结构比较方法、蛋白质结构同源性建模和基因组上下文分析等也得到了越来越广泛的应用。此外,还开发和应用了自定义的蛋白质结构域库和用于新型结构域识别的计算管道。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Eugene V Koonin其他文献
The common ancestry of life
- DOI:
10.1186/1745-6150-5-64 - 发表时间:
2010-01-01 - 期刊:
- 影响因子:4.900
- 作者:
Eugene V Koonin;Yuri I Wolf - 通讯作者:
Yuri I Wolf
Identification of dephospho-CoA kinase in Thermococcus kodakarensis and the complete CoA biosynthesis pathway
Thermococcus kodakarensis 中去磷酸 CoA 激酶的鉴定及完整 CoA 生物合成途径
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Takahiro Shimosaka;Kira S Makarova;Eugene V Koonin;Haruyuki Atomi - 通讯作者:
Haruyuki Atomi
Positive and strongly relaxed purifying selection drive the evolution of repeats in proteins
积极且强烈放松的纯化选择驱动蛋白质中重复序列的进化
- DOI:
10.1038/ncomms13570 - 发表时间:
2016-11-18 - 期刊:
- 影响因子:15.700
- 作者:
Erez Persi;Yuri I. Wolf;Eugene V Koonin - 通讯作者:
Eugene V Koonin
Evolutionary primacy of sodium bioenergetics
- DOI:
10.1186/1745-6150-3-13 - 发表时间:
2008-04-01 - 期刊:
- 影响因子:4.900
- 作者:
Armen Y Mulkidjanian;Michael Y Galperin;Kira S Makarova;Yuri I Wolf;Eugene V Koonin - 通讯作者:
Eugene V Koonin
Classification and evolutionary history of the single-strand annealing proteins, RecT, Redβ, ERF and RAD52
- DOI:
10.1186/1471-2164-3-8 - 发表时间:
2002-03-21 - 期刊:
- 影响因子:3.700
- 作者:
Lakshminarayan M Iyer;Eugene V Koonin;L Aravind - 通讯作者:
L Aravind
Eugene V Koonin的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Eugene V Koonin', 18)}}的其他基金
Finding Protein Sequence Motifs--Methods and Application
寻找蛋白质序列基序--方法与应用
- 批准号:
6988455 - 财政年份:
- 资助金额:
$ 31.91万 - 项目类别:
Finding Protein Sequence Motifs--methods And Application
寻找蛋白质序列基序--方法与应用
- 批准号:
6681337 - 财政年份:
- 资助金额:
$ 31.91万 - 项目类别:
Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
- 批准号:
7969213 - 财政年份:
- 资助金额:
$ 31.91万 - 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
- 批准号:
8943217 - 财政年份:
- 资助金额:
$ 31.91万 - 项目类别:
Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
- 批准号:
9160910 - 财政年份:
- 资助金额:
$ 31.91万 - 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
- 批准号:
7594460 - 财政年份:
- 资助金额:
$ 31.91万 - 项目类别:
Finding Protein Sequence Motifs--methods And Applications
寻找蛋白质序列基序——方法和应用
- 批准号:
7735068 - 财政年份:
- 资助金额:
$ 31.91万 - 项目类别:
Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
- 批准号:
6988458 - 财政年份:
- 资助金额:
$ 31.91万 - 项目类别:
Comparative Analysis Of Completely Sequenced Genomes
完全测序的基因组的比较分析
- 批准号:
7316251 - 财政年份:
- 资助金额:
$ 31.91万 - 项目类别:
相似海外基金
Elucidating the biophysics of pre-fibrillar, toxic tau oligomers: from amino acid motifs to neuronal dysfunction
阐明前原纤维有毒 tau 寡聚体的生物物理学:从氨基酸基序到神经元功能障碍
- 批准号:
10461322 - 财政年份:2021
- 资助金额:
$ 31.91万 - 项目类别:
Elucidating the biophysics of pre-fibrillar, toxic tau oligomers: from amino acid motifs to neuronal dysfunction
阐明前原纤维有毒 tau 寡聚体的生物物理学:从氨基酸基序到神经元功能障碍
- 批准号:
10489810 - 财政年份:2021
- 资助金额:
$ 31.91万 - 项目类别:
Detection of amino acid motifs on the agretopes of antigens highly bound to MHC molecules
检测与 MHC 分子高度结合的抗原聚集位上的氨基酸基序
- 批准号:
03670243 - 财政年份:1991
- 资助金额:
$ 31.91万 - 项目类别:
Grant-in-Aid for General Scientific Research (C)














{{item.name}}会员




