ACTIVE SITE SIGNATURES FOR SFLD: ENOLASE SUPERFAMILY

SFLD 的活性位点特征:烯醇酶超家族

基本信息

项目摘要

This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. Primary support for the subproject and the subproject's principal investigator may have been provided by other sources, including other NIH sources. The Total Cost listed for the subproject likely represents the estimated amount of Center infrastructure utilized by the subproject, not direct funding provided by the NCRR grant to the subproject or subproject staff. A major unsolved problem for structure-function linkage using computational prediction is that while we can accurately cluster protein sequences and structures with good statistical significance based on many types of similarity metrics, how those clusters link to functional classes is not clear. Although simple approaches such as ortholog prediction can achieve good results for sequences that are closely similar or that contain readily identifiable motifs that distinguish functional classes, for many protein superfamilies successful prediction is far from trivial. This is the case for the functionally diverse superfamilies in the SFLD. These are homologous sets of enzymes that carry out different chemical transformations, using different substrates, but all share a specific chemical functionality or partial reaction. The main purpose of the SFLD is to aid researchers in the curation of these types of superfamilies, to help in the identification of new members of these superfamilies, and to provide an explicit structure-function mapping for these enzymes. Because the different functional families in a given superfamily look similar but perform different specific reactions, they are difficult to annotate and easy to misannotate, showing levels of misannotation as high as 80% in the archival databases Genbank NR and TrEMBL. Because sequence information is still coming available in large volumes, automated methods are required to update the SFLD superfamilies with newly determined sequences and assign them to the appropriate functional families. Clearly, improved methods for achieving these functional assignments are urgently needed. Development of an approach to achieve this has been a major focus of the RBVI in collaboration with the group of Prof. Jacquelyn Fetrow of Wake Forest University. The active site profiling methods developed by Dr. Fetrow have now been integrated with an approach developed in the Babbitt lab, Genetic Algorithm Search for Patterns in Structures: GASPS, to automatically determine 3D templates capable of distinguishing new superfamily members for the purpose of automatically assigning sequences to the specific functional families to which they belong. GASPS will be combined with Fetrow's methods to create sequence and structural motifs for automated clustering of SFLD data. The core elements of the method include a motif-generating technology called "Fuzzy Functional Forms", (FFF), implemented by the tool Protein Active Site Structure Search (PASSS), and the Deacon Active Site Profiler (DASP) which uses three-dimensional, or structure-based, active-site profiling to identify residues located in the spatial environment around the active site. PASSS uses the FFF technology, describing a proteins functional site by the distances between the alpha carbons of three key residues important to the functional site chemistry and the alpha carbons of adjacent residues. Based on the premise that functionally related proteins should have structural similarity at the functional site, PASSS returns related proteins to the starting known functional site. DASP expands on this, extracting the residues that are found in the vicinity of the key residues for each protein, creating motifs from these fragments, and using these fragments to search all sequences in a database to return proteins that may share this function. Use of these tools together, and in an iterative fashion, provides a quick method to putatively functionally characterize both structures and sequences. Preliminary results from this project show exceptional accuracy in distinguishing functionally diverse families in the enolase and the kinase superfamily. The former is one of the annotated superfamilies in the SFLD that serves as a challenging test system for this type of automated effort.
This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. Primary support for the subproject and the subproject's principal investigator may have been provided by other sources, including other NIH sources. The Total Cost listed for the subproject likely represents the estimated amount of Center infrastructure utilized by the subproject, not direct funding provided by the NCRR grant to the subproject or subproject staff. A major unsolved problem for structure-function linkage using computational prediction is that while we can accurately cluster protein sequences and structures with good statistical significance based on many types of similarity metrics, how those clusters link to functional classes is not clear. Although simple approaches such as ortholog prediction can achieve good results for sequences that are closely similar or that contain readily identifiable motifs that distinguish functional classes, for many protein superfamilies successful prediction is far from trivial. This is the case for the functionally diverse superfamilies in the SFLD. These are homologous sets of enzymes that carry out different chemical transformations, using different substrates, but all share a specific chemical functionality or partial reaction. The main purpose of the SFLD is to aid researchers in the curation of these types of superfamilies, to help in the identification of new members of these superfamilies, and to provide an explicit structure-function mapping for these enzymes. Because the different functional families in a given superfamily look similar but perform different specific reactions, they are difficult to annotate and easy to misannotate, showing levels of misannotation as high as 80% in the archival databases Genbank NR and TrEMBL. Because sequence information is still coming available in large volumes, automated methods are required to update the SFLD superfamilies with newly determined sequences and assign them to the appropriate functional families. Clearly, improved methods for achieving these functional assignments are urgently needed. Development of an approach to achieve this has been a major focus of the RBVI in collaboration with the group of Prof. Jacquelyn Fetrow of Wake Forest University. The active site profiling methods developed by Dr. Fetrow have now been integrated with an approach developed in the Babbitt lab, Genetic Algorithm Search for Patterns in Structures: GASPS, to automatically determine 3D templates capable of distinguishing new superfamily members for the purpose of automatically assigning sequences to the specific functional families to which they belong. GASPS will be combined with Fetrow's methods to create sequence and structural motifs for automated clustering of SFLD data. The core elements of the method include a motif-generating technology called "Fuzzy Functional Forms", (FFF), implemented by the tool Protein Active Site Structure Search (PASSS), and the Deacon Active Site Profiler (DASP) which uses three-dimensional, or structure-based, active-site profiling to identify residues located in the spatial environment around the active site. PASSS uses the FFF technology, describing a proteins functional site by the distances between the alpha carbons of three key residues important to the functional site chemistry and the alpha carbons of adjacent residues. Based on the premise that functionally related proteins should have structural similarity at the functional site, PASSS returns related proteins to the starting known functional site. DASP expands on this, extracting the residues that are found in the vicinity of the key residues for each protein, creating motifs from these fragments, and using these fragments to search all sequences in a database to return proteins that may share this function. Use of these tools together, and in an iterative fashion, provides a quick method to putatively functionally characterize both structures and sequences. Preliminary results from this project show exceptional accuracy in distinguishing functionally diverse families in the enolase and the kinase superfamily. The former is one of the annotated superfamilies in the SFLD that serves as a challenging test system for this type of automated effort.

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

PATRICIA CLEMENT BABBITT其他文献

PATRICIA CLEMENT BABBITT的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('PATRICIA CLEMENT BABBITT', 18)}}的其他基金

THE STRUCTURE-FUNCTION LINKAGE DATABASE
结构-功能联系数据库
  • 批准号:
    8363588
  • 财政年份:
    2011
  • 资助金额:
    $ 1.68万
  • 项目类别:
LAYING THE FOUNDATIONS FOR GENOMIC ENZYMOLOGY
为基因组酶学奠定基础
  • 批准号:
    8363593
  • 财政年份:
    2011
  • 资助金额:
    $ 1.68万
  • 项目类别:
ACTIVE SITE SIGNATURES FOR AUTOMATIC UPDATES OF SFLD SUPERFAMILIES
用于 SFLD 超家族自动更新的活动站点签名
  • 批准号:
    8363621
  • 财政年份:
    2011
  • 资助金额:
    $ 1.68万
  • 项目类别:
ENZYME ACTIVE SITE TEMPLATES
酶活性位点模板
  • 批准号:
    8363587
  • 财政年份:
    2011
  • 资助金额:
    $ 1.68万
  • 项目类别:
A COMPUTATIONAL ATLAS OF THE T BRUCEI DEGRADOME AS A GUIDE TO DRUG DISCOVERY
布鲁斯氏菌降解组的计算图谱作为药物发现的指南
  • 批准号:
    8363620
  • 财政年份:
    2011
  • 资助金额:
    $ 1.68万
  • 项目类别:
ACTIVE SITE SIGNATURES FOR SFLD: KINASE SUPERFAMILY
SFLD 的活性位点特征:激酶超家族
  • 批准号:
    8363628
  • 财政年份:
    2011
  • 资助金额:
    $ 1.68万
  • 项目类别:
ENZYME FUNCTION INITIAVE
酶功能倡议
  • 批准号:
    8363638
  • 财政年份:
    2011
  • 资助金额:
    $ 1.68万
  • 项目类别:
ENZYME ACTIVE SITE TEMPLATES
酶活性位点模板
  • 批准号:
    8170507
  • 财政年份:
    2010
  • 资助金额:
    $ 1.68万
  • 项目类别:
ACTIVE SITE SIGNATURES FOR SFLD: ENOLASE SUPERFAMILY
SFLD 的活性位点特征:烯醇酶超家族
  • 批准号:
    8170567
  • 财政年份:
    2010
  • 资助金额:
    $ 1.68万
  • 项目类别:
ROADMAP FOR DRUG DISCOVERY IN SMALL MOLECULE METABOLISM
小分子代谢药物发现路线图
  • 批准号:
    8170555
  • 财政年份:
    2010
  • 资助金额:
    $ 1.68万
  • 项目类别:

相似海外基金

Cerebral infarction treatment strategy using collagen-like "triple helix peptide" containing functional amino acid sequence
含功能氨基酸序列的类胶原“三螺旋肽”治疗脑梗塞策略
  • 批准号:
    23K06972
  • 财政年份:
    2023
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Establishment of a screening method for functional microproteins independent of amino acid sequence conservation
不依赖氨基酸序列保守性的功能性微生物蛋白筛选方法的建立
  • 批准号:
    23KJ0939
  • 财政年份:
    2023
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Effects of amino acid sequence and lipids on the structure and self-association of transmembrane helices
氨基酸序列和脂质对跨膜螺旋结构和自缔合的影响
  • 批准号:
    19K07013
  • 财政年份:
    2019
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Construction of electron-transfer amino acid sequence probe with an interaction for protein and cell
蛋白质与细胞相互作用的电子转移氨基酸序列探针的构建
  • 批准号:
    16K05820
  • 财政年份:
    2016
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Development of artificial antibody of anti-bitter taste receptor using random amino acid sequence library
利用随机氨基酸序列库开发抗苦味受体人工抗体
  • 批准号:
    16K08426
  • 财政年份:
    2016
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
The aa15-17 amino acid sequence in the terminal protein domain of HBV polymerase as a viral factor affect-ing in vivo as well as in vitro replication activity of the virus.
HBV聚合酶末端蛋白结构域中的aa15-17氨基酸序列作为影响病毒体内和体外复制活性的病毒因子。
  • 批准号:
    25461010
  • 财政年份:
    2013
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Amino acid sequence analysis of fossil proteins using mass spectrometry
使用质谱法分析化石蛋白质的氨基酸序列
  • 批准号:
    23654177
  • 财政年份:
    2011
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Challenging Exploratory Research
Precise hybrid synthesis of glycoprotein through amino acid sequence-specific introduction of oligosaccharide followed by enzymatic transglycosylation reaction
通过氨基酸序列特异性引入寡糖,然后进行酶促糖基转移反应,精确杂合合成糖蛋白
  • 批准号:
    22550105
  • 财政年份:
    2010
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Estimating selection on amino-acid sequence polymorphisms in Drosophila
果蝇氨基酸序列多态性选择的估计
  • 批准号:
    NE/D00232X/1
  • 财政年份:
    2006
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Research Grant
Construction of a neural network for detecting novel domains from amino acid sequence information only
构建仅从氨基酸序列信息检测新结构域的神经网络
  • 批准号:
    16500189
  • 财政年份:
    2004
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了