ACTIVE SITE SIGNATURES FOR AUTOMATIC UPDATES OF SFLD SUPERFAMILIES
用于 SFLD 超家族自动更新的活动站点签名
基本信息
- 批准号:8363621
- 负责人:
- 金额:$ 1.68万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2011
- 资助国家:美国
- 起止时间:2011-07-01 至 2012-06-30
- 项目状态:已结题
- 来源:
- 关键词:Active SitesAmino Acid SequenceCarbonChemicalsChemistryCollaborationsDataDatabasesDevelopmentElementsEnvironmentEnzymesFamilyFundingGenbankGenetic ProgrammingGrantImageryInformaticsLinkMapsMethodsMetricNational Center for Research ResourcesOrthologous GenePatternPeptide Sequence DeterminationPhosphotransferasesPrincipal InvestigatorProteinsReactionResearchResearch InfrastructureResearch PersonnelResourcesSiteSourceStructureSystemTechnologyTestingUnited States National Institutes of HealthUniversitiesUpdatebasebiocomputingcostenolaseforestimprovedmemberprotein structuretool
项目摘要
This subproject is one of many research subprojects utilizing the resources
provided by a Center grant funded by NIH/NCRR. Primary support for the subproject
and the subproject's principal investigator may have been provided by other sources,
including other NIH sources. The Total Cost listed for the subproject likely
represents the estimated amount of Center infrastructure utilized by the subproject,
not direct funding provided by the NCRR grant to the subproject or subproject staff.
A major unsolved problem for structure-function linkage using
computational prediction is that while we can accurately cluster protein
sequences and structures with good statistical significance based on many
types of similarity metrics, how those clusters link to functional classes
is not clear. Although simple approaches such as ortholog prediction can
achieve good results for sequences that are closely similar or that
contain readily identifiable motifs that distinguish functional classes,
for many protein superfamilies successful prediction is far from trivial.
This is the case for the functionally diverse superfamilies in the SFLD.
These are homologous sets of enzymes that carry out different chemical
transformations, using different substrates, but all share a specific
chemical functionality or partial reaction. The main purpose of the SFLD
is to aid researchers in the curation of these types of superfamilies, to
help in the identification of new members of these superfamilies, and to
provide an explicit structure-function mapping for these enzymes. Because
the different functional families in a given superfamily look similar but
perform different specific reactions, they are difficult to annotate and
easy to misannotate, showing levels of misannotation as high as 80% in the
archival databases Genbank NR and TrEMBL. Because sequence information is
still coming available in large volumes, automated methods are required to
update the SFLD superfamilies with newly determined sequences and assign
them to the appropriate functional families. Clearly, improved methods for
achieving these functional assignments are urgently needed.
Development of an approach to achieve this has been a major focus of the
RBVI in collaboration with the group of Prof. Jacquelyn Fetrow of Wake
Forest University. The active site profiling methods developed by Dr.
Fetrow have now been integrated with an approach developed in the Babbitt
lab, Genetic Algorithm Search for Patterns in Structures: GASPS, to
automatically determine 3D templates capable of distinguishing new
superfamily members for the purpose of automatically assigning sequences
to the specific functional families to which they belong. GASPS will be
combined with Fetrow's methods to create sequence and structural motifs for
automated clustering of SFLD data. The core elements of the method include
a motif-generating technology called "Fuzzy Functional Forms", (FFF),
implemented by the tool Protein Active Site Structure Search (PASSS), and
the Deacon Active Site Profiler (DASP) which uses three-dimensional, or
structure-based, active-site profiling to identify residues located in the
spatial environment around the active site. PASSS uses the FFF
technology, describing a proteins functional site by the distances between
the alpha carbons of three key residues important to the functional site
chemistry and the alpha carbons of adjacent residues. Based on the
premise that functionally related proteins should have structural
similarity at the functional site, PASSS returns related proteins to the
starting known functional site. DASP expands on this, extracting the
residues that are found in the vicinity of the key residues for each
protein, creating motifs from these fragments, and using these fragments
to search all sequences in a database to return proteins that may share
this function. Use of these tools together, and in an iterative fashion,
provides a quick method to putatively functionally characterize both
structures and sequences.
Preliminary results from this project show exceptional accuracy in
distinguishing functionally diverse families in the enolase and the kinase
superfamily. The former is one of the annotated superfamilies in the SFLD
that serves as a challenging test system for this type of automated
effort.
该子项目是利用资源的众多研究子项目之一
由 NIH/NCRR 资助的中心拨款提供。子项目的主要支持
并且子项目的主要研究者可能是由其他来源提供的,
包括其他 NIH 来源。 子项目可能列出的总成本
代表子项目使用的中心基础设施的估计数量,
NCRR 赠款不直接向子项目或子项目工作人员提供资金。
使用结构-功能链接的一个未解决的主要问题
计算预测是,虽然我们可以准确地聚类蛋白质
基于许多具有良好统计意义的序列和结构
相似性度量的类型,这些集群如何链接到功能类
尚不清楚。尽管直向同源预测等简单方法可以
对于非常相似或相似的序列取得良好的结果
包含易于识别的图案来区分功能类别,
对于许多蛋白质超家族来说,成功的预测绝非易事。
SFLD 中功能多样化的超家族就是这种情况。
这些是执行不同化学反应的同源酶组
转化,使用不同的底物,但都共享一个特定的
化学官能团或部分反应。 SFLD的主要目的
是为了帮助研究人员管理这些类型的超家族,
帮助识别这些超级家族的新成员,并
为这些酶提供明确的结构-功能图谱。 因为
给定超家族中的不同功能家族看起来相似,但是
执行不同的特定反应,它们很难注释和
容易错误注释,错误注释率高达 80%
档案数据库 Genbank NR 和 TrEMBL。因为序列信息是
仍然可以大量使用,需要自动化方法
用新确定的序列更新 SFLD 超家族并分配
他们属于适当的职能家庭。显然,改进的方法
迫切需要实现这些职能分配。
开发实现这一目标的方法一直是
RBVI 与 Wake 的 Jacquelyn Fetrow 教授团队合作
森林大学。 博士开发的活性位点分析方法
费特罗 (Fetrow) 现已与巴比特 (Babbitt) 开发的方法集成
实验室,遗传算法搜索结构模式:GASPS,
自动确定能够区分新的3D模板
用于自动分配序列的超家族成员
他们所属的特定功能家族。全球航空安全计划将
结合费特罗的方法来创建序列和结构基序
SFLD 数据的自动聚类。该方法的核心要素包括
一种称为“模糊功能形式”(FFF)的主题生成技术,
通过蛋白质活性位点结构搜索(PASSS)工具实现,以及
使用三维的 Deacon Active Site Profiler (DASP),或
基于结构的活性位点分析来识别位于
活动地点周围的空间环境。 PASSS 使用 FFF
技术,通过之间的距离来描述蛋白质的功能位点
对功能位点很重要的三个关键残基的α碳
化学和相邻残基的α碳。 基于
前提是功能相关的蛋白质应该具有结构
功能位点的相似性,PASSS 将相关蛋白质返回到
启动已知的功能站点。 DASP 对此进行了扩展,提取了
在每个关键残基附近发现的残基
蛋白质,从这些片段创建基序,并使用这些片段
搜索数据库中的所有序列以返回可能共享的蛋白质
这个功能。 以迭代方式一起使用这些工具,
提供了一种快速方法来推定功能表征两者
结构和序列。
该项目的初步结果显示出极高的准确性
区分烯醇酶和激酶中功能不同的家族
超家族。前者是 SFLD 中注释的超家族之一
对于这种类型的自动化来说,这是一个具有挑战性的测试系统
努力。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
PATRICIA CLEMENT BABBITT其他文献
PATRICIA CLEMENT BABBITT的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('PATRICIA CLEMENT BABBITT', 18)}}的其他基金
ACTIVE SITE SIGNATURES FOR SFLD: ENOLASE SUPERFAMILY
SFLD 的活性位点特征:烯醇酶超家族
- 批准号:
8363627 - 财政年份:2011
- 资助金额:
$ 1.68万 - 项目类别:
A COMPUTATIONAL ATLAS OF THE T BRUCEI DEGRADOME AS A GUIDE TO DRUG DISCOVERY
布鲁斯氏菌降解组的计算图谱作为药物发现的指南
- 批准号:
8363620 - 财政年份:2011
- 资助金额:
$ 1.68万 - 项目类别:
ACTIVE SITE SIGNATURES FOR SFLD: KINASE SUPERFAMILY
SFLD 的活性位点特征:激酶超家族
- 批准号:
8363628 - 财政年份:2011
- 资助金额:
$ 1.68万 - 项目类别:
ACTIVE SITE SIGNATURES FOR SFLD: ENOLASE SUPERFAMILY
SFLD 的活性位点特征:烯醇酶超家族
- 批准号:
8170567 - 财政年份:2010
- 资助金额:
$ 1.68万 - 项目类别:
ROADMAP FOR DRUG DISCOVERY IN SMALL MOLECULE METABOLISM
小分子代谢药物发现路线图
- 批准号:
8170555 - 财政年份:2010
- 资助金额:
$ 1.68万 - 项目类别:
相似海外基金
Cerebral infarction treatment strategy using collagen-like "triple helix peptide" containing functional amino acid sequence
含功能氨基酸序列的类胶原“三螺旋肽”治疗脑梗塞策略
- 批准号:
23K06972 - 财政年份:2023
- 资助金额:
$ 1.68万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Establishment of a screening method for functional microproteins independent of amino acid sequence conservation
不依赖氨基酸序列保守性的功能性微生物蛋白筛选方法的建立
- 批准号:
23KJ0939 - 财政年份:2023
- 资助金额:
$ 1.68万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Effects of amino acid sequence and lipids on the structure and self-association of transmembrane helices
氨基酸序列和脂质对跨膜螺旋结构和自缔合的影响
- 批准号:
19K07013 - 财政年份:2019
- 资助金额:
$ 1.68万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Construction of electron-transfer amino acid sequence probe with an interaction for protein and cell
蛋白质与细胞相互作用的电子转移氨基酸序列探针的构建
- 批准号:
16K05820 - 财政年份:2016
- 资助金额:
$ 1.68万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Development of artificial antibody of anti-bitter taste receptor using random amino acid sequence library
利用随机氨基酸序列库开发抗苦味受体人工抗体
- 批准号:
16K08426 - 财政年份:2016
- 资助金额:
$ 1.68万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The aa15-17 amino acid sequence in the terminal protein domain of HBV polymerase as a viral factor affect-ing in vivo as well as in vitro replication activity of the virus.
HBV聚合酶末端蛋白结构域中的aa15-17氨基酸序列作为影响病毒体内和体外复制活性的病毒因子。
- 批准号:
25461010 - 财政年份:2013
- 资助金额:
$ 1.68万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Amino acid sequence analysis of fossil proteins using mass spectrometry
使用质谱法分析化石蛋白质的氨基酸序列
- 批准号:
23654177 - 财政年份:2011
- 资助金额:
$ 1.68万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Precise hybrid synthesis of glycoprotein through amino acid sequence-specific introduction of oligosaccharide followed by enzymatic transglycosylation reaction
通过氨基酸序列特异性引入寡糖,然后进行酶促糖基转移反应,精确杂合合成糖蛋白
- 批准号:
22550105 - 财政年份:2010
- 资助金额:
$ 1.68万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Estimating selection on amino-acid sequence polymorphisms in Drosophila
果蝇氨基酸序列多态性选择的估计
- 批准号:
NE/D00232X/1 - 财政年份:2006
- 资助金额:
$ 1.68万 - 项目类别:
Research Grant
Construction of a neural network for detecting novel domains from amino acid sequence information only
构建仅从氨基酸序列信息检测新结构域的神经网络
- 批准号:
16500189 - 财政年份:2004
- 资助金额:
$ 1.68万 - 项目类别:
Grant-in-Aid for Scientific Research (C)