A Database Of Conserved Domain Alignments

保守域比对数据库

基本信息

  • 批准号:
    6843679
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
  • 资助国家:
    美国
  • 起止时间:
  • 项目状态:
    未结题

项目摘要

We are producing a database of expert-curated protein domain alignments, describing sequence and 3D-structure conservation within protein families. These alignments are used to produce position-specific score matrices (PSSMs) that may in turn be used in NCBI's web-based protein classification resources. Links to the Conserved Domain Database (CDD) are made by default from NCBI?s BLAST resource, http://www.ncbi.nlm.nih.gov/BLAST/, and from protein records in NCBI?s PubMed/Entrez browser, http://www.ncbi.nlm.nih.gov/entrez/query.fcgi. Further information about CDD and these search services is available at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml. These servers may be used to identify conserved domains within a protein sequence. They summarize the known functions of family members, using relevant citations from PubMed when possible. They also provide site-specific functional annotation, via sequence and structure alignments and via evidence-based interaction-site features. The CDD alignment project differs from earlier efforts in two fundamental ways: 3D-structure information is used whenever possible to guide alignments, and an explicit hierarchy of families and subfamilies describes the evolutionary history of each domain. When a 3D structure is known within a domain family, this information is used to define a conserved 3D core structure, a set of un-gapped blocks that must be identified in all representative sequences included in the alignment. Representative sequences are aligned to this core structure using threading or structure-based alignment algorithms or, when multiple structures are known, by structure-structure alignment. These procedures assure high alignment accuracy, as needed for accurate transfer of annotation to new family members identified by searching. Explicit hierarchies identify major gene duplication events in the molecular evolution of each family. Our basic strategy is to use domain-sequence clustering methods together with known domain architecture and phylogeny to identify what appear to be ancient orthology groups. These define explicitly annotated "children" of the overall "parent" alignment, and in turn provide more specific functional annotation. The CDD project employs a high level of automation, to produce structure-based alignments, to identify candidate orthology groups, to update CDD alignments with new sequences and structures, and to "publish" the results to web servers. These algorithms and associated software required are described under another project, "Alignment methods for a conserved domain database". This project describes human-expert curation of CDD alignments. The role of the CDD curators is multifaceted. They first of all must survey relevant scientific literature, to produce concise summaries of the known functions of each domain family and to choose citations useful to users of NCBI's web-based classification resources. Curators must also examine the results of automated sequence and structure comparison to infer the location of conserved core blocks, an iterative process that requires judgment with respect to elimination of incomplete or erroneous sequence and structure data. Curators must also identify apparent orthology groups, based on the consensus of results from alternative molecular evolution and clustering methods. The CDD curation project is new, and results over this year consist primarily of recruiting and training PhD biologists as CDD curators. Nonetheless, this group has produced several hundred curated CDD families which are now available via NCBI's protein classification servers.
我们正在制作一个专家策划的蛋白质结构域比对数据库,描述蛋白质家族内的序列和3d结构守恒。这些比对被用于产生位置特异性评分矩阵(pssm),该矩阵反过来可用于NCBI基于网络的蛋白质分类资源。到保守域数据库(CDD)的链接默认由NCBI?s BLAST资源http://www.ncbi.nlm.nih.gov/BLAST/,以及NCBI的蛋白质记录?PubMed/Entrez浏览器,http://www.ncbi.nlm.nih.gov/entrez/query.fcgi。有关CDD和这些搜索服务的更多信息,请访问http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml。这些服务器可用于识别蛋白质序列中的保守结构域。他们总结了家族成员的已知功能,并尽可能引用PubMed的相关引文。它们还通过序列和结构对齐以及基于证据的交互式现场功能提供特定于站点的功能注释。CDD对齐项目在两个基本方面与早期的工作不同:只要可能,就使用3d结构信息来指导对齐,并且家族和亚家族的显式层次结构描述了每个领域的进化历史。当一个三维结构在一个域族中已知时,该信息用于定义一个保守的3D核心结构,这是一组必须在包括在比对中的所有代表性序列中识别的非间隙块。使用线程或基于结构的对齐算法将代表性序列对齐到该核心结构,或者当多个结构已知时,通过结构-结构对齐。这些程序确保了高对齐精度,因为需要将注释准确地转移到通过搜索确定的新家庭成员。明确的层次结构确定了每个家族分子进化中的主要基因复制事件。我们的基本策略是使用域序列聚类方法以及已知的域结构和系统发育来识别似乎是古老的同源群。这些定义了整体“父”对齐的显式注释的“子”,并反过来提供更具体的功能注释。CDD项目采用高水平的自动化,以产生基于结构的比对,识别候选的同源组,用新的序列和结构更新CDD比对,并将结果“发布”到web服务器。这些算法和所需的相关软件在另一个项目“保守域数据库的对齐方法”中进行了描述。这个项目描述了人类专家对CDD校准的管理。CDD策展人的角色是多方面的。他们首先必须调查相关的科学文献,对每个领域族的已知功能进行简明的总结,并选择对NCBI基于网络的分类资源的用户有用的引文。管理员还必须检查自动序列和结构比较的结果,以推断保守核心块的位置,这是一个迭代过程,需要对消除不完整或错误的序列和结构数据进行判断。策展人还必须根据不同分子进化和聚类方法的一致结果,确定明显的同源群。CDD策展项目是新的,今年的成果主要包括招募和培训生物学博士作为CDD策展人。尽管如此,这个小组已经产生了几百个精心策划的CDD家族,这些家族现在可以通过NCBI的蛋白质分类服务器获得。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

STEPHEN H. BRYANT其他文献

STEPHEN H. BRYANT的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('STEPHEN H. BRYANT', 18)}}的其他基金

PROTEIN 3D STRUCTURE COMPARISON
蛋白质 3D 结构比较
  • 批准号:
    6111066
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:
Internet Resources for Structural Bioinformatics
结构生物信息学的互联网资源
  • 批准号:
    6554463
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:
PROTEIN 3D STRUCTURE COMPARISON
蛋白质 3D 结构比较
  • 批准号:
    6432751
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:
INTERNET RESOURCES FOR MOLECULAR 3D STRUCTURE
分子 3D 结构的互联网资源
  • 批准号:
    6111065
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:
INTERNET RESOURCES FOR MOLECULAR 3D STRUCTURE
分子 3D 结构的互联网资源
  • 批准号:
    6290484
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:
Internet Resources For Structural Bioinformatics
结构生物信息学的互联网资源
  • 批准号:
    6843566
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:
Comparative Analysis of Protein 3-Dimensional Structure
蛋白质三维结构的比较分析
  • 批准号:
    6988454
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:
INTERNET RESOURCES FOR MOLECULAR 3D STRUCTURE
分子 3D 结构的互联网资源
  • 批准号:
    6432750
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:
Comparative Analysis of Protein 3 Dimensional Structure
蛋白质三维结构的比较分析
  • 批准号:
    6554464
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:
A Database Of Conserved Domain Alignments
保守域比对数据库
  • 批准号:
    6681403
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:

相似海外基金

BIOCHEMICAL EVOLUTION OF ISCHEMIC BRAIN DAMAGE
缺血性脑损伤的生化演变
  • 批准号:
    3396869
  • 财政年份:
    1980
  • 资助金额:
    --
  • 项目类别:
BIOCHEMICAL EVOLUTION OF ISCHEMIC BRAIN DAMAGE
缺血性脑损伤的生化演变
  • 批准号:
    3396868
  • 财政年份:
    1980
  • 资助金额:
    --
  • 项目类别:
BIOCHEMICAL EVOLUTION OF ISCHEMIC BRAIN DAMAGE
缺血性脑损伤的生化演变
  • 批准号:
    3396863
  • 财政年份:
    1980
  • 资助金额:
    --
  • 项目类别:
BIOCHEMICAL EVOLUTION OF ISCHEMIC BRAIN DAMAGE
缺血性脑损伤的生化演变
  • 批准号:
    3396867
  • 财政年份:
    1980
  • 资助金额:
    --
  • 项目类别:
A Study of the Biochemical Evolution of the Cephalopods, Based on the Inorganic and Some of the Organic Constituents Of All Their Hard Parts
头足类动物生化进化的研究——基于其所有硬质部分的无机和部分有机成分
  • 批准号:
    7905730
  • 财政年份:
    1979
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Biochemical Evolution of Tetrabranchian Cephalopod Hard Parts
四鳃类头足类硬质部件的生化进化
  • 批准号:
    7603725
  • 财政年份:
    1976
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
BIOCHEMICAL EVOLUTION
生化进化
  • 批准号:
    7243349
  • 财政年份:
    1972
  • 资助金额:
    --
  • 项目类别:
BIOCHEMICAL EVOLUTION
生化进化
  • 批准号:
    7137899
  • 财政年份:
    1971
  • 资助金额:
    --
  • 项目类别:
Biochemical Evolution
生化进化
  • 批准号:
    6928751
  • 财政年份:
    1969
  • 资助金额:
    --
  • 项目类别:
Biochemical Evolution
生化进化
  • 批准号:
    67B5303
  • 财政年份:
    1967
  • 资助金额:
    --
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了