FINDING PROTEIN SEQUENCE MOTIFS--METHODS AND APPLICATIONS
寻找蛋白质序列基序——方法和应用
基本信息
- 批准号:2578634
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:DNA repair biochemical evolution brca gene chemical information system computer assisted sequence analysis computer program /software computer system design /evaluation developmental genetics genetic disorder guanine nucleotide binding protein guanine nucleotide exchange factors methyltransferase protein sequence protein structure function protooncogene statistics /biometry
项目摘要
With the rapid growth of sequence information which greatly
supersedes the rate of accumulation of experimental data on
protein functions, the role of sensitive methods for
protein sequence analysis, including the detection of
subtle but functionally important motifs, is constantly
increasing. The goals of this project include the
development of a coherent strategy for delineating protein
superfamilies and predicting protein function, eventually
aiming at the construction of a comprehensive database of
protein functional motifs. The methods used included
sequence database search with individual sequences (the
programs of the BLAST and FASTA families) and multiple
sequence alignments (HMMer program package that builds
Hidden Markov Models from multiple alignments and applies
them for database screening); methods for detection of
motifs in protein sequences, including those developed at
an earlier stage of this project (programs PAST, CAP, MoST,
GIBBS); multiple sequence alignment methods (programs
MACAW, CLUSTALW); methods for partitioning protein
sequences into predicted globular and non-globular domains
(program SEG with varying parameters); methods for
prediction of protein secondary structure (programs PHD,
COILS), transmembrane domains (PHDhtm), and signal peptides
(Signalp); a method for prediction of coding regions in DNA
based on non-homogeneous Markov models (GeneMark); methods
for clustering proteins by sequence similarity (CLUS).
These methods were combined in a sequence analysis strategy
designed primarily in order to efficiently analyze the
sequences of large, multidomain proteins which comprise the
majority of the products of genes implicated in human
diseases. The protein sequences were first partitioned into
putative globular and non-globular domains, after which
database searches were conducted separately with the
sequences of individual globular domains using a
combination of transitive BLAST searches and motif
analysis. In addition to general purpose sequence
databases, separate, smaller databases were constructed
using information on protein function and/or phylogenetic
origin. Two large data sets, namely the products of genes
involved in animal development and the products of
positionally cloned human disease genes, were analyzed
using these approaches. A variety of previously
uncharacterized but potentially functionally important
domains and motifs were discovered. Two important examples
include a putative FAD-binding domain in the human
choroideremia protein with a modified dinucleotide-binding
consensus which prevented its previous detection,and a
domain designated BRCT, which is conserved in a number of
proteins involved in DNA damage-responsive cell cycle
checkpoints, including the product of the human BRCA1 gene
implicated in hereditary breast and ovarian cancers.
随着序列信息的快速增长,
取代了实验数据的积累速度,
蛋白质功能,敏感方法的作用,
蛋白质序列分析,包括检测
微妙但功能重要的图案,
增加。该项目的目标包括:
一种描述蛋白质的连贯策略的发展
超家族和预测蛋白质功能,最终
旨在建立一个综合性的数据库,
蛋白质功能基序。使用的方法包括
序列数据库搜索与个别序列(
BLAST和FASTA家族的程序)和多个
序列比对(HMMer程序包,
隐马尔可夫模型从多重比对和应用
用于数据库筛选);
蛋白质序列中的基序,包括那些在
该项目的早期阶段(PAST,CAP,MoST,
多序列比对方法(程序
MACAW、CLUSTALW);蛋白质分配方法
预测的球状和非球状结构域的序列
(具有变化参数的程序SEG);
蛋白质二级结构的预测(程序PHD,
COILS)、跨膜结构域(PHDhtm)和信号肽
(Signalp):预测DNA编码区的方法
基于非齐次马尔可夫模型(GeneMark);方法
用于通过序列相似性(CLUS)聚类蛋白质。
这些方法结合在一个序列分析策略
主要是为了有效地分析
大的多结构域蛋白质的序列,其包含
大多数涉及人类基因的产物
疾病首先将蛋白质序列划分为
假定的球状和非球状结构域,之后
数据库检索分别与
单个球状结构域的序列,
传递性BLAST搜索和基序的组合
分析.除通用序列外
数据库,建立了独立的较小数据库,
使用关于蛋白质功能和/或系统发育的信息,
起源两个大数据集,即基因的产物
参与动物的发展和产品
定位克隆的人类疾病基因,进行了分析
使用这些方法。各种以前
未表征但可能具有重要功能
域和图案被发现。两个重要的例子
包括人中推定的FAD结合结构域,
具有修饰的二核苷酸结合的无脉络膜蛋白
共识,防止其以前的检测,和
结构域命名为BRCT,这是保守的,在许多
DNA损伤反应性细胞周期相关蛋白
检查点,包括人类BRCA1基因的产物
与遗传性乳腺癌和卵巢癌有关
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
E V KOONIN其他文献
E V KOONIN的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('E V KOONIN', 18)}}的其他基金
COMPUTER-ASSISTED DISSECTION OF ROLLING CIRCLE DNA REPLICATION
滚环 DNA 复制的计算机辅助解剖
- 批准号:
3845128 - 财政年份:
- 资助金额:
-- - 项目类别:
COMPUTER-ASSISTED STUDY OF FUNCTIONS AND EVOLUTION OF LARGE DNA VIRUS GENOMES
大型 DNA 病毒基因组的功能和进化的计算机辅助研究
- 批准号:
3845124 - 财政年份:
- 资助金额:
-- - 项目类别:
FINDING PROTEIN SEQUENCE MOTIFS--METHODS AND APPLICATIONS
寻找蛋白质序列基序——方法和应用
- 批准号:
5203632 - 财政年份:
- 资助金额:
-- - 项目类别:
FINDING PROTEIN SEQUENCE MOTIFS--METHODS AND APPLICATIONS
寻找蛋白质序列基序——方法和应用
- 批准号:
3759328 - 财政年份:
- 资助金额:
-- - 项目类别:
相似海外基金
A Study of the Biochemical Evolution of the Cephalopods, Based on the Inorganic and Some of the Organic Constituents Of All Their Hard Parts
头足类动物生化进化的研究——基于其所有硬质部分的无机和部分有机成分
- 批准号:
7905730 - 财政年份:1979
- 资助金额:
-- - 项目类别:
Continuing Grant
Biochemical Evolution of Tetrabranchian Cephalopod Hard Parts
四鳃类头足类硬质部件的生化进化
- 批准号:
7603725 - 财政年份:1976
- 资助金额:
-- - 项目类别:
Standard Grant














{{item.name}}会员




