权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Statistics Of Sequence Comparison

序列比较统计

基本信息

批准号：
7594454
负责人：
Stephen F Altschul
金额：
$ 19.42万
依托单位：
NATIONAL LIBRARY OF MEDICINE
依托单位国家：
美国
项目类别：
财政年份：
资助国家：
美国
起止时间：
至
项目状态：
未结题

项目摘要

Work this year included the publication of a new measure of sequence similarity that unifies a traditional measure of alignment similarity with a new measure of compositional similarity. In brief, protein sequence database search programs may be evaluated both for their retrieval accuracy - the ability to separate meaningful from chance similarities - and for the accuracy of their statistical assessments of reported alignments. However, methods for improving statistical accuracy can degrade retrieval accuracy by discarding compositional evidence of sequence relatedness. This evidence may be preserved by combining essentially independent measures of alignment and compositional similarity into a unified measure of sequence similarity. We have studied two measures of compositional similarity, and found that one, when combined with alignment similarity, improves the statistical accuracy of blastp, as well as its retrieval accuracy measured using a SCOP-based test set. Further work this year focussed on developing scoring systems for recognizing correlated positions in multiple sequence alignments. We have built on other published work to confirm that in alignments of simulated sequences with correlated mutations, a form of normalized mutual information (nmi) appears to be the most effective measure. We have studied the mean and standard deviation of nmi for uncorrelated multiple alignment columns as a function of sequence number N and average per position relative entropy h.

今年的工作包括公布一项新的措施，序列相似性统一了传统的一种新的组合相似性测度相似性简而言之，蛋白质序列数据库搜索程序可以是评估两者的检索准确性-能力，将有意义的相似性与偶然的相似性区分开来，他们对报告的比对的统计评估的准确性。然而，提高统计准确性的方法可能会降低通过丢弃成分证据的检索精度序列相关性这些证据可以通过以下方式保存：结合基本上独立的对准测量，组成相似性转化为序列的统一度量相似性我们已经研究了两个组成的措施，相似性，并发现，一个，当结合对齐相似性，提高了blastp的统计精度，以及使用基于SCOP的测试集。今年的进一步工作集中在开发评分系统上用于识别多个序列中的相关位置对齐。我们已经建立在其他已发表的工作，以确认在模拟序列的比对中，突变，一种形式的归一化互信息（NMI）这似乎是最有效的措施。我们研究了不相关的NMI的平均值和标准差作为序列号函数的多个比对列 N和平均每位置相对熵h。