权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Intruding into the midnight zone of protein comparisons

闯入蛋白质比较的午夜区域

基本信息

批准号：
6520482
负责人：
BURKHARD ROST
金额：
$ 28.02万
依托单位：
COLUMBIA UNIVERSITY HEALTH SCIENCES
依托单位国家：
美国
项目类别：
财政年份：
2001
资助国家：
美国
起止时间：
2001-04-05 至 2005-03-31
项目状态：
已结题

来源：
https://reporter.nih.gov/project-details/6520482
关键词：
DNA binding protein automated data processing binding sites biochemical evolution chemical chain length computer simulation computer system design /evaluation conformation data management functional /structural genomics membrane proteins molecular biology information system molecular dynamics protein sequence protein structure function proteomics statistics /biometry structural biology

项目摘要

Description (provided by applicant): The 'twilight zone' of protein sequence comparison is the region in which sequence similarity does not suffice to conclude e.g. structural similarity. The vast majority of all protein pairs of similar structure populate a 'midnight zone' i.e. their sequences differ too much for sequence-based comparisons. Here, we propose to refine, extend, and specialise methods combining sequence alignment, structure prediction and functional information. Goal is to unravel hidden similarities in entirely sequenced organisms by a reliable, automatic tool. Towards the end of our project, the sequences for most protein families realised by life will supposedly be available. We hope that our system will correctly detect a relation for most of these. (1) Prediction-based threading combines sequence alignments with predictions of secondary structure and accessibility to find remote similarities. We hope to considerably improve detection and alignment accuracy by comparing families with families rather than single proteins. (2) About one third of all proteins in worm and fly seem to have long regions lacking regular secondary structure. We hope to develop a method tailored to reliably detect and compare such regions. (3) No current method finds similarities between extremely diverged membrane proteins. We propose to develop such a method combining 'membrane threading' with classifications of membrane proteins. (4) Since sequence comparison in the twilight zone and below is an extremely demanding task, most existing methods have very low levels of accuracy. In practice, experts compare aspects of function between the protein pair under investigation. We want to develop an automatic method evaluating functional aspects. In particular, we intend to start with proteins binding to DNA. The tasks will be to (i) predict DNA-binding sites in proteins, and to (ii) restrict the threading to the subset of proteins for which binding regions were found. In the following step, we hope to use general sequence motifs for the automatic comparison. (5) Threading entire genomes: the first task will be to find all proteins in an entire organism for which we know structure. However, the particular edge of our method will be to find remote similarities even in the absence of experimental information about structure.

描述（由申请人提供）：蛋白质序列的“模糊区” 比较是序列相似性不足以例如，结构相似性。绝大多数的蛋白质对类似的结构填充“午夜区”，即它们的序列也不同对于基于序列的比较而言，在这里，我们建议完善，扩展，结合序列比对、结构预测功能信息。目标是揭开隐藏的相似之处，通过可靠的自动工具对生物体进行测序。在我们的项目中，生命所实现的大多数蛋白质家族的序列将应该是可用的。我们希望我们的系统能够正确地检测到对于其中的大多数关系。 (1)基于预测的线程化将序列比对与二级结构和可访问性，以找到远程相似性。我们希望通过比较系列，显著提高检测和比对精度而不是单一的蛋白质。(2)大约三分之一的蛋白质在蠕虫和苍蝇中似乎有长区域缺乏规则的二级结构。我们希望开发出一种方法，能够可靠地检测和比较这种地区(3)目前还没有一种方法能在极端分歧的膜蛋白我们建议开发这样一种方法，与膜蛋白的分类联系在一起。(4)自序列在阴阳魔界及以下地区进行比较是一项极其艰巨的任务，现有的方法具有非常低的精确度。在实践中，专家比较研究中的蛋白质对之间的功能方面。我们想开发一种自动评估功能方面的方法。我们尤其从蛋白质与DNA结合开始。任务是：（一）预测蛋白质中的DNA结合位点，以及（ii）将线程限制在子集在接下来的步骤中，我们希望使用通用序列模体进行自动比较。(5)螺纹整个基因组：第一个任务将是找到整个基因组中的所有蛋白质。我们所知道的结构。然而，我们的特殊优势方法将是找到遥远的相似性，即使在没有实验的情况下，关于结构的信息。