Detecting Homology in the "Twilight Zone" of Sequence Similarity

检测序列相似性“暮光区”的同源性

基本信息

  • 批准号:
    8055951
  • 负责人:
  • 金额:
    $ 23.25万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2009
  • 资助国家:
    美国
  • 起止时间:
    2009-04-10 至 2013-03-31
  • 项目状态:
    已结题

项目摘要

DESCRIPTION (provided by applicant): The `protein problem' has remained unsolved despite decades of research [1, 2]. In principle, one expects that the primary amino acid sequence of a protein determines its structure, function, and evolutionary (SF&E) characteristics. Yet, there still is no reliable method for predicting the native state structure of a protein and its function given only its sequence. In addition, inferring the evolutionary relationships among highly divergent protein sequences is a daunting task. In general, when pairwise sequence alignments between protein sequences fall below 25% identity, statistical measurements do not provide support robust enough to identify clear phylogenetic relationships despite intensive research in this area [1, 3, 4]. The recent explosion in the availability of knowledge bases and computational techniques for the analysis of complex data has created an unprecedented opportunity for teasing out invaluable information from protein sequences. Starting with a basic premise that protein sequence encodes information about SF&E, we developed a unified framework for inferring SF&E from sequence information using a knowledge-based approach in which we measure the similarity between a query sequence and a set of biologically relevant profiles in an unbiased manner. Results from this Gestalt Domain Detection Algorithm-Basic Local Alignment Tool (GDDA-BLAST) provide phylogenetic profiles that have the capacity to model SF&E relationships of various proteins. Indeed, GDDA-BLAST is capable of deriving deep phylogenetic relationships for highly divergent proteins in a quantifiable manner [5, 6]. Preliminary results from our computational case study of the highly divergent family of retroelements accord with those previously reported, and demonstrate that GDDA-BLAST measurements can be treated as "fingerprints" that can be used to derive distance estimates and hence phylogenetic relationships without prior information, multiple sequence alignment, or manual editing. We propose that sequence information present within the "twilight zone" of sequence similarity can provide key insight into SF&E relationships among distantly related and/or rapidly evolving proteins. This proposal aims to push our limits of detecting homology within the "twilight zone" of sequence similarity by evaluating and optimizing GDDA-BLAST performance on benchmark and experimental data sets. Armed with these refined GDDA- BLAST measurements we propose to conduct a comprehensive, ab initio, phylogenetic study of retroelements and RNA dependent RNA polymerases from the positive-strand family of RNA viruses (+ssRNA). Simultaneously we will derive high-resolution maps of domain boundaries and empirically validate functional annotations and predictions of key residues for those activities. This work aims to perform translational research from the computer to the laboratory bench top. We expect that the tools and resources generated from this grant will be accessible and user-friendly to the bench scientist, thereby speeding the discovery process of other clinically relevant research endeavors. PUBLIC HEALTH RELEVANCE: The long-term implication of this proposal is the development of a unified framework for high-resolution and simultaneous measurements of structure, function, and evolution. Should this be possible: (i) functional and evolutionary measurements could quantitatively inform structural modeling to derive accurate atomic resolution protein structures, (ii) structural and functional measurements could inform evolutionary histories to derive accurate evolutionary rates, deep-branch relationships, and homologous spaces within each protein, and (iii) structural and evolutionary measures would inform as to the location of functionalities contained within any protein and the regulatory elements which control these functions. Armed with this information, the speeds at which diseases could be understood and pharmacophores/therapies developed to combat them would likely increase dramatically.
描述(由申请人提供):尽管经过数十年的研究,“蛋白质问题”仍然没有解决[1,2]。原则上,人们期望蛋白质的一级氨基酸序列决定其结构、功能和进化(SF&E)特征。然而,仍然没有可靠的方法来预测蛋白质的天然状态结构及其功能,仅给出其序列。此外,推断高度不同的蛋白质序列之间的进化关系是一项艰巨的任务。一般来说,当蛋白质序列之间的成对序列比对低于25%的同一性时,尽管在该领域进行了深入的研究,但统计测量并不能提供足够的支持来确定明确的系统发育关系[1,3,4]。近年来,知识库和计算技术在复杂数据分析中的应用激增,为从蛋白质序列中提取宝贵信息创造了前所未有的机会。从蛋白质序列编码SF&E信息的基本前提开始,我们开发了一个统一的框架,用于使用基于知识的方法从序列信息推断SF&E,在该方法中,我们以无偏的方式测量查询序列和一组生物相关配置文件之间的相似性。从这个完形结构域检测算法基本局部比对工具(GDDA-BLAST)的结果提供了系统发育概况,有能力模拟各种蛋白质的SF&E关系。事实上,GDDA-BLAST能够以可量化的方式推导高度不同的蛋白质的深度系统发育关系[5,6]。从我们的计算的情况下研究的retroelements的高度分歧的家庭的初步结果雅阁与以前报道的,并证明GDDA-BLAST测量可以被视为“指纹”,可以用来获得距离估计,因此没有先验信息,多序列比对,或手动编辑的系统发育关系。我们建议,序列相似性的“模糊区”内存在的序列信息可以提供关键的洞察SF&E关系之间的远亲和/或快速进化的蛋白质。该建议旨在通过评估和优化GDDA-BLAST在基准和实验数据集上的性能,推动我们在序列相似性的“模糊区”内检测同源性的极限。有了这些精确的GDDA-BLAST测量,我们建议对来自RNA病毒正链家族(+ssRNA)的逆转录元件和RNA依赖性RNA聚合酶进行全面的从头算系统发育研究。同时,我们将获得高分辨率的结构域边界图,并根据经验验证这些活动的关键残基的功能注释和预测。这项工作的目的是从计算机到实验室工作台进行转化研究。我们希望从这笔赠款产生的工具和资源将是可访问的和用户友好的实验室科学家,从而加快其他临床相关的研究工作的发现过程。公共卫生关系:这一建议的长期影响是发展一个统一的框架,用于高分辨率和同时测量结构,功能和进化。如果可能的话:(i)功能和进化测量可以定量地告知结构建模以获得精确的原子分辨率蛋白质结构,(ii)结构和功能测量可以告知进化历史以获得精确的进化速率、深分支关系和每个蛋白质内的同源空间,以及(iii)结构和进化测量将告知关于包含在任何蛋白质内的功能的位置以及控制这些功能的调节元件。有了这些信息,了解疾病和开发药效团/治疗方法的速度可能会大大提高。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(1)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

RANDEN LEE PATTERSON其他文献

RANDEN LEE PATTERSON的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('RANDEN LEE PATTERSON', 18)}}的其他基金

Detecting Homology in the "Twilight Zone" of Sequence Similarity
检测序列相似性“暮光区”的同源性
  • 批准号:
    8243153
  • 财政年份:
    2009
  • 资助金额:
    $ 23.25万
  • 项目类别:
Detecting Homology in the "Twilight Zone" of Sequence Similarity
检测序列相似性“暮光区”的同源性
  • 批准号:
    7799248
  • 财政年份:
    2009
  • 资助金额:
    $ 23.25万
  • 项目类别:
Detecting Homology in the "Twilight Zone" of Sequence Similarity
检测序列相似性“暮光区”的同源性
  • 批准号:
    8288082
  • 财政年份:
    2009
  • 资助金额:
    $ 23.25万
  • 项目类别:
The Identity/Role of IP3 Receptor Associated Proteins
IP3 受体相关蛋白的身份/作用
  • 批准号:
    6446438
  • 财政年份:
    2002
  • 资助金额:
    $ 23.25万
  • 项目类别:
The Identity/Role of IP3 Receptor Associated Proteins
IP3 受体相关蛋白的身份/作用
  • 批准号:
    6660382
  • 财政年份:
    2002
  • 资助金额:
    $ 23.25万
  • 项目类别:
The Identity/Role of IP3 Receptor Associated Proteins
IP3 受体相关蛋白的身份/作用
  • 批准号:
    6643309
  • 财政年份:
    2002
  • 资助金额:
    $ 23.25万
  • 项目类别:

相似海外基金

Cerebral infarction treatment strategy using collagen-like "triple helix peptide" containing functional amino acid sequence
含功能氨基酸序列的类胶原“三螺旋肽”治疗脑梗塞策略
  • 批准号:
    23K06972
  • 财政年份:
    2023
  • 资助金额:
    $ 23.25万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Establishment of a screening method for functional microproteins independent of amino acid sequence conservation
不依赖氨基酸序列保守性的功能性微生物蛋白筛选方法的建立
  • 批准号:
    23KJ0939
  • 财政年份:
    2023
  • 资助金额:
    $ 23.25万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Effects of amino acid sequence and lipids on the structure and self-association of transmembrane helices
氨基酸序列和脂质对跨膜螺旋结构和自缔合的影响
  • 批准号:
    19K07013
  • 财政年份:
    2019
  • 资助金额:
    $ 23.25万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Construction of electron-transfer amino acid sequence probe with an interaction for protein and cell
蛋白质与细胞相互作用的电子转移氨基酸序列探针的构建
  • 批准号:
    16K05820
  • 财政年份:
    2016
  • 资助金额:
    $ 23.25万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Development of artificial antibody of anti-bitter taste receptor using random amino acid sequence library
利用随机氨基酸序列库开发抗苦味受体人工抗体
  • 批准号:
    16K08426
  • 财政年份:
    2016
  • 资助金额:
    $ 23.25万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
The aa15-17 amino acid sequence in the terminal protein domain of HBV polymerase as a viral factor affect-ing in vivo as well as in vitro replication activity of the virus.
HBV聚合酶末端蛋白结构域中的aa15-17氨基酸序列作为影响病毒体内和体外复制活性的病毒因子。
  • 批准号:
    25461010
  • 财政年份:
    2013
  • 资助金额:
    $ 23.25万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Amino acid sequence analysis of fossil proteins using mass spectrometry
使用质谱法分析化石蛋白质的氨基酸序列
  • 批准号:
    23654177
  • 财政年份:
    2011
  • 资助金额:
    $ 23.25万
  • 项目类别:
    Grant-in-Aid for Challenging Exploratory Research
Precise hybrid synthesis of glycoprotein through amino acid sequence-specific introduction of oligosaccharide followed by enzymatic transglycosylation reaction
通过氨基酸序列特异性引入寡糖,然后进行酶促糖基转移反应,精确杂合合成糖蛋白
  • 批准号:
    22550105
  • 财政年份:
    2010
  • 资助金额:
    $ 23.25万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Estimating selection on amino-acid sequence polymorphisms in Drosophila
果蝇氨基酸序列多态性选择的估计
  • 批准号:
    NE/D00232X/1
  • 财政年份:
    2006
  • 资助金额:
    $ 23.25万
  • 项目类别:
    Research Grant
Construction of a neural network for detecting novel domains from amino acid sequence information only
构建仅从氨基酸序列信息检测新结构域的神经网络
  • 批准号:
    16500189
  • 财政年份:
    2004
  • 资助金额:
    $ 23.25万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了