Structural and Functional Property Integration for Protein Sequence Feature Representations to Enable Advanced Machine Learning Remote Homology Detection

蛋白质序列特征表示的结构和功能属性集成,以实现高级机器学习远程同源性检测

基本信息

  • 批准号:
    0742553
  • 负责人:
  • 金额:
    $ 20万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2007
  • 资助国家:
    美国
  • 起止时间:
    2007-09-15 至 2009-08-31
  • 项目状态:
    已结题

项目摘要

The determination of homology by pair-wise sequence alignments is notoriously insensitive because many proteins with similar structure and function often have only 8-10% sequence identity ? well below the detection threshold required for conventional methods. It has been shown that by transforming protein sequences into vectors of properties associated with sequence and structure can significantly improve the task of finding these remote homologues (proteins with low sequence identity). However, existing feature-based methods are limited to classifying a protein into a family, which means that proteins cannot be classified unless they fall into a pre-defined family. Methods devised to overcome this caveat by assessing pair-wise similarity have primarily relied on network propagation because of the extremely large training space needed for pair-wise training. For example, a small benchmark dataset of 4000 proteins equates to over 8 million pairs. Unfortunately, these network propagation methods have demonstrated only marginal improvement over the state-of-the-art PSI-BLAST method. This is largely because (1) only a small limited number of features are used and (2) the underlying reliance of the network propagation method to a similarity network derived from BLAST scores. Thus, the use of statistical discrimination methods to answer the pair-wise question has remained beyond reach in the homology detection field. This limitation is a serious technological gap for large-scale genome sequencing since automated annotation is not possible without highly reliable homology detection. The development of a biologically-driven integrated protein feature representation will significantly improve the task of remote homology detection. Additionally, the use of a SVM, which only requires a linear computation for the classification task, will offer a fast computation time. These two components ? faster sequence comparisons and improved sensitivity ? will break a long standing time/sensitivity paradigm in the field of remote homology detection. The proposed pair-wise SVM implementation can also be applied to other large real - world diverse science and engineering problems characterized by classification through association. The PI already has a joint faculty appointment to WSU and is currently serving as a committee member for two students. For the proposed work, one additional Ph.D. graduate student from the WSU computer science department will perform thesis work on components of the proposed project, giving her hands-on access to unique supercomputing facilities.
通过成对序列比对来确定同源性是出了名的不敏感,因为许多具有相似结构和功能的蛋白质通常只有8-10%的序列同一性。远低于常规方法所需的检测阈值。研究表明,通过将蛋白质序列转化为与序列和结构相关的性质的载体,可以显著提高寻找这些远程同源物(低序列同一性的蛋白质)的任务。然而,现有的基于特征的方法仅限于将蛋白质分类到一个家族中,这意味着蛋白质除非属于预先定义的家族,否则无法分类。通过评估成对相似度来克服这一警告的方法主要依赖于网络传播,因为成对训练需要极大的训练空间。例如,一个包含4000个蛋白质的小型基准数据集相当于超过800万对蛋白质。不幸的是,与最先进的PSI-BLAST方法相比,这些网络传播方法仅显示出微小的改进。这在很大程度上是因为(1)只使用了有限数量的特征,(2)网络传播方法对BLAST分数衍生的相似性网络的潜在依赖。因此,使用统计判别方法来回答配对问题在同源性检测领域仍然遥不可及。这种限制是大规模基因组测序的一个严重的技术差距,因为如果没有高度可靠的同源性检测,自动注释是不可能的。生物驱动的蛋白质特征集成表示的开发将大大改善远程同源性检测的任务。此外,使用支持向量机,它只需要对分类任务进行线性计算,将提供快速的计算时间。这两个组成部分?更快的序列比较和提高灵敏度?将打破远程同源检测领域长期存在的时间/灵敏度范式。所提出的成对支持向量机实现也可以应用于其他以关联分类为特征的大型现实世界多样性科学和工程问题。PI已经被任命为华盛顿州立大学的联合教员,目前担任两名学生的委员会成员。对于拟议的工作,另外一名来自华盛顿州立大学计算机科学系的博士研究生将对拟议项目的组成部分进行论文工作,使她能够亲自使用独特的超级计算设备。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Bobbie-Jo Webb-Robertson其他文献

Identification of a specific exporter that enables high production of aconitic acid in emAspergillus pseudoterreus/em
鉴定一种特定的外泌体,使其能够在土曲霉中高产量生产乌头酸
  • DOI:
    10.1016/j.ymben.2023.09.011
  • 发表时间:
    2023-11-01
  • 期刊:
  • 影响因子:
    6.800
  • 作者:
    Shuang Deng;Joonhoon Kim;Kyle R. Pomraning;Yuqian Gao;James E. Evans;Beth A. Hofstad;Ziyu Dai;Bobbie-Jo Webb-Robertson;Samantha M. Powell;Irina V. Novikova;Nathalie Munoz;Young-Mo Kim;Marie Swita;Ana L. Robles;Teresa Lemmon;Rylan D. Duong;Carrie Nicora;Kristin E. Burnum-Johnson;Jon Magnuson
  • 通讯作者:
    Jon Magnuson
Fusion of laboratory and textual data for investigative bioforensics
  • DOI:
    10.1016/j.forsciint.2012.12.016
  • 发表时间:
    2013-03-10
  • 期刊:
  • 影响因子:
  • 作者:
    Bobbie-Jo Webb-Robertson;Courtney Corley;Lee Ann McCue;Karen Wahl;Helen Kreuzer
  • 通讯作者:
    Helen Kreuzer
Changes in Placental Lipidomics with Obesity and Gestational Diabetes: Sexual Dimorphism
  • DOI:
    10.1016/j.placenta.2019.06.145
  • 发表时间:
    2019-08-01
  • 期刊:
  • 影响因子:
  • 作者:
    Leslie Myatt;Eric Wang;Kelly Stratton;Lisa Bramer;Bobbie-Jo Webb-Robertson;Jennifer Kyle;Kristin Burnum-Johnson
  • 通讯作者:
    Kristin Burnum-Johnson
Placental Proteomics Reveals Sexually Dimorphic Adaptive Changes to Maternal Obesity and Gestational Diabetes
  • DOI:
    10.1016/j.placenta.2021.07.033
  • 发表时间:
    2021-09-01
  • 期刊:
  • 影响因子:
  • 作者:
    Leslie Myatt;Eric Wang;Lisa Bramer;Kelly Stratton;Sarah Jarman;Bobbie-Jo Webb-Robertson;Yuqian Gao;Carrie Nicora;Ronald Moore;Jennifer Kyle;Kristin Burnum-Johnson
  • 通讯作者:
    Kristin Burnum-Johnson

Bobbie-Jo Webb-Robertson的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

高维数据的函数型数据(functional data)分析方法
  • 批准号:
    11001084
  • 批准年份:
    2010
  • 资助金额:
    16.0 万元
  • 项目类别:
    青年科学基金项目
Multistage,haplotype and functional tests-based FCAR 基因和IgA肾病相关关系研究
  • 批准号:
    30771013
  • 批准年份:
    2007
  • 资助金额:
    30.0 万元
  • 项目类别:
    面上项目

相似海外基金

Development of new INVAR functional materials by clarification of local structure-physical property correlation
通过阐明局部结构-物理性能相关性开发新型INVAR功能材料
  • 批准号:
    23KK0088
  • 财政年份:
    2023
  • 资助金额:
    $ 20万
  • 项目类别:
    Fund for the Promotion of Joint International Research (International Collaborative Research)
Property prediction and functional design of quantified non-integer dimensional nanoscale materials
量化非整数维纳米材料的性能预测和功能设计
  • 批准号:
    23K04544
  • 财政年份:
    2023
  • 资助金额:
    $ 20万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Establishment of thermal property evaluation and performance design method for multi-functional panels utilizing heat insulation and heat storage by wood
木材隔热蓄热多功能板材热性能评价及性能设计方法的建立
  • 批准号:
    23K04145
  • 财政年份:
    2023
  • 资助金额:
    $ 20万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Functional Property Correlations in Reduced Dimensions
降维中的函数属性相关性
  • 批准号:
    RGPAS-2020-00048
  • 财政年份:
    2022
  • 资助金额:
    $ 20万
  • 项目类别:
    Discovery Grants Program - Accelerator Supplements
Structure-functional property-device performance relationships in electronic and energy materials and devices
电子和能源材料及器件中的结构-功能特性-器件性能关系
  • 批准号:
    RGPIN-2022-04640
  • 财政年份:
    2022
  • 资助金额:
    $ 20万
  • 项目类别:
    Discovery Grants Program - Individual
Functional Property Correlations in Reduced Dimensions
降维中的函数属性相关性
  • 批准号:
    RGPIN-2020-04958
  • 财政年份:
    2022
  • 资助金额:
    $ 20万
  • 项目类别:
    Discovery Grants Program - Individual
Elucidation of photo-functional property of oxyhydroxide and application development to environmental purification and energy generation
羟基氧化物光功能特性的阐明及其在环境净化和能源生产中的应用开发
  • 批准号:
    21H01657
  • 财政年份:
    2021
  • 资助金额:
    $ 20万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Functional Property Correlations in Reduced Dimensions
降维中的函数属性相关性
  • 批准号:
    RGPIN-2020-04958
  • 财政年份:
    2021
  • 资助金额:
    $ 20万
  • 项目类别:
    Discovery Grants Program - Individual
Functional Property Correlations in Reduced Dimensions
降维中的函数属性相关性
  • 批准号:
    RGPAS-2020-00048
  • 财政年份:
    2021
  • 资助金额:
    $ 20万
  • 项目类别:
    Discovery Grants Program - Accelerator Supplements
MRI: Acquisition of a Physical Property Measurement System to Study Quantum, Magnetic and Functional Materials and Quantum Devices
MRI:购买物理特性测量系统来研究量子、磁性和功能材料以及量子器件
  • 批准号:
    2018579
  • 财政年份:
    2020
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了