Construction of a neural network for detecting novel domains from amino acid sequence information only
构建仅从氨基酸序列信息检测新结构域的神经网络
基本信息
- 批准号:16500189
- 负责人:
- 金额:$ 2.11万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for Scientific Research (C)
- 财政年份:2004
- 资助国家:日本
- 起止时间:2004 至 2005
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
High throughput structure determination often requires to dissect multi-domain proteins into structural domains that are able to fold independently and their structures readily determinable. Domains are usually identified by sequence similarity to domain databases such as Pfam, Cd or SMART. However, methods that can detect required for detecting novel domains. Such methods, need to predict domains regions solely from the information contained in the amino acid sequence of the protein of interest.. In this project, we report a neural network and preliminary results for an SVM (Support Vector Machine) that identifies domain boundaries in multi-domain proteins :1) Domain linker sequences. We constructed a multi-domain protein database based on SCOP and CATH domain boundary definition. First, we selected domains that do not form inter-domain interactions, as defined by presence of inter-domain VdW, Hbonds and SS-bonds, and are independently foldable. From this set, we further selected domain boundaries that form loops as defined by DSSP. Domain boundaries that fulfilled both conditions were called linkers and used for training and testing the neural network and the SVM.2) We developed a neural network recognizing domain linkers. Cross validation indicates that the prediction efficiency of our neural network is 50〜60%, compared to a random guess that yields a〜10% prediction efficiency.3) In addition to the above neural network, we developed a domain linker prediction based on SVMlight. We observed prediction efficiencies similar to that of the neural network, but the training time was one fifth of that needed for the neural network..
高通量结构测定通常需要将多结构域蛋白分解成能够独立折叠并且其结构易于测定的结构域。结构域通常通过与结构域数据库如Pfam、Cd或SMART的序列相似性来鉴定。然而,可以检测的方法需要检测新的结构域。这样的方法,需要预测结构域区域仅从包含在感兴趣的蛋白质的氨基酸序列中的信息。在这个项目中,我们报告了一个神经网络和SVM(支持向量机)的初步结果,该SVM识别多结构域蛋白质中的结构域边界:1)结构域连接序列。我们构建了一个基于SCOP和CATH结构域边界定义的多结构域蛋白质数据库。首先,我们选择了不形成结构域间相互作用的结构域,如通过结构域间VdW,H键和SS键的存在所定义的,并且是独立可折叠的。从这个集合中,我们进一步选择了形成DSSP定义的环路的域边界。满足这两个条件的域边界被称为连接器,用于训练和测试神经网络和SVM。2)我们开发了一个识别域连接器的神经网络。交叉验证表明,我们的神经网络的预测效率为50 - 60%,而随机猜测的预测效率为10%。3)除了上述神经网络,我们还开发了一个基于SVMlight的域链接器预测。我们观察到预测效率与神经网络相似,但训练时间是神经网络所需时间的五分之一。
项目成果
期刊论文数量(22)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
生物物理ハンドブック(石渡信一、桂勲、桐野豊、美宅成樹 編)
生物物理学手册(石渡真一、桂功、桐野丰、三宅茂树编)
- DOI:
- 发表时间:2006
- 期刊:
- 影响因子:0
- 作者:Chikayama E.;Kurotani A.;Kuroda Y.;Yokoyama S.;黒田裕
- 通讯作者:黒田裕
Bioinformatics Handbook (Edited by Japanese Bioinformatics Society) (Editor Miyuano S. et al.)
生物信息学手册(日本生物信息学会编)(Miyuano S.等主编)
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:Honda;K.;Mori;Y.;Yamamoto;Y.;Yadohisa;H.;Y.Takada;Y.Kuroda
- 通讯作者:Y.Kuroda
Improvement of domain linker prediction by incorporating loop-length-dependent characteristics
- DOI:10.1002/bip.20361
- 发表时间:2006-01-01
- 期刊:
- 影响因子:2.9
- 作者:Tanaka, T;Yokoyama, S;Kuroda, Y
- 通讯作者:Kuroda, Y
ProteoMix : an integrated and flexible system for interactively analyzing large numbers of protein sequences, Bioinformatic
ProteoMix:一个集成且灵活的系统,用于交互式分析大量蛋白质序列、生物信息学
- DOI:
- 发表时间:2004
- 期刊:
- 影响因子:0
- 作者:Chikayama E.;Kurotani A.;Kuroda Y.;Yokoyama S.
- 通讯作者:Yokoyama S.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
KURODA Yutaka其他文献
KURODA Yutaka的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('KURODA Yutaka', 18)}}的其他基金
Novel screening protocol for multi-SS bond proteins using SEP tags and its application to the development of a minimal Luciferase
使用 SEP 标签的多 SS 键蛋白的新型筛选方案及其在最小荧光素酶开发中的应用
- 批准号:
23651213 - 财政年份:2011
- 资助金额:
$ 2.11万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Development of a novel amino acid solubility propensity scale for the calculation of polypeptide solubility
开发用于计算多肽溶解度的新型氨基酸溶解度倾向量表
- 批准号:
21300110 - 财政年份:2009
- 资助金额:
$ 2.11万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Construction of a support vector machine (SVM) for accurately detecting domain boundaries
构建支持向量机(SVM)以准确检测域边界
- 批准号:
18500225 - 财政年份:2006
- 资助金额:
$ 2.11万 - 项目类别:
Grant-in-Aid for Scientific Research (C)