权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

BBSRC-NSF/BIO - Expanding fold library in the twilight zone to facilitate structure determination of macromolecular machines

BBSRC-NSF/BIO - 扩展暮光区的折叠库以促进大分子机器的结构测定

基本信息

批准号：
BB/S017135/1
负责人：
Sameer Velankar
金额：
$ 43万
依托单位：
European Bioinformatics Institute
依托单位国家：
英国
项目类别：
Research Grant
财政年份：
2019
资助国家：
英国
起止时间：
2019 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=BB%2FS017135%2F1
关键词：
BBSRC NSF BIO Expanding fold

项目摘要

The Protein Data Bank (PDB) is the single global archive of three-dimensional (3D) structures of large biological molecules. PDBe (pdbe.org) is the European partner in the global consortium managing the PDB. PDB is one of the oldest biological archives, with 144,000+ entries and nearly 2 million downloads daily by users worldwide in academic or industry settings, working on topics ranging from food security, human health through to design of more efficient enzymes in various aspects of biotechnology. Despite a steady increase in its holdings (13,000+ entries added in 2017), the growth of the PDB is far outstripped by the growth in the available protein sequence data. Resources like Genome3D (genome3d.eu), funded by the BBSRC, aim to fill the gap in structure coverage of the protein sequence space with reliable predictions of structures. This resource combines data from a number of UK and overseas groups who apply complementary methods for protein structure prediction. These approaches largely model proteins that are closely related to a protein of known structure (ie the protein relatives share more than 30% identical residues in their sequences). The Rosetta method for predicting protein structures, a world-leading approach developed by the Baker lab in the USA, was recently enhanced with information derived from evolutionary analyses of protein sequence data, yielding reliable models even for cases where sequence identity between the model and the available experimental structures is very low (below 30%). We will integrate Rosetta models into Genome3D to expand the coverage of structural data for important organisms for health (e.g. human) and food security (e.g. wheat).This project will also enrich both the experimentally determined and computationally predicted structures with valuable functional annotations, such as information pertaining to surface interfaces, a key ingredient in understanding how proteins interact with each other and with other biological molecules. By focussing on proteins dissimilar to those with known structures, this portal will help fill the gaps in structure coverage of the protein sequence space and will make structure data much more readily available and accessible. Finally, novel visualisation tools integrating the presentation of the predicted and experimentally determined structures will be developed, maintaining a clear distinction between what is predicted and what is experimentally determined. The expanded set of 3D models derived from this project will in turn help to expand the coverage of sequence space even further, since these models can be used to guide the experimental determination of protein structures being obtained by powerful new structural biology techniques like cryo-Electron Microscopy (EM). This project will also endeavour, where possible, to improve the assembly of individual protein structures into macromolecular complexes which can be analysed to determine their biological role. We anticipate that scientists in both academia and industrial sectors (e.g. pharmaceutical companies) will benefit from access to such an integrated portal, assisting them in designing new medicines, understanding the mechanism of disease, or in designing proteins with novel properties. Recent "resolution revolution" in Electron Microscopy allows near routine determination of structures of large molecular machines, and is in need of a large repertoire of "building blocks" in interpreting the experimental results, a need which will be partially addressed by the new portal and its provision of expanded domain structure libraries. The portal will also have ways to access the assembled data programmatically, benefiting power users: software developers and maintainers of other resources.

蛋白质数据库（PDB）是大型生物分子三维（3D）结构的单一全球存档。PDBe（pdbe.org）是管理PDB的全球联盟的欧洲合作伙伴。PDB是最古老的生物档案之一，拥有144，000多个条目，全球学术或行业用户每天下载近200万次，研究主题从食品安全，人类健康到设计生物技术各个方面的更有效的酶。尽管其持有量稳步增加（2017年增加了13，000多个条目），但PDB的增长远远超过了可用蛋白质序列数据的增长。由BBSRC资助的Genome 3D（genome3d.eu）等资源旨在通过可靠的结构预测来填补蛋白质序列空间结构覆盖方面的差距。该资源结合了来自许多英国和海外团体的数据，这些团体应用互补方法进行蛋白质结构预测。这些方法主要模拟与已知结构的蛋白质密切相关的蛋白质（即蛋白质亲属在其序列中共享超过30%的相同残基）。用于预测蛋白质结构的Rosetta方法是由美国Baker实验室开发的世界领先的方法，最近利用来自蛋白质序列数据的进化分析的信息进行了增强，即使在模型和可用实验结构之间的序列同一性非常低（低于30%）的情况下，也可以产生可靠的模型。我们将把Rosetta模型集成到Genome3D中，以扩大重要健康生物体结构数据的覆盖范围。（如人的）和粮食安全（例如小麦）。该项目还将丰富实验确定和计算预测的结构，并提供有价值的功能注释，例如与表面界面有关的信息，这是理解蛋白质如何相互作用以及与其他生物分子相互作用的关键因素。通过关注与已知结构不同的蛋白质，该门户将有助于填补蛋白质序列空间结构覆盖范围的空白，并使结构数据更容易获得和访问。最后，新的可视化工具集成的预测和实验确定的结构的演示将被开发，保持什么是预测和什么是实验确定之间的明确区别。从该项目衍生的扩展的3D模型集反过来将有助于进一步扩大序列空间的覆盖范围，因为这些模型可用于指导通过强大的新结构生物学技术（如冷冻电子显微镜（EM））获得的蛋白质结构的实验测定。该项目还将尽可能努力改进将单个蛋白质结构组装成大分子复合物的工作，对这种复合物进行分析，以确定其生物作用。我们预计，学术界和工业部门（如制药公司）的科学家将受益于访问这样一个综合门户网站，帮助他们设计新药，了解疾病的机制，或设计具有新特性的蛋白质。最近的“分辨率革命”，在电子显微镜允许近常规的大型分子机器的结构测定，并在解释实验结果，需要一个大的剧目的“积木”，这将部分解决新的门户网站和其提供的扩展域结构库的需要。该门户网站还将有方法以编程方式访问组装的数据，使高级用户受益：软件开发人员和其他资源的维护者。

项目成果

期刊论文数量（3）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Characterizing and explaining impact of disease-associated mutations in proteins without known structures or structural homologues

DOI：
10.1101/2021.11.17.468998
发表时间：
2021-11
期刊：
bioRxiv
影响因子：
0
作者：
Neeladri Sen;I. Anishchenko;N. Bordin;I. Sillitoe;S. Velankar;D. Baker;C. Orengo
通讯作者：
Neeladri Sen;I. Anishchenko;N. Bordin;I. Sillitoe;S. Velankar;D. Baker;C. Orengo

PDBe and PDBe-KB: Providing high-quality, up-to-date and integrated resources of macromolecular structures to support basic and applied research and education.

DOI：
10.1002/pro.4439
发表时间：
2022-10
期刊：
PROTEIN SCIENCE
影响因子：
8
作者：
Varadi, Mihaly;Anyango, Stephen;Appasamy, Sri Devan;Armstrong, David;Bage, Marcus;Berrisford, John;Choudhary, Preeti;Bertoni, Damian;Deshpande, Mandar;Leines, Grisell Diaz;Ellaway, Joseph;Evans, Genevieve;Gaborova, Romana;Gupta, Deepti;Gutmanas, Aleksandras;Harrus, Deborah;Kleywegt, Gerard J.;Bueno, Weslley Morellato;Nadzirin, Nurul;Nair, Sreenath;Pravda, Lukas;Afonso, Marcelo Querino Lima;Sehnal, David;Tanweer, Ahsan;Tolchard, James;Abrams, Charlotte;Dunlop, Roisin;Velankar, Sameer
通讯作者：
Velankar, Sameer

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Sameer Velankar其他文献

Interactive 3D Macromolecular Structure Data Mining with MolQL and Litemol Suite

DOI：
10.1016/j.bpj.2017.11.308
发表时间：
2018-02-02
期刊：
Conference abstract
影响因子：
作者：
David Sehnal;Mandar Deshpande;Alexander Rose;Lukas Pravda;Adam Midlik;Radka Svobodová Vařeková;Saqib Mir;Karel Berka;Sameer Velankar;Jaroslav Koca
通讯作者：
Jaroslav Koca