Embracing new technologies to streamline improve and sustain InterPro and its contributing databases
采用新技术来简化、改进和维护 InterPro 及其贡献数据库
基本信息
- 批准号:BB/F010435/1
- 负责人:
- 金额:$ 39.16万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2008
- 资助国家:英国
- 起止时间:2008 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
New DNA sequencing technologies have led to a flood of new data in sequence databases being submitted by individual scientists, genome sequencing projects and metagenomics projects. These sequences enter the databases with little or no annotation, limiting their usefulness to the scientific community. This has inspired the development of new tools for automatic annotation of the encoded protein sequences. One of the most successful developments in this area has been in the production of so-called protein 'signatures', diagnostic methods that are able to characterise newly-determined sequences in terms of the protein families to which they belong and/or the structural or functional domains they contain. Protein signature approaches have been adopted by a number of databases, and ten of the top such resources are integrated into the InterPro database. InterPro, and its accompanying protein analysis software tool, InterProScan, is now one of the leading protein functional classification resources in the world. However, despite its success, InterPro and its partners are currently suffering from a lack of financial support. The level of funding required to maintain and improve a database of this size is often underestimated. The amount of incoming data is increasing exponentially, and databases now struggle to provide their data to the public in a timely way, while at the same time maintaining the necessary high standards of data quality. Moreover, as they become more popular, and user demands increase, these core databases endure mounting pressure not only to keep up with the expanding volume of data and growing community requirements, but also to be early adopters of newly emerging technologies. This proposal aims to resolve these issues by embracing new technologies to enhance and further develop InterPro and its source databases. It aims to streamline production processes both to provide more regular data releases and to better cope with increased volumes of data. With more formalised Consortium activities and coordination thereof, we will make more efficient use of resources and share tasks to ensure long-term sustainability of the databases. Specifically we aim to: - Streamline data production procedures to enable a faster turn-around time for releasing the data; - Develop and integrate new annotation tools and standards to make the rate-limiting annotation step quicker and easier, and share tasks, such as annotation, to remove redundancy in effort; - Work closely together to improve quality-assurance procedures for protein matches; - Coordinate the upgrade of InterProScan and other HMM-based databases to the latest HMMer version; - Improve the InterProScan protein domain-finding software; - Exploit new technologies for database linking and data exchange; and - Extend the functionality of the Web interface to better meet the needs of the user community. The planned improvements to InterProScan and the protein match procedures will improve the quality, as well as the speed of protein functional classification; streamlining the production processes will enable the databases to get new protein domains and families out to the public as soon as they become available. New technologies will facilitate easier linking between different databases, and will provide the public with access to data from different sources. They will also open the door to more complex analyses, by providing improved programmatic access to the data. In addition, these new processes and technologies will allow InterPro and its member databases to cope with the ever-increasing flood of new data and make it accessible to the public in more regular releases. Ultimately, these improvements will make InterPro and its partners easier and more efficient to maintain, paving the way to a more sustainable future and increasing their benefit and usefulness to the scientific community.
新的DNA测序技术已经导致了大量的新数据在序列数据库中被提交的个人科学家,基因组测序项目和宏基因组学项目。这些序列进入数据库时很少或没有注释,限制了它们对科学界的有用性。这激发了新工具的开发,用于自动注释编码的蛋白质序列。这一领域最成功的发展之一是产生所谓的蛋白质“特征”,即能够根据它们所属的蛋白质家族和/或它们所包含的结构或功能结构域来重新确定序列的诊断方法。蛋白质特征方法已被许多数据库采用,其中10个顶级资源已被整合到InterPro数据库中。InterPro及其配套的蛋白质分析软件工具InterProScan现在是世界上领先的蛋白质功能分类资源之一。然而,尽管取得了成功,但InterPro及其合作伙伴目前缺乏财政支持。维持和改进如此规模的数据库所需的资金水平往往被低估。传入的数据量呈指数级增长,数据库现在很难及时向公众提供数据,同时保持必要的高数据质量标准。此外,随着它们变得越来越受欢迎,用户需求增加,这些核心数据库承受着越来越大的压力,不仅要跟上不断扩大的数据量和不断增长的社区需求,而且要成为新兴技术的早期采用者。该提案旨在通过采用新技术来解决这些问题,以增强和进一步开发InterPro及其源数据库。其目的是简化制作过程,以便更经常地发布数据,并更好地科普数据量的增加。通过更正式的联盟活动和协调,我们将更有效地利用资源和分担任务,以确保数据库的长期可持续性。具体而言,我们的目标是:- 简化数据制作程序,以便能够更快地发布数据; -开发和整合新的注释工具和标准,使限制速率的注释步骤更快、更容易,并分担注释等任务,以消除工作中的冗余; -密切合作,改进蛋白质匹配的质量保证程序;- 协调将InterProScan和其他基于人类基因组模型的数据库升级到人类基因组模型的最新版本;以及-扩展Web界面的功能,以更好地满足用户群体的需求。计划对InterProScan和蛋白质匹配程序进行的改进将提高蛋白质功能分类的质量和速度;简化生产流程将使数据库能够在新的蛋白质结构域和家族可用时尽快向公众提供。新技术将使不同数据库之间的联系更加容易,并使公众能够获得不同来源的数据。它们还将通过提供更好的数据编程访问,为更复杂的分析打开大门。此外,这些新的流程和技术将使InterPro及其成员数据库能够科普不断增加的新数据,并使其在更定期的发布中向公众开放。最终,这些改进将使InterPro及其合作伙伴更容易、更有效地维护,为更可持续的未来铺平道路,并增加其对科学界的益处和实用性。
项目成果
期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
InterPro: the integrative protein signature database.
- DOI:10.1093/nar/gkn785
- 发表时间:2009-01
- 期刊:
- 影响因子:14.9
- 作者:Hunter S;Apweiler R;Attwood TK;Bairoch A;Bateman A;Binns D;Bork P;Das U;Daugherty L;Duquenne L;Finn RD;Gough J;Haft D;Hulo N;Kahn D;Kelly E;Laugraud A;Letunic I;Lonsdale D;Lopez R;Madera M;Maslen J;McAnulla C;McDowall J;Mistry J;Mitchell A;Mulder N;Natale D;Orengo C;Quinn AF;Selengut JD;Sigrist CJ;Thimma M;Thomas PD;Valentin F;Wilson D;Wu CH;Yeats C
- 通讯作者:Yeats C
InterPro in 2011: new developments in the family and domain prediction database.
- DOI:10.1093/nar/gkr948
- 发表时间:2012-01
- 期刊:
- 影响因子:14.9
- 作者:Hunter S;Jones P;Mitchell A;Apweiler R;Attwood TK;Bateman A;Bernard T;Binns D;Bork P;Burge S;de Castro E;Coggill P;Corbett M;Das U;Daugherty L;Duquenne L;Finn RD;Fraser M;Gough J;Haft D;Hulo N;Kahn D;Kelly E;Letunic I;Lonsdale D;Lopez R;Madera M;Maslen J;McAnulla C;McDowall J;McMenamin C;Mi H;Mutowo-Muellenet P;Mulder N;Natale D;Orengo C;Pesseat S;Punta M;Quinn AF;Rivoire C;Sangrador-Vegas A;Selengut JD;Sigrist CJ;Scheremetjew M;Tate J;Thimmajanarthanan M;Thomas PD;Wu CH;Yeats C;Yong SY
- 通讯作者:Yong SY
Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions.
- DOI:10.1093/nar/gkt263
- 发表时间:2013-07
- 期刊:
- 影响因子:14.9
- 作者:Mistry J;Finn RD;Eddy SR;Bateman A;Punta M
- 通讯作者:Punta M
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Alex Bateman其他文献
Bioinformatics Applications Note Databases and Ontologies Codex: Exploration of Semantic Changes between Ontology Versions
生物信息学应用笔记数据库和本体法典:本体版本之间语义变化的探索
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
Michael Hartung;Anika Groß;E. Rahm;Alex Bateman - 通讯作者:
Alex Bateman
Bioinformatics Advance Access published May 31, 2007
生物信息学高级访问发表于 2007 年 5 月 31 日
- DOI:
10.1007/s10015-009-0735-5 - 发表时间:
2007 - 期刊:
- 影响因子:0.9
- 作者:
Alex Bateman - 通讯作者:
Alex Bateman
Alex Bateman的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Alex Bateman', 18)}}的其他基金
Improving accuracy, coverage, and sustainability of functional protein annotation in InterPro, Pfam and FunFam using Deep Learning methods
使用深度学习方法提高 InterPro、Pfam 和 FunFam 中功能蛋白注释的准确性、覆盖范围和可持续性
- 批准号:
BB/X018660/1 - 财政年份:2024
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
UKRI/BBSRC-NSF/BIO: Unifying Pfam protein sequence and ECOD structural classifications with structure models
UKRI/BBSRC-NSF/BIO:通过结构模型统一 Pfam 蛋白质序列和 ECOD 结构分类
- 批准号:
BB/X012492/1 - 财政年份:2023
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
Exploiting data driven computational approaches for understanding protein structure and function in InterPro and Pfam
利用数据驱动的计算方法来理解 InterPro 和 Pfam 中的蛋白质结构和功能
- 批准号:
BB/S020381/1 - 财政年份:2019
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
Rfam: The community resource for RNA families
Rfam:RNA 家族的社区资源
- 批准号:
BB/S020462/1 - 财政年份:2019
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
RNAcentral, the RNA sequence database
RNAcentral,RNA 序列数据库
- 批准号:
BB/N019199/1 - 财政年份:2017
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
Rfam: Towards a sustainable resource for understanding the genomic functional ncRNA repertoire
Rfam:寻找了解基因组功能 ncRNA 库的可持续资源
- 批准号:
BB/M011690/1 - 财政年份:2015
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
Keeping pace with protein sequence annotation; consolidating and enhancing Pfam and InterPro's methodologies for functional prediction
与蛋白质序列注释保持同步;
- 批准号:
BB/L024136/1 - 财政年份:2014
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
The RNAcentral database of non-coding RNAs
非编码RNA的RNA中央数据库
- 批准号:
BB/J019232/1 - 财政年份:2012
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
相似国自然基金
脊髓新鉴定SNAPR神经元相关环路介导SCS电刺激抑制恶性瘙痒
- 批准号:82371478
- 批准年份:2023
- 资助金额:48.00 万元
- 项目类别:面上项目
tau轻子衰变与新物理模型唯象研究
- 批准号:11005033
- 批准年份:2010
- 资助金额:18.0 万元
- 项目类别:青年科学基金项目
HIV gp41的NHR区新靶点的确证及高效干预
- 批准号:81072676
- 批准年份:2010
- 资助金额:33.0 万元
- 项目类别:面上项目
强子对撞机上新物理信号的多轻子末态研究
- 批准号:10675110
- 批准年份:2006
- 资助金额:36.0 万元
- 项目类别:面上项目
相似海外基金
New, easy to use, low-cost technologies based on DNA origami biosensing to achieve distributed screening for AMR and improved antibiotic prescribing
基于 DNA 折纸生物传感的易于使用、低成本的新型技术,可实现 AMR 的分布式筛查并改进抗生素处方
- 批准号:
MR/Y034481/1 - 财政年份:2024
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
Piecing together the Neutrino Mass Puzzle in Search of New Particles with Precision Oscillation Experiments and Quantum Technologies
通过精密振荡实验和量子技术拼凑中微子质量难题以寻找新粒子
- 批准号:
ST/W003880/2 - 财政年份:2024
- 资助金额:
$ 39.16万 - 项目类别:
Fellowship
NEW TECHNOLOGIES FOR AFRICAN SWINE FEVER VACCINES
非洲猪瘟疫苗新技术
- 批准号:
10091291 - 财政年份:2024
- 资助金额:
$ 39.16万 - 项目类别:
EU-Funded
PHOtolysis Reaction Mechanisms by Emerging and New Technologies - PhoRMENT
新兴新技术的光解反应机制 - PhoRMENT
- 批准号:
2885177 - 财政年份:2023
- 资助金额:
$ 39.16万 - 项目类别:
Studentship
Elucidating the critical role of Wee1 in GIST
阐明 Wee1 在 GIST 中的关键作用
- 批准号:
10681775 - 财政年份:2023
- 资助金额:
$ 39.16万 - 项目类别:
Northwestern University O'Brien Kidney National Resource Center
西北大学奥布莱恩肾脏国家资源中心
- 批准号:
10754080 - 财政年份:2023
- 资助金额:
$ 39.16万 - 项目类别:
Technologies for High-Throughput Mapping of Antigen Specificity to B-Cell-Receptor Sequence
B 细胞受体序列抗原特异性高通量作图技术
- 批准号:
10734412 - 财政年份:2023
- 资助金额:
$ 39.16万 - 项目类别:
Bioethical, Legal, and Anthropological Study of Technologies (BLAST)
技术的生物伦理、法律和人类学研究 (BLAST)
- 批准号:
10831226 - 财政年份:2023
- 资助金额:
$ 39.16万 - 项目类别:
Soft robotic sensor arrays for fast and efficient mapping of cardiac arrhythmias.
软机器人传感器阵列可快速有效地绘制心律失常图。
- 批准号:
10760164 - 财政年份:2023
- 资助金额:
$ 39.16万 - 项目类别: