ProteomeHarvest - Excel/XML Bridge for User-friendly Proteomics Data Collection

ProteomeHarvest - Excel/XML 桥接器,用于用户友好的蛋白质组学数据收集

基本信息

  • 批准号:
    BB/E00573X/1
  • 负责人:
  • 金额:
    $ 6.4万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2006
  • 资助国家:
    英国
  • 起止时间:
    2006 至 无数据
  • 项目状态:
    已结题

项目摘要

Today, scientific experiments in molecular biology in general and in proteomics in particular, are often done on a large scale, producing large numbers of individual data items. These large data sets are then the basis of scientific publications. Often only relatively few results are actually contributing to the final conclusions reached by the researcher, but the complete result sets can provide valuable knowledge to other researchers comparing them to their own results. However, to allow others to understand how the experiments were done, they need to be described in a very detailed manner. To avoid 'comparing apples and pears', this discription needs to be done in a systematic manner, using established rules or standards on how to describe experiments. In addition, the data needs to be easily accessible for other researchers, which can be best achieved by entering it into large databases, accessible over the internet. Overall, a lot of effort is needed to describe a large experiment in the detailed, standardised manner which allows others to understand them. In other projects, we are working on setting up common rules for the description of proteomics experiments. However, even the best rules are useless if they are not applied. As scientists, like everybody else, tend to do only the minimum amount of work to achieve their goals, their experiment description is often incomplete, focussing only on the aspects they consider relevant. And of course they tend to quickly tire of properly entering the data into databases if they have to use complicated tools they have to install on their computer, and with which they are not familiar. On the other hand, there are programs they know well, because the use them almost every day anyway to manage their data. The main purpose of this proposal is to use one such tool, Microsoft Excel, to develop forms which allow scientists to enter their results into a database in as easy a manner as possible. Biologists are used to Excel, they are familiar with its functionality, and they nearly always have it installed on their computer anyway. We plan to develop Excel forms which are as user friendly as possible, but still capture all the necessary data to appropriately describe the results of a large experiment, according to established rules and standards. While Excel is often used to store experiment results, this is often done in a very unsystematic manner, and it is usually very difficult to transfer the data into XML, a file format which is nowadays practically the standard way for entering data into databases. Also, so far it has been difficult to use and regularly update controlled vocabularies in Excel. Controlled vocabularies are lists of possible words which can be entered in a specific field in a form, to avoid typing errors, and to ensure everybody uses the same word for the same thing. In this project, we propose to develop advanced Excel forms for proteomics data harvesting. These forms should provide researchers with an easy tool to store their data in a systematic manner, ready for sending it to a database. These forms will be able to communicate with a database on the internet to provide up-to-date controlled vocabularies, and they will be able to directly send the data in the form of XML to a database on the internet. We will develop and test these forms for the existing PRIDE proteomics database, making use of the existing database for data storage, and using OLS, the ontology lookup service developed as part of PRIDE, to keep controlled vocabularies in the Excel forms up to date. By providing Excel forms as a user-friendly way to store proteomics data and send it to public databases, we hope to convince researchers to invest a little bit of extra effort to make their valuable data accessible to their collegues by sending it to public databases, and thus to maximise the use of data paid for by the tax payer anyway.
今天,分子生物学的科学实验,特别是蛋白质组学的科学实验,通常是在大规模上进行的,产生大量的个人数据项。这些大型数据集是科学出版物的基础。通常只有相对较少的结果对研究人员得出的最终结论有贡献,但完整的结果集可以为其他研究人员提供有价值的知识,将它们与自己的结果进行比较。然而,为了让其他人理解实验是如何完成的,他们需要以非常详细的方式描述。为了避免“比较苹果和梨”,这种描述需要以一种系统的方式进行,使用关于如何描述实验的既定规则或标准。此外,这些数据需要便于其他研究人员访问,最好的方法是将其输入大型数据库,并通过互联网访问。总的来说,以详细、标准化的方式描述一个大型实验是需要付出很多努力的,这样才能让其他人理解它们。在其他项目中,我们正致力于建立描述蛋白质组学实验的通用规则。然而,即使是最好的规则,如果不加以应用也是无用的。由于科学家和其他人一样,倾向于只做最少的工作来实现他们的目标,他们的实验描述往往是不完整的,只关注他们认为相关的方面。当然,如果他们必须在自己的计算机上安装复杂的工具,而且他们不熟悉这些工具,他们往往很快就会厌倦正确地将数据输入数据库。另一方面,有些程序他们很熟悉,因为他们几乎每天都在使用它们来管理他们的数据。这项提议的主要目的是使用这样一种工具,微软Excel,来开发表格,使科学家能够以尽可能简单的方式将他们的结果输入数据库。生物学家已经习惯了Excel,他们熟悉它的功能,而且他们几乎总是把它安装在他们的电脑上。我们计划开发尽可能用户友好的Excel表格,但仍然根据既定的规则和标准捕获所有必要的数据,以适当地描述大型实验的结果。虽然通常使用Excel来存储实验结果,但这通常是以一种非常不系统的方式完成的,并且通常很难将数据转换为XML,而XML是目前将数据输入数据库的实际标准方式。此外,到目前为止,在Excel中使用和定期更新受控词汇表一直很困难。受控词汇表是可以在表单的特定字段中输入的可能单词列表,以避免输入错误,并确保每个人都使用相同的单词来表示相同的事物。在这个项目中,我们建议开发用于蛋白质组学数据收集的高级Excel表格。这些表格应该为研究人员提供一个简单的工具,以系统的方式存储他们的数据,准备将其发送到数据库。这些表单将能够与internet上的数据库通信,以提供最新的受控词汇表,并且它们将能够直接将XML形式的数据发送到internet上的数据库。我们将为现有的PRIDE蛋白质组学数据库开发和测试这些表单,利用现有的数据库进行数据存储,并使用OLS(作为PRIDE的一部分开发的本体查找服务)来保持Excel表单中的受控词汇表的更新。通过提供Excel表格作为一种用户友好的方式来存储蛋白质组学数据并将其发送到公共数据库,我们希望说服研究人员投入一点额外的努力,通过将其发送到公共数据库,使他们的同事可以访问他们有价值的数据,从而最大限度地利用纳税人支付的数据。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Rolf Apweiler其他文献

In Silico Characterization of Proteins: UniProt, InterPro and Integr8
  • DOI:
    10.1007/s12033-007-9003-x
  • 发表时间:
    2007-10-04
  • 期刊:
  • 影响因子:
    2.500
  • 作者:
    Nicola Jane Mulder;Paul Kersey;Manuela Pruess;Rolf Apweiler
  • 通讯作者:
    Rolf Apweiler
Linking publication, gene and protein data
链接出版物、基因和蛋白质数据
  • DOI:
    10.1038/ncb1495
  • 发表时间:
    2006-11-01
  • 期刊:
  • 影响因子:
    19.100
  • 作者:
    Paul Kersey;Rolf Apweiler
  • 通讯作者:
    Rolf Apweiler
Broadening the horizon – level 2.5 of the HUPO-PSI format for molecular interactions
  • DOI:
    10.1186/1741-7007-5-44
  • 发表时间:
    2007-10-09
  • 期刊:
  • 影响因子:
    4.500
  • 作者:
    Samuel Kerrien;Sandra Orchard;Luisa Montecchi-Palazzi;Bruno Aranda;Antony F Quinn;Nisha Vinod;Gary D Bader;Ioannis Xenarios;Jérôme Wojcik;David Sherman;Mike Tyers;John J Salama;Susan Moore;Arnaud Ceol;Andrew Chatr-aryamontri;Matthias Oesterheld;Volker Stümpflen;Lukasz Salwinski;Jason Nerothin;Ethan Cerami;Michael E Cusick;Marc Vidal;Michael Gilson;John Armstrong;Peter Woollard;Christopher Hogue;David Eisenberg;Gianni Cesareni;Rolf Apweiler;Henning Hermjakob
  • 通讯作者:
    Henning Hermjakob
Whither systems medicine?
系统医学何去何从?
  • DOI:
    10.1038/emm.2017.290
  • 发表时间:
    2018-03-02
  • 期刊:
  • 影响因子:
    12.900
  • 作者:
    Rolf Apweiler;Tim Beissbarth;Michael R Berthold;Nils Blüthgen;Yvonne Burmeister;Olaf Dammann;Andreas Deutsch;Friedrich Feuerhake;Andre Franke;Jan Hasenauer;Steve Hoffmann;Thomas Höfer;Peter LM Jansen;Lars Kaderali;Ursula Klingmüller;Ina Koch;Oliver Kohlbacher;Lars Kuepfer;Frank Lammert;Dieter Maier;Nico Pfeifer;Nicole Radde;Markus Rehm;Ingo Roeder;Julio Saez-Rodriguez;Ulrich Sax;Bernd Schmeck;Andreas Schuppert;Bernd Seilheimer;Fabian J Theis;Julio Vera;Olaf Wolkenhauer
  • 通讯作者:
    Olaf Wolkenhauer

Rolf Apweiler的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Rolf Apweiler', 18)}}的其他基金

ARGENT: ARgentinian GEnomics for Tuberculosis
ARGENT:阿根廷结核病基因组学
  • 批准号:
    EP/T015446/1
  • 财政年份:
    2019
  • 资助金额:
    $ 6.4万
  • 项目类别:
    Research Grant
Database on demand - creating customized sequence databases for efficient protein identification
按需数据库 - 创建定制序列数据库以实现高效蛋白质识别
  • 批准号:
    BB/F016255/1
  • 财政年份:
    2008
  • 资助金额:
    $ 6.4万
  • 项目类别:
    Research Grant
Embracing new technologies to streamline improve and sustain InterPro and its contributing databases
采用新技术来简化、改进和维护 InterPro 及其贡献数据库
  • 批准号:
    BB/F010508/1
  • 财政年份:
    2008
  • 资助金额:
    $ 6.4万
  • 项目类别:
    Research Grant
Further development of the QuickGO web interface for browsing and retrieving Gene Ontology Annotation data
进一步开发 QuickGO Web 界面,用于浏览和检索基因本体注释数据
  • 批准号:
    BB/E023541/1
  • 财政年份:
    2007
  • 资助金额:
    $ 6.4万
  • 项目类别:
    Research Grant

相似海外基金

Colorado Preparation in Interdisciplinary Knowledge to Excel PREP
科罗拉多州跨学科知识准备至 Excel PREP
  • 批准号:
    10344884
  • 财政年份:
    2022
  • 资助金额:
    $ 6.4万
  • 项目类别:
Supporting Undergraduates from Community College to Excel and Succeed in STEM
支持社区学院的本科生在 STEM 领域脱颖而出并取得成功
  • 批准号:
    2130435
  • 财政年份:
    2022
  • 资助金额:
    $ 6.4万
  • 项目类别:
    Standard Grant
Colorado Preparation in Interdisciplinary Knowledge to Excel PREP
科罗拉多州跨学科知识准备至 Excel PREP
  • 批准号:
    10559519
  • 财政年份:
    2022
  • 资助金额:
    $ 6.4万
  • 项目类别:
GLYCOTwinning: Building Networks to Excel in Glycosciences
GLYCOTwining:建立网络以在糖科学领域取得卓越成就
  • 批准号:
    10052578
  • 财政年份:
    2022
  • 资助金额:
    $ 6.4万
  • 项目类别:
    EU-Funded
Intervention to Help Orient Men to Excel (IN-HOME): A culturally appropriate CHW training program to reduce minority caregiver burden
帮助男性走向卓越的干预措施(在家):适合文化的社区卫生工作者培训计划,以减轻少数族裔护理人员的负担
  • 批准号:
    10600554
  • 财政年份:
    2022
  • 资助金额:
    $ 6.4万
  • 项目类别:
Computer Science Indigenous Community of Learners United to Develop, Excel, and Succeed
计算机科学本土学习者社区联合起来发展、超越并取得成功
  • 批准号:
    2130371
  • 财政年份:
    2021
  • 资助金额:
    $ 6.4万
  • 项目类别:
    Standard Grant
The EXCEL Project: A collaborative approach to improve outcomes of Australian patients with acute heart failure and cardiac arrest requiring extracorporeal life support
EXCEL 项目:一种协作方法,可改善需要体外生命支持的澳大利亚急性心力衰竭和心脏骤停患者的预后
  • 批准号:
    nhmrc : GNT1152793
  • 财政年份:
    2018
  • 资助金额:
    $ 6.4万
  • 项目类别:
    Partnerships
PFI:AIR - TT: The RULE project: Read Understand Learn & Excel
PFI:AIR - TT:RULE 项目:阅读理解学习
  • 批准号:
    1640492
  • 财政年份:
    2016
  • 资助金额:
    $ 6.4万
  • 项目类别:
    Standard Grant
E2CDA: Type I: EXtremely Energy Efficient Collective ELectronics (EXCEL)
E2CDA:I 型:极其节能的集体电子 (EXCEL)
  • 批准号:
    1640081
  • 财政年份:
    2016
  • 资助金额:
    $ 6.4万
  • 项目类别:
    Continuing Grant
Food Excel Workshop Series (Holland College, Canada's Smartest Kitchen)
Food Excel 工作坊系列(荷兰学院、加拿大最智能厨房)
  • 批准号:
    503196-2016
  • 财政年份:
    2016
  • 资助金额:
    $ 6.4万
  • 项目类别:
    Connect Grants Level 2
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了