SOFTWARE FOR MASS SPECTRAL DATA CONVERSION / AUTOMATIC PROTEOMIC ANALYSIS

质谱数据转换/自动蛋白质组分析软件

基本信息

  • 批准号:
    8170864
  • 负责人:
  • 金额:
    $ 0.59万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2010
  • 资助国家:
    美国
  • 起止时间:
    2010-06-01 至 2011-05-31
  • 项目状态:
    已结题

项目摘要

This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. With a multitude of different MS instrumentation and data analysis software platforms available for MS and proteomics, it becomes difficult to manipulate and manage various data sets. We have created a software application that will allow the conversion of processed MS data files obtained on a variety of instruments into several common formats accepted by different software applications. We have further developed the program to add support for the mzXML format (Pedroli et al., 2004) and incorporate a front end interface which may be linked to several web based database searching engines including Mascot, ProteinProspector and BUPID (a peptide mass fingerprinting program based on a log-likelihood ratio model developed here at BUSM) (Tong et al., 2005). The data processing software was developed using Microsoft Visual Basic 6.0. To add support for mzXML format, we used MSXML 4.0 as an XML parser and built a visual C++ library to decode Base64 encoded peak list data in the mzXML file. Other supported data formats are intermediate files converted from raw data files using software from the manufacturers: LC MS/MS data is processed with Analyst QS (ABI/Sciex), MassLynx/PLGS2.1 (Waters); and MALDI MS data with MOverZ (Proteometrics LLC); and FTMS data with BUDA (O'Connor, http://www.bumc.bu.edu/FTMS). The BUPID program was developed in C under Linux and made accessible to the main program through a CGI based web interface. The shell data conversion program was written to implement a user friendly GUI interface which may be operated in an unattended batch processing mode. Testing of the program was performed on existing MALDI-TOF MS, MALDI-FT MS and LC MS/MS data sets obtained in house. The program allowed the conversion of large volumes of data obtained on different instruments to the formats of several commercially and publicly available search engines. Files were then submitted for protein identification to the search engines with the search settings specified by the user. Our implementation of the mzXML format introduced by the Institute for Systems Biology afforded the benefits of a common data format for summation of results obtained on different MS platforms, comparative analysis of MS methodology and archiving of data. We added capabilities for interpretation of top-down tandem mass spectra of proteins (BUPID-top down) and for linking of database assignments to functionality of proteins (STRAP). These results were presented as posters at ASMS and other scientific meetings; the STRAP manuscript was published in early 2010. The search algorithm Boston University Protein Identifier (BUPID) provides a robust and accurate statistical model for protein identification using MS data. The algorithm offers a number of important features: 1. Using log-likelihood ratio as scoring function, the algorithm can best distinguish correctly assigned peptides from incorrect assignments. 2. Matching peaks with a background-dependent threshold offers more flexibility and accuracy than the traditional mass window. 3. The statistical model provides similar or better results with comparison to conventional database search engines. We use log-likelihood ratio to calculate the probability that a protein is present in the sample. The model distinguishes two hypotheses (1) H0: That a set of peaks in the spectrum is generated by the random background; and (2) HA: That the same set of peaks is generated by peptides corresponding to a specific protein. A peak is included in the set if the probability that it is produced by the protein is more significant than that it is otherwise produced by the random background. Final results are ranked by the E-value of their probability score using the sequence information of the protein. We have compared the performance of the BUPID server and several other public web-based database search engines. Peptide map data sets were obtained from existing ongoing projects. BUPID database search results had on average 27% more true positives in top 10 predictions, with comparison to MASCOT results. Within the top 100 predictions, BUPID showed 27% more true positives as compared to MASCOT. In addition, BUPID was able to find all five human hemoglobin proteins in 6 cases within top 20. MASCOT succeeded in one case. When using peaks with higher than 5% relative intensity, the spreads are 28% and 24% within top 10 and 100 predictions, respectively. Recent efforts have been directed toward the next stage of the software for interpretation of top-down tandem MS data from the LTQ-Orbitrap MS and ICR-FTMS instruments. A manuscript that includes the use of BUPID Top-down has been submitted for publication.
该副本是利用众多研究子项目之一 由NIH/NCRR资助的中心赠款提供的资源。子弹和 调查员(PI)可能已经从其他NIH来源获得了主要资金, 因此可以在其他清晰的条目中代表。列出的机构是 对于中心,这不一定是调查员的机构。 借助可用于MS和蛋白质组学的多种不同MS仪器和数据分析软件平台,很难操纵和管理各种数据集。我们创建了一个软件应用程序,该应用程序将允许在各种仪器上获得的已处理的MS数据文件转换为不同软件应用程序接受的几种通用格式。我们进一步开发了该计划,以增加对MZXML格式的支持(Pedroli et al。,2004),并结合了前端界面,该界面可能与几个基于Web的数据库搜索引擎相关联,包括Mascot,ProteinProspector和Bupid(ProteinProspector和Bupid)(肽质量指纹计划程序(肽质量指纹计划)基于基于Likikelihoodhiehoodhiehoodhiehoodhiehoodhiehoodhoodhoodhoodhoodhoodhoodhoodhoodhoodhoodhoodhood hode在此处开发的BUSM)(an Busm)(an Busm),and and at busm)(and at Busm)。使用Microsoft Visual Basic 6.0开发了数据处理软件。为了添加对MZXML格式的支持,我们将MSXML 4.0用作XML解析器,并构建了一个Visual C ++库来解码MZXML文件中编码的base64编码峰值列表数据。其他支持的数据格式是使用制造商的软件从原始数据文件转换的中间文件:LC MS/MS数据与Analyst QS(ABI/SCIEX),Masslynx/plgs2.1(Waters)一起处理;和Movers(proteoMetrics LLC)的MALDI MS数据;和FTMS数据(O'Connor,http://www.bumc.bu.edu/ftms)。 BUPID程序是在Linux下在C下开发的,并通过基于CGI的Web界面使主程序访问。 编写了Shell数据转换程序以实现用户友好的GUI界面,该界面可以在无人看管的批处理处理模式下操作。该程序的测试是对内部获得的现有MALDI-TOF MS,MALDI-FT MS和LC MS/MS数据集进行的。该计划允许将不同工具获得的大量数据转换为商业和公开搜索引擎的格式。然后将文件提交给搜索引擎的蛋白质识别,并使用用户指定的搜索设置。我们实施由系统生物学研究所引入的MZXML格式,为在不同的MS平台上获得的结果总结,MS方法论和数据归档的比较分析提供了共同数据格式的好处。我们添加了解释自上而下的蛋白质(bupid-top down)以及将数据库分配与蛋白质功能联系起来的功能。这些结果作为ASM和其他科学会议的海报提出。皮带手稿于2010年初发表。搜索算法波士顿大学蛋白质标识符(BUPID)为使用MS数据提供了蛋白质鉴定的强大而准确的统计模型。该算法提供了许多重要功能:1。使用对数可能性比率作为评分函数,该算法可以最好地将正确分配的肽与不正确的分配区分开。 2。与传统质量窗口相比,与背景有关的匹配峰具有更大的灵活性和准确性。 3。与常规数据库搜索引擎相比,统计模型提供了类似或更好的结果。我们使用对数似然比来计算样品中存在蛋白质的概率。该模型区分了两个假设(1)H0:频谱中的一组峰是由随机背景产生的。 (2)HA:相同的峰是由对应于特定蛋白质的肽产生的。如果蛋白质产生的概率比随机背景产生的峰更重要,则该集合中包含一个峰。最终结果由使用蛋白质的序列信息的概率评分来对其概率评分进行排名。我们已经比较了Bupid服务器和其他一些基于Web的数据库搜索引擎的性能。肽地图数据集是从现有正在进行的项目中获得的。 与吉祥物的结果相比,在十大预测中,布比德数据库搜索结果平均具有27%的真实阳性。在前100个预测中,与吉祥物相比,Bupid显示出多27%的真实阳性。此外,BUPID能够在6例中的6例中找到所有五种人血红蛋白蛋白。在一种情况下,Mascot成功了。当使用相对强度高于5%的峰时,在前10名和100个预测中,利差分别为28%和24%。最近的努力是针对软件的下一个阶段,用于解释从LTQ-Orbitrap MS和ICR-FTMS仪器中的自上而下的串联MS数据。已提交了包括使用Bupid自上而下的手稿出版。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Catherine E. Costello其他文献

Phencyclidine (Sernylan) poisoning
  • DOI:
    10.1016/s0022-3476(73)80385-3
  • 发表时间:
    1973-11-01
  • 期刊:
  • 影响因子:
  • 作者:
    William L. Nyhan;Harry C. Shirkey;Craig B. Liden;Frederick H. Lovejoy;Catherine E. Costello
  • 通讯作者:
    Catherine E. Costello
若年肥満者における尿中カルボニル物質による血圧上昇の予測
年轻肥胖者尿液中羰基物质导致血压升高的预测
  • DOI:
  • 发表时间:
    2011
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Garry L. Corthals;Catherine E. Costello;Eric W. Deutsch;Bruno Domon;William Hancock;Fuchu He;Denis Hochstrasser;Gyorgy Marko-Varga;Ghasem Hosseini Salekdeh;Salvatore Sechi;Michael Snyder;Sudhir Srivastava;Mathias Uhlen;Cathy H. Hu;Tadashi Y;佐藤恵美子
  • 通讯作者:
    佐藤恵美子
Differential Labeling of Reversible Protein-Oxidation and S-Palmitoylation Using the Biotin-Switch Assay
  • DOI:
    10.1016/j.freeradbiomed.2011.10.069
  • 发表时间:
    2011-11-01
  • 期刊:
  • 影响因子:
  • 作者:
    Dagmar J. Haeussler;Vikas Kumar;Joseph R. Burgoyne;Yuhuan Ji;Cheng Lin;Catherine E. Costello;David R. Pimental;Richard A. Cohen;Markus M. Bachschmid
  • 通讯作者:
    Markus M. Bachschmid

Catherine E. Costello的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Catherine E. Costello', 18)}}的其他基金

Legacy Support During Closure of the Mass Spectrometry Resource for Biology and Medicine
生物学和医学质谱资源关闭期间的遗留支持
  • 批准号:
    10204050
  • 财政年份:
    2019
  • 资助金额:
    $ 0.59万
  • 项目类别:
Legacy Support During Closure of the Mass Spectrometry Resource for Biology and Medicine
生物学和医学质谱资源关闭期间的遗留支持
  • 批准号:
    9976561
  • 财政年份:
    2019
  • 资助金额:
    $ 0.59万
  • 项目类别:
Legacy Support During Closure of the Mass Spectrometry Resource for Biology and Medicine
生物学和医学质谱资源关闭期间的遗留支持
  • 批准号:
    9810729
  • 财政年份:
    2019
  • 资助金额:
    $ 0.59万
  • 项目类别:
MALDI-TOF/TOF MS TO SUPPORT BIOMEDICAL RESEARCH
MALDI-TOF/TOF MS 支持生物医学研究
  • 批准号:
    8247392
  • 财政年份:
    2012
  • 资助金额:
    $ 0.59万
  • 项目类别:
PROTEIN CYSTEINE POST-TRANSLATIONAL MODIFICATION IN AMYLOIDOSIS
淀粉样变性中的蛋白质半胱氨酸翻译后修饰
  • 批准号:
    8365496
  • 财政年份:
    2011
  • 资助金额:
    $ 0.59万
  • 项目类别:
BUSM SEMINARS, LECTURES AND SABBATICAL ON MASS SPECTROMETRY
BUSM 质谱研讨会、讲座和休假
  • 批准号:
    8365520
  • 财政年份:
    2011
  • 资助金额:
    $ 0.59万
  • 项目类别:
MICROSCALE SAMPLE PREPARATION FOR MASS SPECTROMETRY
质谱分析的微量样品制备
  • 批准号:
    8365509
  • 财政年份:
    2011
  • 资助金额:
    $ 0.59万
  • 项目类别:
OXIDATIVE POST-TRANSLATIONAL MODIFICATIONS IN CARDIOVASCULAR DISEASE
心血管疾病中的氧化翻译后修饰
  • 批准号:
    8365547
  • 财政年份:
    2011
  • 资助金额:
    $ 0.59万
  • 项目类别:
ELECTRON TRANSFER DISSOCIATION OF GLYCANS AND GLYCOCONJUGATES
聚糖和糖缀合物的电子转移解离
  • 批准号:
    8365562
  • 财政年份:
    2011
  • 资助金额:
    $ 0.59万
  • 项目类别:
LIPID METABOLITES AND PATHWAYS STRATEGY CONSORTIUM
脂质代谢物和途径策略联盟
  • 批准号:
    8365525
  • 财政年份:
    2011
  • 资助金额:
    $ 0.59万
  • 项目类别:

相似国自然基金

科学基金档案资料信息化管理探索与实践研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    10 万元
  • 项目类别:
面向单套制的国家自然科学基金项目档案分级分类管理研究
  • 批准号:
    J2224001
  • 批准年份:
    2022
  • 资助金额:
    30.00 万元
  • 项目类别:
    专项项目
科学基金档案资料信息化管理探索与实践研究
  • 批准号:
    52242312
  • 批准年份:
    2022
  • 资助金额:
    10.00 万元
  • 项目类别:
    专项项目
零信任架构下的电子健康档案动态共享研究
  • 批准号:
    72274077
  • 批准年份:
    2022
  • 资助金额:
    45 万元
  • 项目类别:
    面上项目
胶州湾河口湿地盾纤亚纲纤毛虫的多样性研究与档案资料建立
  • 批准号:
  • 批准年份:
    2021
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

M-ISIC: A Multimodal Open-Source International Skin Imaging Collaboration Informatics Platform for Automated Skin Cancer Detection
M-ISIC:用于自动皮肤癌检测的多模式开源国际皮肤成像协作信息学平台
  • 批准号:
    10528944
  • 财政年份:
    2022
  • 资助金额:
    $ 0.59万
  • 项目类别:
M-ISIC: A Multimodal Open-Source International Skin Imaging Collaboration Informatics Platform for Automated Skin Cancer Detection
M-ISIC:用于自动皮肤癌检测的多模式开源国际皮肤成像协作信息学平台
  • 批准号:
    10689201
  • 财政年份:
    2022
  • 资助金额:
    $ 0.59万
  • 项目类别:
Multi-site Data for Nutrition Studies in Healthy Early Childhood
健康幼儿营养研究的多站点数据
  • 批准号:
    10676921
  • 财政年份:
    2022
  • 资助金额:
    $ 0.59万
  • 项目类别:
Data Science Core
数据科学核心
  • 批准号:
    10687829
  • 财政年份:
    2019
  • 资助金额:
    $ 0.59万
  • 项目类别:
Better Outcomes for Children: Promoting Excellence in Healthcare Genomics to Inform Policy
为儿童带来更好的结果:促进卓越的医疗基因组学为政策提供信息
  • 批准号:
    9901995
  • 财政年份:
    2015
  • 资助金额:
    $ 0.59万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了