SOFTWARE FOR MASS SPECTRAL DATA CONVERSION / AUTOMATIC PROTEOMIC ANALYSIS
质谱数据转换/自动蛋白质组分析软件
基本信息
- 批准号:7722964
- 负责人:
- 金额:$ 0.52万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2008
- 资助国家:美国
- 起止时间:2008-06-01 至 2009-05-31
- 项目状态:已结题
- 来源:
- 关键词:AlgorithmsArchivesBostonComputer Retrieval of Information on Scientific Projects DatabaseComputer softwareDataData AnalysesData FilesData SetDatabasesE2F Transcription Factor 1Extensible Markup LanguageFingerprintFundingGrantHemoglobinHousingHumanImageryInstitutesInstitutionLibrariesLinkLinuxManufacturer NameMass Spectrum AnalysisMethodologyModelingNumbersOnline SystemsOperative Surgical ProceduresPeptide MappingPeptidesPerformancePliabilityProbabilityProcessProteinsProteomicsRelative (related person)ResearchResearch PersonnelResourcesRunningSamplingScoreSodium Dodecyl Sulfate-PAGESourceSpecific qualifier valueSpectrometry, Mass, Matrix-Assisted Laser Desorption-IonizationStatistical ModelsSystems BiologyTestingUnited States National Institutes of HealthUniversitiesVisualWaterWritingbasecomparativecomputerized data processinggraphical user interfaceinstrumentinstrumentationprogramstooluser-friendlyweb based interface
项目摘要
This subproject is one of many research subprojects utilizing the
resources provided by a Center grant funded by NIH/NCRR. The subproject and
investigator (PI) may have received primary funding from another NIH source,
and thus could be represented in other CRISP entries. The institution listed is
for the Center, which is not necessarily the institution for the investigator.
With a multitude of different MS instrumentation and data analysis software platforms available for MS and proteomics, it becomes difficult to manipulate and manage various data sets. We have created a software application that will allow the conversion of processed MS data files obtained on a variety of instruments into several common formats accepted by different software applications. We have further developed the program to add support for the mzXML format (Pedroli et al., 2004) and incorporate a front end interface which may be linked to several web based database searching engines including Mascot, ProteinProspector and BUPID (a peptide mass fingerprinting program based on a log-likelihood ratio model developed here at BUSM) (Tong et al., 2005). The data processing software was developed using Microsoft Visual Basic 6.0. To add support for mzXML format, we used MSXML 4.0 as an XML parser and built a visual C++ library to decode Base64 encoded peak list data in the mzXML file. Other supported data formats are intermediate files converted from raw data files using software from the manufacturers: LC MS/MS data is processed with Analyst QS (ABI/Sciex), MassLynx/PLGS2.1 (Waters); and MALDI MS data with MOverZ (Proteometrics LLC); and FTMS data with BUDA (O'Connor, http://www.bumc.bu.edu/FTMS). The BUPID program was developed in C under Linux and made accessible to the main program through a CGI based web interface. The shell data conversion program was written to implement a user friendly GUI interface which may be operated in an unattended batch processing mode. Testing of the program was performed on existing MALDI-TOF MS, MALDI-FT MS and LC MS/MS data sets obtained in house. The program allowed the conversion of large volumes of data obtained on different instruments to the formats of several commercially and publicly available search engines. Files were then submitted for protein identification to the search engines with the search settings specified by the user. For a batch of files, the search setting only needs to be specified once, thus allowing unattended operation. Results files are automatically saved in HTML format and can then be viewed directly inside the program. Our recent implementation of the mzXML format introduced by the Institute for Systems Biology affords the benefits of a common data format for summation of results obtained on different MS platforms, comparative analysis of MS methodology and archiving of data such that it may be analyzed at a later date in-house or at a different facility. The software provides an easy-to-use graphical interface for automatic MS data conversion and database searching. It can also be easily be expanded for more MS data types and linked to more database search engines.
The new search algorithm Boston University Protein Identifier (BUPID) provides a robust and accurate statistical model for protein identification using MS data. The algorithm offers a number of new features: 1. Using log-likelihood ratio as scoring function, the algorithm can best distinguish correctly assigned peptides from incorrect assignments. 2. Matching peaks with a background-dependent threshold offers more flexibility and accuracy than the traditional mass window. 3. The statistical model provides similar or better results with comparison to conventional database search engines. We use log-likelihood ratio to calculate the probability that a protein is present in the sample. The model distinguishes two hypotheses ? H0: That a set of peaks in the spectrum is generated by the random background; and HA: That the same set of peaks is generated by peptides corresponding to a specific protein. A peak is included in the set if the probability that it is produced by the protein is more significant than that it is otherwise produced by the random background. Final results are ranked by the E-value of their probability score using the sequence information of the protein. We have compared the performance of the BUPID server and several other public web-based database search engines. Peptide map data sets were obtained from existing ongoing projects. BUPID database search results had on average 27% more true positives in top 10 predictions, with comparison to MASCOT results. Within the top 100 predictions, BUPID showed 27% more true positives as compared to MASCOT. In addition, BUPID was able to find all five human hemoglobin proteins in 6 cases within top 20. MASCOT succeeded in one case. When using peaks with higher than 5% relative intensity, the spreads are 28% and 24% within top 10 and 100 predictions, respectively. With another MALDI data set (transcription factor E2F1 protein purified and separated on 1D SDS PAGE), all five search engines pulled out similar results. BUPID also provides various data visualizations tools that are found useful by many users, including combined view of a protein mixture, mass spectrum of shared or similar peptides in different proteins, etc. A typical BUPID run takes 2~3 minutes on a Pentium IV PC.
这个子项目是许多研究子项目中的一个
由NIH/NCRR资助的中心赠款提供的资源。子项目和
研究者(PI)可能从另一个NIH来源获得了主要资金,
因此可以在其他CRISP条目中表示。所列机构为
研究中心,而研究中心不一定是研究者所在的机构。
随着大量不同的MS仪器和数据分析软件平台可用于MS和蛋白质组学,它变得难以操纵和管理各种数据集。我们已经创建了一个软件应用程序,该软件应用程序将允许将在各种仪器上获得的经处理的MS数据文件转换为不同软件应用程序接受的几种常见格式。我们已经进一步开发了该程序,以添加对mzXML格式的支持(Pedroli等人,2004)并结合前端接口,该前端接口可以链接到几个基于网络的数据库搜索引擎,包括Mascot、ProteinProspector和BUPID(基于在BUSM开发的对数似然比模型的肽质量指纹程序)(Tong等人,2005年)。数据处理软件采用Microsoft Visual Basic 6.0开发。为了增加对mzXML格式的支持,我们使用MSXML 4.0作为XML解析器,并构建了一个Visual C++库来解码mzXML文件中Base64编码的峰列表数据。其他支持的数据格式为使用生产商软件从原始数据文件转换的中间文件:LC MS/MS数据使用Analyst QS(ABI/Sciex)、MassLynx/PLGS 2.1(沃茨)处理; MALDI MS数据使用MOverZ(Proteometrics LLC)处理; FTMS数据使用BUDA(奥康纳,http://www.bumc.bu.edu/FTMS)处理。BUPID程序是在Linux下用C语言开发的,通过基于CGI的Web界面可访问主程序。 编写外壳数据转换程序以实现用户友好的GUI界面,该界面可以在无人值守的批处理模式下操作。对内部获得的现有MALDI-TOF MS、MALDI-FT MS和LC MS/MS数据集进行程序检测。该方案允许将通过不同仪器获得的大量数据转换为若干商业和公开搜索引擎的格式。然后将文件提交给具有用户指定的搜索设置的搜索引擎进行蛋白质鉴定。对于一批文件,搜索设置只需要指定一次,从而允许无人值守操作。结果文件自动保存为HTML格式,然后可以直接在程序中查看。我们最近实施的系统生物学研究所推出的mzXML格式提供了一个共同的数据格式的好处,在不同的MS平台上获得的结果汇总,MS方法和数据存档的比较分析,以便它可以在以后的日期在内部或在不同的设施进行分析。该软件提供了一个易于使用的图形界面,用于自动MS数据转换和数据库搜索。它也可以很容易地扩展为更多的MS数据类型,并链接到更多的数据库搜索引擎。
新的搜索算法Boston University Protein Identifier(BUPID)为使用MS数据进行蛋白质鉴定提供了一个强大而准确的统计模型。该算法提供了一些新的功能:1。使用对数似然比作为评分函数,该算法可以最好地区分正确分配的肽与不正确的分配。2.与传统的质量窗口相比,使用背景相关阈值匹配峰提供了更高的灵活性和准确性。3.与传统的数据库搜索引擎相比,统计模型提供了类似或更好的结果。我们使用对数似然比来计算样本中存在蛋白质的概率。该模型区分两个假设?第0阶段:光谱中的一组峰由随机背景产生; HA:同一组峰由对应于特定蛋白质的肽产生。如果由蛋白质产生的峰的概率比由随机背景产生的峰的概率更显著,则将该峰包括在该组中。使用蛋白质的序列信息,通过其概率得分的E值对最终结果进行排名。我们比较了BUPID服务器和其他几个公共的基于Web的数据库搜索引擎的性能。肽图谱数据集从现有的正在进行的项目中获得。 与MASCOT结果相比,BUPID数据库搜索结果在前10名预测中平均多出27%的真阳性。在前100个预测中,BUPID比MASCOT多出27%的真阳性。此外,BUPID能够在前20名中的6个病例中找到所有5种人类血红蛋白。马斯科特在一个案例中取得了成功。当使用相对强度高于5%的峰时,在前10和100个预测中的扩展分别为28%和24%。用另一个MALDI数据集(在1D SDS PAGE上纯化和分离的转录因子E2 F1蛋白),所有五个搜索引擎都得到了类似的结果。BUPID还提供了各种数据可视化工具,许多用户发现这些工具很有用,包括蛋白质混合物的组合视图,不同蛋白质中共享或相似肽的质谱等。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Catherine E. Costello其他文献
Phencyclidine (Sernylan) poisoning
- DOI:
10.1016/s0022-3476(73)80385-3 - 发表时间:
1973-11-01 - 期刊:
- 影响因子:
- 作者:
William L. Nyhan;Harry C. Shirkey;Craig B. Liden;Frederick H. Lovejoy;Catherine E. Costello - 通讯作者:
Catherine E. Costello
Inactivation of emMinar2/em in mice hyperactivates mTOR signaling and results in obesity
小鼠中 emMinar2/em 的失活过度激活 mTOR 信号并导致肥胖
- DOI:
10.1016/j.molmet.2023.101744 - 发表时间:
2023-07-01 - 期刊:
- 影响因子:6.600
- 作者:
Saran Lotfollahzadeh;Chaoshuang Xia;Razie Amraei;Ning Hua;Konstantin V. Kandror;Stephen R. Farmer;Wenyi Wei;Catherine E. Costello;Vipul Chitalia;Nader Rahimi - 通讯作者:
Nader Rahimi
RETRACTED ARTICLE: Endoperoxide formation by an α-ketoglutarate-dependent mononuclear non-haem iron enzyme
撤回文章:依赖α-酮戊二酸的单核非血红素铁酶形成内过氧化物
- DOI:
10.1038/nature15519 - 发表时间:
2015-11-02 - 期刊:
- 影响因子:48.500
- 作者:
Wupeng Yan;Heng Song;Fuhang Song;Yisong Guo;Cheng-Hsuan Wu;Ampon Sae Her;Yi Pu;Shu Wang;Nathchar Naowarojna;Andrew Weitz;Michael P. Hendrich;Catherine E. Costello;Lixin Zhang;Pinghua Liu;Yan Jessie Zhang - 通讯作者:
Yan Jessie Zhang
若年肥満者における尿中カルボニル物質による血圧上昇の予測
年轻肥胖者尿液中羰基物质导致血压升高的预测
- DOI:
- 发表时间:
2011 - 期刊:
- 影响因子:0
- 作者:
Garry L. Corthals;Catherine E. Costello;Eric W. Deutsch;Bruno Domon;William Hancock;Fuchu He;Denis Hochstrasser;Gyorgy Marko-Varga;Ghasem Hosseini Salekdeh;Salvatore Sechi;Michael Snyder;Sudhir Srivastava;Mathias Uhlen;Cathy H. Hu;Tadashi Y;佐藤恵美子 - 通讯作者:
佐藤恵美子
emDe novo/em glycan sequencing by electronic excitation dissociation MSsup2/sup-guided MSsup3/sup analysis on an Omnitrap-Orbitrap hybrid instrument
电子激发解离 MS² 引导的 MS³ 分析在 Omnitrap-Orbitrap 混合仪器上进行从头糖链测序
- DOI:
10.1039/d3sc00870c - 发表时间:
2023-06-21 - 期刊:
- 影响因子:7.400
- 作者:
Juan Wei;Dimitris Papanastasiou;Mariangela Kosmopoulou;Athanasios Smyrnakis;Pengyu Hong;Nafisa Tursumamat;Joshua A. Klein;Chaoshuang Xia;Yang Tang;Joseph Zaia;Catherine E. Costello;Cheng Lin - 通讯作者:
Cheng Lin
Catherine E. Costello的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Catherine E. Costello', 18)}}的其他基金
Legacy Support During Closure of the Mass Spectrometry Resource for Biology and Medicine
生物学和医学质谱资源关闭期间的遗留支持
- 批准号:
10204050 - 财政年份:2019
- 资助金额:
$ 0.52万 - 项目类别:
Legacy Support During Closure of the Mass Spectrometry Resource for Biology and Medicine
生物学和医学质谱资源关闭期间的遗留支持
- 批准号:
9976561 - 财政年份:2019
- 资助金额:
$ 0.52万 - 项目类别:
Legacy Support During Closure of the Mass Spectrometry Resource for Biology and Medicine
生物学和医学质谱资源关闭期间的遗留支持
- 批准号:
9810729 - 财政年份:2019
- 资助金额:
$ 0.52万 - 项目类别:
MALDI-TOF/TOF MS TO SUPPORT BIOMEDICAL RESEARCH
MALDI-TOF/TOF MS 支持生物医学研究
- 批准号:
8247392 - 财政年份:2012
- 资助金额:
$ 0.52万 - 项目类别:
PROTEIN CYSTEINE POST-TRANSLATIONAL MODIFICATION IN AMYLOIDOSIS
淀粉样变性中的蛋白质半胱氨酸翻译后修饰
- 批准号:
8365496 - 财政年份:2011
- 资助金额:
$ 0.52万 - 项目类别:
BUSM SEMINARS, LECTURES AND SABBATICAL ON MASS SPECTROMETRY
BUSM 质谱研讨会、讲座和休假
- 批准号:
8365520 - 财政年份:2011
- 资助金额:
$ 0.52万 - 项目类别:
MICROSCALE SAMPLE PREPARATION FOR MASS SPECTROMETRY
质谱分析的微量样品制备
- 批准号:
8365509 - 财政年份:2011
- 资助金额:
$ 0.52万 - 项目类别:
OXIDATIVE POST-TRANSLATIONAL MODIFICATIONS IN CARDIOVASCULAR DISEASE
心血管疾病中的氧化翻译后修饰
- 批准号:
8365547 - 财政年份:2011
- 资助金额:
$ 0.52万 - 项目类别:
ELECTRON TRANSFER DISSOCIATION OF GLYCANS AND GLYCOCONJUGATES
聚糖和糖缀合物的电子转移解离
- 批准号:
8365562 - 财政年份:2011
- 资助金额:
$ 0.52万 - 项目类别:
LIPID METABOLITES AND PATHWAYS STRATEGY CONSORTIUM
脂质代谢物和途径策略联盟
- 批准号:
8365525 - 财政年份:2011
- 资助金额:
$ 0.52万 - 项目类别:
相似海外基金
Sediment Drilling Facility for environmental and genetic archives
环境和遗传档案沉积物钻探设施
- 批准号:
LE240100064 - 财政年份:2024
- 资助金额:
$ 0.52万 - 项目类别:
Linkage Infrastructure, Equipment and Facilities
Aerial Archives of Race and American-Occupied Japan
种族和美国占领的日本的航空档案
- 批准号:
24K03721 - 财政年份:2024
- 资助金额:
$ 0.52万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
CAREER: Understanding biosphere-geosphere coevolution through carbonate-associated phosphate, community archives, and open-access education in rural schools
职业:通过碳酸盐相关磷酸盐、社区档案和农村学校的开放教育了解生物圈-地圈协同进化
- 批准号:
2338055 - 财政年份:2024
- 资助金额:
$ 0.52万 - 项目类别:
Continuing Grant
Designing a Bridging Model Using Learning Content Information LOD to Link School Education and Digital Archives
使用学习内容信息 LOD 设计桥接模型来链接学校教育和数字档案
- 批准号:
23H03695 - 财政年份:2023
- 资助金额:
$ 0.52万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Doris Lessing's Archives: Communism, Decolonisation and Literary Practice
多丽丝·莱辛档案:共产主义、非殖民化和文学实践
- 批准号:
2888789 - 财政年份:2023
- 资助金额:
$ 0.52万 - 项目类别:
Studentship
Building a sustainable future for anthropology's archives: Researching primary source data lifecycles, infrastructures, and reuse
为人类学档案构建可持续的未来:研究主要源数据生命周期、基础设施和重用
- 批准号:
2314762 - 财政年份:2023
- 资助金额:
$ 0.52万 - 项目类别:
Standard Grant
Reading Writing Lives: Publishing & Preserving Australian Literary Archives
阅读写作生活:出版
- 批准号:
DP230101797 - 财政年份:2023
- 资助金额:
$ 0.52万 - 项目类别:
Discovery Projects
Integrated High-Definition Visualization of Digital Archives for Borobudur Temple
婆罗浮屠寺数字档案集成高清可视化
- 批准号:
22KJ3026 - 财政年份:2023
- 资助金额:
$ 0.52万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Research on multilingual data integration for digital archives of Japanese culture
日本文化数字档案多语言数据集成研究
- 批准号:
23K11780 - 财政年份:2023
- 资助金额:
$ 0.52万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
A Preliminary Study for Constructing International Network of Image Archives on Afghan Cultural Heritages
构建阿富汗文化遗产国际图像档案网络的初步研究
- 批准号:
23K00915 - 财政年份:2023
- 资助金额:
$ 0.52万 - 项目类别:
Grant-in-Aid for Scientific Research (C)