SOFTWARE FOR MASS SPECTRAL DATA CONVERSION / AUTOMATIC PROTEOMIC ANALYSIS
质谱数据转换/自动蛋白质组分析软件
基本信息
- 批准号:7955889
- 负责人:
- 金额:$ 0.47万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2009
- 资助国家:美国
- 起止时间:2009-06-01 至 2010-05-31
- 项目状态:已结题
- 来源:
- 关键词:AlgorithmsArchivesBiologyBostonComputer Retrieval of Information on Scientific Projects DatabaseComputer softwareDataData AnalysesData FilesData SetDatabasesDevelopmentExtensible Markup LanguageFingerprintFundingGrantHemoglobinHousingHumanInstitutesInstitutionJournalsLibrariesLinkLinuxManufacturer NameManuscriptsMass Spectrum AnalysisMedicineMethodologyModelingOnline SystemsOperative Surgical ProceduresPeptide MappingPeptidesPerformanceProbabilityProcessProteinsProteomicsRelative (related person)ResearchResearch PersonnelResourcesSamplingSourceSpecific qualifier valueSpectrometry, Mass, Matrix-Assisted Laser Desorption-IonizationStatistical ModelsSystems BiologyTestingUnited States National Institutes of HealthUniversitiesVisualWaterWritingbasecomparativecomputerized data processingdata formatflexibilitygraphical user interfaceinstrumentinstrumentationmeetingspostersprogramsuser-friendlyweb based interface
项目摘要
This subproject is one of many research subprojects utilizing the
resources provided by a Center grant funded by NIH/NCRR. The subproject and
investigator (PI) may have received primary funding from another NIH source,
and thus could be represented in other CRISP entries. The institution listed is
for the Center, which is not necessarily the institution for the investigator.
With a multitude of different MS instrumentation and data analysis software platforms available for MS and proteomics, it becomes difficult to manipulate and manage various data sets. We have created a software application that will allow the conversion of processed MS data files obtained on a variety of instruments into several common formats accepted by different software applications. We have further developed the program to add support for the mzXML format (Pedroli et al., 2004) and incorporate a front end interface which may be linked to several web based database searching engines including Mascot, ProteinProspector and BUPID (a peptide mass fingerprinting program based on a log-likelihood ratio model developed here at BUSM) (Tong et al., 2005). The data processing software was developed using Microsoft Visual Basic 6.0. To add support for mzXML format, we used MSXML 4.0 as an XML parser and built a visual C++ library to decode Base64 encoded peak list data in the mzXML file. Other supported data formats are intermediate files converted from raw data files using software from the manufacturers: LC MS/MS data is processed with Analyst QS (ABI/Sciex), MassLynx/PLGS2.1 (Waters); and MALDI MS data with MOverZ (Proteometrics LLC); and FTMS data with BUDA (O'Connor, http://www.bumc.bu.edu/FTMS). The BUPID program was developed in C under Linux and made accessible to the main program through a CGI based web interface. The shell data conversion program was written to implement a user friendly GUI interface which may be operated in an unattended batch processing mode. Testing of the program was performed on existing MALDI-TOF MS, MALDI-FT MS and LC MS/MS data sets obtained in house. The program allowed the conversion of large volumes of data obtained on different instruments to the formats of several commercially and publicly available search engines. Files were then submitted for protein identification to the search engines with the search settings specified by the user. For a batch of files, the search setting only needs to be specified once, thus allowing unattended operation. Results files are automatically saved in HTML format and can then be viewed directly inside the program. Our recent implementation of the mzXML format introduced by the Institute for Systems Biology affords the benefits of a common data format for summation of results obtained on different MS platforms, comparative analysis of MS methodology and archiving of data such that it may be analyzed at a later date in-house or at a different facility. The software provides an easy-to-use graphical interface for automatic MS data conversion and database searching. It can also be easily be expanded for more MS data types and linked to more database search engines. During the current year, we have added capabilities for interpretation of top-down tandem mass spectra of proteins (BUPID-top down) and for linking of database assignments to functionality of proteins (STRAP). these results have been presented as posters at ASMS and other scientific meetings and manuscripts have been written for journal submission.
The new search algorithm Boston University Protein Identifier (BUPID) provides a robust and accurate statistical model for protein identification using MS data. The algorithm offers a number of new features: 1. Using log-likelihood ratio as scoring function, the algorithm can best distinguish correctly assigned peptides from incorrect assignments. 2. Matching peaks with a background-dependent threshold offers more flexibility and accuracy than the traditional mass window. 3. The statistical model provides similar or better results with comparison to conventional database search engines. We use log-likelihood ratio to calculate the probability that a protein is present in the sample. The model distinguishes two hypotheses ? H0: That a set of peaks in the spectrum is generated by the random background; and HA: That the same set of peaks is generated by peptides corresponding to a specific protein. A peak is included in the set if the probability that it is produced by the protein is more significant than that it is otherwise produced by the random background. Final results are ranked by the E-value of their probability score using the sequence information of the protein. We have compared the performance of the BUPID server and several other public web-based database search engines. Peptide map data sets were obtained from existing ongoing projects. BUPID database search results had on average 27% more true positives in top 10 predictions, with comparison to MASCOT results. Within the top 100 predictions, BUPID showed 27% more true positives as compared to MASCOT. In addition, BUPID was able to find all five human hemoglobin proteins in 6 cases within top 20. MASCOT succeeded in one case. When using peaks with higher than 5% relative intensity, the spreads are 28% and 24% within top 10 and 100 predictions, respectively. Recent efforts have been directed toward development of additional software for interpretation of top-down tandem MS data from the LTQ-Orbitrap MS.
这个子项目是许多研究子项目中利用
资源由NIH/NCRR资助的中心拨款提供。子项目和
调查员(PI)可能从NIH的另一个来源获得了主要资金,
并因此可以在其他清晰的条目中表示。列出的机构是
该中心不一定是调查人员的机构。
随着大量不同的MS仪器和数据分析软件平台可用于MS和蛋白质组学,操纵和管理各种数据集变得困难。我们已经创建了一个软件应用程序,它将允许将在各种仪器上获得的经过处理的MS数据文件转换为不同软件应用程序接受的几种通用格式。我们进一步开发了该程序以增加对mzXML格式的支持(Pedroi等人,2004年),并结合了前端接口,该前端接口可以链接到几个基于网络的数据库搜索引擎,包括Mascot、ProteinProspector和BUPID(基于在BUSM这里开发的对数似然比模型的多肽质量指纹分析程序)(童等人,2005年)。数据处理软件采用Microsoft Visual Basic 6.0开发。为了增加对mzXML格式的支持,我们使用MSXML4.0作为XML解析器,并构建了一个可视化C++库来解码mzXML文件中的Base64编码的峰值列表数据。其他受支持的数据格式是使用制造商的软件从原始数据文件转换而来的中间文件:LC MS/MS数据使用Analyst QS(ABI/SCHEX)、MassLynx/PLGS2.1(Waters)处理;MALDI MS数据使用MOverZ(Proteometrics LLC)处理;FTMS数据使用Buda(O‘Connor,http://www.bumc.bu.edu/FTMS).)处理BUPID程序是在Linux下用C语言开发的,主程序可以通过基于CGI的Web界面进行访问。编写了外壳数据转换程序,以实现用户友好的图形用户界面,该界面可以在无人值守的批处理模式下操作。对现有的MALDI-TOF MS、MALDI-FT MS和LC MS/MS数据集进行了程序测试。该程序允许将从不同仪器获得的大量数据转换为几个商业和公开可用的搜索引擎的格式。然后,根据用户指定的搜索设置,将用于蛋白质鉴定的文件提交给搜索引擎。对于一批文件,只需指定一次搜索设置,从而允许无人值守操作。结果文件自动保存为HTML格式,然后可以直接在程序中查看。我们最近实施的系统生物学研究所推出的mzXML格式提供了一种通用数据格式的好处,用于汇总在不同MS平台上获得的结果,对MS方法进行比较分析,并对数据进行归档,以便以后可以在内部或在不同的设施进行分析。该软件为MS数据自动转换和数据库搜索提供了一个简单易用的图形界面。它还可以很容易地扩展为更多的MS数据类型,并链接到更多的数据库搜索引擎。在本年度,我们增加了解释蛋白质的自上而下串联质谱学的能力(BUPID-TOP DOWN),并将数据库分配与蛋白质的功能联系起来(STRAP)。这些成果已经在ASMS和其他科学会议上以海报的形式公布,并为期刊提交撰写了手稿。
新的搜索算法波士顿大学蛋白质识别器(BUPID)为使用MS数据识别蛋白质提供了一个稳健和准确的统计模型。该算法提供了许多新的特征:1.使用对数似然比作为评分函数,该算法能够最好地区分正确分配的多肽和错误分配的多肽。2.与传统的质量窗相比,与背景相关的阈值匹配峰值提供了更多的灵活性和准确性。3.与传统的数据库搜索引擎相比,统计模型提供了类似或更好的结果。我们使用对数似然比来计算样本中存在蛋白质的概率。该模型区分了两个假设?H0:光谱中的一组峰是由随机背景产生的;以及HA:相同的一组峰是由对应于特定蛋白质的肽产生的。如果由蛋白质产生的峰的概率比由随机背景产生的峰的概率更显著,则将峰包括在集合中。最终结果通过使用蛋白质序列信息的概率分数的E值进行排序。我们比较了BUPID服务器和其他几个基于Web的公共数据库搜索引擎的性能。从现有的正在进行的项目中获得了肽图数据集。与吉祥物结果相比,BUPID数据库搜索结果在前10个预测中的真阳性平均多27%。在前100个预测中,与吉祥物相比,BUPID显示出27%的真实阳性。此外,BUPID能够在前20名中的6例中找到全部5种人类血红蛋白。吉祥物成功地找到了1例。当使用相对强度大于5%的峰值时,前10名和前100名的预测范围内的价差分别为28%和24%。最近的努力是开发额外的软件,用于解释来自LTQ-Orbitrap MS的自上而下的串联MS数据。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Catherine E. Costello其他文献
Phencyclidine (Sernylan) poisoning
- DOI:
10.1016/s0022-3476(73)80385-3 - 发表时间:
1973-11-01 - 期刊:
- 影响因子:
- 作者:
William L. Nyhan;Harry C. Shirkey;Craig B. Liden;Frederick H. Lovejoy;Catherine E. Costello - 通讯作者:
Catherine E. Costello
Inactivation of emMinar2/em in mice hyperactivates mTOR signaling and results in obesity
小鼠中 emMinar2/em 的失活过度激活 mTOR 信号并导致肥胖
- DOI:
10.1016/j.molmet.2023.101744 - 发表时间:
2023-07-01 - 期刊:
- 影响因子:6.600
- 作者:
Saran Lotfollahzadeh;Chaoshuang Xia;Razie Amraei;Ning Hua;Konstantin V. Kandror;Stephen R. Farmer;Wenyi Wei;Catherine E. Costello;Vipul Chitalia;Nader Rahimi - 通讯作者:
Nader Rahimi
RETRACTED ARTICLE: Endoperoxide formation by an α-ketoglutarate-dependent mononuclear non-haem iron enzyme
撤回文章:依赖α-酮戊二酸的单核非血红素铁酶形成内过氧化物
- DOI:
10.1038/nature15519 - 发表时间:
2015-11-02 - 期刊:
- 影响因子:48.500
- 作者:
Wupeng Yan;Heng Song;Fuhang Song;Yisong Guo;Cheng-Hsuan Wu;Ampon Sae Her;Yi Pu;Shu Wang;Nathchar Naowarojna;Andrew Weitz;Michael P. Hendrich;Catherine E. Costello;Lixin Zhang;Pinghua Liu;Yan Jessie Zhang - 通讯作者:
Yan Jessie Zhang
若年肥満者における尿中カルボニル物質による血圧上昇の予測
年轻肥胖者尿液中羰基物质导致血压升高的预测
- DOI:
- 发表时间:
2011 - 期刊:
- 影响因子:0
- 作者:
Garry L. Corthals;Catherine E. Costello;Eric W. Deutsch;Bruno Domon;William Hancock;Fuchu He;Denis Hochstrasser;Gyorgy Marko-Varga;Ghasem Hosseini Salekdeh;Salvatore Sechi;Michael Snyder;Sudhir Srivastava;Mathias Uhlen;Cathy H. Hu;Tadashi Y;佐藤恵美子 - 通讯作者:
佐藤恵美子
emDe novo/em glycan sequencing by electronic excitation dissociation MSsup2/sup-guided MSsup3/sup analysis on an Omnitrap-Orbitrap hybrid instrument
电子激发解离 MS² 引导的 MS³ 分析在 Omnitrap-Orbitrap 混合仪器上进行从头糖链测序
- DOI:
10.1039/d3sc00870c - 发表时间:
2023-06-21 - 期刊:
- 影响因子:7.400
- 作者:
Juan Wei;Dimitris Papanastasiou;Mariangela Kosmopoulou;Athanasios Smyrnakis;Pengyu Hong;Nafisa Tursumamat;Joshua A. Klein;Chaoshuang Xia;Yang Tang;Joseph Zaia;Catherine E. Costello;Cheng Lin - 通讯作者:
Cheng Lin
Catherine E. Costello的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Catherine E. Costello', 18)}}的其他基金
Legacy Support During Closure of the Mass Spectrometry Resource for Biology and Medicine
生物学和医学质谱资源关闭期间的遗留支持
- 批准号:
10204050 - 财政年份:2019
- 资助金额:
$ 0.47万 - 项目类别:
Legacy Support During Closure of the Mass Spectrometry Resource for Biology and Medicine
生物学和医学质谱资源关闭期间的遗留支持
- 批准号:
9976561 - 财政年份:2019
- 资助金额:
$ 0.47万 - 项目类别:
Legacy Support During Closure of the Mass Spectrometry Resource for Biology and Medicine
生物学和医学质谱资源关闭期间的遗留支持
- 批准号:
9810729 - 财政年份:2019
- 资助金额:
$ 0.47万 - 项目类别:
MALDI-TOF/TOF MS TO SUPPORT BIOMEDICAL RESEARCH
MALDI-TOF/TOF MS 支持生物医学研究
- 批准号:
8247392 - 财政年份:2012
- 资助金额:
$ 0.47万 - 项目类别:
PROTEIN CYSTEINE POST-TRANSLATIONAL MODIFICATION IN AMYLOIDOSIS
淀粉样变性中的蛋白质半胱氨酸翻译后修饰
- 批准号:
8365496 - 财政年份:2011
- 资助金额:
$ 0.47万 - 项目类别:
BUSM SEMINARS, LECTURES AND SABBATICAL ON MASS SPECTROMETRY
BUSM 质谱研讨会、讲座和休假
- 批准号:
8365520 - 财政年份:2011
- 资助金额:
$ 0.47万 - 项目类别:
MICROSCALE SAMPLE PREPARATION FOR MASS SPECTROMETRY
质谱分析的微量样品制备
- 批准号:
8365509 - 财政年份:2011
- 资助金额:
$ 0.47万 - 项目类别:
OXIDATIVE POST-TRANSLATIONAL MODIFICATIONS IN CARDIOVASCULAR DISEASE
心血管疾病中的氧化翻译后修饰
- 批准号:
8365547 - 财政年份:2011
- 资助金额:
$ 0.47万 - 项目类别:
ELECTRON TRANSFER DISSOCIATION OF GLYCANS AND GLYCOCONJUGATES
聚糖和糖缀合物的电子转移解离
- 批准号:
8365562 - 财政年份:2011
- 资助金额:
$ 0.47万 - 项目类别:
LIPID METABOLITES AND PATHWAYS STRATEGY CONSORTIUM
脂质代谢物和途径策略联盟
- 批准号:
8365525 - 财政年份:2011
- 资助金额:
$ 0.47万 - 项目类别:
相似海外基金
Sediment Drilling Facility for environmental and genetic archives
环境和遗传档案沉积物钻探设施
- 批准号:
LE240100064 - 财政年份:2024
- 资助金额:
$ 0.47万 - 项目类别:
Linkage Infrastructure, Equipment and Facilities
Aerial Archives of Race and American-Occupied Japan
种族和美国占领的日本的航空档案
- 批准号:
24K03721 - 财政年份:2024
- 资助金额:
$ 0.47万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
CAREER: Understanding biosphere-geosphere coevolution through carbonate-associated phosphate, community archives, and open-access education in rural schools
职业:通过碳酸盐相关磷酸盐、社区档案和农村学校的开放教育了解生物圈-地圈协同进化
- 批准号:
2338055 - 财政年份:2024
- 资助金额:
$ 0.47万 - 项目类别:
Continuing Grant
Designing a Bridging Model Using Learning Content Information LOD to Link School Education and Digital Archives
使用学习内容信息 LOD 设计桥接模型来链接学校教育和数字档案
- 批准号:
23H03695 - 财政年份:2023
- 资助金额:
$ 0.47万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Doris Lessing's Archives: Communism, Decolonisation and Literary Practice
多丽丝·莱辛档案:共产主义、非殖民化和文学实践
- 批准号:
2888789 - 财政年份:2023
- 资助金额:
$ 0.47万 - 项目类别:
Studentship
Integrated High-Definition Visualization of Digital Archives for Borobudur Temple
婆罗浮屠寺数字档案集成高清可视化
- 批准号:
22KJ3026 - 财政年份:2023
- 资助金额:
$ 0.47万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Research on multilingual data integration for digital archives of Japanese culture
日本文化数字档案多语言数据集成研究
- 批准号:
23K11780 - 财政年份:2023
- 资助金额:
$ 0.47万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Building a sustainable future for anthropology's archives: Researching primary source data lifecycles, infrastructures, and reuse
为人类学档案构建可持续的未来:研究主要源数据生命周期、基础设施和重用
- 批准号:
2314762 - 财政年份:2023
- 资助金额:
$ 0.47万 - 项目类别:
Standard Grant
A Preliminary Study for Constructing International Network of Image Archives on Afghan Cultural Heritages
构建阿富汗文化遗产国际图像档案网络的初步研究
- 批准号:
23K00915 - 财政年份:2023
- 资助金额:
$ 0.47万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Reading Writing Lives: Publishing & Preserving Australian Literary Archives
阅读写作生活:出版
- 批准号:
DP230101797 - 财政年份:2023
- 资助金额:
$ 0.47万 - 项目类别:
Discovery Projects