Galaxy Workflows for Proteomics Informed by Transcriptomics (PIT)

Galaxy 转录组学蛋白质组学工作流程 (PIT)

基本信息

  • 批准号:
    BB/K016075/1
  • 负责人:
  • 金额:
    $ 13.82万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2013
  • 资助国家:
    英国
  • 起止时间:
    2013 至 无数据
  • 项目状态:
    已结题

项目摘要

Identifying which proteins are present in a given biological sample, and in what quantities, is essential to understanding many biological processes. A technique called "shotgun proteomics" has become the method of choice for tackling this problem. In a shotgun proteomics analysis proteins are first broken down into more easily analysable segments (peptides) using a cleavage enzyme, then separated using liquid chromatography (LC), prior to individual injection into a tandem mass spectrometer (MS/MS), which breaks peptides into fragments, producing a spectrum of product ions that can be considered as a fingerprint for each peptide. Software is used to match the acquired spectra to peptides and these peptide identifications are then used to infer the presence of proteins. Working out which peptide is represented by each of the acquired spectra is clearly a crucial part of shotgun proteomics. In theory, because we understand the principles of peptide fragmentation, it should be possible to take any peptide spectrum and work out the sequence of the peptide from which it came. In practice this is usually too difficult because the combination of imperfect MS/MS spectra and the huge number of peptides that could potentially exist make incorrect identifications very likely. To circumvent this problem, protein identification software seeks to match peptide spectra only to those peptide sequences that might reasonably be expected to be in the sample. Currently this is done by searching against the sequences of all proteins that the species under study is known to produce (the "proteome"), downloaded from an online database (e.g. UniProt). However, high quality proteomes are only available for a small number of species. What if you want to do proteomics on a sample from a species for which a proteome is not available, or on a sample from an experiment involving multiple species, or unknown species?We recently developed (and tested, and published) a solution to this problem, which we call proteomics informed by transcriptomics (PIT). The key to PIT is the creation of a sample-specific list of proteins that may be present, derived from gene transcripts found in the sample. Transcripts are copies of genes that are used to make proteins, so by knowing which transcripts are present in a sample we can predict which proteins might be present. The transcripts are found by using a next generation sequencing technique called RNA-seq. Until very recently, RNA-seq involved mapping short reads to a reference genome, but software is now available that can assemble transcripts de novo.The PIT approach therefore makes it possible to identify and quantify proteins in complex samples when a reference proteome (or genome) is not available. This opens many new areas of research for species that do not have well annotated genomes (which include many pests, pathogens and plants), and also for experiments where proteins from multiple species are present (so-called "metaproteomics") or where the proteome is changing (e.g. during viral infection). There are also a number of additional spin-off benefits such as the ability to find protein variants that are specific to the individual under study (i.e. not present in any reference proteome), and possibility to annotate genomes.Currently, the main challenge of the PIT approach is the complexity of the data analysis necessary to integrate the transcriptomic and proteomic data and report results in a way that is useful to biologists. The aim of this proposal is therefore to put together a suite of easy to use connected software tools that enable the typical bench scientist to perform the necessary data analysis within an acceptable timescale with no bioinformatics support. To help achieve this we plan to implement the software within the popular Galaxy framework. Galaxy provides an easy to use web browser interface and can take advantage of powerful computing resources.
确定在给定的生物样品中存在哪些蛋白质,以及它们的数量,对于理解许多生物过程至关重要。一种叫做“鸟枪蛋白质组学”的技术已经成为解决这个问题的首选方法。在散弹枪蛋白质组学分析中,首先使用裂解酶将蛋白质分解成更容易分析的片段(肽),然后使用液相色谱(LC)分离,然后单独注射到串联质谱仪(MS/MS)中,串联质谱仪将肽分解成片段,产生可被视为每个肽指纹的产物离子光谱。软件是用来匹配获得的光谱肽和这些肽鉴定然后用来推断蛋白质的存在。确定每个获得的光谱所代表的肽显然是霰弹枪蛋白质组学的关键部分。从理论上讲,由于我们了解肽片段的原理,应该有可能采用任何肽谱并计算出它来自的肽序列。在实践中,这通常是非常困难的,因为不完美的MS/MS光谱和可能存在的大量肽的组合很可能导致错误的鉴定。为了避免这个问题,蛋白质鉴定软件试图将肽谱只与那些可能合理地预期在样品中的肽序列相匹配。目前,这是通过从在线数据库(例如UniProt)下载所研究物种已知产生的所有蛋白质序列(“蛋白质组”)来完成的。然而,高质量的蛋白质组只适用于少数物种。如果你想对一个物种的蛋白质组学样本做蛋白质组学,而这个物种的蛋白质组学是不可用的,或者对一个涉及多个物种或未知物种的实验样本做蛋白质组学呢?我们最近开发(并测试并发表)了一种解决这个问题的方法,我们称之为转录组学(PIT)。PIT的关键是创建一个可能存在的样品特异性蛋白质列表,该列表来自样品中发现的基因转录本。转录本是用于制造蛋白质的基因的副本,因此通过了解样本中存在哪些转录本,我们可以预测哪些蛋白质可能存在。转录本是通过下一代测序技术RNA-seq找到的。直到最近,RNA-seq还涉及到将短序列映射到参考基因组,但现在有软件可以从头组装转录本。因此,当参考蛋白质组(或基因组)不可用时,PIT方法使鉴定和定量复杂样品中的蛋白质成为可能。这为基因组没有很好注释的物种(包括许多害虫、病原体和植物)开辟了许多新的研究领域,也为来自多个物种的蛋白质存在的实验(所谓的“宏蛋白质组学”)或蛋白质组发生变化的实验(例如在病毒感染期间)开辟了许多新的研究领域。还有一些额外的附带好处,例如能够找到特定于被研究个体的蛋白质变体(即不存在于任何参考蛋白质组中),以及注释基因组的可能性。目前,PIT方法的主要挑战是整合转录组学和蛋白质组学数据并以一种对生物学家有用的方式报告结果所需的数据分析的复杂性。因此,本提案的目的是将一套易于使用的连接软件工具组合在一起,使典型的实验室科学家能够在没有生物信息学支持的情况下在可接受的时间范围内执行必要的数据分析。为了帮助实现这一目标,我们计划在流行的Galaxy框架内实现该软件。Galaxy提供了一个易于使用的网页浏览器界面,可以利用强大的计算资源。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Galaxy Integrated Omics: Web-based Standards-Compliant Workflows for Proteomics Informed by Transcriptomics.
  • DOI:
    10.1074/mcp.o115.048777
  • 发表时间:
    2015-11
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Fan J;Saha S;Barker G;Heesom KJ;Ghali F;Jones AR;Matthews DA;Bessant C
  • 通讯作者:
    Bessant C
Proteomics technique opens new frontiers in mobilome research.
  • DOI:
    10.1080/2159256x.2017.1362494
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Davidson AD;Matthews DA;Maringer K
  • 通讯作者:
    Maringer K
Proteomics informed by transcriptomics for characterising active transposable elements and genome annotation in Aedes aegypti.
蛋白质组学通过转录组学告知,以表征伊蚊中的主动转座元件和基因组注释。
  • DOI:
    10.1186/s12864-016-3432-5
  • 发表时间:
    2017-01-19
  • 期刊:
  • 影响因子:
    4.4
  • 作者:
    Maringer K;Yousuf A;Heesom KJ;Fan J;Lee D;Fernandez-Sesma A;Bessant C;Matthews DA;Davidson AD
  • 通讯作者:
    Davidson AD
PITDB: a database of translated genomic elements.
  • DOI:
    10.1093/nar/gkx906
  • 发表时间:
    2018-01-04
  • 期刊:
  • 影响因子:
    14.9
  • 作者:
    Saha S;Chatzimichali EA;Matthews DA;Bessant C
  • 通讯作者:
    Bessant C
High throughput discovery of protein variants using proteomics informed by transcriptomics.
  • DOI:
    10.1093/nar/gky295
  • 发表时间:
    2018-06-01
  • 期刊:
  • 影响因子:
    14.9
  • 作者:
    Saha S;Matthews DA;Bessant C
  • 通讯作者:
    Bessant C
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Conrad Bessant其他文献

Deriving Meaningful Aspects of Health Related to Physical Activity in Chronic Disease: Concept Elicitation Using Machine Learning–Assisted Coding of Online Patient Conversations
  • DOI:
    10.1016/j.jval.2023.01.022
  • 发表时间:
    2023-07-01
  • 期刊:
  • 影响因子:
  • 作者:
    Bill Byrom;Conrad Bessant;Fabrizio Smeraldi;Maryam Abdollahyan;Yasemin Bridges;Marzana Chowdhury;Asiyya Tahsin
  • 通讯作者:
    Asiyya Tahsin

Conrad Bessant的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Conrad Bessant', 18)}}的其他基金

PIT-DB: A Resource for Sharing, Annotating and Analysing Translated Genomic Elements
PIT-DB:用于共享、注释和分析翻译基因组元素的资源
  • 批准号:
    BB/M020118/1
  • 财政年份:
    2015
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Research Grant
Proteomics Goes Viral: Novel Resources for Identification and Quantification of Virus Proteins
蛋白质组学病毒式传播:用于病毒蛋白鉴定和定量的新资源
  • 批准号:
    BB/L018438/1
  • 财政年份:
    2014
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Research Grant
An Integrated Open Source Software Resource for Quantitative Proteomics
用于定量蛋白质组学的集成开源软件资源
  • 批准号:
    BB/I001131/2
  • 财政年份:
    2013
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Research Grant
An Integrated Open Source Software Resource for Quantitative Proteomics
用于定量蛋白质组学的集成开源软件资源
  • 批准号:
    BB/I001131/1
  • 财政年份:
    2010
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Research Grant
X-tracker: a generic quantitation tool for MS-based proteomics:
X-tracker:基于 MS 的蛋白质组学通用定量工具:
  • 批准号:
    BB/F016107/1
  • 财政年份:
    2008
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Research Grant
Further Development of the Genome Annotating Proteomic Pipeline
基因组注释蛋白质组管道的进一步发展
  • 批准号:
    BB/E01237X/1
  • 财政年份:
    2007
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Research Grant
Bioinformatics for High Throughput Proteomics (Short Course)
高通量蛋白质组学生物信息学(短期课程)
  • 批准号:
    BB/D007216/1
  • 财政年份:
    2006
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Research Grant

相似海外基金

Collaborative Research: GEO OSE Track 2: Developing CI-enabled collaborative workflows to integrate data for the SZ4D (Subduction Zones in Four Dimensions) community
协作研究:GEO OSE 轨道 2:开发支持 CI 的协作工作流程以集成 SZ4D(四维俯冲带)社区的数据
  • 批准号:
    2324714
  • 财政年份:
    2024
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Standard Grant
CAREER: Toolkits for Digital/Physical Workflows
职业:数字/物理工作流程工具包
  • 批准号:
    2339273
  • 财政年份:
    2024
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Continuing Grant
Collaborative Research: GEO OSE Track 2: Project Pythia and Pangeo: Building an inclusive geoscience community through accessible, reusable, and reproducible workflows
合作研究:GEO OSE 第 2 轨道:Pythia 和 Pangeo 项目:通过可访问、可重用和可重复的工作流程构建包容性的地球科学社区
  • 批准号:
    2324304
  • 财政年份:
    2024
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Standard Grant
Elements: Adaptive End-to-End Parallelism for Distributed Science Workflows
要素:分布式科学工作流程的自适应端到端并行性
  • 批准号:
    2427408
  • 财政年份:
    2024
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Standard Grant
Collaborative Research: GEO OSE Track 2: Developing CI-enabled collaborative workflows to integrate data for the SZ4D (Subduction Zones in Four Dimensions) community
协作研究:GEO OSE 轨道 2:开发支持 CI 的协作工作流程以集成 SZ4D(四维俯冲带)社区的数据
  • 批准号:
    2324709
  • 财政年份:
    2024
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Standard Grant
Collaborative Research: GEO OSE Track 2: Developing CI-enabled collaborative workflows to integrate data for the SZ4D (Subduction Zones in Four Dimensions) community
协作研究:GEO OSE 轨道 2:开发支持 CI 的协作工作流程以集成 SZ4D(四维俯冲带)社区的数据
  • 批准号:
    2324713
  • 财政年份:
    2024
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Standard Grant
Collaborative Research: GEO OSE Track 2: Project Pythia and Pangeo: Building an inclusive geoscience community through accessible, reusable, and reproducible workflows
合作研究:GEO OSE 第 2 轨道:Pythia 和 Pangeo 项目:通过可访问、可重用和可重复的工作流程构建包容性的地球科学社区
  • 批准号:
    2324302
  • 财政年份:
    2024
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Standard Grant
Collaborative Research: GEO OSE Track 2: Developing CI-enabled collaborative workflows to integrate data for the SZ4D (Subduction Zones in Four Dimensions) community
协作研究:GEO OSE 轨道 2:开发支持 CI 的协作工作流程以集成 SZ4D(四维俯冲带)社区的数据
  • 批准号:
    2324710
  • 财政年份:
    2024
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Standard Grant
Collaborative Research: GEO OSE Track 2: Developing CI-enabled collaborative workflows to integrate data for the SZ4D (Subduction Zones in Four Dimensions) community
协作研究:GEO OSE 轨道 2:开发支持 CI 的协作工作流程以集成 SZ4D(四维俯冲带)社区的数据
  • 批准号:
    2324711
  • 财政年份:
    2024
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Standard Grant
Collaborative Research: GEO OSE Track 2: Project Pythia and Pangeo: Building an inclusive geoscience community through accessible, reusable, and reproducible workflows
合作研究:GEO OSE 第 2 轨道:Pythia 和 Pangeo 项目:通过可访问、可重用和可重复的工作流程构建包容性的地球科学社区
  • 批准号:
    2324303
  • 财政年份:
    2024
  • 资助金额:
    $ 13.82万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了