Development and benchmarking of improved computational methods for transcript-level expression analysis using RNA-seq data

使用 RNA-seq 数据进行转录水平表达分析的改进计算方法的开发和基准测试

基本信息

  • 批准号:
    BB/J009415/1
  • 负责人:
  • 金额:
    $ 39.81万
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Research Grant
  • 财政年份:
    2012
  • 资助国家:
    英国
  • 起止时间:
    2012 至 无数据
  • 项目状态:
    已结题

项目摘要

After sequencing of the human genome was completed, Scientists were surprised to discover that there are far fewer protein-coding genes than was previously predicted. One reason that an organism as complex as human can be built from a relatively small number of genes is that each gene encodes more than one protein. An intermediate molecule, messenger RNA (mRNA), carries the information from the genome in the cell nucleus to ribosomes which create proteins. These mRNA molecules are also known as transcripts and their full complement is termed the transcriptome. Before they mature these transcripts are edited to form the template for different proteins. This editing process is called splicing and different transcripts that result are called splice variants or isoforms. An additional complexity in the transcriptome is due to the fact that each gene has multiple copies (for example 2 in human, 6 in wheat) and these different copies, called alleles, can be expressed differently under different conditions or in different tissues. The transcriptome is a collection of transcripts which includes all the allele-specific gene isoforms that are expressed in the cell along with other non-coding RNA molecules. Splicing and allele usage are fundamental ways that the function of genes can be modulated in a tissue-specific manner. Therefore developing technologies to accurately measure transcript expression is a necessary step towards understanding and modelling cells and tissues. A recently developed experimental technology called RNA-seq gives unprecedented access to data about the transcriptome. Computational methods are required to interpret these data which are in the form of a list containing millions of short RNA sequence fragments. These fragments are difficult to interpret because, for example, the same fragment could have come from a large number of different gene isoforms. The question is, which one? Computational methods can be used to answer this question and infer the concentration of different gene isoforms in the sample given these data. In this project we will develop a new computational method, implemented in publically available free software, which uses advanced statistical procedures to solve this problem. An important distinguishing feature of the method is the ability to associate inferred concentrations with a degree of uncertainty which captures technical and biological sources of error as well as the inherent difficulty of the problem due to the difficulty of assigning fragments to gene isoforms. We will create benchmark data that allows us to assess the performance or our method and other available published methods, allowing researchers and end-users of different methods to understand their properties. Finally, we will adapt an existing computer program, puma, to work with the processed RNA-seq data in order to identify genes which change between conditions, which have similar expression patterns or which contribute most to the variance in the data.
在人类基因组测序完成后,科学家们惊讶地发现,蛋白质编码基因比之前预测的要少得多。像人类这样复杂的生物体可以由相对较少的基因构建而成,其中一个原因是每个基因编码不止一种蛋白质。信使RNA (mRNA)是一种中间分子,它将细胞核基因组的信息传递给产生蛋白质的核糖体。这些mRNA分子也被称为转录本,它们的完整补体被称为转录组。在它们成熟之前,这些转录本被编辑成不同蛋白质的模板。这种编辑过程称为剪接,产生的不同转录本称为剪接变体或同种异构体。转录组的另一个复杂性是由于每个基因都有多个拷贝(例如人类有2个,小麦有6个),这些不同的拷贝被称为等位基因,在不同的条件下或在不同的组织中可以以不同的方式表达。转录组是转录本的集合,包括在细胞中与其他非编码RNA分子一起表达的所有等位基因特异性基因同种异构体。剪接和等位基因的使用是基因功能以组织特异性方式调节的基本途径。因此,开发准确测量转录物表达的技术是理解和模拟细胞和组织的必要步骤。最近开发的一项名为RNA-seq的实验技术为获取转录组数据提供了前所未有的途径。需要计算方法来解释这些数据,这些数据以包含数百万个短RNA序列片段的列表形式存在。这些片段很难解释,因为,例如,相同的片段可能来自大量不同的基因同种异构体。问题是,是哪一个?计算方法可以用来回答这个问题,并根据这些数据推断样品中不同基因同种异构体的浓度。在这个项目中,我们将开发一种新的计算方法,在公开的自由软件中实现,它使用先进的统计程序来解决这个问题。该方法的一个重要区别特征是能够将推断的浓度与一定程度的不确定性联系起来,这种不确定性捕获了技术和生物学上的错误来源,以及由于难以将片段分配给基因同种异构体而导致的问题的固有困难。我们将创建基准数据,使我们能够评估我们的方法和其他可用的已发表方法的性能,从而使不同方法的研究人员和最终用户能够了解它们的特性。最后,我们将调整现有的计算机程序puma,与处理过的RNA-seq数据一起工作,以识别在不同条件下变化的基因,具有相似表达模式的基因或对数据差异贡献最大的基因。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Improved variational Bayes inference for transcript expression estimation.
改进了用于转录表达估计的变分贝叶斯推理。
Fast and accurate approximate inference of transcript expression from RNA-seq data
从 RNA-seq 数据快速准确地近似推断转录本表达
  • DOI:
    10.48550/arxiv.1412.5995
  • 发表时间:
    2014
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hensman J
  • 通讯作者:
    Hensman J
Fast and accurate approximate inference of transcript expression from RNA-seq data.
  • DOI:
    10.1093/bioinformatics/btv483
  • 发表时间:
    2015-12-15
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hensman J;Papastamoulis P;Glaus P;Honkela A;Rattray M
  • 通讯作者:
    Rattray M
A Bayesian model selection approach for identifying differentially expressed transcripts from RNA sequencing data.
Bayesian estimation of differential transcript usage from RNA-seq data.
根据 RNA-seq 数据对转录本使用差异进行贝叶斯估计。
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Magnus Rattray其他文献

Scalable Multi-Output Gaussian Processes with Stochastic Variational Inference
具有随机变分推理的可扩展多输出高斯过程
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xiaoyu Jiang;Sokratia Georgaka;Magnus Rattray;Mauricio A. Alvarez
  • 通讯作者:
    Mauricio A. Alvarez
Cumulant dynamics of a population under multiplicative selection, mutation, and drift.
乘法选择、突变和漂移下种群的累积动态。
  • DOI:
    10.1006/tpbi.2001.1531
  • 发表时间:
    1999
  • 期刊:
  • 影响因子:
    1.4
  • 作者:
    Magnus Rattray;Jonathan L. Shapiro
  • 通讯作者:
    Jonathan L. Shapiro
OS152 - Integrating single-cell RNA and spatial transcriptomic data defines altered cell state in human liver fibrosis
OS152 - 整合单细胞 RNA 和空间转录组学数据定义了人类肝纤维化中改变的细胞状态
  • DOI:
    10.1016/s0168-8278(22)00598-0
  • 发表时间:
    2022-07-01
  • 期刊:
  • 影响因子:
    33.000
  • 作者:
    Nigel Hammond;Sokratia Georgaka;Syed Murtuza-Baker;Ali Al-Anbaki;Elliot Jokl;Harry Spiers;Ajith Siriwardena;Varinder Athwal;Neil Hanley;Magnus Rattray;Karen Piper Hanley
  • 通讯作者:
    Karen Piper Hanley
Component‐specific clusters for diagnosis and prediction of allergic airway diseases
用于诊断和预测过敏性气道疾病的特定成分簇
  • DOI:
    10.1111/cea.14468
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    6.1
  • 作者:
    Rebecca Howard;S. Fontanella;A. Simpson;Clare S. Murray;Adnan Custovic;Magnus Rattray
  • 通讯作者:
    Magnus Rattray
UVAE: Integration of Heterogeneous Unpaired Data with Imbalanced Classes
UVAE:异构不成对数据与不平衡类的集成
  • DOI:
    10.1101/2023.12.18.572157
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Mike Phuycharoen;Verena Kaestele;Thomas Williams;Lijing Lin;Tracy Hussell;John Grainger;Magnus Rattray
  • 通讯作者:
    Magnus Rattray

Magnus Rattray的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Magnus Rattray', 18)}}的其他基金

Integrating Capture-HiC with omic time course data to uncover the regulatory interactions modulated by genetic variation in disease
将 Capture-HiC 与组学时间过程数据相结合,揭示疾病遗传变异调节的调控相互作用
  • 批准号:
    MR/N00017X/1
  • 财政年份:
    2015
  • 资助金额:
    $ 39.81万
  • 项目类别:
    Research Grant

相似国自然基金

企业绩效评价的DEA-Benchmarking方法及动态博弈研究
  • 批准号:
    70571028
  • 批准年份:
    2005
  • 资助金额:
    16.5 万元
  • 项目类别:
    面上项目

相似海外基金

Bioorthogonal probe development for highly parallel in vivo imaging
用于高度并行体内成像的生物正交探针开发
  • 批准号:
    10596786
  • 财政年份:
    2023
  • 资助金额:
    $ 39.81万
  • 项目类别:
Development and evaluation of a combined X-ray transmission and diffraction imaging system for pathology
用于病理学的组合 X 射线透射和衍射成像系统的开发和评估
  • 批准号:
    10699271
  • 财政年份:
    2023
  • 资助金额:
    $ 39.81万
  • 项目类别:
Development, feasibility, and acceptability of Aim to Play, a user-friendly digital application for teacher skills training and physical education activities for K-2 elementary students
Aim to Play 的开发、可行性和可接受性,这是一款用户友好的数字应用程序,用于 K-2 小学生的教师技能培训和体育活动
  • 批准号:
    10598343
  • 财政年份:
    2023
  • 资助金额:
    $ 39.81万
  • 项目类别:
Development of contrast agents to facilitate image-guided surgery
开发造影剂以促进图像引导手术
  • 批准号:
    10810184
  • 财政年份:
    2023
  • 资助金额:
    $ 39.81万
  • 项目类别:
Practice Wellness: Equipping home visitors with skills in reflective coaching, parent mediated child development, and occupational wellness to strengthen child outcomes among low-resourced families
实践健康:为家庭访客提供反思性辅导、家长介导的儿童发展和职业健康方面的技能,以增强资源匮乏家庭的儿童成果
  • 批准号:
    10820982
  • 财政年份:
    2023
  • 资助金额:
    $ 39.81万
  • 项目类别:
Reducing the chasm in behavioral health care for older adults with cancer: Development of the Center for Implementation Research in Cancer in Later Life (CIRCL)
缩小老年癌症患者行为健康护理方面的鸿沟:晚年癌症实施研究中心 (CIRCL) 的发展
  • 批准号:
    10742163
  • 财政年份:
    2023
  • 资助金额:
    $ 39.81万
  • 项目类别:
Development of methods to assess geographic variation in reproductive health behaviors and outcomes over the life course
开发评估生命历程中生殖健康行为和结果的地理差异的方法
  • 批准号:
    10732184
  • 财政年份:
    2023
  • 资助金额:
    $ 39.81万
  • 项目类别:
Development and Pre-Clinical Validation of Quantitative Imaging of Cell State Kinetics (QuICK) for Functional Precision Oncology
用于功能性精准肿瘤学的细胞状态动力学定量成像 (QuICK) 的开发和临床前验证
  • 批准号:
    10737379
  • 财政年份:
    2023
  • 资助金额:
    $ 39.81万
  • 项目类别:
Development and Validation of the Down Syndrome Regression Rating Scales
唐氏综合症回归评定量表的开发和验证
  • 批准号:
    10781052
  • 财政年份:
    2023
  • 资助金额:
    $ 39.81万
  • 项目类别:
Development of nanodroplet enhanced ultrasonic cavitation technologyto enable the study of chromatin accessibility in FFPE tissues
开发纳米液滴增强超声空化技术以实现 FFPE 组织中染色质可及性的研究
  • 批准号:
    10699112
  • 财政年份:
    2023
  • 资助金额:
    $ 39.81万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了