Unlocking transcript diversity via differential analyses of splice graphs

通过剪接图的差异分析解锁转录本多样性

基本信息

  • 批准号:
    8473250
  • 负责人:
  • 金额:
    $ 39.56万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2012
  • 资助国家:
    美国
  • 起止时间:
    2012-05-23 至 2015-03-31
  • 项目状态:
    已结题

项目摘要

Abstract A most basic difference between cells of the same genotype and different phenotype lies in their transcriptome. Understanding the difference between two transcriptomes in terms of the RNA molecules present in each, or changes in abundance of specific molecules, can offer valuable insight into the molecular mechanisms of disease, development, and specialization. High throughput sequencing provides a unique view of the transcriptome in the form of millions or even billions of short reads of nucleotide sequences sampled from the RNA molecules. To date, nearly 1000 such RNA-seq datasets have already been deposited in the NCBI Gene Expression Omnibus. Beyond measuring differences in overall expression of genes between samples, there is a critical need to measure differences in expression at the transcript level. Computational tools that can extract significant changes in transcript diversity across populations with RNA-seq are in immediate demand. However, reconstructing the full extent of transcript isoforms from this wealth of data is not a solved problem because of fundamental ambiguities between isoforms at the scale of the short read samples. We propose a novel approach to the differential analysis of transcriptomes that does not depend on the reconstruction of the full-length transcripts, and yet can accurately pinpoint the variation of transcriptomes. Our techniques are data-driven and applicable to any transcriptome, requiring only a reference genome, and do not depend on a priori gene structure annotations. Our research program builds on our highly sensitive and accurate MapSplice alignment algorithm to construct expression weighted splice graphs (ESG) from RNA-seq datasets. ESGs can be three orders of magnitude smaller in size than current RNA-seq datasets, yet fully represent the substantive biological content of such datasets. The ESG representation supports highly efficient analysis techniques that can directly identify and visualize statistically significant differential transcription between samples. Generalizations of the algorithms are proposed to identify co-regulated splicing patterns that are keys for biological pathway analyses and systems biology analyses. We have established an ongoing interactive and collaborative research environment among the co-PIs and Co-Is which include the biologists, computer scientists and statistician. The proposed computational methods will be tested and refined using RNA-seq data generated from breast cancer cell lines before being further applied to three well curated RNA-seq datasets on lung cancer pathogenesis, stem cells in leukemia, and equine articular cartilage development and repair (a non-model mammalian organism). Experimental validation of differentially expressed transcript isoforms will both improve the accuracy of our methods, as well as propose novel candidates for alternative isoforms associated with lung cancer,and leukemia diseases, and chondrocyte differentiation. The software will be open-source and will be developed as a set of components that can be used on their own or integrated into RNA-seq processing workflows. In particular we will integrate the components into the Galaxy cloud computing framework hosted on a local server. As such the methods will be available to researchers worldwide. As components mature they may be installed in other servers worldwide to provide a convenient and secure way to analyze transcriptomes. Unveiling the dynamics of the transcriptome at modest cost will revolutionize cellular diagnostics and biomedical research. Genome-wide measurement of transcription variants offers the potential for detailed molecular information about cellular identity and function that will greatly expand traditional histological assessment. Cloud-based access to the methods can turn individual laboratories into small genome centers and will enable individual scientists to assess differences among RNA transcriptomes in a matter of days. Our suite of algorithms will enable biomedical researchers to prioritize candidate genes or different gene ontology categories to investigate further for differential transcription and mechanistic importance between experimental conditions.
摘要 相同基因型和不同表型的细胞之间最基本的差异在于它们的转录组。了解两个转录组之间存在的RNA分子的差异,或特定分子丰度的变化,可以为疾病,发育和专业化的分子机制提供有价值的见解。高通量测序以从RNA分子取样的核苷酸序列的数百万或甚至数十亿短读段的形式提供转录组的独特视图。到目前为止,已经有近1000个这样的RNA-seq数据集存放在NCBI基因表达综合数据库中。除了测量样品之间基因总体表达的差异之外,还迫切需要测量转录物水平上的表达差异。可以利用RNA-seq提取人群中转录物多样性的显著变化的计算工具是迫切需要的。然而,从该丰富的数据重建转录物同种型的完整范围不是解决的问题,因为在短读段样品的规模下同种型之间的基本模糊性。 我们提出了一种新的方法,不依赖于重建的全长转录本的转录组的差异分析,但可以准确地查明的转录组的变化。我们的技术是数据驱动的,适用于任何转录组,只需要一个参考基因组,不依赖于先验的基因结构注释。我们的研究计划建立在我们高度敏感和准确的MapSplice比对算法的基础上,从RNA-seq数据集构建表达加权剪接图(ESG)。ESG的大小可以比目前的RNA-seq数据集小三个数量级,但完全代表了这些数据集的实质性生物学内容。ESG表示支持高效的分析技术,可以直接识别和可视化样品之间的统计学显著差异转录。概括的算法,提出了确定共同调节的剪接模式,是生物途径分析和系统生物学分析的关键。 我们已经建立了一个持续的互动和协作的研究环境之间的co-PI和Co-Is,其中包括生物学家,计算机科学家和统计学家。提出的计算方法将使用从乳腺癌细胞系生成的RNA-seq数据进行测试和改进,然后进一步应用于肺癌发病机制,白血病干细胞和马关节软骨发育和修复(非模型哺乳动物生物体)的三个精心策划的RNA-seq数据集。差异表达的转录异构体的实验验证将提高我们的方法的准确性,以及提出新的候选人与肺癌,白血病和软骨细胞分化相关的替代异构体。 该软件将是开源的,并将被开发为一组组件,可以单独使用或集成到RNA-seq处理工作流程中。特别是,我们将把这些组件集成到本地服务器上托管的银河云计算框架中。因此,这些方法将提供给世界各地的研究人员。随着组件的成熟,它们可以安装在世界各地的其他服务器上,以提供方便和安全的方式来分析转录组。 以适度的成本揭示转录组的动态将彻底改变细胞诊断和生物医学研究。全基因组转录变异的测量提供了关于细胞身份和功能的详细分子信息的可能性,这将大大扩展传统的组织学评估。基于云的方法可以将单个实验室变成小型基因组中心,并使单个科学家能够在几天内评估RNA转录组之间的差异。我们的算法套件将使生物医学研究人员能够优先考虑候选基因或不同的基因本体类别,以进一步研究实验条件之间的差异转录和机制重要性。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jinze Liu其他文献

Jinze Liu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jinze Liu', 18)}}的其他基金

Bioinformatics Core
生物信息学核心
  • 批准号:
    10733313
  • 财政年份:
    2023
  • 资助金额:
    $ 39.56万
  • 项目类别:
Unlocking transcript diversity via differential analyses of splice graphs
通过剪接图的差异分析解锁转录本多样性
  • 批准号:
    8660317
  • 财政年份:
    2012
  • 资助金额:
    $ 39.56万
  • 项目类别:
Unlocking transcript diversity via differential analyses of splice graphs
通过剪接图的差异分析解锁转录本多样性
  • 批准号:
    8296802
  • 财政年份:
    2012
  • 资助金额:
    $ 39.56万
  • 项目类别:

相似海外基金

CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 39.56万
  • 项目类别:
    Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
  • 批准号:
    2338816
  • 财政年份:
    2024
  • 资助金额:
    $ 39.56万
  • 项目类别:
    Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
  • 批准号:
    2338846
  • 财政年份:
    2024
  • 资助金额:
    $ 39.56万
  • 项目类别:
    Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
  • 批准号:
    2348261
  • 财政年份:
    2024
  • 资助金额:
    $ 39.56万
  • 项目类别:
    Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
  • 批准号:
    2348346
  • 财政年份:
    2024
  • 资助金额:
    $ 39.56万
  • 项目类别:
    Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
  • 批准号:
    2348457
  • 财政年份:
    2024
  • 资助金额:
    $ 39.56万
  • 项目类别:
    Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
  • 批准号:
    2404989
  • 财政年份:
    2024
  • 资助金额:
    $ 39.56万
  • 项目类别:
    Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 39.56万
  • 项目类别:
    Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
  • 批准号:
    2339669
  • 财政年份:
    2024
  • 资助金额:
    $ 39.56万
  • 项目类别:
    Continuing Grant
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
  • 批准号:
    EP/Y029089/1
  • 财政年份:
    2024
  • 资助金额:
    $ 39.56万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了