Computational Methods to Characterize Alternative Splicing from Massive Collections of RNA-seq Data

从大量 RNA-seq 数据中表征选择性剪接的计算方法

基本信息

  • 批准号:
    10021689
  • 负责人:
  • 金额:
    $ 36.32万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-09-20 至 2023-06-30
  • 项目状态:
    已结题

项目摘要

SUMMARY Alternative splicing (AS) is a gene regulatory mechanism with important roles in human biology and disease. High throughput sequencing of RNA (RNA-seq) is making it possible to survey the expressed genes and their alternative splicing variations in a wide variety of cellular conditions. However, the short reads are challenging to analyze, demanding highly sophisticated computational methods that can extract meaningful AS information efficiently, accurately, and in a comprehensive way. While there has been great progress so far, current methods based on assembling the short reads into transcript annotations have reached a plateau. We propose two innovations that can help overcome the limits. The first is one-step simultaneous analyses of multiple samples in an RNA-seq collection, in contrast with the current two-step approach that analyzes each sample separately and then merges the results. The second is to create and interrogate assembly-free representations of AS. The project will design a suite of tools that will leverage the latent information in large collections of samples and from heterogeneous data types to build complete and accurate AS signatures of tissues and cell types, and to elucidate the regulatory circuitry of AS and its functional implications. Aim 1 will develop a high- performance multi-sample transcript assembly tool, combining subexon graph representations of genes and AS variations, statistical methods for improved feature detection, and search space reduction techniques for efficient sample processing. Aim 2 will build highly efficient and accurate feature selection tools to detect and characterize assembly-free AS variations (subexons and introns), simultaneously from collections of RNA-seq samples. It will combine novel regularized programs with complex models of intronic `noise' and other RNA-seq confounders, and enable analyses of differential splicing and to identify individual and group-specific variations. Lastly, Aim 3 will develop a system to comprehensively model the regulatory and functional circuitry of AS and the effects of mutations, starting from deep learning models of sequences and alignments and integrating expression, sequence, epigenetic and mutation data across tissues, cell types and conditions. We will rigorously test and evaluate all tools in simulations and on large public data sets, as well as on thyroid and head and neck cancer data provided by our collaborators, and we will experimentally validate random subsets of predictions with capillary electrophoresis and qRT-PCR. Collectively, the concepts, methods and tools will establish a new framework for analyzing RNA-seq data that can efficiently tackle the `big data' challenges, leading to more complete discovery and annotation of AS structure and function in human health and disease.
摘要 选择性剪接(AS)是一种在人类生物学和疾病中具有重要作用的基因调控机制。 高通量rna测序技术(rna-seq)使研究表达的基因及其基因成为可能。 在各种各样的细胞条件下的选择性剪接变异。然而,短期读数具有挑战性。 分析,需要高度复杂的计算方法,可以提取有意义的信息 高效、准确、全面。虽然到目前为止已经取得了很大的进步,但目前 基于将短文汇编成文字注释的方法已经达到了平台期。我们建议 两项创新可以帮助克服这些限制。第一种是一步同时分析多个 RNA-Seq集合中的样本,而不是目前分析每个样本的两步方法 然后将结果合并。第二种方法是创建和询问免装配表示 AS的。该项目将设计一套工具,利用大型集合中的潜在信息 样本和来自不同数据类型的构建完整和准确的组织和细胞签名 类型,并阐明AS的调节回路及其功能意义。目标一号将开发一种高- 高性能多样本转录本组装工具,结合基因和亚外显子图表示 作为变体,改进的特征检测的统计方法,以及搜索空间缩减技术 高效的样品处理。AIM 2将构建高效和准确的特征选择工具来检测和 同时从RNA-seq集合中鉴定无装配AS变异(亚外显子和内含子) 样本。它将把新的正则化程序与内含子噪声和其他RNA-SEQ的复杂模型结合起来 混杂基因,使差异剪接分析成为可能,并识别个体和群体特有的变异。 最后,目标3将开发一个系统,对AS和AS的调节和功能电路进行全面建模 突变的影响,从序列和比对的深度学习模型开始,并整合 跨组织、细胞类型和条件的表达、序列、表观遗传和突变数据。我们会 在模拟和大型公共数据集上严格测试和评估所有工具,以及在甲状腺和 由我们的合作者提供的头颈部癌症数据,我们将通过实验验证随机子集 用毛细管电泳法和定量逆转录聚合酶链式反应进行预测。总的来说,这些概念、方法和工具将 建立一个分析RNA-SEQ数据的新框架,以有效应对“大数据”挑战, 从而更全面地发现和诠释AS在人类健康和疾病中的结构和功能。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Liliana D Florea其他文献

Liliana D Florea的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Liliana D Florea', 18)}}的其他基金

Computational Methods to Characterize Alternative Splicing from Massive Collections of RNA-seq Data
从大量 RNA-seq 数据中表征选择性剪接的计算方法
  • 批准号:
    10387065
  • 财政年份:
    2019
  • 资助金额:
    $ 36.32万
  • 项目类别:
Computational Methods to Characterize Alternative Splicing from Massive Collections of RNA-seq Data
从大量 RNA-seq 数据中表征选择性剪接的计算方法
  • 批准号:
    10218209
  • 财政年份:
    2019
  • 资助金额:
    $ 36.32万
  • 项目类别:
Computational Methods to Characterize Alternative Splicing from Massive Collections of RNA-seq Data
从大量 RNA-seq 数据中表征选择性剪接的计算方法
  • 批准号:
    10450006
  • 财政年份:
    2019
  • 资助金额:
    $ 36.32万
  • 项目类别:

相似国自然基金

Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    合作创新研究团队

相似海外基金

Bioinformatics and Big Data Analytics
生物信息学和大数据分析
  • 批准号:
    CRC-2021-00259
  • 财政年份:
    2022
  • 资助金额:
    $ 36.32万
  • 项目类别:
    Canada Research Chairs
Bioinformatics And Big Data Analytics
生物信息学和大数据分析
  • 批准号:
    CRC-2016-00137
  • 财政年份:
    2021
  • 资助金额:
    $ 36.32万
  • 项目类别:
    Canada Research Chairs
Byte-sized bioinformatics: introducing Big Data through computational biology
字节大小的生物信息学:通过计算生物学引入大数据
  • 批准号:
    ST/T000872/1
  • 财政年份:
    2020
  • 资助金额:
    $ 36.32万
  • 项目类别:
    Research Grant
Bioinformatics and big data analytics
生物信息学和大数据分析
  • 批准号:
    CRC-2016-00137
  • 财政年份:
    2020
  • 资助金额:
    $ 36.32万
  • 项目类别:
    Canada Research Chairs
CLIMB-BIG-DATA: A Cloud Infrastructure for Big-Data Microbial Bioinformatics
CLIMB-BIG-DATA:大数据微生物生物信息学的云基础设施
  • 批准号:
    MR/T030062/1
  • 财政年份:
    2020
  • 资助金额:
    $ 36.32万
  • 项目类别:
    Research Grant
Mining and Processing Big Data in Bioinformatics: Mouse Phenotyping using Flow Cytometry
生物信息学中的大数据挖掘和处理:使用流式细胞术进行小鼠表型分析
  • 批准号:
    505116-2017
  • 财政年份:
    2019
  • 资助金额:
    $ 36.32万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Bioinformatics and big data analytics
生物信息学和大数据分析
  • 批准号:
    CRC-2016-00137
  • 财政年份:
    2019
  • 资助金额:
    $ 36.32万
  • 项目类别:
    Canada Research Chairs
Bioinformatics and big data analytics
生物信息学和大数据分析
  • 批准号:
    CRC-2016-00137
  • 财政年份:
    2018
  • 资助金额:
    $ 36.32万
  • 项目类别:
    Canada Research Chairs
Mining and Processing Big Data in Bioinformatics: Mouse Phenotyping using Flow Cytometry
生物信息学中的大数据挖掘和处理:使用流式细胞术进行小鼠表型分析
  • 批准号:
    505116-2017
  • 财政年份:
    2018
  • 资助金额:
    $ 36.32万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Bioinformatics and big data analytics
生物信息学和大数据分析
  • 批准号:
    CRC-2016-00137
  • 财政年份:
    2017
  • 资助金额:
    $ 36.32万
  • 项目类别:
    Canada Research Chairs
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了