A Modular Framework for Accurate, Efficient, and Reproducible Analysis of RNA-Seq Data

用于准确、高效和可重复分析 RNA-Seq 数据的模块化框架

基本信息

  • 批准号:
    10170579
  • 负责人:
  • 金额:
    $ 30.46万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-03-12 至 2023-06-30
  • 项目状态:
    已结题

项目摘要

PROJECT SUMMARY / ABSTRACT We propose to develop improved, modular pipelines for more accurate and reproducible RNA-seq analyses. RNA- seq experiments are widely used in biological and biomedical sciences to determine the expression level of all genes and isoforms across multiple samples. Raw RNA-seq data must be pre-processed to determine abundances of RNA molecules. State-of-the-art tools for quantifying RNA abundances are fast and efficient, model and correct for common technical biases, and provide estimates of the uncertainty of the abundances. Downstream tools for visualization and statistical testing of abundance ideally should incorporate uncertainty of abundance estimates from the quantification step, take into account the sampling variability inherent in observations in all sequencing experiments, and estimate, for each transcript, the underlying biological variation in abundances across samples. While isolated tools fulfill a subset of the above characteristics, we propose to develop a pipeline which addresses all of these, while at the same time leveraging the powerful existing infrastructure for gene expression analysis. Our modular approach to improving the current RNA-seq analysis pipelines will also seek to make use of the best downstream tools for gene set analysis and dynamic report generation. Current RNA-seq computational pipelines do not keep track of critical pieces of metadata throughout the analysis, including genome and transcriptome version, such that final results cannot reliably be repro- duced or put in the correct genomic context as the information about annotation provenance may be lost. While fast and lightweight tools have been quickly adopted for gene- and transcript-level quantification, they are not yet optimized for certain RNA-seq analysis tasks such as quantification of allele specific expression. We have developed a set of top performing tools for abundance quantification and downstream inference. We propose to formalize our existing tools into a pipeline, and build additional tools and infrastructure, which optimally estimates and propagates uncertainty from abundance estimation (described in Aim 1), and which stores critical provenance metadata automatically on the user's behalf — this metadata tagging and propagation will be integrated with community resources (described in Aim 2). Furthermore, we propose building out the capabilities of our existing quantification infrastructure to allow for improved mapping accuracy and more robust and accurate allelic expression estimation (described in Aim 3).
项目概要/摘要 我们建议开发改进的模块化流程,以实现更准确和可重复的 RNA-seq 分析。 RNA- seq实验广泛应用于生物和生物医学科学中以确定所有基因的表达水平 和跨多个样品的亚型。必须对原始 RNA-seq 数据进行预处理以确定 RNA 的丰度 分子。用于定量 RNA 丰度的最先进工具快速高效,可针对常见问题进行建模和校正 技术偏差,并提供丰度不确定性的估计。用于可视化的下游工具 理想情况下,丰度的统计测试应包含定量中丰度估计的不确定性 步骤,考虑所有测序实验中观察结果固有的采样变异性,并估计, 每个转录本,样本之间丰度的潜在生物学变化。虽然孤立的工具满足了一个子集 鉴于上述特征,我们建议开发一个解决所有这些问题的管道,同时 利用强大的现有基础设施进行基因表达分析。我们的模块化方法来改进 当前的 RNA-seq 分析流程还将寻求利用最好的下游工具进行基因集分析和 动态报告生成。当前的 RNA-seq 计算流程不跟踪元数据的关键部分 整个分析过程,包括基因组和转录组版本,因此最终结果无法可靠地重现 推断或放入正确的基因组上下文中,因为有关注释来源的信息可能会丢失。虽然快 轻量级工具已迅速被用于基因和转录水平的定量,但它们尚未优化 用于某些 RNA-seq 分析任务,例如等位基因特异性表达的定量。我们开发了一套顶级 执行丰度量化和下游推断的工具。我们建议正式化我们现有的工具 进入管道,并构建额外的工具和基础设施,以最佳方式估计和传播不确定性 来自丰度估计(目标 1 中描述),并且自动将关键来源元数据存储在 代表用户——元数据标记和传播将与社区资源集成(描述 目标 2)。此外,我们建议构建现有量化基础设施的能力,以允许 提高作图准确性和更稳健、更准确的等位基因表达估计(目标 3 中所述)。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Michael Isaiah Love其他文献

Michael Isaiah Love的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Michael Isaiah Love', 18)}}的其他基金

Systematic in vivo characterization of disease-associated regulatory variants
疾病相关调控变异的系统体内表征
  • 批准号:
    10472058
  • 财政年份:
    2021
  • 资助金额:
    $ 30.46万
  • 项目类别:
Systematic in vivo characterization of disease-associated regulatory variants
疾病相关调控变异的系统体内表征
  • 批准号:
    10296745
  • 财政年份:
    2021
  • 资助金额:
    $ 30.46万
  • 项目类别:
Systematic in vivo characterization of disease-associated regulatory variants
疾病相关调控变异的系统体内表征
  • 批准号:
    10631225
  • 财政年份:
    2021
  • 资助金额:
    $ 30.46万
  • 项目类别:
A Modular Framework for Accurate, Efficient, and Reproducible Analysis of RNA-Seq Data
用于准确、高效和可重复分析 RNA-Seq 数据的模块化框架
  • 批准号:
    10238765
  • 财政年份:
    2020
  • 资助金额:
    $ 30.46万
  • 项目类别:
A Modular Framework for Accurate, Efficient, and Reproducible Analysis of RNA-Seq Data
用于准确、高效和可重复分析 RNA-Seq 数据的模块化框架
  • 批准号:
    10440402
  • 财政年份:
    2020
  • 资助金额:
    $ 30.46万
  • 项目类别:
pathQTL: Integrative Multi-Omics Causal Inference of Molecular Mechanisms Leading to Neuropsychiatric Illness
pathQTL:导致神经精神疾病的分子机制的综合多组学因果推断
  • 批准号:
    10318952
  • 财政年份:
    2018
  • 资助金额:
    $ 30.46万
  • 项目类别:
pathQTL: Integrative Multi-Omics Causal Inference of Molecular Mechanisms Leading to Neuropsychiatric Illness
pathQTL:导致神经精神疾病的分子机制的综合多组学因果推断
  • 批准号:
    10550143
  • 财政年份:
    2018
  • 资助金额:
    $ 30.46万
  • 项目类别:
pathQTL: Integrative Multi-Omics Causal Inference of Molecular Mechanisms Leading to Neuropsychiatric Illness
pathQTL:导致神经精神疾病的分子机制的综合多组学因果推断
  • 批准号:
    10066367
  • 财政年份:
    2018
  • 资助金额:
    $ 30.46万
  • 项目类别:

相似海外基金

Adapting Position-Based Dynamics as a Biophysically Accurate and Efficient Modeling Framework for Dynamic Cell Shapes
采用基于位置的动力学作为动态细胞形状的生物物理准确且高效的建模框架
  • 批准号:
    24K16962
  • 财政年份:
    2024
  • 资助金额:
    $ 30.46万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
A Framework for Fast, Accurate, and Explainable Computerized Adaptive Language Test
快速、准确且可解释的计算机化自适应语言测试框架
  • 批准号:
    24K20903
  • 财政年份:
    2024
  • 资助金额:
    $ 30.46万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Robust and Accurate Equation of State Framework for Modeling Phase Behavior of Reservoir Fluids under Extreme Pressure/Temperature Conditions
用于模拟极压/温度条件下储层流体相行为的稳健且准确的状态方程框架
  • 批准号:
    RGPIN-2020-04571
  • 财政年份:
    2022
  • 资助金额:
    $ 30.46万
  • 项目类别:
    Discovery Grants Program - Individual
A hybrid molecular simulation/machine-learning framework for rapid and accurate computation of absolute binding free energies of lead-like molecules
用于快速准确计算类铅分子的绝对结合自由能的混合分子模拟/机器学习框架
  • 批准号:
    2581380
  • 财政年份:
    2021
  • 资助金额:
    $ 30.46万
  • 项目类别:
    Studentship
Robust and Accurate Equation of State Framework for Modeling Phase Behavior of Reservoir Fluids under Extreme Pressure/Temperature Conditions
用于模拟极压/温度条件下储层流体相行为的稳健且准确的状态方程框架
  • 批准号:
    RGPIN-2020-04571
  • 财政年份:
    2021
  • 资助金额:
    $ 30.46万
  • 项目类别:
    Discovery Grants Program - Individual
EAGER: IIBR Informatics: A reinforced imputation framework for accurate gene expression recovery from single-cell RNA-seq data
EAGER:IIBR 信息学:从单细胞 RNA-seq 数据中准确恢复基因表达的强化插补框架
  • 批准号:
    1945971
  • 财政年份:
    2021
  • 资助金额:
    $ 30.46万
  • 项目类别:
    Standard Grant
EAGER: Accurate Estimation of Indoor Airborne Virus Transmission based on a Novel Multiscale Data-Driven Framework
EAGER:基于新型多尺度数据驱动框架准确估计室内空气传播病毒传播
  • 批准号:
    2134083
  • 财政年份:
    2021
  • 资助金额:
    $ 30.46万
  • 项目类别:
    Standard Grant
A Predictive Framework for Micro-scale Carbonate Diagenesis: Towards More Accurate Reconstructions of Global Climate and Environmental Change
微尺度碳酸盐岩成岩作用的预测框架:更准确地重建全球气候和环境变化
  • 批准号:
    2040145
  • 财政年份:
    2020
  • 资助金额:
    $ 30.46万
  • 项目类别:
    Standard Grant
Robust and Accurate Equation of State Framework for Modeling Phase Behavior of Reservoir Fluids under Extreme Pressure/Temperature Conditions
用于模拟极压/温度条件下储层流体相行为的稳健且准确的状态方程框架
  • 批准号:
    RGPIN-2020-04571
  • 财政年份:
    2020
  • 资助金额:
    $ 30.46万
  • 项目类别:
    Discovery Grants Program - Individual
A Modular Framework for Accurate, Efficient, and Reproducible Analysis of RNA-Seq Data
用于准确、高效和可重复分析 RNA-Seq 数据的模块化框架
  • 批准号:
    10238765
  • 财政年份:
    2020
  • 资助金额:
    $ 30.46万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了