A Modular Framework for Accurate, Efficient, and Reproducible Analysis of RNA-Seq Data
用于准确、高效和可重复分析 RNA-Seq 数据的模块化框架
基本信息
- 批准号:10170579
- 负责人:
- 金额:$ 30.46万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-03-12 至 2023-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
PROJECT SUMMARY / ABSTRACT
We propose to develop improved, modular pipelines for more accurate and reproducible RNA-seq analyses. RNA-
seq experiments are widely used in biological and biomedical sciences to determine the expression level of all genes
and isoforms across multiple samples. Raw RNA-seq data must be pre-processed to determine abundances of RNA
molecules. State-of-the-art tools for quantifying RNA abundances are fast and efficient, model and correct for common
technical biases, and provide estimates of the uncertainty of the abundances. Downstream tools for visualization and
statistical testing of abundance ideally should incorporate uncertainty of abundance estimates from the quantification
step, take into account the sampling variability inherent in observations in all sequencing experiments, and estimate, for
each transcript, the underlying biological variation in abundances across samples. While isolated tools fulfill a subset
of the above characteristics, we propose to develop a pipeline which addresses all of these, while at the same time
leveraging the powerful existing infrastructure for gene expression analysis. Our modular approach to improving the
current RNA-seq analysis pipelines will also seek to make use of the best downstream tools for gene set analysis and
dynamic report generation. Current RNA-seq computational pipelines do not keep track of critical pieces of metadata
throughout the analysis, including genome and transcriptome version, such that final results cannot reliably be repro-
duced or put in the correct genomic context as the information about annotation provenance may be lost. While fast
and lightweight tools have been quickly adopted for gene- and transcript-level quantification, they are not yet optimized
for certain RNA-seq analysis tasks such as quantification of allele specific expression. We have developed a set of top
performing tools for abundance quantification and downstream inference. We propose to formalize our existing tools
into a pipeline, and build additional tools and infrastructure, which optimally estimates and propagates uncertainty
from abundance estimation (described in Aim 1), and which stores critical provenance metadata automatically on
the user's behalf — this metadata tagging and propagation will be integrated with community resources (described
in Aim 2). Furthermore, we propose building out the capabilities of our existing quantification infrastructure to allow
for improved mapping accuracy and more robust and accurate allelic expression estimation (described in Aim 3).
项目摘要/摘要
我们建议开发改进的、模块化的管道,以实现更准确和可重复性的RNA-SEQ分析。RNA-
SEQ实验在生物和生物医学科学中被广泛用于确定所有基因的表达水平
以及跨多个样本的异构体。原始的rna-seq数据必须经过预处理才能确定rna的丰度。
分子。最先进的定量核糖核酸丰度的工具是快速和有效的,fi,模型和常见的正确
技术偏差,并提供对丰度不确定性的估计。可视化的下游工具和
丰度的统计检验理想情况下应该包含量子fi阳离子丰度估计的不确定度
步骤,考虑到所有测序实验中观测中固有的抽样变异性,并估计
每一份转录本都是样本间丰度的潜在生物差异。虽然孤立的工具完整的fi只是一个子集
针对上述特点,我们建议开发一条解决所有这些问题的管道,同时
利用强大的现有基础设施进行基因表达分析。我们采用模块化方法来改进
目前的rna-seq分析管道也将寻求利用最好的下游工具进行基因集分析和
动态报告生成。当前的rna-seq计算管道不跟踪关键的元数据片段。
在整个分析过程中,包括基因组和转录组版本,使得fiNAL结果不能可靠地再现-
由于关于注解来源的信息可能丢失,因此被归入或放入正确的基因组上下文中。虽然很快
轻量级工具已经迅速被用于基因和转录水平的fi定量检测,它们还没有得到优化
用于某些rna-seq分析任务,如等位基因fi阳离子等位基因fic表达。我们已经开发了一套顶级
进行fi丰度定量和下游推断的工具。我们建议将我们现有的工具正规化
整合到管道中,并构建其他工具和基础设施,从而以最佳方式估计和传播不确定性
来自丰度估计(在目标1中描述),并将关键种源元数据自动存储在
代表用户-此元数据标记和传播将与社区资源集成(描述
目标2)。此外,我们建议扩展我们现有的Quantifi阳离子基础设施的能力,以允许
为了提高作图精度和更稳健、更准确的等位基因表达估计(在目标3中描述)。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michael Isaiah Love其他文献
Michael Isaiah Love的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michael Isaiah Love', 18)}}的其他基金
Systematic in vivo characterization of disease-associated regulatory variants
疾病相关调控变异的系统体内表征
- 批准号:
10472058 - 财政年份:2021
- 资助金额:
$ 30.46万 - 项目类别:
Systematic in vivo characterization of disease-associated regulatory variants
疾病相关调控变异的系统体内表征
- 批准号:
10296745 - 财政年份:2021
- 资助金额:
$ 30.46万 - 项目类别:
Systematic in vivo characterization of disease-associated regulatory variants
疾病相关调控变异的系统体内表征
- 批准号:
10631225 - 财政年份:2021
- 资助金额:
$ 30.46万 - 项目类别:
A Modular Framework for Accurate, Efficient, and Reproducible Analysis of RNA-Seq Data
用于准确、高效和可重复分析 RNA-Seq 数据的模块化框架
- 批准号:
10238765 - 财政年份:2020
- 资助金额:
$ 30.46万 - 项目类别:
A Modular Framework for Accurate, Efficient, and Reproducible Analysis of RNA-Seq Data
用于准确、高效和可重复分析 RNA-Seq 数据的模块化框架
- 批准号:
10440402 - 财政年份:2020
- 资助金额:
$ 30.46万 - 项目类别:
pathQTL: Integrative Multi-Omics Causal Inference of Molecular Mechanisms Leading to Neuropsychiatric Illness
pathQTL:导致神经精神疾病的分子机制的综合多组学因果推断
- 批准号:
10318952 - 财政年份:2018
- 资助金额:
$ 30.46万 - 项目类别:
pathQTL: Integrative Multi-Omics Causal Inference of Molecular Mechanisms Leading to Neuropsychiatric Illness
pathQTL:导致神经精神疾病的分子机制的综合多组学因果推断
- 批准号:
10550143 - 财政年份:2018
- 资助金额:
$ 30.46万 - 项目类别:
pathQTL: Integrative Multi-Omics Causal Inference of Molecular Mechanisms Leading to Neuropsychiatric Illness
pathQTL:导致神经精神疾病的分子机制的综合多组学因果推断
- 批准号:
10066367 - 财政年份:2018
- 资助金额:
$ 30.46万 - 项目类别:
相似海外基金
Adapting Position-Based Dynamics as a Biophysically Accurate and Efficient Modeling Framework for Dynamic Cell Shapes
采用基于位置的动力学作为动态细胞形状的生物物理准确且高效的建模框架
- 批准号:
24K16962 - 财政年份:2024
- 资助金额:
$ 30.46万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
A Framework for Fast, Accurate, and Explainable Computerized Adaptive Language Test
快速、准确且可解释的计算机化自适应语言测试框架
- 批准号:
24K20903 - 财政年份:2024
- 资助金额:
$ 30.46万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Robust and Accurate Equation of State Framework for Modeling Phase Behavior of Reservoir Fluids under Extreme Pressure/Temperature Conditions
用于模拟极压/温度条件下储层流体相行为的稳健且准确的状态方程框架
- 批准号:
RGPIN-2020-04571 - 财政年份:2022
- 资助金额:
$ 30.46万 - 项目类别:
Discovery Grants Program - Individual
A hybrid molecular simulation/machine-learning framework for rapid and accurate computation of absolute binding free energies of lead-like molecules
用于快速准确计算类铅分子的绝对结合自由能的混合分子模拟/机器学习框架
- 批准号:
2581380 - 财政年份:2021
- 资助金额:
$ 30.46万 - 项目类别:
Studentship
Robust and Accurate Equation of State Framework for Modeling Phase Behavior of Reservoir Fluids under Extreme Pressure/Temperature Conditions
用于模拟极压/温度条件下储层流体相行为的稳健且准确的状态方程框架
- 批准号:
RGPIN-2020-04571 - 财政年份:2021
- 资助金额:
$ 30.46万 - 项目类别:
Discovery Grants Program - Individual
EAGER: IIBR Informatics: A reinforced imputation framework for accurate gene expression recovery from single-cell RNA-seq data
EAGER:IIBR 信息学:从单细胞 RNA-seq 数据中准确恢复基因表达的强化插补框架
- 批准号:
1945971 - 财政年份:2021
- 资助金额:
$ 30.46万 - 项目类别:
Standard Grant
EAGER: Accurate Estimation of Indoor Airborne Virus Transmission based on a Novel Multiscale Data-Driven Framework
EAGER:基于新型多尺度数据驱动框架准确估计室内空气传播病毒传播
- 批准号:
2134083 - 财政年份:2021
- 资助金额:
$ 30.46万 - 项目类别:
Standard Grant
A Predictive Framework for Micro-scale Carbonate Diagenesis: Towards More Accurate Reconstructions of Global Climate and Environmental Change
微尺度碳酸盐岩成岩作用的预测框架:更准确地重建全球气候和环境变化
- 批准号:
2040145 - 财政年份:2020
- 资助金额:
$ 30.46万 - 项目类别:
Standard Grant
Robust and Accurate Equation of State Framework for Modeling Phase Behavior of Reservoir Fluids under Extreme Pressure/Temperature Conditions
用于模拟极压/温度条件下储层流体相行为的稳健且准确的状态方程框架
- 批准号:
RGPIN-2020-04571 - 财政年份:2020
- 资助金额:
$ 30.46万 - 项目类别:
Discovery Grants Program - Individual
A Modular Framework for Accurate, Efficient, and Reproducible Analysis of RNA-Seq Data
用于准确、高效和可重复分析 RNA-Seq 数据的模块化框架
- 批准号:
10238765 - 财政年份:2020
- 资助金额:
$ 30.46万 - 项目类别:














{{item.name}}会员




