A Modular Framework for Accurate, Efficient, and Reproducible Analysis of RNA-Seq Data
用于准确、高效和可重复分析 RNA-Seq 数据的模块化框架
基本信息
- 批准号:10238765
- 负责人:
- 金额:$ 29.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-03-12 至 2023-06-30
- 项目状态:已结题
- 来源:
- 关键词:AddressAdoptedAdoptionAlgorithmsAllelesArchivesAreaAttentionBiologicalBiological AssayBiomedical ResearchCharacteristicsCommunitiesDataData SetDatabasesDevelopmentDiseaseEventFollow-Up StudiesGene Expression ProfilingGenerationsGenesGeneticGenomeGenomicsGoalsHealthHumanHybridsInfrastructureKnowledgeLeadLocationMeasurementMetadataMethodsModelingNucleotidesOrganismPhenotypeProcessProtein IsoformsRNARNA EditingRNA analysisReportingReproducibilityReproducibility of ResultsResearch PersonnelResourcesSalmonSamplingScienceSequence AlignmentSourceSpeedStatistical Data InterpretationTestingTimeTranscriptUncertaintyVariantVisionVisualizationVisualization softwareanalysis pipelinecomputational pipelinescryptographydesigndifferential expressionexperimental studyhuman errorimprovedlight weighttask analysistooltranscriptometranscriptome sequencingtranscriptomicswasting
项目摘要
PROJECT SUMMARY / ABSTRACT
We propose to develop improved, modular pipelines for more accurate and reproducible RNA-seq analyses. RNA-
seq experiments are widely used in biological and biomedical sciences to determine the expression level of all genes
and isoforms across multiple samples. Raw RNA-seq data must be pre-processed to determine abundances of RNA
molecules. State-of-the-art tools for quantifying RNA abundances are fast and efficient, model and correct for common
technical biases, and provide estimates of the uncertainty of the abundances. Downstream tools for visualization and
statistical testing of abundance ideally should incorporate uncertainty of abundance estimates from the quantification
step, take into account the sampling variability inherent in observations in all sequencing experiments, and estimate, for
each transcript, the underlying biological variation in abundances across samples. While isolated tools fulfill a subset
of the above characteristics, we propose to develop a pipeline which addresses all of these, while at the same time
leveraging the powerful existing infrastructure for gene expression analysis. Our modular approach to improving the
current RNA-seq analysis pipelines will also seek to make use of the best downstream tools for gene set analysis and
dynamic report generation. Current RNA-seq computational pipelines do not keep track of critical pieces of metadata
throughout the analysis, including genome and transcriptome version, such that final results cannot reliably be repro-
duced or put in the correct genomic context as the information about annotation provenance may be lost. While fast
and lightweight tools have been quickly adopted for gene- and transcript-level quantification, they are not yet optimized
for certain RNA-seq analysis tasks such as quantification of allele specific expression. We have developed a set of top
performing tools for abundance quantification and downstream inference. We propose to formalize our existing tools
into a pipeline, and build additional tools and infrastructure, which optimally estimates and propagates uncertainty
from abundance estimation (described in Aim 1), and which stores critical provenance metadata automatically on
the user's behalf — this metadata tagging and propagation will be integrated with community resources (described
in Aim 2). Furthermore, we propose building out the capabilities of our existing quantification infrastructure to allow
for improved mapping accuracy and more robust and accurate allelic expression estimation (described in Aim 3).
项目摘要 /摘要
我们建议开发改进的模块化管道,以进行更准确和可再现的RNA-Seq分析。 RNA-
SEQ实验广泛用于生物学和生物医学科学,以确定所有基因的表达水平
和多个样品的同工型。必须预处理原始RNA-seq数据以确定RNA的丰度
分子。量化RNA丰度的最先进的工具是快速且有效的,型号的,并且可以纠正常见
技术偏见,并提供抽象不确定性的估计。可视化和
理想情况下,抽象的统计测试应纳入量化中的抽象估计值的不确定性
步骤,请考虑所有测序实验中观察固有的采样变异性,并估算
每个转录本,样品抽象的潜在生物学变化。而孤立的工具完成了子集
在上述特征中,我们建议开发一条解决所有这些的管道,同时
利用强大的现有基础设施进行基因表达分析。我们改善的模块化方法
当前的RNA-seq分析管道还将寻求利用用于基因集分析和的最佳下游工具
动态报告生成。当前的RNA-seq计算管道不会跟踪关键的元数据
在整个分析中,包括基因组和转录组版本,因此无法可靠地重复研究结果
由于可能会丢失有关注释出处的信息,因此在正确的基因组环境中引起或放置。快速
并且已迅速用于基因和笔录级量化的轻量级工具,尚未对其进行优化
对于某些RNA-seq分析任务,例如对等位基因表达的定量。我们已经开发了一组顶部
执行用于抽象定量和下游推理的工具。我们建议将现有工具形式化
进入管道,并建立其他工具和基础架构,从而最佳地估计并传播不确定性
根据抽象估计(AIM 1中描述),并自动存储关键的出处元数据
用户的代表 - 此元数据标记和传播将与社区资源集成(描述
在AIM 2)。此外,我们建议建立我们现有量化基础设施的能力以允许
为了提高映射准确性和更健壮,更准确的倾斜表达估计(在AIM 3中进行了描述)。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michael Isaiah Love其他文献
Michael Isaiah Love的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michael Isaiah Love', 18)}}的其他基金
Systematic in vivo characterization of disease-associated regulatory variants
疾病相关调控变异的系统体内表征
- 批准号:
10472058 - 财政年份:2021
- 资助金额:
$ 29.5万 - 项目类别:
Systematic in vivo characterization of disease-associated regulatory variants
疾病相关调控变异的系统体内表征
- 批准号:
10296745 - 财政年份:2021
- 资助金额:
$ 29.5万 - 项目类别:
Systematic in vivo characterization of disease-associated regulatory variants
疾病相关调控变异的系统体内表征
- 批准号:
10631225 - 财政年份:2021
- 资助金额:
$ 29.5万 - 项目类别:
A Modular Framework for Accurate, Efficient, and Reproducible Analysis of RNA-Seq Data
用于准确、高效和可重复分析 RNA-Seq 数据的模块化框架
- 批准号:
10170579 - 财政年份:2020
- 资助金额:
$ 29.5万 - 项目类别:
A Modular Framework for Accurate, Efficient, and Reproducible Analysis of RNA-Seq Data
用于准确、高效和可重复分析 RNA-Seq 数据的模块化框架
- 批准号:
10440402 - 财政年份:2020
- 资助金额:
$ 29.5万 - 项目类别:
pathQTL: Integrative Multi-Omics Causal Inference of Molecular Mechanisms Leading to Neuropsychiatric Illness
pathQTL:导致神经精神疾病的分子机制的综合多组学因果推断
- 批准号:
10318952 - 财政年份:2018
- 资助金额:
$ 29.5万 - 项目类别:
pathQTL: Integrative Multi-Omics Causal Inference of Molecular Mechanisms Leading to Neuropsychiatric Illness
pathQTL:导致神经精神疾病的分子机制的综合多组学因果推断
- 批准号:
10550143 - 财政年份:2018
- 资助金额:
$ 29.5万 - 项目类别:
pathQTL: Integrative Multi-Omics Causal Inference of Molecular Mechanisms Leading to Neuropsychiatric Illness
pathQTL:导致神经精神疾病的分子机制的综合多组学因果推断
- 批准号:
10066367 - 财政年份:2018
- 资助金额:
$ 29.5万 - 项目类别:
相似国自然基金
采用新型视觉-电刺激配对范式长期、特异性改变成年期动物视觉系统功能可塑性
- 批准号:32371047
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
破解老年人数字鸿沟:老年人采用数字技术的决策过程、客观障碍和应对策略
- 批准号:72303205
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
通过抑制流体运动和采用双能谱方法来改进烧蚀速率测量的研究
- 批准号:12305261
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
采用多种稀疏自注意力机制的Transformer隧道衬砌裂缝检测方法研究
- 批准号:62301339
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
政策激励、信息传递与农户屋顶光伏技术采用提升机制研究
- 批准号:72304103
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Implementation of Innovative Treatment for Moral Injury Syndrome: A Hybrid Type 2 Study
道德伤害综合症创新治疗的实施:2 型混合研究
- 批准号:
10752930 - 财政年份:2024
- 资助金额:
$ 29.5万 - 项目类别:
Optimization of electromechanical monitoring of engineered heart tissues
工程心脏组织机电监测的优化
- 批准号:
10673513 - 财政年份:2023
- 资助金额:
$ 29.5万 - 项目类别:
The University of Miami AIDS Research Center on Mental Health and HIV/AIDS - Center for HIV & Research in Mental Health (CHARM)Research Core - EIS
迈阿密大学艾滋病心理健康和艾滋病毒/艾滋病研究中心 - Center for HIV
- 批准号:
10686546 - 财政年份:2023
- 资助金额:
$ 29.5万 - 项目类别:
The RaDIANT Health Systems Intervention for Equity in Kidney Transplantation
Radiant 卫生系统干预肾移植的公平性
- 批准号:
10681998 - 财政年份:2023
- 资助金额:
$ 29.5万 - 项目类别:
Extensible Open Source Zero-Footprint Web Viewer for Cancer Imaging Research
用于癌症成像研究的可扩展开源零足迹 Web 查看器
- 批准号:
10644112 - 财政年份:2023
- 资助金额:
$ 29.5万 - 项目类别: