An interactive tool for in-depth and reproducible analysis of RNA-seq data
用于对 RNA-seq 数据进行深入且可重复分析的交互式工具
基本信息
- 批准号:10252004
- 负责人:
- 金额:$ 33.06万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-09-02 至 2024-06-30
- 项目状态:已结题
- 来源:
- 关键词:AdoptedArabidopsisArchitectureAreaBioconductorBioinformaticsBiologicalBiologyClassificationCodeCollaborationsCommunitiesCommunity IntegrationComputer softwareCountryDNA Microarray ChipDataData AnalysesData SetDatabasesDevelopmentDisadvantagedDiseaseDocumentationEncapsulatedFeedbackFloodsGene set enrichment analysisGenerationsGenesGenomeGoalsGuanine + Cytosine CompositionHumanInstitutionLengthLettersMeta-AnalysisMethodsMicroRNAsModelingMolecularMusOntologyOrganismPaperPathway AnalysisPathway interactionsProcessRNA analysisReportingReproducibilityResearchResearch PersonnelResourcesRetrievalSamplingSideStatistical Data InterpretationSupervisionSurveysTechniquesTestingTimeTissue-Specific Gene ExpressionTissuesTranslatingUpdateVisitVisualizationWorkWritingapplication programming interfacebasedata visualizationdesigndifferential expressionexperiencegenomic datagraphical user interfacehigh throughput technologyimprovedinnovationinsightinteractive toolknowledge baselarge datasetsprogramsprotein protein interactionprototypesingle-cell RNA sequencingtooltranscription factortranscriptome sequencingtranscriptomicsuser-friendlyweb appweb site
项目摘要
PROJECT SUMMARY
Bioinformatic analysis of large genomic datasets is a critical barrier for many biologists, especially
those at smaller research institutions. Leveraging our team's bioinformatics experience, our goal is to
develop an interactive web application that can be used to easily translate RNA sequencing data
into biological insights. We hypothesized that an integrated tool for reproducible, in-depth analysis of
expression data will democratize access to high-throughput technologies and help biologists pinpoint
molecular pathways from large data. Our goal is to develop a carefully-designed user-friendly pipeline
with rich data visualization capacity. As a proof of concept, the team developed a prototype called iDEP
(integrated Differential Expression and Pathway analysis) for the analysis of summarized expression
matrices. It's unique features include (1) comprehensive analytic functionality based on 63 R and
Bioconductor packages, covering exploratory data analysis, clustering, differential gene expression and
pathway analysis; (2) a massive knowledgebase for automatic gene ID conversion, annotation, and
pathway analysis for over 2000 archaeal, bacterial and eukaryotic species; (3) reproducibility of some
core steps by generating R and R Markdown notebooks; (4) application programming interfaces (APIs)
for retrieval of protein-protein interaction networks and KEGG pathway diagrams, and (5) easy access
to about 13000 processed public RNA-seq data in 9 species. Compared with existing tools, the key
innovation is the emphasis on deep integration (tools, annotation, pathways, and public datasets), user-
friendliness, and reproducibility. Even with limited features, iDEP is beginning to be adopted by
researchers from diverse fields.
In this proposal, the team plans to complete the development of iDEP. The goal of Specific Aim 1 is
to (a) re-write iDEP in a modular, object-oriented fashion, (b) make an R package for generating fully
reproducible R Markdown notebooks, and (c) add essential functionalities such as bias correction (batch
effect, GC content, gene length, expression level), time-course analysis, supervised classification, and
additional methods for existing functional modules. We will also enable gene ontology enrichment
analysis for unannotated species using Blast2GO. Specific Aim 2 focuses on (a) substantially
expanding the pathway database for frequently studied species and (b) collecting more uniformly
processed RNA-seq and DNA microarray datasets to facilitate the re-analysis and meta-analysis of
public expression data. In Specific Aim 3, the team will conduct hardware upgrade, rigorous testing,
code review, documentation, and community integration. The development of iDEP can help make
standard RNA-seq analysis accessible for a very broad community of researchers.
项目摘要
大型基因组数据集的生物信息学分析是许多生物学家的关键障碍,特别是
在较小的研究机构。利用我们团队的生物信息学经验,我们的目标是
开发一个交互式网络应用程序,可以用来轻松地翻译RNA测序数据
生物学的见解。我们假设,一个集成的工具,可重复的,深入的分析,
表达数据将使高通量技术的使用民主化,并帮助生物学家确定
从大数据中提取分子路径。我们的目标是开发一个精心设计的用户友好的管道
具有丰富的数据可视化能力。作为概念验证,该团队开发了一个名为iDEP的原型
(整合的差异表达和途径分析)用于分析概括的表达
矩阵它的独特功能包括(1)基于63 R的全面分析功能,
Bioconductor软件包,涵盖探索性数据分析、聚类、差异基因表达和
途径分析;(2)用于自动基因ID转换,注释和
对2000多种古细菌、细菌和真核生物物种的途径分析;(3)某些
通过生成R和R Markdown笔记本的核心步骤;(4)应用程序编程接口(API)
用于检索蛋白质-蛋白质相互作用网络和KEGG通路图,以及(5)容易访问
到9个物种的约13000个已处理的公共RNA测序数据。与现有工具相比,
创新是强调深度整合(工具,注释,途径和公共数据集),用户-
友好性和可重复性。即使功能有限,iDEP也开始被
来自不同领域的研究人员。
在本提案中,该小组计划完成iDEP的开发。具体目标1的目标是
为了(a)以模块化的、面向对象的方式重写iDEP,(B)制作一个R包,
可重复的R Markdown笔记本,以及(c)添加基本功能,例如偏差校正(批处理
效应、GC含量、基因长度、表达水平)、时程分析、监督分类,以及
现有功能模块的附加方法。我们还将使基因本体丰富
使用Blast 2GO对未注释的物种进行分析。具体目标2侧重于:(a)实质性
扩大经常研究的物种的途径数据库和(B)更均匀地收集
经过处理的RNA-seq和DNA微阵列数据集,以促进对
公共表达式数据。在《特定目标3》中,团队将进行硬件升级,严格测试,
代码审查、文档和社区集成。iDEP的发展可以帮助
标准的RNA-seq分析可供非常广泛的研究人员社区使用。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Xijin Ge其他文献
Xijin Ge的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Xijin Ge', 18)}}的其他基金
An interactive tool for in-depth and reproducible analysis of RNA-seq data
用于对 RNA-seq 数据进行深入且可重复分析的交互式工具
- 批准号:
10432078 - 财政年份:2020
- 资助金额:
$ 33.06万 - 项目类别:
An interactive tool for in-depth and reproducible analysis of RNA-seq data
用于对 RNA-seq 数据进行深入且可重复分析的交互式工具
- 批准号:
10657551 - 财政年份:2020
- 资助金额:
$ 33.06万 - 项目类别:
An interactive tool for in-depth and reproducible analysis of RNA-seq data
用于对 RNA-seq 数据进行深入且可重复分析的交互式工具
- 批准号:
9978200 - 财政年份:2020
- 资助金额:
$ 33.06万 - 项目类别:
Large-scale expression analysis of natural antisense transcripts
天然反义转录本的大规模表达分析
- 批准号:
8054875 - 财政年份:2009
- 资助金额:
$ 33.06万 - 项目类别:
Large-scale expression analysis of natural antisense transcripts
天然反义转录本的大规模表达分析
- 批准号:
8248786 - 财政年份:2009
- 资助金额:
$ 33.06万 - 项目类别:
Large-scale expression analysis of natural antisense transcripts
天然反义转录本的大规模表达分析
- 批准号:
7791283 - 财政年份:2009
- 资助金额:
$ 33.06万 - 项目类别:
相似海外基金
Single cell level elucidation of local cell death-triggered regeneration mechanism in Arabidopsis
单细胞水平阐明拟南芥局部细胞死亡触发的再生机制
- 批准号:
24K17869 - 财政年份:2024
- 资助金额:
$ 33.06万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Deciphering the molecular mechanism of GESENI (GEne Silencing based on ENcoded protein's Intracellular localization) in Arabidopsis sperm cells
破译拟南芥精子细胞中GESENI(基于编码蛋白细胞内定位的基因沉默)的分子机制
- 批准号:
24K18143 - 财政年份:2024
- 资助金额:
$ 33.06万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Identification of cell fate specification mechanisms during early embryogenesis in Arabidopsis
拟南芥早期胚胎发生过程中细胞命运规范机制的鉴定
- 批准号:
22KF0023 - 财政年份:2023
- 资助金额:
$ 33.06万 - 项目类别:
Grant-in-Aid for JSPS Fellows
The role of ELMOD family proteins and their genetic network in the development of specialized membrane domains on the Arabidopsis pollen surface
ELMOD家族蛋白及其遗传网络在拟南芥花粉表面特殊膜结构域发育中的作用
- 批准号:
2240972 - 财政年份:2023
- 资助金额:
$ 33.06万 - 项目类别:
Standard Grant
Effects of perturbing polyamine metabolism on development and stress responses in Arabidopsis thaliana
扰动多胺代谢对拟南芥发育和应激反应的影响
- 批准号:
2887668 - 财政年份:2023
- 资助金额:
$ 33.06万 - 项目类别:
Studentship
Elucidation of plant cell magnesium concentration control mechanism by the Arabidopsis thaliana transport protein AtMRS2-1
拟南芥转运蛋白AtMRS2-1阐明植物细胞镁浓度控制机制
- 批准号:
23KJ0503 - 财政年份:2023
- 资助金额:
$ 33.06万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Identification and analysis of genetic variants that enhance the expression of gravitropism in Arabidopsis roots
增强拟南芥根向地性表达的遗传变异的鉴定和分析
- 批准号:
23K05483 - 财政年份:2023
- 资助金额:
$ 33.06万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Genome assessment of temperature adaptability in Arabidopsis halleri ecotypes that adapted to different altitudes
适应不同海拔的拟南芥生态型温度适应性的基因组评估
- 批准号:
23H02549 - 财政年份:2023
- 资助金额:
$ 33.06万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Rotation 1: Circadian clocks in wheat and Arabidopsis
旋转 1:小麦和拟南芥的生物钟
- 批准号:
2886558 - 财政年份:2023
- 资助金额:
$ 33.06万 - 项目类别:
Studentship
Development of yeast protein expression library expressing all Arabidopsis membrane transporters
表达所有拟南芥膜转运蛋白的酵母蛋白表达文库的开发
- 批准号:
23K05696 - 财政年份:2023
- 资助金额:
$ 33.06万 - 项目类别:
Grant-in-Aid for Scientific Research (C)














{{item.name}}会员




