Continued Development and Maintenance of the MG-RAST Metagenomics Pipeline
MG-RAST 宏基因组管道的持续开发和维护
基本信息
- 批准号:9906157
- 负责人:
- 金额:$ 70.49万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-03-01 至 2022-02-28
- 项目状态:已结题
- 来源:
- 关键词:AlgorithmsArchitectureBiologicalCase StudyClinicalCloud ComputingCommunitiesComplexComputer softwareCustomDataData SetDatabasesDetectionDevelopmentDocumentationEngineeringEnvironmentEnzymesFailureFruitGenerationsGenetic MaterialsGenomeGenomicsGoalsGroupingHealthHousingHumanInfrastructureInternetKnowledgeMaintenanceMeasuresMetadataMetagenomicsMethodsMineralsMiningModelingModificationMonitorNucleotidesOrganismPatientsPatternPerformancePilot ProjectsPlayPopulationPrivatizationProteinsReportingReproducibilityResolutionResourcesRunningSamplingScheduleService delivery modelServicesSpeedSystemTaxonomyTechnologyTestingTimeUpdateVariantWorkbasecloud basedcomplex data computer infrastructurecomputerized data processingcomputing resourcescostdata modelingdesignexperiencefile formatflexibilitygut microbiotahost-microbe interactionsimprovedimproved functioninginterestlarge datasetsmetagenomemicrobialmicrobial communitymicrobiomenext generation sequencingnovelopen sourceoperationpersonalized medicinepublic health relevancesequencing platformtoolusability
项目摘要
DESCRIPTION (provided by applicant): Metagenomics, the study of microbial populations sampled directly from the environment, affords avenues for discovering novel enzymes via microbial profiling; using microbial shifts as predictors for health; or gauging the sustainabilityof human operations like mineral mining. However, the volume of metagenomic data is large (e.g., the metagenome of a human's gut microbiota is about 1 Gigabasepairs in size) and the processing that needs to be done to extract meaning out of the large datasets is significant, such as to identify what organisms' genomes are in the sample (taxonomic annotation) and what are they doing (functional annotation) via comparisons with continually updated knowledge databases. These numbers are only growing as experimentalists demand more and more metagenomic analysis runs. Borne out of this need, our MG-RAST (Metagenomics-Rapid Annotation) portal, an open-source, high-throughput, metagenomics service, has been a major community resource since 2008, housing over 160K datasets and 40K users. However, since its original design, MG-RAST has witnessed the frenetic development of next-generation sequencing technologies, drastically altered computing landscape (both in hardware and software), changed requirements in terms of number of users and datasets' volumes and diversity, increasing complexity of pipeline components, and requirements for higher throughput. To adapt to this, MG-RAST has been continually modified. Modifications included upgrading the pipeline components with several algorithmic improvements; deploying a customized data and workflow management system - the SHOCK object store and AWE workflow manager; and porting MG-RAST to a cloud-based distributed architecture. Notwithstanding our continual, albeit ad-hoc system improvements, our pilot studies have indicated the need for a comprehensive redesign of MG-RAST to keep pace with the needs of the rapidly advancing field of metagenomics. Our proposed enhancements are based on expressed user requirements, new usage patterns, and flexibility to incorporate new tools, especially for the compute-intensive similarity analysis for queried sequences. Through this project, we propose to accomplish MG-RAST's transformation via (i) improving its functionality and data reproducibility; (ii) improving its software quality and performance through automated monitoring and generation of test suites; and (iii) moving toward a federated infrastructure for metagenomics data. Overall, the successful accomplishment of our aims will support alternate metagenomics service models through federation of services and data and result in a robust state-of-the-art metagenomics resource. Federation in biomedical pipelines is in general a powerful direction to leverage the expertise of diverse user-bases and, reciprocally, benefit its users. Thus, MG-RAST, as a state- of-the-art pipeline, will be capable of supporting an ever increasing user-base, handling larger and more varied datasets, and evolving in concert with new genomics technologies. This, with the ultimate goal, to accelerate advances in end-user applications, e.g., personalized medicine, tailored to the patient's microbiome.
项目成果
期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
The metagenomic data life-cycle: standards and best practices.
- DOI:10.1093/gigascience/gix047
- 发表时间:2017-08-01
- 期刊:
- 影响因子:9.2
- 作者:Ten Hoopen P;Finn RD;Bongo LA;Corre E;Fosso B;Meyer F;Mitchell A;Pelletier E;Pesole G;Santamaria M;Willassen NP;Cochrane G
- 通讯作者:Cochrane G
A Distributed Classifier for MicroRNA Target Prediction with Validation Through TCGA Expression Data.
- DOI:10.1109/tcbb.2018.2828305
- 发表时间:2018-07
- 期刊:
- 影响因子:0
- 作者:Ghoshal A;Zhang J;Roth MA;Xia KM;Grama AY;Chaterji S
- 通讯作者:Chaterji S
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.
- DOI:10.1186/s12918-016-0302-3
- 发表时间:2016-08-01
- 期刊:
- 影响因子:0
- 作者:Kim SG;Theera-Ampornpunt N;Fang CH;Harwani M;Grama A;Chaterji S
- 通讯作者:Chaterji S
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ananth Grama其他文献
Ananth Grama的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ananth Grama', 18)}}的其他基金
Continued Development and Maintenance of the MG-RAST Metagenomics Pipeline
MG-RAST 宏基因组管道的持续开发和维护
- 批准号:
9233909 - 财政年份:2016
- 资助金额:
$ 70.49万 - 项目类别:
相似海外基金
System Architecture of Impact-Resistant Robot with Detection and Prevention of Joint Dislocation Inspired from Biological Intra-Articular Proprioception
受生物关节内本体感觉启发的关节脱位检测与预防的抗冲击机器人系统架构
- 批准号:
22K17973 - 财政年份:2022
- 资助金额:
$ 70.49万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Perturbation of the extracellular architecture to promote the absorption and lymphatic transport of biological macromolecules
扰动细胞外结构促进生物大分子的吸收和淋巴转运
- 批准号:
LP140100377 - 财政年份:2015
- 资助金额:
$ 70.49万 - 项目类别:
Linkage Projects
TRR 141: Biological Design and Integrative Structures. Analysis, Simulation and Implementation in Architecture
TRR 141:生物设计和综合结构。
- 批准号:
231064407 - 财政年份:2014
- 资助金额:
$ 70.49万 - 项目类别:
CRC/Transregios
Evolutionary processes driving biological variation and diversity as models for exploratory digital design tools in architecture (B02)
驱动生物变异和多样性的进化过程作为建筑探索性数字设计工具的模型(B02)
- 批准号:
260974942 - 财政年份:2014
- 资助金额:
$ 70.49万 - 项目类别:
CRC/Transregios
Collaborative Research: ABI: Innovation: The Global Names Architecture, an infrastructure for unifying taxonomic databases and services for managers of biological information.
合作研究:ABI:创新:全球名称架构,一个为生物信息管理者统一分类数据库和服务的基础设施。
- 批准号:
1342595 - 财政年份:2013
- 资助金额:
$ 70.49万 - 项目类别:
Continuing Grant
Collaborative Research: ABI: Innovation: The "Global Names Architecture," an infrastructure for unifying taxonomic databases and services for managers of biological information.
合作研究:ABI:创新:“全球名称架构”,一个为生物信息管理者统一分类数据库和服务的基础设施。
- 批准号:
1062324 - 财政年份:2011
- 资助金额:
$ 70.49万 - 项目类别:
Continuing Grant
Collaborative Research: ABI: Innovation: The Global Names Architecture, an infrastructure for unifying taxonomic databases and services for managers of biological information.
合作研究:ABI:创新:全球名称架构,一个为生物信息管理者统一分类数据库和服务的基础设施。
- 批准号:
1062387 - 财政年份:2011
- 资助金额:
$ 70.49万 - 项目类别:
Continuing Grant
ABI:Innovation: Collaborative Research: The "Global Names Architecture," an infrastructure for unifying taxonomic databases and services for managers of biological information.
ABI:创新:协作研究:“全球名称架构”,一种为生物信息管理者统一分类数据库和服务的基础设施。
- 批准号:
1062378 - 财政年份:2011
- 资助金额:
$ 70.49万 - 项目类别:
Continuing Grant
Collaborative Research: ABI: Innovation: The Global Names Architecture, an infrastructure for unifying taxonomic databases and services for managers of biological information
合作研究:ABI:创新:全球名称架构,为生物信息管理者统一分类数据库和服务的基础设施
- 批准号:
1062441 - 财政年份:2011
- 资助金额:
$ 70.49万 - 项目类别:
Continuing Grant
Biophysics of cryopreservation: elucidating the structural architecture and physical mechanisms of both model and complex biological systems
冷冻保存的生物物理学:阐明模型和复杂生物系统的结构体系和物理机制
- 批准号:
EP/H020616/1 - 财政年份:2010
- 资助金额:
$ 70.49万 - 项目类别:
Research Grant