Global Infrastructure for Collaborative High-throughput Cancer Genomics Analysis
协作高通量癌症基因组分析的全球基础设施
基本信息
- 批准号:10011769
- 负责人:
- 金额:$ 94.19万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-09-20 至 2022-08-31
- 项目状态:已结题
- 来源:
- 关键词:AddressAlgorithmsAttentionBioinformaticsBiologicalBiomedical ResearchCancer CenterCancer Gene MutationCategoriesClassificationClinicClinicalClinical DataClinical TrialsCodeCollaborationsCommunitiesComplexConsensusCorrelative StudyCredentialingCustomDataData AggregationData AnalysesData AnalyticsData SetDatabasesDepositionDevelopmentDocumentationEnsureEvolutionFreezingFutureGeneral PopulationGenome Data Analysis CenterGenome Data Analysis NetworkGenomic Data CommonsGoalsInfrastructureInstitutesKnowledgeLinkMalignant NeoplasmsManuscriptsMissionModelingMolecularMorphologic artifactsNational Cancer InstitutePaperPathway AnalysisPatient-Focused OutcomesPatientsPhaseProcessProductionPublicationsPublishingReportingReproducibilityResearch PersonnelRunningSamplingScienceScientistService delivery modelStructureSuggestionSummary ReportsSystemThe Cancer Genome AtlasTimeTimeLineTumor SubtypeUpdateVertebral columnWorkWritingbasecancer genomecancer genomicsclinical diagnosticsclinically relevantcostdata harmonizationdisorder subtypeexperienceexperimental studygenome analysishigh standardimprovedinnovationinsightmembermolecular subtypesoperationpreventrepositorytooltreatment responseuser-friendlywhole genomeworking group
项目摘要
Abstract
The Cancer Genome Atlas (TCGA) set the standards for large-scale cancer genome
projects worldwide. In the next phase, the National Cancer Institute and its Center for
Cancer Genomics are planning large-scale projects closely tied to clinical questions and
trials. In order to perform the analysis of these data, the NCI is creating a Genome Data
Analysis Network (GDAN) of different types of Genome Data Analysis Centers (GDACs).
Central to this Network is a single Processing GDAC, which will take all the harmonized
data, as stored in the NCI's Genomics Data Commons, and perform higher level integrated
analyses on these data to support both the Analysis Working Groups (AWGs) within the
Network (which will be formed for each project to perform special analyses of the data and
write manuscripts) as well as the entire biomedical research community.
Herein we propose to build the centralized Processing GDAC on top of our FireCloud
platform, an infrastructure to run large scale computation on the cloud in a fully rigorous
and reproducible fashion. FireCloud development was based on our experience with
Firehose, the Broad internal platform on which the standard TCGA data and analyses
currently run. We propose to create and operate the GDAN Standard Workflow,
incorporating tools actively developed and used within the GDAN and across the entire
field, with particular emphasis on clinical tools. This Workflow will serve as the starting
point for AWGs and set the highest standards of transparency, reproducibility and rigor for
cancer genome analysis. The results of the Standard Workflow will be stored in a public
database, and accessible via standard APIs, and used together with a continuously
updated database of prior knowledge to create scientific reports that will be made available
to the community, in a pre-publication manner. Finally, a major innovation is that AWG
members will be able to login into FireCloud and rerun the entire workflow, or parts of it,
with their own parameters and subsets of the data – thus making the entire GDAN analysis
fully reproducible and scalable.
Our goals are therefore: (1) To create a global infrastructure for collaborative extreme-
scale cancer analysis; (2) Operate the Standard Workflows at scale; (3) Rapidly and
continuously evolve the Standard Workflows; and (4) created improved capabilities for
reporting, exploring the results, clinical diagnostics and reproducibility.
摘要
癌症基因组图谱(TCGA)为大规模癌症基因组制定了标准
全球的项目。在下一阶段,美国国家癌症研究所及其癌症研究中心
癌症基因组学正在计划与临床问题密切相关的大规模项目,
审判为了对这些数据进行分析,NCI正在创建一个基因组数据库,
不同类型的基因组数据分析中心(GDAC)的分析网络(GDAN)。
这个网络的中心是一个单一的处理GDAC,它将把所有的协调
数据,存储在NCI的基因组学数据共享区,并执行更高级别的集成
分析这些数据,以支持分析工作组(特设工作组)和
网络(将为每个项目建立网络,对数据进行专门分析,
撰写手稿)以及整个生物医学研究界。
在此,我们建议在FireCloud之上构建集中式处理GDAC
平台,一个在云上以完全严格的方式运行大规模计算的基础设施。
和可复制的时尚。FireCloud的开发是基于我们的经验,
Firehose,Broad内部平台,标准TCGA数据和分析
目前运行。我们建议创建和操作GDAN标准工作流程,
整合GDAN内和整个GDAN内积极开发和使用的工具
特别强调临床工具。此工作流程将作为
为特设工作组设立了透明度、再现性和严谨性的最高标准,
癌症基因组分析标准工作流的结果将存储在公共
数据库,可通过标准API访问,并与连续
更新现有知识数据库,以创建将提供的科学报告
以出版前的方式向社会公布。最后,一个主要的创新是AWG
成员将能够登录到FireCloud并浏览整个工作流程,或其中的一部分,
使用自己的参数和数据子集-从而使整个GDAN分析
完全可复制和可扩展。
因此,我们的目标是:(1)创建一个全球基础设施,以实现协作的极端-
大规模癌症分析;(2)大规模操作标准工作流程;(3)快速,
不断发展标准工作流程;(4)创建改进的能力,
报告、探索结果、临床诊断和再现性。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Diverse mutational landscapes in human lymphocytes.
- DOI:10.1038/s41586-022-05072-7
- 发表时间:2022-08
- 期刊:
- 影响因子:64.8
- 作者:Machado, Heather E.;Mitchell, Emily;Obro, Nina F.;Kubler, Kirsten;Davies, Megan;Leongamornlert, Daniel;Cull, Alyssa;Maura, Francesco;Sanders, Mathijs A.;Cagan, Alex T. J.;McDonald, Craig;Belmonte, Miriam;Shepherd, Mairi S.;Braga, Felipe A. Vieira;Osborne, Robert J.;Mahbubani, Krishnaa;Martincorena, Inigo;Laurenti, Elisa;Green, Anthony R.;Getz, Gad;Polak, Paz;Saeb-Parsy, Kourosh;Hodson, Daniel J.;Kent, David G.;Campbell, Peter J.
- 通讯作者:Campbell, Peter J.
Comment on "DNA damage is a pervasive cause of sequencing errors, directly confounding variant identification".
对“DNA 损伤是测序错误的普遍原因,直接混淆变异识别”的评论。
- DOI:10.1126/science.aas9824
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Stewart,Chip;Leshchiner,Ignaty;Hess,Julian;Getz,Gad
- 通讯作者:Getz,Gad
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
GAD A GETZ其他文献
GAD A GETZ的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('GAD A GETZ', 18)}}的其他基金
Center for comprehensive proteogenomic data analysis
综合蛋白质组数据分析中心
- 批准号:
10440579 - 财政年份:2022
- 资助金额:
$ 94.19万 - 项目类别:
Center for comprehensive proteogenomic data analysis
综合蛋白质组数据分析中心
- 批准号:
10644013 - 财政年份:2022
- 资助金额:
$ 94.19万 - 项目类别:
Comprehensive analysis of point mutations in cancer
癌症点突变综合分析
- 批准号:
10301857 - 财政年份:2021
- 资助金额:
$ 94.19万 - 项目类别:
Comprehensive analysis of point mutations in cancer
癌症点突变综合分析
- 批准号:
10491092 - 财政年份:2021
- 资助金额:
$ 94.19万 - 项目类别:
Comprehensive analysis of point mutations in cancer
癌症点突变综合分析
- 批准号:
10676830 - 财政年份:2021
- 资助金额:
$ 94.19万 - 项目类别:
Global Infrastructure for Collaborative High-throughput Cancer Genomics Analysis
协作高通量癌症基因组分析的全球基础设施
- 批准号:
9571405 - 财政年份:2016
- 资助金额:
$ 94.19万 - 项目类别:
Global Infrastructure for Collaborative High-throughput Cancer Genomics Analysis
协作高通量癌症基因组分析的全球基础设施
- 批准号:
9355157 - 财政年份:2016
- 资助金额:
$ 94.19万 - 项目类别:
Discovery of clinically distinct CLL subgroups by integrative mapping of large-scale CLL genetic, expression and clinical data
通过大规模 CLL 遗传、表达和临床数据的综合绘图发现临床上不同的 CLL 亚组
- 批准号:
10005157 - 财政年份:2016
- 资助金额:
$ 94.19万 - 项目类别:
Global Infrastructure for Collaborative High-throughput Cancer Genomics Analysis
协作高通量癌症基因组分析的全球基础设施
- 批准号:
9211085 - 财政年份:2016
- 资助金额:
$ 94.19万 - 项目类别:
相似海外基金
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
- 批准号:
2337776 - 财政年份:2024
- 资助金额:
$ 94.19万 - 项目类别:
Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
- 批准号:
2338816 - 财政年份:2024
- 资助金额:
$ 94.19万 - 项目类别:
Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
- 批准号:
2338846 - 财政年份:2024
- 资助金额:
$ 94.19万 - 项目类别:
Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
- 批准号:
2348261 - 财政年份:2024
- 资助金额:
$ 94.19万 - 项目类别:
Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
- 批准号:
2348346 - 财政年份:2024
- 资助金额:
$ 94.19万 - 项目类别:
Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
- 批准号:
2348457 - 财政年份:2024
- 资助金额:
$ 94.19万 - 项目类别:
Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
- 批准号:
2404989 - 财政年份:2024
- 资助金额:
$ 94.19万 - 项目类别:
Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
- 批准号:
2339310 - 财政年份:2024
- 资助金额:
$ 94.19万 - 项目类别:
Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
- 批准号:
2339669 - 财政年份:2024
- 资助金额:
$ 94.19万 - 项目类别:
Continuing Grant
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
- 批准号:
EP/Y029089/1 - 财政年份:2024
- 资助金额:
$ 94.19万 - 项目类别:
Research Grant