The Integration of Trans-omics for Precision Medicine (TOPMED) and Other Heart, Lung, Blood and Sleep (HLBS) Data Sets with the Data Commons
精准医学跨组学 (TOPMED) 和其他心、肺、血液和睡眠 (HLBS) 数据集与数据共享的集成
基本信息
- 批准号:10267909
- 负责人:
- 金额:$ 875万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-09-30 至 2022-03-30
- 项目状态:已结题
- 来源:
- 关键词:AddressArchitectureAtlasesAutomobile DrivingAwardBiological SciencesBloodCaliforniaCellsChicagoClinical DataClinical ManagementComplexComputer softwareCosts and BenefitsCreativenessDataData AnalysesData CommonsData ScienceData SetEcosystemEncapsulatedEnsureEnvironmentFundingGenomeGenotype-Tissue Expression ProjectGoalsHeartHumanHybridsImageIndividualInstitutesInvestigationLungMedical RecordsMetalsModelingMolecular ProfilingParticipantPhilosophyProductionReadabilityRunningScientistSleepStandardizationTrans-Omics for Precision MedicineUnited States National Institutes of HealthUniversitiesVisionbrain healthcloud basedcohesioncohortdata formatdesigndiverse datagenome sequencinghigh resolution imaginghuman diseaseinsightinteroperabilitymodel organisms databasesmultiple data typesopen sourceprogramssoftware developmentworking group
项目摘要
The life sciences are in the midst of a data revolution. Cheap and accurate genome sequencing is a reality, high-resolution imaging is becoming routine, and clinical data is increasingly stored in machine-readable formats. These breakthroughs have brought us to the threshold of a new era in biomedicine, one where the data sciences hold the potential to propel our understanding and treatment of human disease. Achieving this potential, however, will require creating software platforms that can support storing, sharing, and analyzing data at unlimited scale. In this application, we propose to address this unmet need by bringing together three groups — the University of Chicago, the Broad Institute, and the University of California at Santa Cruz — each with a strong track record of developing production-grade software platforms to support flagship scientific efforts, including the All of Us Cohort Program, the Genome Data Commons (GDC) and its affiliated NCI Cloud Pilots program, and the Human Cell Atlas Data Coordination Platform (HCA DCP). Our goal is to align and integrate our individual efforts at building data platforms, in order to build a cohesive environment that can serve the needs of the NIH Data Commons and beyond. Because these platforms were each developed to fulfill differing use cases, there is currently far more complementarity than overlap between them. For example, Dr. Grossman has extensive expertise in running a hybrid cloud at scale to support the needs of the GDC; this provides cost benefits around data transport and egress that would be invaluable to the NIH Data Commons. Similarly, Dr. Philippakis has developed a cloud-based model of collaborative workspaces (FireCloud) and software for management of secondary data use restrictions (DUOS), and Dr. Paten has long been a leader in developing and implementing standardized APIs as part of the GA4GH. It is this complementarity that motivates us to integrate our efforts. In the sections below, we present our plans for creating the Commons Alliance Platform. In addition to having a unified technical vision for what is needed, we are also aligned around a core set of guiding principles: (1) Open-source - All the software we develop, from user interfaces down to cloud metal, is open-source. This includes not only the software that would be funded via this awarding mechanism, but all software developed and deployed by our team. (2) Modular and interoperable - A design principle of all complex software undertakings is “separation of concerns,” i.e. the notion that there should be a clean division between architectural components, each encapsulated by well-defined interfaces. We are committed to building modular and interoperable software and, in doing so, encouraging the creation of an ecosystem around them. (3) Standards-driven - Our team is committed to creating and utilizing standardized APIs and data formats. We have been leaders in GA4GH since its founding, chairing various working groups and driver projects. (4) Healthy Competition - Our consortium’s philosophy is to collaborate on APIs to support interoperability, but compete on implementation to encourage creativity and diversity. (5) Diversity of data types - We have expertise in multiple data types beyond molecular profiling. In particular, a key goal of All of Us is to collect extensive clinical data in the form of participant-provided data and medical records. Similarly, through the Brain Health Commons, Dr. Grossman will be managing clinical and imaging data. These capabilities will be invaluable as the Commons expands to include additional data types. (6) Driven by scientific use cases - Our consortium includes many leading scientists, including PIs on awards for model organism databases, GTEx, and TOPMed. We will leverage their insights via driving use cases to ensure that our software enables flagship scientific investigations.
生命科学正处于一场数据革命之中。廉价而准确的基因组测序已成为现实,高分辨率成像正在成为常规,临床数据越来越多地以机器可读格式存储。这些突破将我们带到了生物医学新时代的门槛,在这个时代,数据科学有可能推动我们对人类疾病的理解和治疗。然而,要实现这一潜力,需要创建能够支持无限规模存储、共享和分析数据的软件平台。在本申请中,我们建议通过将三个小组--芝加哥大学、布罗德研究所和加州大学圣克鲁斯分校--聚集在一起来解决这一未满足的需求,每个小组都有开发生产级软件平台的良好记录,以支持旗舰科学工作,包括All of Us队列计划、基因组数据共享(GDC)及其附属的NCI云试点计划,人类细胞图谱数据协调平台(HCA DCP)我们的目标是调整和整合我们在构建数据平台方面的个人努力,以构建一个能够满足NIH数据共享区及其他需求的有凝聚力的环境。由于这些平台都是为了满足不同的用例而开发的,因此目前它们之间的互补性远远大于重叠。例如,Grossman博士在大规模运行混合云以支持GDC的需求方面拥有丰富的专业知识;这为数据传输和出口提供了成本优势,这对NIH Data Commons来说是无价的。同样,Philippakis博士开发了一种基于云的协作云模型(FireCloud)和用于管理二级数据使用限制(DUOS)的软件,Paten博士长期以来一直是开发和实施标准化API的领导者,作为GA 4GH的一部分。正是这种互补性促使我们整合我们的努力。在下面的章节中,我们将介绍我们创建共享资源联盟平台的计划。除了对所需内容有统一的技术愿景外,我们还围绕一套核心指导原则保持一致:(1)开源-我们开发的所有软件,从用户界面到Cloud Metal,都是开源的。这不仅包括通过该奖励机制资助的软件,还包括我们团队开发和部署的所有软件。(2)模块化和可互操作-所有复杂软件的设计原则都是“关注点分离”,即在架构组件之间应该有一个清晰的划分,每个组件都由定义良好的接口封装。我们致力于构建模块化和可互操作的软件,并在此过程中鼓励围绕它们创建生态系统。(3)标准驱动-我们的团队致力于创建和利用标准化的API和数据格式。自GA 4GH成立以来,我们一直是其领导者,主持各种工作组和驱动程序项目。(4)健康的竞争-我们联盟的理念是在API上合作以支持互操作性,但在实现上竞争以鼓励创造力和多样性。(5)数据类型的多样性-我们在分子分析之外的多种数据类型方面拥有专业知识。特别是,我们所有人的一个关键目标是以参与者提供的数据和医疗记录的形式收集广泛的临床数据。同样,通过大脑健康共享,格罗斯曼博士将管理临床和成像数据。随着Commons扩展到包括更多的数据类型,这些功能将是非常宝贵的。(6)由科学用例驱动-我们的联盟包括许多领先的科学家,包括模式生物数据库,GTEx和TOPMed奖项的PI。我们将通过驱动用例来利用他们的见解,以确保我们的软件能够进行旗舰科学研究。
项目成果
期刊论文数量(6)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Progress Toward Cancer Data Ecosystems.
- DOI:10.1097/ppo.0000000000000318
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Grossman RL
- 通讯作者:Grossman RL
Personalized Pangenome References.
个性化泛基因组参考。
- DOI:10.1101/2023.12.13.571553
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Sirén,Jouni;Eskandar,Parsa;Ungaro,MatteoTommaso;Hickey,Glenn;Eizenga,JordanM;Novak,AdamM;Chang,Xian;Chang,Pi-Chuan;Kolmogorov,Mikhail;Carroll,Andrew;Monlong,Jean;Paten,Benedict
- 通讯作者:Paten,Benedict
The Dockstore: enhancing a community platform for sharing reproducible and accessible computational protocols.
- DOI:10.1093/nar/gkab346
- 发表时间:2021-07-02
- 期刊:
- 影响因子:14.9
- 作者:Yuen D;Cabansay L;Duncan A;Luu G;Hogue G;Overbeck C;Perez N;Shands W;Steinberg D;Reid C;Olunwa N;Hansen R;Sheets E;O'Farrell A;Cullion K;O'Connor BD;Paten B;Stein L
- 通讯作者:Stein L
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
ROBERT L. GROSSMAN其他文献
ROBERT L. GROSSMAN的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('ROBERT L. GROSSMAN', 18)}}的其他基金
Helping to End Addiction Long-term (HEAL) Data Platform
帮助戒除成瘾长期 (HEAL) 数据平台
- 批准号:
10906696 - 财政年份:2020
- 资助金额:
$ 875万 - 项目类别:
Helping to End Addiction Long-term (HEAL) Data Platform
帮助戒除成瘾长期 (HEAL) 数据平台
- 批准号:
10167308 - 财政年份:2020
- 资助金额:
$ 875万 - 项目类别:
Helping to End Addiction Long-term (HEAL) Data Platform
帮助戒除成瘾长期 (HEAL) 数据平台
- 批准号:
10701395 - 财政年份:2020
- 资助金额:
$ 875万 - 项目类别:
The Integration of Trans-omics for Precision Medicine (TOPMED) and Other Heart, Lung, Blood and Sleep (HLBS) Data Sets with the Data Commons
精准医学跨组学 (TOPMED) 和其他心、肺、血液和睡眠 (HLBS) 数据集与数据共享的集成
- 批准号:
9569862 - 财政年份:2017
- 资助金额:
$ 875万 - 项目类别:
The Commons Alliance: A Partnership to Catalyze the Creation of an NIH Data Commons
共享联盟:促进 NIH 数据共享创建的合作伙伴关系
- 批准号:
9559879 - 财政年份:2017
- 资助金额:
$ 875万 - 项目类别:
The Integration of Trans-omics for Precision Medicine (TOPMED) and Other Heart, Lung, Blood and Sleep (HLBS) Data Sets with the Data Commons
精准医学跨组学 (TOPMED) 和其他心、肺、血液和睡眠 (HLBS) 数据集与数据共享的集成
- 批准号:
10001102 - 财政年份:2017
- 资助金额:
$ 875万 - 项目类别:
Training of Junior Faculty for Careers in Omics of Lung Diseases
肺部疾病组学初级教师职业培训
- 批准号:
8575164 - 财政年份:2013
- 资助金额:
$ 875万 - 项目类别:
Training of Junior Faculty for Careers in Omics of Lung Diseases
肺部疾病组学初级教师职业培训
- 批准号:
9283609 - 财政年份:2013
- 资助金额:
$ 875万 - 项目类别:
Training of Junior Faculty for Careers in Omics of Lung Diseases
肺部疾病组学初级教师职业培训
- 批准号:
9069942 - 财政年份:2013
- 资助金额:
$ 875万 - 项目类别:
Training of Junior Faculty for Careers in Omics of Lung Diseases
肺部疾病组学初级教师职业培训
- 批准号:
8722616 - 财政年份:2013
- 资助金额:
$ 875万 - 项目类别:
相似海外基金
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
- 批准号:
2339310 - 财政年份:2024
- 资助金额:
$ 875万 - 项目类别:
Continuing Grant
Hardware-aware Network Architecture Search under ML Training workloads
ML 训练工作负载下的硬件感知网络架构搜索
- 批准号:
2904511 - 财政年份:2024
- 资助金额:
$ 875万 - 项目类别:
Studentship
CAREER: Creating Tough, Sustainable Materials Using Fracture Size-Effects and Architecture
职业:利用断裂尺寸效应和架构创造坚韧、可持续的材料
- 批准号:
2339197 - 财政年份:2024
- 资助金额:
$ 875万 - 项目类别:
Standard Grant
Travel: Student Travel Support for the 51st International Symposium on Computer Architecture (ISCA)
旅行:第 51 届计算机体系结构国际研讨会 (ISCA) 的学生旅行支持
- 批准号:
2409279 - 财政年份:2024
- 资助金额:
$ 875万 - 项目类别:
Standard Grant
Understanding Architecture Hierarchy of Polymer Networks to Control Mechanical Responses
了解聚合物网络的架构层次结构以控制机械响应
- 批准号:
2419386 - 财政年份:2024
- 资助金额:
$ 875万 - 项目类别:
Standard Grant
I-Corps: Highly Scalable Differential Power Processing Architecture
I-Corps:高度可扩展的差分电源处理架构
- 批准号:
2348571 - 财政年份:2024
- 资助金额:
$ 875万 - 项目类别:
Standard Grant
Collaborative Research: Merging Human Creativity with Computational Intelligence for the Design of Next Generation Responsive Architecture
协作研究:将人类创造力与计算智能相结合,设计下一代响应式架构
- 批准号:
2329759 - 财政年份:2024
- 资助金额:
$ 875万 - 项目类别:
Standard Grant
The architecture and evolution of host control in a microbial symbiosis
微生物共生中宿主控制的结构和进化
- 批准号:
BB/X014657/1 - 财政年份:2024
- 资助金额:
$ 875万 - 项目类别:
Research Grant
RACCTURK: Rock-cut Architecture and Christian Communities in Turkey, from Antiquity to 1923
RACCTURK:土耳其的岩石建筑和基督教社区,从古代到 1923 年
- 批准号:
EP/Y028120/1 - 财政年份:2024
- 资助金额:
$ 875万 - 项目类别:
Fellowship
NSF Convergence Accelerator Track M: Bio-Inspired Surface Design for High Performance Mechanical Tracking Solar Collection Skins in Architecture
NSF Convergence Accelerator Track M:建筑中高性能机械跟踪太阳能收集表皮的仿生表面设计
- 批准号:
2344424 - 财政年份:2024
- 资助金额:
$ 875万 - 项目类别:
Standard Grant