The CFDE Workbench
CFDE 工作台
基本信息
- 批准号:10851224
- 负责人:
- 金额:$ 150万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-09-18 至 2028-09-17
- 项目状态:未结题
- 来源:
- 关键词:AddressAdoptedArchitectureArchivesBenchmarkingBiological AssayBiomedical ResearchCatalogingCatalogsComplexComputer softwareDataData Coordinating CenterData SetDatabasesDedicationsEcosystemEducationEducation and OutreachElectronic MailElementsEventFeedbackFeedsFundingFunding OpportunitiesGenesGenus MenthaGoalsGraphGroup MeetingsHomeIngestionKnowledgeKnowledge PortalLearningLibrariesLinkLinkedInMachine LearningMetadataMethodsPharmaceutical PreparationsPhasePlayProcessProductivityProtocols documentationPublicationsPublishingResearch PersonnelResourcesSiteSocial NetworkSystemTimeTwitterUnited StatesUnited States National Institutes of HealthVisualizationWorkbioinformatics toolcell typechatbotcostdata analysis pipelinedata ecosystemdata ingestiondata integrationdata modelingdata portaldata resourcedata toolsdesigndigitalexperienceindexingknowledge graphmeetingsnewspreservationprogramsrepositorysearch enginesocial mediatooltranscriptome sequencingvirtualweb appweb portalworking group
项目摘要
Abstract
The NIH Common Fund (CF) programs have produced transformative datasets, databases,
methods, bioinformatics tools and workflows that are significantly advancing biomedical research
in the United States and worldwide. Currently, CF programs are mostly isolated. However,
integrating data from across CF programs has the potential for synergistic discoveries. In addition,
since CF programs have a time limit of 10 years, sustainability of the widely used CF digital
resources after the programs expire is critical. To address these challenges, the NIH established
the Common Fund Data Ecosystem (CFDE) program which has been recently approved to
continue to its second new phase. For the second phase of the CFDE, this project will establish
the Data Resource Center (DRC) and the Knowledge Center (KC). Our efforts will culminate in
producing The CFDE Workbench which will be composed of three main products: the CFDE
information portal, the CFDE data resource portal, and the CFDE knowledge portal. These three
web portals will be full-stack web-based applications with a backend database and will be
integrated into one public site.
The CFDE information portal will be the entry point to the other two portals. It will contain
information about the CFDE in a dedicated About page, information about each participating and
non-participating CF program, information about each data coordination center (DCC), a link to a
catalog of CF datasets, and a link to a catalog of CF tools and workflows, news, events, funding
opportunities, standards and protocols, educational programs and opportunities, social media
feeds, and publications.
The CFDE data resource portal will contain metadata, data, workflows, and tools which are the
products of the CF programs, and their data coordination centers (DDCs). We will adopt the C2M2
data model for storing information about metadata describing DCC datasets. We will also archive
relatively small omics datasets that do not have a home in widely established repositories and do
not require PHI protection. In addition, we will expand the cataloging to CF tools, APIs, and
workflows. Importantly, we will develop a search engine that will index and present results from
all these assembled digital assets. In addition, continuing the work established in the CFDE pilot
phase, users of the data portal will be able to fetch identified datasets through links provided by
the DCCs via the DRS protocol. This will include links to raw and processed data.
The CFDE knowledge portal will provide access to CF programs processed data in various
formats including: 1) knowledge graph assertions; 2) gene, drug, metabolite, and other set
libraries; 3) data matrices ready for machine learning and other AI applications; 4) signatures; and
5) bipartite graphs. In addition, the extract, transform, and load (ETL) scripts to process the data
into these formats will be provided. Since such processed data is relatively small, we will archive
and serve this processed data, mint it with unique IDs, and serve it via APIs. In addition, we will
develop workflows that will demonstrate how the processed data can be harmonized. At the same
time, we will document APIs from all CF DCCs and provide example Jupyter Notebooks that
demonstrate how these datasets can be accessed, processed, and combined for integrative
omics analysis. For the knowledge portal we will also develop a library of tools that utilize these
processed datasets. These tools will have some uniform requirements enabling a plug-and-play
architecture.
To achieve these goals, we will work collaboratively with the other CFDE newly established
centers, the participating CFDE DCCs, the CFDE NIH team, and relevant external entities and
potential consumers of these three software products. These interactions will be achieved via
face-to-face meetings, virtual working groups meeting, one-on-one meetings, Slack, GitHub,
project management software, and e-mail exchange. Via these interactions, we will establish
standards, workstreams, feedback and mini projects towards accomplishing the goal of
developing a lively and productive Common Fund Data Ecosystem.
摘要
NIH共同基金(CF)项目已经产生了变革性的数据集、数据库,
方法,生物信息学工具和工作流程,大大推进生物医学研究
在美国和全世界。目前,CF程序大多是孤立的。然而,在这方面,
整合来自CF项目的数据具有协同发现的潜力。此外,本发明还提供了一种方法,
由于CF计划有10年的时间限制,广泛使用的CF数字的可持续性
资源后,计划到期是至关重要的。为了应对这些挑战,NIH成立了
共同基金数据生态系统(CFDE)计划最近获得批准,
进入第二个新阶段。对于CFDE的第二阶段,该项目将建立
数据资源中心(DRC)和知识中心(KC)。我们的努力最终会
生产CFDE产品,将由三个主要产品组成:CFDE
信息门户、CFDE数据资源门户和CFDE知识门户。这三
门户网站将是基于Web的全栈应用程序,具有后端数据库,
整合到一个公共网站。
CFDE信息门户将是其他两个门户的入口点。它将包含
关于CFDE的信息,在一个专门的关于页面,关于每个参与者的信息,
非参与CF程序、关于每个数据协调中心(DCC)的信息、到
CF数据集目录,以及CF工具和工作流、新闻、事件、资金目录的链接
机会、标准和协议、教育计划和机会、社交媒体
饲料和出版物。
CFDE数据资源门户将包含元数据、数据、工作流和工具,
CF计划的产品及其数据协调中心(DDC)。我们将采用C2 M2
用于存储关于描述DCC数据集的元数据的信息的数据模型。我们也将存档
相对较小的组学数据集在广泛建立的存储库中没有归属,
无需PHI保护。此外,我们将把编目扩展到CF工具、API和
工作流程。重要的是,我们将开发一个搜索引擎,将索引和目前的结果,
所有这些组装的数字资产。此外,继续开展在CFDE试点中确定的工作,
在这一阶段,数据门户的用户将能够通过
通过DRS协议的DCC。这将包括原始数据和处理数据的链接。
CFDE知识门户网站将提供各种CF程序处理数据的访问
格式包括:1)知识图谱断言; 2)基因、药物、代谢物等集合
库; 3)为机器学习和其他AI应用做好准备的数据矩阵; 4)签名;以及
5)二部图此外,用于处理数据的提取、转换和加载(ETL)脚本
将提供这些格式。由于此类处理的数据相对较小,我们将存档
并提供这些经过处理的数据,用唯一的ID制作它,并通过API提供它。此外,我们将
制定工作流程,展示如何协调处理后的数据。在同一
同时,我们将记录来自所有CF DCC的API,并提供
演示如何访问、处理和组合这些数据集,
组学分析对于知识门户,我们还将开发一个工具库,
处理数据集。这些工具将有一些统一的要求,使即插即用
架构
为了实现这些目标,我们将与其他新成立的CFDE合作
中心、参与的CFDE DCC、CFDE NIH团队和相关外部实体,以及
这三个软件产品的潜在消费者。这些互动将通过
面对面会议,虚拟工作组会议,一对一会议,Slack,GitHub,
项目管理软件和电子邮件交换。通过这些互动,我们将建立
标准、工作流程、反馈和小型项目,以实现
建立一个活跃而富有成效的共同基金数据生态系统。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Avi Ma'ayan其他文献
Avi Ma'ayan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Avi Ma'ayan', 18)}}的其他基金
ARCHS4: Massive Mining of Publicly Available RNA Sequencing Data
ARCHS4:大规模挖掘公开的 RNA 测序数据
- 批准号:
10693339 - 财政年份:2022
- 资助金额:
$ 150万 - 项目类别:
Proteogenomic translator for cancer biomarker discovery towards precision medicine
用于癌症生物标志物发现和精准医学的蛋白质基因组翻译
- 批准号:
10442088 - 财政年份:2022
- 资助金额:
$ 150万 - 项目类别:
ARCHS4: Massive Mining of Publicly Available RNA Sequencing Data
ARCHS4:大规模挖掘公开的 RNA 测序数据
- 批准号:
10527721 - 财政年份:2022
- 资助金额:
$ 150万 - 项目类别:
ARCHS4: Massive Mining of Publicly Available RNA Sequencing Data
ARCHS4:大规模挖掘公开的 RNA 测序数据
- 批准号:
10814654 - 财政年份:2022
- 资助金额:
$ 150万 - 项目类别:
Proteogenomic translator for cancer biomarker discovery towards precision medicine
用于癌症生物标志物发现和精准医学的蛋白质基因组翻译
- 批准号:
10655588 - 财政年份:2022
- 资助金额:
$ 150万 - 项目类别:
The LINCS DCIC Engagement Plan with the CFDE
LINCS DCIC 与 CFDE 的合作计划
- 批准号:
10837964 - 财政年份:2020
- 资助金额:
$ 150万 - 项目类别:
The LINCS DCIC Engagement Plan with the CFDE
LINCS DCIC 与 CFDE 的合作计划
- 批准号:
10468520 - 财政年份:2020
- 资助金额:
$ 150万 - 项目类别:
The LINCS DCIC Engagement Plan with the CFDE
LINCS DCIC 与 CFDE 的合作计划
- 批准号:
10444350 - 财政年份:2020
- 资助金额:
$ 150万 - 项目类别:
The LINCS DCIC Engagement Plan with the CFDE
LINCS DCIC 与 CFDE 的合作计划
- 批准号:
10682935 - 财政年份:2020
- 资助金额:
$ 150万 - 项目类别:
Knowledge Management Center for Illuminating the Druggable Genome
阐明可药物基因组的知识管理中心
- 批准号:
10560469 - 财政年份:2018
- 资助金额:
$ 150万 - 项目类别:
相似海外基金
How novices write code: discovering best practices and how they can be adopted
新手如何编写代码:发现最佳实践以及如何采用它们
- 批准号:
2315783 - 财政年份:2023
- 资助金额:
$ 150万 - 项目类别:
Standard Grant
One or Several Mothers: The Adopted Child as Critical and Clinical Subject
一位或多位母亲:收养的孩子作为关键和临床对象
- 批准号:
2719534 - 财政年份:2022
- 资助金额:
$ 150万 - 项目类别:
Studentship
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
- 批准号:
2633211 - 财政年份:2020
- 资助金额:
$ 150万 - 项目类别:
Studentship
A material investigation of the ceramic shards excavated from the Omuro Ninsei kiln site: Production techniques adopted by Nonomura Ninsei.
对大室仁清窑遗址出土的陶瓷碎片进行材质调查:野野村仁清采用的生产技术。
- 批准号:
20K01113 - 财政年份:2020
- 资助金额:
$ 150万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
- 批准号:
2436895 - 财政年份:2020
- 资助金额:
$ 150万 - 项目类别:
Studentship
A comparative study of disabled children and their adopted maternal figures in French and English Romantic Literature
英法浪漫主义文学中残疾儿童及其收养母亲形象的比较研究
- 批准号:
2633207 - 财政年份:2020
- 资助金额:
$ 150万 - 项目类别:
Studentship
The limits of development: State structural policy, comparing systems adopted in two European mountain regions (1945-1989)
发展的限制:国家结构政策,比较欧洲两个山区采用的制度(1945-1989)
- 批准号:
426559561 - 财政年份:2019
- 资助金额:
$ 150万 - 项目类别:
Research Grants
Securing a Sense of Safety for Adopted Children in Middle Childhood
确保被收养儿童的中期安全感
- 批准号:
2236701 - 财政年份:2019
- 资助金额:
$ 150万 - 项目类别:
Studentship
A Study on Mutual Funds Adopted for Individual Defined Contribution Pension Plans
个人设定缴存养老金计划采用共同基金的研究
- 批准号:
19K01745 - 财政年份:2019
- 资助金额:
$ 150万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Structural and functional analyses of a bacterial protein translocation domain that has adopted diverse pathogenic effector functions within host cells
对宿主细胞内采用多种致病效应功能的细菌蛋白易位结构域进行结构和功能分析
- 批准号:
415543446 - 财政年份:2019
- 资助金额:
$ 150万 - 项目类别:
Research Fellowships