Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science
合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统
基本信息
- 批准号:2148700
- 负责人:
- 金额:$ 15.28万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-10-01 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Scientists collect terabytes of critical data every year. Recently a strong open science movement has generated traction for the beneficial practice of sharing data across laboratories, universities and research institutions. Yet, sharing data is not enough. Data must be shared using standardized formats and accompanied by curated metadata to allow for tracking, search, and organization. Metadata are essential for scientific discovery, as they are routinely used to complete all data analyses. However, to date, most brain projects focus on collecting or analyzing data, not on metadata management. Typical metadata records consist of heterogeneous study descriptions, developed at study release stage, without consistency across records or standard mechanisms to track changes. This project will increase access to brain data and improve metadata handling by combining two NSF-funded projects. It will develop a first-of-its-kind metadata management system able to track data and metadata distributed across heterogeneous geographical locations, storage systems and data formats. This portion of the project will expand the functionality of a previously funded NSF project DataLad. DataLad will also be enhanced to interoperate with major data repositories such as OSF and Figshare. Furthermore, the project will use the NSF-funded cloud computing platform brainlife.io to create a data and metadata marketplace by gathering data from multiple currently separated repositories into a single ecosystem . The goal is to improve interoperability across open science projects and make data and metadata easily searchable and available for computing on national cyberinfrastructure systems, ultimately advancing scientific discovery by increasing data discoverability, utilization, and publication. This project will generate various technological advances. The core target will be an extensible system capable of automated gathering of metadata from various domains. It will be comprised of two major components: 1) a set of metadata parser algorithms that extract metadata from datasets and individual files using a flexible JSON-LD based data structure (with the ability to encode controlled vocabularies where available) and 2) an aggregation procedure that merges the aggregated metadata across parsers and stores them into compressed files that are optimized for bandwidth-efficient exchange and can be queried directly, or used as input into SQL or graph databases for data discovery applications. Extracted metadata will be included within the same datasets under Git and git-annex version control for unambiguous referencing and versatile data logistics. In parallel development we will improve interoperability of DataLad with existing data publishing portals (such as Figshare and OSF) by taking advantage of extracted metadata (e.g., Author, Description) to prefill required fields, and also by bundling the entire Git object store within the publication to make such published datasets installable back by DataLad without any loss of information. To make such published datasets discoverable, we will establish a crowd-sourced registry (with a RESTful API) which will get announcements on the availability of new datasets upon publication and aggregate their metadata to enable querying across datasets and data hosting providers. The final development will be the integration of DataLad within the brainlife.io data marketplace. This will make it possible to search and install datasets on brainlife.io as well as to process the data utilizing the brainlife.io analyses Apps on various NSF-funded national cyberinfrastructure high-throughput computer systems.A companion project is being funded by the Federal Ministry of Education and Research, Germany (BMBF).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
科学家们每年都要收集数TB的关键数据。最近,一场强大的开放科学运动推动了在实验室、大学和研究机构之间共享数据的有益实践。然而,仅仅共享数据是不够的。数据必须使用标准化格式共享,并伴随着精心策划的元数据,以允许跟踪,搜索和组织。元数据对于科学发现至关重要,因为它们通常用于完成所有数据分析。然而,到目前为止,大多数大脑项目都专注于收集或分析数据,而不是元数据管理。典型的元数据记录包括在研究发布阶段开发的异构研究描述,没有跨记录的一致性或跟踪变更的标准机制。该项目将通过合并两个国家科学基金会资助的项目,增加对大脑数据的访问,并改善元数据处理。它将开发一个能够跟踪分布在不同地理位置、存储系统和数据格式的数据和元数据的首个元数据管理系统。该项目的这一部分将扩展以前资助的NSF项目DataLad的功能。DataLad还将得到增强,以便与OSF和Figshare等主要数据存储库进行互操作。此外,该项目将使用NSF资助的云计算平台brainlife.io,通过将来自多个目前独立的存储库的数据收集到一个单一的生态系统中,来创建一个数据和元数据市场。其目标是提高开放科学项目之间的互操作性,使数据和元数据易于搜索,并可用于国家网络基础设施系统的计算,最终通过提高数据的可重复性,利用率和出版物来推进科学发现。该项目将产生各种技术进步。核心目标将是一个可扩展的系统,能够自动收集来自不同领域的元数据。它将由两个主要部分组成:1)一组元数据解析器算法,使用灵活的基于JSON-LD的数据结构从数据集和单个文件中提取元数据(具有在可用的情况下编码受控词汇表的能力)和2)聚合过程,其跨解析器合并聚合的元数据并将它们存储到针对带宽优化的压缩文件中。高效的交换,可以直接查询,也可以作为数据发现应用程序的SQL或图形数据库的输入。提取的元数据将被纳入Git和git-annex版本控制下的相同数据集,以提供明确的参考和通用的数据物流。在并行开发中,我们将通过利用提取的元数据(例如,作者,描述)来预填充必填字段,并将整个Git对象存储捆绑在发布中,使这些发布的数据集可以通过DataLad安装回来,而不会丢失任何信息。为了使这些已发布的数据集可共享,我们将建立一个众包注册表(使用REST风格的API),该注册表将在发布时获得关于新数据集可用性的公告,并聚合其元数据,以实现跨数据集和数据托管提供商的查询。最后的开发将是将DataLad集成到brainlife.io数据市场中。这将使人们能够在brainlife.io上搜索和安装数据集,并利用brainlife.io分析应用程序在各种NSF资助的国家网络基础设施高吞吐量计算机系统上处理数据。一个配套项目正在由德国的联邦教育和研究部(BMBF)资助。该奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Development of white matter tracts between and within the dorsal and ventral streams
- DOI:10.1007/s00429-021-02414-5
- 发表时间:2021-01
- 期刊:
- 影响因子:3.1
- 作者:S. Vinci-Booher;B. Caron;D. Bullock;K. James;F. Pestilli
- 通讯作者:S. Vinci-Booher;B. Caron;D. Bullock;K. James;F. Pestilli
Neurodesk: an accessible, flexible and portable data analysis environment for reproducible neuroimaging
- DOI:10.1038/s41592-023-02145-x
- 发表时间:2024-01-08
- 期刊:
- 影响因子:48
- 作者:Renton,Angela I.;Dao,Thuy T.;Bollmann,Steffen
- 通讯作者:Bollmann,Steffen
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Franco Pestilli其他文献
The visual dorsal and ventral streams communicate through the vertical occipital fasciculus
视觉背侧和腹侧流通过垂直枕叶束进行交流
- DOI:
- 发表时间:
2013 - 期刊:
- 影响因子:0
- 作者:
Hiromasa Takemura;Franco Pestilli;Ariel Rokem;Jonathan Winawer;Jason D. Yeatman;Brian A. Wandell - 通讯作者:
Brian A. Wandell
Using fMRI to characterize how cortex represents limb motions
- DOI:
10.1186/1471-2202-15-s1-p126 - 发表时间:
2014-07-21 - 期刊:
- 影响因子:2.300
- 作者:
Samir Menon;Jack Zhu;Paul Quigley;Franco Pestilli;Kwabena Boahen;Oussama Khatib - 通讯作者:
Oussama Khatib
Factors Associated with Persisting Post-Concussion Symptoms Among Collegiate Athletes and Military Cadets: Findings from the NCAA-DoD CARE Consortium
- DOI:
10.1007/s40279-024-02168-0 - 发表时间:
2025-01-19 - 期刊:
- 影响因子:9.400
- 作者:
Lauren T. Rooks;Giulia Bertò;Paul F. Pasquina;Steven P. Broglio;Thomas W. McAllister;Michael A. McCrea;Franco Pestilli;Nicholas L. Port - 通讯作者:
Nicholas L. Port
574. Separable White Matter Pathways Associated With Counterconditioning and Fear Extinction
- DOI:
10.1016/j.biopsych.2023.02.814 - 发表时间:
2023-05-01 - 期刊:
- 影响因子:
- 作者:
Patrick Laing;Nicole Keller;Franco Pestilli;Joseph Dunsmoor - 通讯作者:
Joseph Dunsmoor
New technologies for precision brain science: studying individuality and variability in large human populations.
精密脑科学新技术:研究大量人群的个性和变异性。
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Franco Pestilli;Cesar Caiafa;& 竹村浩昌. - 通讯作者:
& 竹村浩昌.
Franco Pestilli的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Franco Pestilli', 18)}}的其他基金
NCS-FO: Connectome mapping algorithms with application to community services for big data neuroscience
NCS-FO:连接组映射算法及其应用于大数据神经科学社区服务
- 批准号:
2203524 - 财政年份:2021
- 资助金额:
$ 15.28万 - 项目类别:
Standard Grant
BD Spokes: SPOKE: MIDWEST: Collaborative: Advanced Computational Neuroscience Network (ACNN)
BD 辐条:辐条:中西部:协作:高级计算神经科学网络 (ACNN)
- 批准号:
2148729 - 财政年份:2021
- 资助金额:
$ 15.28万 - 项目类别:
Standard Grant
Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science
合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统
- 批准号:
1912270 - 财政年份:2019
- 资助金额:
$ 15.28万 - 项目类别:
Standard Grant
NCS-FO: Connectome mapping algorithms with application to community services for big data neuroscience
NCS-FO:连接组映射算法及其应用于大数据神经科学社区服务
- 批准号:
1734853 - 财政年份:2017
- 资助金额:
$ 15.28万 - 项目类别:
Standard Grant
BD Spokes: SPOKE: MIDWEST: Collaborative: Advanced Computational Neuroscience Network (ACNN)
BD 辐条:辐条:中西部:协作:高级计算神经科学网络 (ACNN)
- 批准号:
1636893 - 财政年份:2016
- 资助金额:
$ 15.28万 - 项目类别:
Standard Grant
相似海外基金
CRCNS US-German Collaborative Research Proposal: Neural and computational mechanisms of flexible goal-directed decision making
CRCNS 美德合作研究提案:灵活目标导向决策的神经和计算机制
- 批准号:
2309022 - 财政年份:2024
- 资助金额:
$ 15.28万 - 项目类别:
Standard Grant
CRCNS US-Spain Research Proposal: Collaborative Research: Tracking and modeling the neurobiology of multilingual speech recognition
CRCNS 美国-西班牙研究提案:合作研究:跟踪和建模多语言语音识别的神经生物学
- 批准号:
2207770 - 财政年份:2022
- 资助金额:
$ 15.28万 - 项目类别:
Continuing Grant
CRCNS US-Spain Research Proposal: Collaborative Research: Tracking and modeling the neurobiology of multilingual speech recognition
CRCNS 美国-西班牙研究提案:合作研究:跟踪和建模多语言语音识别的神经生物学
- 批准号:
2207747 - 财政年份:2022
- 资助金额:
$ 15.28万 - 项目类别:
Standard Grant
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
- 批准号:
2207727 - 财政年份:2022
- 资助金额:
$ 15.28万 - 项目类别:
Standard Grant
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
- 批准号:
2207700 - 财政年份:2022
- 资助金额:
$ 15.28万 - 项目类别:
Standard Grant
CRCNS Research Proposal: Collaborative Research: US-German Collaboration toward a biophysically principled network model of transcranial magnetic stimulation (TMS)
CRCNS 研究提案:合作研究:美德合作建立经颅磁刺激 (TMS) 的生物物理原理网络模型
- 批准号:
10610594 - 财政年份:2022
- 资助金额:
$ 15.28万 - 项目类别:
CRCNS Research Proposal: Collaborative Research: US-German Collaboration toward a biophysically principled network model of transcranial magnetic stimulation (TMS)
CRCNS 研究提案:合作研究:美德合作建立经颅磁刺激 (TMS) 的生物物理原理网络模型
- 批准号:
10708986 - 财政年份:2022
- 资助金额:
$ 15.28万 - 项目类别:
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
- 批准号:
2207647 - 财政年份:2022
- 资助金额:
$ 15.28万 - 项目类别:
Standard Grant
CRCNS Research Proposal: Collaborative Research: Electrophysiome: comprehensive recording and integrated modeling of the C. elegans nervous system
CRCNS 研究提案:合作研究:电生理组:线虫神经系统的全面记录和集成建模
- 批准号:
2113003 - 财政年份:2021
- 资助金额:
$ 15.28万 - 项目类别:
Standard Grant
CRCNS US-France Research Proposal: Collaborative Research: Encoding reward expectation in Drosophilia
CRCNS 美国-法国研究提案:合作研究:编码果蝇奖励期望
- 批准号:
2113179 - 财政年份:2021
- 资助金额:
$ 15.28万 - 项目类别:
Standard Grant