Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science

合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统

基本信息

  • 批准号:
    1912270
  • 负责人:
  • 金额:
    $ 15.28万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-12-01 至 2021-12-31
  • 项目状态:
    已结题

项目摘要

Scientists collect terabytes of critical data every year. Recently a strong open science movement has generated traction for the beneficial practice of sharing data across laboratories, universities and research institutions. Yet, sharing data is not enough. Data must be shared using standardized formats and accompanied by curated metadata to allow for tracking, search, and organization. Metadata are essential for scientific discovery, as they are routinely used to complete all data analyses. However, to date, most brain projects focus on collecting or analyzing data, not on metadata management. Typical metadata records consist of heterogeneous study descriptions, developed at study release stage, without consistency across records or standard mechanisms to track changes. This project will increase access to brain data and improve metadata handling by combining two NSF-funded projects. It will develop a first-of-its-kind metadata management system able to track data and metadata distributed across heterogeneous geographical locations, storage systems and data formats. This portion of the project will expand the functionality of a previously funded NSF project DataLad. DataLad will also be enhanced to interoperate with major data repositories such as OSF and Figshare. Furthermore, the project will use the NSF-funded cloud computing platform brainlife.io to create a data and metadata marketplace by gathering data from multiple currently separated repositories into a single ecosystem . The goal is to improve interoperability across open science projects and make data and metadata easily searchable and available for computing on national cyberinfrastructure systems, ultimately advancing scientific discovery by increasing data discoverability, utilization, and publication. This project will generate various technological advances. The core target will be an extensible system capable of automated gathering of metadata from various domains. It will be comprised of two major components: 1) a set of metadata parser algorithms that extract metadata from datasets and individual files using a flexible JSON-LD based data structure (with the ability to encode controlled vocabularies where available) and 2) an aggregation procedure that merges the aggregated metadata across parsers and stores them into compressed files that are optimized for bandwidth-efficient exchange and can be queried directly, or used as input into SQL or graph databases for data discovery applications. Extracted metadata will be included within the same datasets under Git and git-annex version control for unambiguous referencing and versatile data logistics. In parallel development we will improve interoperability of DataLad with existing data publishing portals (such as Figshare and OSF) by taking advantage of extracted metadata (e.g., Author, Description) to prefill required fields, and also by bundling the entire Git object store within the publication to make such published datasets installable back by DataLad without any loss of information. To make such published datasets discoverable, we will establish a crowd-sourced registry (with a RESTful API) which will get announcements on the availability of new datasets upon publication and aggregate their metadata to enable querying across datasets and data hosting providers. The final development will be the integration of DataLad within the brainlife.io data marketplace. This will make it possible to search and install datasets on brainlife.io as well as to process the data utilizing the brainlife.io analyses Apps on various NSF-funded national cyberinfrastructure high-throughput computer systems.A companion project is being funded by the Federal Ministry of Education and Research, Germany (BMBF).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
科学家每年都会收集数 TB 的关键数据。最近,一场强大的开放科学运动推动了实验室、大学和研究机构之间共享数据的有益实践。然而,共享数据还不够。数据必须使用标准化格式进行共享,并附有精选的元数据,以便进行跟踪、搜索和组织。元数据对于科学发现至关重要,因为它们通常用于完成所有数据分析。然而,迄今为止,大多数大脑项目都专注于收集或分析数据,而不是元数据管理。典型的元数据记录由在研究发布阶段开发的异构研究描述组成,记录之间没有一致性,也没有跟踪更改的标准机制。该项目将通过结合两个 NSF 资助的项目来增加对大脑数据的访问并改进元数据处理。它将开发首个元数据管理系统,能够跟踪分布在异构地理位置、存储系统和数据格式上的数据和元数据。该项目的这一部分将扩展之前资助的 NSF 项目 DataLad 的功能。 DataLad 还将得到增强,以便与 OSF 和 Figshare 等主要数据存储库进行互操作。此外,该项目将使用 NSF 资助的云计算平台 Brainlife.io,通过将多个当前分离的存储库中的数据收集到单个生态系统中来创建数据和元数据市场。目标是提高开放科学项目的互操作性,使数据和元数据易于搜索并可用于国家网络基础设施系统上的计算,最终通过提高数据的可发现性、利用率和出版来推进科学发现。该项目将产生各种技术进步。核心目标将是一个能够自动收集来自各个领域的元数据的可扩展系统。它将由两个主要组件组成:1) 一组元数据解析器算法,使用基于灵活的 JSON-LD 的数据结构(能够对可用的受控词汇进行编码)从数据集和单个文件中提取元数据;2) 聚合程序,将跨解析器的聚合元数据合并并将它们存储到压缩文件中,这些文件针对带宽高效交换进行了优化,并且可以 直接查询,或用作数据发现应用程序的 SQL 或图形数据库的输入。提取的元数据将包含在 Git 和 git-annex 版本控制下的相同数据集中,以实现明确的引用和多功能数据物流。在并行开发中,我们将通过利用提取的元数据(例如作者、描述)来预填充必填字段,以及通过将整个 Git 对象存储捆绑在出版物中,使此类已发布的数据集可以由 DataLad 安装回来,从而提高 DataLad 与现有数据发布门户(例如 Figshare 和 OSF)的互操作性,而不会丢失任何信息。为了使此类已发布的数据集可被发现,我们将建立一个众包注册表(使用 RESTful API),该注册表将在发布时发布有关新数据集可用性的公告,并聚合其元数据以实现跨数据集和数据托管提供商的查询。最终的开发将是将 DataLad 集成到 Brainlife.io 数据市场中。这将使在 Brainlife.io 上搜索和安装数据集以及利用 Brainlife.io 分析各种 NSF 资助的国家网络基础设施高吞吐量计算机系统上的应用程序处理数据成为可能。德国联邦教育和研究部 (BMBF) 正在资助一个配套项目。该奖项反映了 NSF 的法定使命,并通过使用 基金会的智力价值和更广泛的影响审查标准。

项目成果

期刊论文数量(16)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Development of white matter tracts between and within the dorsal and ventral streams
  • DOI:
    10.1007/s00429-021-02414-5
  • 发表时间:
    2021-01
  • 期刊:
  • 影响因子:
    3.1
  • 作者:
    S. Vinci-Booher;B. Caron;D. Bullock;K. James;F. Pestilli
  • 通讯作者:
    S. Vinci-Booher;B. Caron;D. Bullock;K. James;F. Pestilli
Classifyber, a robust streamline-based linear classifier for white matter bundle segmentation
  • DOI:
    10.1016/j.neuroimage.2020.117402
  • 发表时间:
    2021-01-01
  • 期刊:
  • 影响因子:
    5.7
  • 作者:
    Berto, Giulia;Bullock, Daniel;Olivetti, Emanuele
  • 通讯作者:
    Olivetti, Emanuele
Triple visual hemifield maps in a case of optic chiasm hypoplasia
  • DOI:
    10.1016/j.neuroimage.2020.116822
  • 发表时间:
    2020-07-15
  • 期刊:
  • 影响因子:
    5.7
  • 作者:
    Ahmadi, Khazar;Fracasso, Alessio;Hoffmann, Michael B.
  • 通讯作者:
    Hoffmann, Michael B.
Tractostorm: The what, why, and how of tractography dissection reproducibility
  • DOI:
    10.1002/hbm.24917
  • 发表时间:
    2020-01-10
  • 期刊:
  • 影响因子:
    4.8
  • 作者:
    Rheault, Francois;De Benedictis, Alessandro;Descoteaux, Maxime
  • 通讯作者:
    Descoteaux, Maxime
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Franco Pestilli其他文献

The visual dorsal and ventral streams communicate through the vertical occipital fasciculus
视觉背侧和腹侧流通过垂直枕叶束进行交流
  • DOI:
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hiromasa Takemura;Franco Pestilli;Ariel Rokem;Jonathan Winawer;Jason D. Yeatman;Brian A. Wandell
  • 通讯作者:
    Brian A. Wandell
Using fMRI to characterize how cortex represents limb motions
  • DOI:
    10.1186/1471-2202-15-s1-p126
  • 发表时间:
    2014-07-21
  • 期刊:
  • 影响因子:
    2.300
  • 作者:
    Samir Menon;Jack Zhu;Paul Quigley;Franco Pestilli;Kwabena Boahen;Oussama Khatib
  • 通讯作者:
    Oussama Khatib
Factors Associated with Persisting Post-Concussion Symptoms Among Collegiate Athletes and Military Cadets: Findings from the NCAA-DoD CARE Consortium
  • DOI:
    10.1007/s40279-024-02168-0
  • 发表时间:
    2025-01-19
  • 期刊:
  • 影响因子:
    9.400
  • 作者:
    Lauren T. Rooks;Giulia Bertò;Paul F. Pasquina;Steven P. Broglio;Thomas W. McAllister;Michael A. McCrea;Franco Pestilli;Nicholas L. Port
  • 通讯作者:
    Nicholas L. Port
574. Separable White Matter Pathways Associated With Counterconditioning and Fear Extinction
  • DOI:
    10.1016/j.biopsych.2023.02.814
  • 发表时间:
    2023-05-01
  • 期刊:
  • 影响因子:
  • 作者:
    Patrick Laing;Nicole Keller;Franco Pestilli;Joseph Dunsmoor
  • 通讯作者:
    Joseph Dunsmoor
New technologies for precision brain science: studying individuality and variability in large human populations.
精密脑科学新技术:研究大量人群的个性和变异性。
  • DOI:
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Franco Pestilli;Cesar Caiafa;& 竹村浩昌.
  • 通讯作者:
    & 竹村浩昌.

Franco Pestilli的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Franco Pestilli', 18)}}的其他基金

NCS-FO: Connectome mapping algorithms with application to community services for big data neuroscience
NCS-FO:连接组映射算法及其应用于大数据神经科学社区服务
  • 批准号:
    2203524
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science
合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统
  • 批准号:
    2148700
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
BD Spokes: SPOKE: MIDWEST: Collaborative: Advanced Computational Neuroscience Network (ACNN)
BD 辐条:辐条:中西部:协作:高级计算神经科学网络 (ACNN)
  • 批准号:
    2148729
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
NCS-FO: Connectome mapping algorithms with application to community services for big data neuroscience
NCS-FO:连接组映射算法及其应用于大数据神经科学社区服务
  • 批准号:
    1734853
  • 财政年份:
    2017
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
BD Spokes: SPOKE: MIDWEST: Collaborative: Advanced Computational Neuroscience Network (ACNN)
BD 辐条:辐条:中西部:协作:高级计算神经科学网络 (ACNN)
  • 批准号:
    1636893
  • 财政年份:
    2016
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant

相似海外基金

CRCNS US-German Collaborative Research Proposal: Neural and computational mechanisms of flexible goal-directed decision making
CRCNS 美德合作研究提案:灵活目标导向决策的神经和计算机制
  • 批准号:
    2309022
  • 财政年份:
    2024
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
CRCNS US-Spain Research Proposal: Collaborative Research: Tracking and modeling the neurobiology of multilingual speech recognition
CRCNS 美国-西班牙研究提案:合作研究:跟踪和建模多语言语音识别的神经生物学
  • 批准号:
    2207770
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Continuing Grant
CRCNS US-Spain Research Proposal: Collaborative Research: Tracking and modeling the neurobiology of multilingual speech recognition
CRCNS 美国-西班牙研究提案:合作研究:跟踪和建模多语言语音识别的神经生物学
  • 批准号:
    2207747
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
  • 批准号:
    2207727
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
  • 批准号:
    2207700
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
CRCNS Research Proposal: Collaborative Research: US-German Collaboration toward a biophysically principled network model of transcranial magnetic stimulation (TMS)
CRCNS 研究提案:合作研究:美德合作建立经颅磁刺激 (TMS) 的生物物理原理网络模型
  • 批准号:
    10610594
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
  • 批准号:
    2207647
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
CRCNS Research Proposal: Collaborative Research: US-German Collaboration toward a biophysically principled network model of transcranial magnetic stimulation (TMS)
CRCNS 研究提案:合作研究:美德合作建立经颅磁刺激 (TMS) 的生物物理原理网络模型
  • 批准号:
    10708986
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science
合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统
  • 批准号:
    2148700
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
CRCNS US-France Research Proposal: Collaborative Research: Encoding reward expectation in Drosophilia
CRCNS 美国-法国研究提案:合作研究:编码果蝇奖励期望
  • 批准号:
    2113179
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了