Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science

合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统

基本信息

  • 批准号:
    1912270
  • 负责人:
  • 金额:
    $ 15.28万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-12-01 至 2021-12-31
  • 项目状态:
    已结题

项目摘要

Scientists collect terabytes of critical data every year. Recently a strong open science movement has generated traction for the beneficial practice of sharing data across laboratories, universities and research institutions. Yet, sharing data is not enough. Data must be shared using standardized formats and accompanied by curated metadata to allow for tracking, search, and organization. Metadata are essential for scientific discovery, as they are routinely used to complete all data analyses. However, to date, most brain projects focus on collecting or analyzing data, not on metadata management. Typical metadata records consist of heterogeneous study descriptions, developed at study release stage, without consistency across records or standard mechanisms to track changes. This project will increase access to brain data and improve metadata handling by combining two NSF-funded projects. It will develop a first-of-its-kind metadata management system able to track data and metadata distributed across heterogeneous geographical locations, storage systems and data formats. This portion of the project will expand the functionality of a previously funded NSF project DataLad. DataLad will also be enhanced to interoperate with major data repositories such as OSF and Figshare. Furthermore, the project will use the NSF-funded cloud computing platform brainlife.io to create a data and metadata marketplace by gathering data from multiple currently separated repositories into a single ecosystem . The goal is to improve interoperability across open science projects and make data and metadata easily searchable and available for computing on national cyberinfrastructure systems, ultimately advancing scientific discovery by increasing data discoverability, utilization, and publication. This project will generate various technological advances. The core target will be an extensible system capable of automated gathering of metadata from various domains. It will be comprised of two major components: 1) a set of metadata parser algorithms that extract metadata from datasets and individual files using a flexible JSON-LD based data structure (with the ability to encode controlled vocabularies where available) and 2) an aggregation procedure that merges the aggregated metadata across parsers and stores them into compressed files that are optimized for bandwidth-efficient exchange and can be queried directly, or used as input into SQL or graph databases for data discovery applications. Extracted metadata will be included within the same datasets under Git and git-annex version control for unambiguous referencing and versatile data logistics. In parallel development we will improve interoperability of DataLad with existing data publishing portals (such as Figshare and OSF) by taking advantage of extracted metadata (e.g., Author, Description) to prefill required fields, and also by bundling the entire Git object store within the publication to make such published datasets installable back by DataLad without any loss of information. To make such published datasets discoverable, we will establish a crowd-sourced registry (with a RESTful API) which will get announcements on the availability of new datasets upon publication and aggregate their metadata to enable querying across datasets and data hosting providers. The final development will be the integration of DataLad within the brainlife.io data marketplace. This will make it possible to search and install datasets on brainlife.io as well as to process the data utilizing the brainlife.io analyses Apps on various NSF-funded national cyberinfrastructure high-throughput computer systems.A companion project is being funded by the Federal Ministry of Education and Research, Germany (BMBF).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
科学家们每年都会收集数tb的关键数据。最近,一场强大的开放科学运动推动了在实验室、大学和研究机构之间共享数据的有益实践。然而,仅仅共享数据是不够的。必须使用标准化格式共享数据,并附带经过管理的元数据,以便进行跟踪、搜索和组织。元数据对于科学发现至关重要,因为它们通常用于完成所有数据分析。然而,到目前为止,大多数大脑项目关注的是收集或分析数据,而不是元数据管理。典型的元数据记录由异构的研究描述组成,在研究发布阶段开发,没有记录之间的一致性或跟踪变化的标准机制。该项目将结合两个nsf资助的项目,增加对大脑数据的访问,并改善元数据处理。它将开发一种首创的元数据管理系统,能够跟踪分布在不同地理位置、存储系统和数据格式之间的数据和元数据。该项目的这一部分将扩展先前资助的NSF项目DataLad的功能。DataLad还将加强与主要数据存储库(如OSF和Figshare)的互操作。此外,该项目将使用nsf资助的云计算平台brainlife。IO通过将来自多个当前分离的存储库的数据收集到单个生态系统中来创建数据和元数据市场。目标是提高开放科学项目之间的互操作性,使数据和元数据易于搜索,并可用于国家网络基础设施系统的计算,最终通过提高数据的可发现性、利用率和发布来推进科学发现。这个项目将产生各种各样的技术进步。核心目标将是一个能够自动收集来自不同领域的元数据的可扩展系统。它将由两个主要部分组成:1)一组元数据解析器算法,使用灵活的基于JSON-LD的数据结构(在可用的情况下能够编码受控词汇表)从数据集和单个文件中提取元数据;2)一个聚合过程,将聚合的元数据合并到解析器中,并将其存储到压缩文件中,压缩文件针对带宽效率交换进行了优化,可以直接查询,或者用作数据发现应用程序的SQL或图形数据库的输入。提取的元数据将包含在Git和Git -annex版本控制下的相同数据集中,以实现明确的引用和通用的数据逻辑。在并行开发中,我们将改进DataLad与现有数据发布门户(如Figshare和OSF)的互操作性,方法是利用提取的元数据(如Author、Description)来预先填充所需字段,并将整个Git对象存储捆绑在发布中,使这些发布的数据集可以由DataLad安装回来,而不会丢失任何信息。为了使这些发布的数据集能够被发现,我们将建立一个众包注册表(使用RESTful API),它将在发布时获得关于新数据集可用性的公告,并聚合它们的元数据,以便跨数据集和数据托管提供商进行查询。最后的发展将是DataLad在大脑生命中的整合。IO数据市场。这将使在brainlife上搜索和安装数据集成为可能。IO以及利用大脑生命来处理数据。io在各种nsf资助的国家网络基础设施高吞吐量计算机系统上分析应用程序。德国联邦教育和研究部(BMBF)正在资助一个伙伴项目。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(16)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Development of white matter tracts between and within the dorsal and ventral streams
  • DOI:
    10.1007/s00429-021-02414-5
  • 发表时间:
    2021-01
  • 期刊:
  • 影响因子:
    3.1
  • 作者:
    S. Vinci-Booher;B. Caron;D. Bullock;K. James;F. Pestilli
  • 通讯作者:
    S. Vinci-Booher;B. Caron;D. Bullock;K. James;F. Pestilli
Classifyber, a robust streamline-based linear classifier for white matter bundle segmentation
  • DOI:
    10.1016/j.neuroimage.2020.117402
  • 发表时间:
    2021-01-01
  • 期刊:
  • 影响因子:
    5.7
  • 作者:
    Berto, Giulia;Bullock, Daniel;Olivetti, Emanuele
  • 通讯作者:
    Olivetti, Emanuele
Tractostorm: The what, why, and how of tractography dissection reproducibility
  • DOI:
    10.1002/hbm.24917
  • 发表时间:
    2020-01-10
  • 期刊:
  • 影响因子:
    4.8
  • 作者:
    Rheault, Francois;De Benedictis, Alessandro;Descoteaux, Maxime
  • 通讯作者:
    Descoteaux, Maxime
Triple visual hemifield maps in a case of optic chiasm hypoplasia
  • DOI:
    10.1016/j.neuroimage.2020.116822
  • 发表时间:
    2020-07-15
  • 期刊:
  • 影响因子:
    5.7
  • 作者:
    Ahmadi, Khazar;Fracasso, Alessio;Hoffmann, Michael B.
  • 通讯作者:
    Hoffmann, Michael B.
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Franco Pestilli其他文献

The visual dorsal and ventral streams communicate through the vertical occipital fasciculus
视觉背侧和腹侧流通过垂直枕叶束进行交流
  • DOI:
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hiromasa Takemura;Franco Pestilli;Ariel Rokem;Jonathan Winawer;Jason D. Yeatman;Brian A. Wandell
  • 通讯作者:
    Brian A. Wandell
Using fMRI to characterize how cortex represents limb motions
  • DOI:
    10.1186/1471-2202-15-s1-p126
  • 发表时间:
    2014-07-21
  • 期刊:
  • 影响因子:
    2.300
  • 作者:
    Samir Menon;Jack Zhu;Paul Quigley;Franco Pestilli;Kwabena Boahen;Oussama Khatib
  • 通讯作者:
    Oussama Khatib
Factors Associated with Persisting Post-Concussion Symptoms Among Collegiate Athletes and Military Cadets: Findings from the NCAA-DoD CARE Consortium
  • DOI:
    10.1007/s40279-024-02168-0
  • 发表时间:
    2025-01-19
  • 期刊:
  • 影响因子:
    9.400
  • 作者:
    Lauren T. Rooks;Giulia Bertò;Paul F. Pasquina;Steven P. Broglio;Thomas W. McAllister;Michael A. McCrea;Franco Pestilli;Nicholas L. Port
  • 通讯作者:
    Nicholas L. Port
574. Separable White Matter Pathways Associated With Counterconditioning and Fear Extinction
  • DOI:
    10.1016/j.biopsych.2023.02.814
  • 发表时间:
    2023-05-01
  • 期刊:
  • 影响因子:
  • 作者:
    Patrick Laing;Nicole Keller;Franco Pestilli;Joseph Dunsmoor
  • 通讯作者:
    Joseph Dunsmoor
New technologies for precision brain science: studying individuality and variability in large human populations.
精密脑科学新技术:研究大量人群的个性和变异性。
  • DOI:
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Franco Pestilli;Cesar Caiafa;& 竹村浩昌.
  • 通讯作者:
    & 竹村浩昌.

Franco Pestilli的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Franco Pestilli', 18)}}的其他基金

Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science
合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统
  • 批准号:
    2148700
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
BD Spokes: SPOKE: MIDWEST: Collaborative: Advanced Computational Neuroscience Network (ACNN)
BD 辐条:辐条:中西部:协作:高级计算神经科学网络 (ACNN)
  • 批准号:
    2148729
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
NCS-FO: Connectome mapping algorithms with application to community services for big data neuroscience
NCS-FO:连接组映射算法及其应用于大数据神经科学社区服务
  • 批准号:
    2203524
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
NCS-FO: Connectome mapping algorithms with application to community services for big data neuroscience
NCS-FO:连接组映射算法及其应用于大数据神经科学社区服务
  • 批准号:
    1734853
  • 财政年份:
    2017
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
BD Spokes: SPOKE: MIDWEST: Collaborative: Advanced Computational Neuroscience Network (ACNN)
BD 辐条:辐条:中西部:协作:高级计算神经科学网络 (ACNN)
  • 批准号:
    1636893
  • 财政年份:
    2016
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant

相似海外基金

CRCNS US-German Collaborative Research Proposal: Neural and computational mechanisms of flexible goal-directed decision making
CRCNS 美德合作研究提案:灵活目标导向决策的神经和计算机制
  • 批准号:
    2309022
  • 财政年份:
    2024
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
CRCNS US-Spain Research Proposal: Collaborative Research: Tracking and modeling the neurobiology of multilingual speech recognition
CRCNS 美国-西班牙研究提案:合作研究:跟踪和建模多语言语音识别的神经生物学
  • 批准号:
    2207770
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Continuing Grant
CRCNS US-Spain Research Proposal: Collaborative Research: Tracking and modeling the neurobiology of multilingual speech recognition
CRCNS 美国-西班牙研究提案:合作研究:跟踪和建模多语言语音识别的神经生物学
  • 批准号:
    2207747
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
  • 批准号:
    2207727
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
  • 批准号:
    2207700
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
CRCNS Research Proposal: Collaborative Research: US-German Collaboration toward a biophysically principled network model of transcranial magnetic stimulation (TMS)
CRCNS 研究提案:合作研究:美德合作建立经颅磁刺激 (TMS) 的生物物理原理网络模型
  • 批准号:
    10610594
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
CRCNS Research Proposal: Collaborative Research: US-German Collaboration toward a biophysically principled network model of transcranial magnetic stimulation (TMS)
CRCNS 研究提案:合作研究:美德合作建立经颅磁刺激 (TMS) 的生物物理原理网络模型
  • 批准号:
    10708986
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
  • 批准号:
    2207647
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science
合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统
  • 批准号:
    2148700
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
CRCNS US-France Research Proposal: Collaborative Research: Encoding reward expectation in Drosophilia
CRCNS 美国-法国研究提案:合作研究:编码果蝇奖励期望
  • 批准号:
    2113179
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了