Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science

合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统

基本信息

  • 批准号:
    1912270
  • 负责人:
  • 金额:
    $ 15.28万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-12-01 至 2021-12-31
  • 项目状态:
    已结题

项目摘要

Scientists collect terabytes of critical data every year. Recently a strong open science movement has generated traction for the beneficial practice of sharing data across laboratories, universities and research institutions. Yet, sharing data is not enough. Data must be shared using standardized formats and accompanied by curated metadata to allow for tracking, search, and organization. Metadata are essential for scientific discovery, as they are routinely used to complete all data analyses. However, to date, most brain projects focus on collecting or analyzing data, not on metadata management. Typical metadata records consist of heterogeneous study descriptions, developed at study release stage, without consistency across records or standard mechanisms to track changes. This project will increase access to brain data and improve metadata handling by combining two NSF-funded projects. It will develop a first-of-its-kind metadata management system able to track data and metadata distributed across heterogeneous geographical locations, storage systems and data formats. This portion of the project will expand the functionality of a previously funded NSF project DataLad. DataLad will also be enhanced to interoperate with major data repositories such as OSF and Figshare. Furthermore, the project will use the NSF-funded cloud computing platform brainlife.io to create a data and metadata marketplace by gathering data from multiple currently separated repositories into a single ecosystem . The goal is to improve interoperability across open science projects and make data and metadata easily searchable and available for computing on national cyberinfrastructure systems, ultimately advancing scientific discovery by increasing data discoverability, utilization, and publication. This project will generate various technological advances. The core target will be an extensible system capable of automated gathering of metadata from various domains. It will be comprised of two major components: 1) a set of metadata parser algorithms that extract metadata from datasets and individual files using a flexible JSON-LD based data structure (with the ability to encode controlled vocabularies where available) and 2) an aggregation procedure that merges the aggregated metadata across parsers and stores them into compressed files that are optimized for bandwidth-efficient exchange and can be queried directly, or used as input into SQL or graph databases for data discovery applications. Extracted metadata will be included within the same datasets under Git and git-annex version control for unambiguous referencing and versatile data logistics. In parallel development we will improve interoperability of DataLad with existing data publishing portals (such as Figshare and OSF) by taking advantage of extracted metadata (e.g., Author, Description) to prefill required fields, and also by bundling the entire Git object store within the publication to make such published datasets installable back by DataLad without any loss of information. To make such published datasets discoverable, we will establish a crowd-sourced registry (with a RESTful API) which will get announcements on the availability of new datasets upon publication and aggregate their metadata to enable querying across datasets and data hosting providers. The final development will be the integration of DataLad within the brainlife.io data marketplace. This will make it possible to search and install datasets on brainlife.io as well as to process the data utilizing the brainlife.io analyses Apps on various NSF-funded national cyberinfrastructure high-throughput computer systems.A companion project is being funded by the Federal Ministry of Education and Research, Germany (BMBF).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
科学家每年收集数兆字节的关键数据。最近,一场强大的开放科学运动为在实验室、大学和研究机构之间共享数据的有益实践带来了牵引力。然而,仅共享数据是不够的。数据必须使用标准化格式共享,并伴随有经过管理的元数据,以便进行跟踪、搜索和组织。元数据对于科学发现是必不可少的,因为它们通常被用来完成所有数据分析。然而,到目前为止,大多数Brain项目都专注于收集或分析数据,而不是元数据管理。典型的元数据记录包括在研究发布阶段制定的不同的研究描述,记录之间没有一致性,也没有跟踪变化的标准机制。该项目将通过结合NSF资助的两个项目,增加对大脑数据的访问,并改进元数据处理。它将开发首个此类元数据管理系统,能够跟踪分布在不同地理位置、存储系统和数据格式上的数据和元数据。该项目的这一部分将扩展之前资助的NSF项目DataLad的功能。DataLad还将得到增强,以便与OSF和FigShare等主要数据存储库进行互操作。此外,该项目将使用NSF资助的云计算平台Brainlife.io,通过将目前分离的多个存储库的数据收集到单个生态系统中来创建数据和元数据市场。其目标是改善开放科学项目之间的互操作性,使数据和元数据易于搜索,并可用于国家网络基础设施系统上的计算,最终通过增加数据的可发现性、利用率和发布来促进科学发现。这个项目将带来各种技术进步。核心目标将是一个能够自动收集来自不同领域的元数据的可扩展系统。它将由两个主要组件组成:1)一组元数据解析器算法,使用灵活的基于JSON-LD的数据结构(具有在可用情况下对受控词汇进行编码的能力)从数据集和单个文件中提取元数据;2)聚合过程,跨解析器合并聚合的元数据,并将它们存储到压缩文件中,这些压缩文件针对带宽高效交换进行了优化,可以直接查询,或用作数据发现应用程序的SQL或图形数据库的输入。提取的元数据将包括在Git和Git附件版本控制下的相同数据集中,以便明确引用和多种多样的数据后勤。在并行开发中,我们将提高DataLad与现有数据发布门户(如FigShare和OSF)的互操作性,方法是利用提取的元数据(例如,作者、描述)预先填充必填字段,并将整个Git对象存储捆绑在出版物中,使这些已发布的数据集可以由DataLad安装回来,而不会丢失任何信息。为了使这些已发布的数据集可被发现,我们将建立一个众包注册表(使用REST风格的API),该注册表将在发布后获得有关新数据集的可用性的通知,并聚合它们的元数据以支持跨数据集和数据宿主提供程序的查询。最终的开发将是将DataLad集成到Brainlife.io数据市场中。这将使搜索和安装有关Brain Life.io的数据集以及利用Brainlife处理数据成为可能。io分析NSF资助的各种国家网络基础设施高通量计算机系统上的应用程序。德国联邦教育和研究部(BMBF)正在资助一个配套项目。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(16)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Development of white matter tracts between and within the dorsal and ventral streams
  • DOI:
    10.1007/s00429-021-02414-5
  • 发表时间:
    2021-01
  • 期刊:
  • 影响因子:
    3.1
  • 作者:
    S. Vinci-Booher;B. Caron;D. Bullock;K. James;F. Pestilli
  • 通讯作者:
    S. Vinci-Booher;B. Caron;D. Bullock;K. James;F. Pestilli
Classifyber, a robust streamline-based linear classifier for white matter bundle segmentation
  • DOI:
    10.1016/j.neuroimage.2020.117402
  • 发表时间:
    2021-01-01
  • 期刊:
  • 影响因子:
    5.7
  • 作者:
    Berto, Giulia;Bullock, Daniel;Olivetti, Emanuele
  • 通讯作者:
    Olivetti, Emanuele
Triple visual hemifield maps in a case of optic chiasm hypoplasia
  • DOI:
    10.1016/j.neuroimage.2020.116822
  • 发表时间:
    2020-07-15
  • 期刊:
  • 影响因子:
    5.7
  • 作者:
    Ahmadi, Khazar;Fracasso, Alessio;Hoffmann, Michael B.
  • 通讯作者:
    Hoffmann, Michael B.
Tractostorm: The what, why, and how of tractography dissection reproducibility
  • DOI:
    10.1002/hbm.24917
  • 发表时间:
    2020-01-10
  • 期刊:
  • 影响因子:
    4.8
  • 作者:
    Rheault, Francois;De Benedictis, Alessandro;Descoteaux, Maxime
  • 通讯作者:
    Descoteaux, Maxime
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Franco Pestilli其他文献

The visual dorsal and ventral streams communicate through the vertical occipital fasciculus
视觉背侧和腹侧流通过垂直枕叶束进行交流
  • DOI:
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hiromasa Takemura;Franco Pestilli;Ariel Rokem;Jonathan Winawer;Jason D. Yeatman;Brian A. Wandell
  • 通讯作者:
    Brian A. Wandell
Using fMRI to characterize how cortex represents limb motions
  • DOI:
    10.1186/1471-2202-15-s1-p126
  • 发表时间:
    2014-07-21
  • 期刊:
  • 影响因子:
    2.300
  • 作者:
    Samir Menon;Jack Zhu;Paul Quigley;Franco Pestilli;Kwabena Boahen;Oussama Khatib
  • 通讯作者:
    Oussama Khatib
Factors Associated with Persisting Post-Concussion Symptoms Among Collegiate Athletes and Military Cadets: Findings from the NCAA-DoD CARE Consortium
  • DOI:
    10.1007/s40279-024-02168-0
  • 发表时间:
    2025-01-19
  • 期刊:
  • 影响因子:
    9.400
  • 作者:
    Lauren T. Rooks;Giulia Bertò;Paul F. Pasquina;Steven P. Broglio;Thomas W. McAllister;Michael A. McCrea;Franco Pestilli;Nicholas L. Port
  • 通讯作者:
    Nicholas L. Port
574. Separable White Matter Pathways Associated With Counterconditioning and Fear Extinction
  • DOI:
    10.1016/j.biopsych.2023.02.814
  • 发表时间:
    2023-05-01
  • 期刊:
  • 影响因子:
  • 作者:
    Patrick Laing;Nicole Keller;Franco Pestilli;Joseph Dunsmoor
  • 通讯作者:
    Joseph Dunsmoor
New technologies for precision brain science: studying individuality and variability in large human populations.
精密脑科学新技术:研究大量人群的个性和变异性。
  • DOI:
  • 发表时间:
    2016
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Franco Pestilli;Cesar Caiafa;& 竹村浩昌.
  • 通讯作者:
    & 竹村浩昌.

Franco Pestilli的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Franco Pestilli', 18)}}的其他基金

Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science
合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统
  • 批准号:
    2148700
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
BD Spokes: SPOKE: MIDWEST: Collaborative: Advanced Computational Neuroscience Network (ACNN)
BD 辐条:辐条:中西部:协作:高级计算神经科学网络 (ACNN)
  • 批准号:
    2148729
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
NCS-FO: Connectome mapping algorithms with application to community services for big data neuroscience
NCS-FO:连接组映射算法及其应用于大数据神经科学社区服务
  • 批准号:
    2203524
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
NCS-FO: Connectome mapping algorithms with application to community services for big data neuroscience
NCS-FO:连接组映射算法及其应用于大数据神经科学社区服务
  • 批准号:
    1734853
  • 财政年份:
    2017
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
BD Spokes: SPOKE: MIDWEST: Collaborative: Advanced Computational Neuroscience Network (ACNN)
BD 辐条:辐条:中西部:协作:高级计算神经科学网络 (ACNN)
  • 批准号:
    1636893
  • 财政年份:
    2016
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant

相似海外基金

CRCNS US-German Collaborative Research Proposal: Neural and computational mechanisms of flexible goal-directed decision making
CRCNS 美德合作研究提案:灵活目标导向决策的神经和计算机制
  • 批准号:
    2309022
  • 财政年份:
    2024
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
CRCNS US-Spain Research Proposal: Collaborative Research: Tracking and modeling the neurobiology of multilingual speech recognition
CRCNS 美国-西班牙研究提案:合作研究:跟踪和建模多语言语音识别的神经生物学
  • 批准号:
    2207770
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Continuing Grant
CRCNS US-Spain Research Proposal: Collaborative Research: Tracking and modeling the neurobiology of multilingual speech recognition
CRCNS 美国-西班牙研究提案:合作研究:跟踪和建模多语言语音识别的神经生物学
  • 批准号:
    2207747
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
  • 批准号:
    2207727
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
  • 批准号:
    2207700
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
CRCNS Research Proposal: Collaborative Research: US-German Collaboration toward a biophysically principled network model of transcranial magnetic stimulation (TMS)
CRCNS 研究提案:合作研究:美德合作建立经颅磁刺激 (TMS) 的生物物理原理网络模型
  • 批准号:
    10610594
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
CRCNS Research Proposal: Collaborative Research: US-German Collaboration toward a biophysically principled network model of transcranial magnetic stimulation (TMS)
CRCNS 研究提案:合作研究:美德合作建立经颅磁刺激 (TMS) 的生物物理原理网络模型
  • 批准号:
    10708986
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
Collaborative Research: CRCNS Research Proposal: Adaptive Decision Rules in Dynamic Environments
合作研究:CRCNS 研究提案:动态环境中的自适应决策规则
  • 批准号:
    2207647
  • 财政年份:
    2022
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
Collaborative Proposal: CRCNS US-German Data Sharing Proposal: DataLad - a decentralized system for integrated discovery, management, and publication of digital objects of science
合作提案:CRCNS 美德数据共享提案:DataLad - 一个用于集成发现、管理和出版科学数字对象的去中心化系统
  • 批准号:
    2148700
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
CRCNS US-France Research Proposal: Collaborative Research: Encoding reward expectation in Drosophilia
CRCNS 美国-法国研究提案:合作研究:编码果蝇奖励期望
  • 批准号:
    2113179
  • 财政年份:
    2021
  • 资助金额:
    $ 15.28万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了