CSSI Elements: DataSwarm: A User-Level Framework for Data Intensive Scientific Applications

CSSI 元素:DataSwarm:数据密集型科学应用程序的用户级框架

基本信息

  • 批准号:
    1931348
  • 负责人:
  • 金额:
    $ 56.3万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-09-01 至 2024-08-31
  • 项目状态:
    已结题

项目摘要

This project creates a capability that will support the construction of large, data intensive scientific applications that must run on top of national cyberinfrastructure, such as large campus clusters, NSF extreme-scale computing facilities, the Open Science Grid, and commercial clouds. The new capability (DataSwarm) brings data requirements and software dependencies to the target cyberinfrastructure systems, and deploys them as and when required, rather than having these requirements pre-installed on the target systems. The motivation comes from applications in high energy physics, molecular dynamics, and quantum chemistry.The main motivation of the work is the challenge of scalable computing frameworks. Based on a prior development by the Principal Investigator (Work Queue), the current project provides technical innovation in three areas: (1) Molecular Task Composition. Molecular task composition is used as an abstraction for the precise construction of tasks that require a custom software environment, large data input, and a scratch data area to capture the outputs. By expressing these aspects explicitly instead of implicitly, the project improves the storage efficiency of large numbers of tasks. (2) In-Situ Data Management. In-situ storage management is performed to offset the increased storage consumption likely to occur under molecular task composition, avoiding unpredictable failures of tasks due to storage exhaustion. (3) Precision Provenance. Precision provenance of both data objects and task components enables the efficient re-use of resources across multiple runs, as well as precise incremental changes to complex workflows.For this project, the three key elements addressed are the software environment, input data, and a scratch data area. These elements are usually independently managed; here, they are bound together to form temporary "molecules" for task execution. The three applications included in this project represent three typical types of complex data and complex software dependencies. They include custom late-stage data analysis codes in high energy physics, complex multidimensional optimization, and ensemble molecular dynamics, respectively.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目创建了一种能力,将支持必须在国家网络基础设施之上运行的大型数据密集型科学应用程序的构建,例如大型校园集群,NSF极端规模计算设施,开放科学网格和商业云。 新功能(DataSwarm)将数据要求和软件依赖性带到目标网络基础设施系统,并在需要时部署它们,而不是将这些要求预先安装在目标系统上。 其动机来自于高能物理、分子动力学和量子化学的应用,主要动机是可扩展计算框架的挑战。 基于主要研究者(工作队列)的先前开发,当前项目在三个方面提供技术创新:(1)分子任务组成。 分子任务组合用作精确构建任务的抽象,这些任务需要自定义软件环境、大数据输入和临时数据区域来捕获输出。通过显式地而不是隐式地表达这些方面,该项目提高了大量任务的存储效率。(2)现场数据管理。 执行原位存储管理以抵消在分子任务组合下可能发生的增加的存储消耗,避免由于存储耗尽而导致的不可预测的任务失败。(3)精确的起源。 数据对象和任务组件的精确起源支持跨多个运行的资源的高效重用,以及对复杂工作流的精确增量更改。对于该项目,解决的三个关键要素是软件环境、输入数据和临时数据区。这些元素通常是独立管理的;在这里,它们被绑定在一起,形成临时的“分子”来执行任务。 本项目中包含的三个应用程序代表了三种典型的复杂数据和复杂软件依赖关系。这一奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(9)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Lightweight Function Monitors for Fine-Grained Management in Large Scale Python Applications
用于大规模 Python 应用程序中细粒度管理的轻量级函数监视器
Poster: Robust Meta-Workflow Management with Mufasa
海报:使用 Mufasa 进行稳健的元工作流程管理
PONCHO: Dynamic Package Synthesis for Distributed and Serverless Python Applications
PONCHO:分布式和无服务器 Python 应用程序的动态包合成
  • DOI:
    10.1145/3526060.3535459
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Sly-Delgado, Barry;Locascio, Nick;Simonetti, David;Wiseman, Brett;Tovar, Ben;Thain, Douglas
  • 通讯作者:
    Thain, Douglas
An Empirical Study of Package Dependencies and Lifetimes in Binder Python Containers
Binder Python 容器中包依赖关系和生命周期的实证研究
Software Environments in Binder Containers
Binder 容器中的软件环境
  • DOI:
    10.5281/zenodo.4891790
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Shaffer, Tim;Chard, Kyle;Thain, Douglas
  • 通讯作者:
    Thain, Douglas
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Douglas Thain其他文献

Experience with BXGrid: a data repository and computing grid for biometrics research
Multiple Bypass: Interposition Agents for Distributed Computing

Douglas Thain的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Douglas Thain', 18)}}的其他基金

CSR: Small: Accelerating Data Intensive Scientific Workflows with Consistency Contracts
CSR:小:通过一致性合同加速数据密集型科学工作流程
  • 批准号:
    2317556
  • 财政年份:
    2023
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Standard Grant
REU Site: Data Intensive Scientific Computing
REU 站点:数据密集型科学计算
  • 批准号:
    1560363
  • 财政年份:
    2016
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Standard Grant
SI2-SSE: Scaling up Science on Cyberinfrastructure with the Cooperative Computing Tools
SI2-SSE:利用协作计算工具扩大网络基础设施科学规模
  • 批准号:
    1642409
  • 财政年份:
    2016
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Standard Grant
Collaborative Research: Software Sustainability: an SI^2 PI Workshop
协作研究:软件可持续性:SI^2 PI 研讨会
  • 批准号:
    1419132
  • 财政年份:
    2014
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Standard Grant
SI2-SSE: Connecting Cyberinfrastructure with the Cooperative Computing Tools
SI2-SSE:将网络基础设施与协作计算工具连接起来
  • 批准号:
    1148330
  • 财政年份:
    2012
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Standard Grant
Collaborative Research: II-New: Distributed Research Testbed (DiRT)
协作研究:II-新:分布式研究测试台 (DiRT)
  • 批准号:
    0855047
  • 财政年份:
    2009
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Standard Grant
CAREER: Data Intensive Grid Computing on Active Storage Clusters
职业:活动存储集群上的数据密集型网格计算
  • 批准号:
    0643229
  • 财政年份:
    2007
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Continuing Grant
HECURADeconstructing Clusters for High End Biometric Applications
HECURA解构高端生物识别应用的集群
  • 批准号:
    0621434
  • 财政年份:
    2007
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Standard Grant
SGER: Enabling Electronic Self-Defense with Dynamic Identities
SGER:通过动态身份实现电子自卫
  • 批准号:
    0549087
  • 财政年份:
    2005
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Standard Grant

相似海外基金

CAREER: Investigating Biogeographic Hypotheses and Drivers of Diversification in Neotropical Harvestmen (Opiliones: Laniatores) Using Ultraconserved Elements
职业:利用超保守元素研究新热带收获者(Opiliones:Laniatores)多样化的生物地理学假设和驱动因素
  • 批准号:
    2337605
  • 财政年份:
    2024
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Continuing Grant
ECCS-EPSRC Micromechanical Elements for Photonic Reconfigurable Zero-Static-Power Modules
用于光子可重构零静态功率模块的 ECCS-EPSRC 微机械元件
  • 批准号:
    EP/X025381/1
  • 财政年份:
    2024
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Research Grant
BRC-BIO: Epigenetic Regulation of Transposable Elements in Maize
BRC-BIO:玉米转座元件的表观遗传调控
  • 批准号:
    2334573
  • 财政年份:
    2024
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Standard Grant
SUstainable EuroPean Rare Earth Elements production value chain from priMary Ores
来自原矿的可持续欧洲稀土元素生产价值链
  • 批准号:
    10091569
  • 财政年份:
    2024
  • 资助金额:
    $ 56.3万
  • 项目类别:
    EU-Funded
Investigating Energy Transfer Pathways in Lanthanoid Elements
研究镧系元素的能量转移途径
  • 批准号:
    DP240103097
  • 财政年份:
    2024
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Discovery Projects
Pioneering alpine epigenomics to discover adaptive genetic elements
开拓高山表观基因组学以发现适应性遗传元素
  • 批准号:
    DE240100184
  • 财政年份:
    2024
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Discovery Early Career Researcher Award
Impact of impurity elements on the corrosion performance of high strength 6xxx aluminium alloys
杂质元素对高强6xxx铝合金腐蚀性能的影响
  • 批准号:
    2906344
  • 财政年份:
    2024
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Studentship
Using whole genome sequencing to identify non-coding elements associated with diabetes and related traits across ancestries
使用全基因组测序来识别与糖尿病相关的非编码元件和跨祖先的相关特征
  • 批准号:
    MR/Y003748/1
  • 财政年份:
    2024
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Research Grant
ERI: Self-Recognizing Composite Structural Elements (SR-CSEs): Multifunctional, Sustainable and Reliable
ERI:自我识别复合结构元件 (SR-CSE):多功能、可持续且可靠
  • 批准号:
    2347554
  • 财政年份:
    2024
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Standard Grant
REU Site: Elements of Sustainability
REU 网站:可持续发展的要素
  • 批准号:
    2348001
  • 财政年份:
    2024
  • 资助金额:
    $ 56.3万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了