SHF: Medium: Collaborative Research: ECC: Ephemeral Coherence Cohort for I/O Containerization and Disaggregation

SHF:媒介:协作研究:ECC:I/O 容器化和分解的临时一致性队列

基本信息

  • 批准号:
    1763547
  • 负责人:
  • 金额:
    $ 50万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2018
  • 资助国家:
    美国
  • 起止时间:
    2018-06-01 至 2025-05-31
  • 项目状态:
    未结题

项目摘要

Leadership computing facilities for high-performance computing (HPC) have a huge investment in the file and storage systems. The reason is that the HPC storage system often is the Achilles Heel of HPC systems, as it is fraught with numerous scenarios for contention, congestion and performance variability. This problem is getting worse due to: (a) the increased importance of data-driven HPC and the growth in the amount of data generated by large-scale simulation; and (b) the slower growth of disk speed, as compared to CPU speed. The addition of high-bandwidth persistent memory devices as burst-buffers brings in new opportunities for fast caching of application data while still allowing data persistence. However, the conventional approach of exploiting burst-buffers as yet another caching layer cannot reduce the lengthy and costly data processing steps in the deep I/O stack or reconcile occasional contentions inside the complex storage system. This project, therefore, seeks to exploit burst-buffers as repositories of persistent application-specific parallel file systems, with a lifetime commensurate to the lifetime of an application or an application campaign on a HPC system. This is a collaborative project between University of Illinois at Urbana-Champaign and Florida State University. This project formulates a research framework called Ephemeral Coherence Cohort (ECC) that offers an abstraction to represent the active collection of application data through containerization, insulate I/O activities across different applications, and enable storage disaggregation for ephemeral allocation and dynamic utilization of burst buffers. The proposed ECC framework aims to enhance a variety of mission-critical applications running on the Department of Energy and the National Science Foundation leadership computing facilities. The project strengthens the collaboration between University of Illinois Urbana-Champaign and the Florida State University. The project has plans to organize panels and birds-of-feather sessions on burst buffer research in the upcoming HPC conferences and collaborate with leaders of super-computing centers for wider community penetration with techniques from this research.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
用于高性能计算(HPC)的领先计算设施在文件和存储系统方面有巨大的投资。原因是HPC存储系统通常是HPC系统的阿喀琉斯之踵,因为它充满了竞争,拥塞和性能变化的许多场景。由于以下原因,该问题变得越来越严重:(a)数据驱动的HPC的重要性增加以及大规模模拟生成的数据量的增长;以及(B)与CPU速度相比,磁盘速度的增长较慢。作为突发缓冲器的高带宽持久存储器设备的添加为应用程序数据的快速缓存带来了新的机会,同时仍然允许数据持久性。然而,利用突发缓冲器作为另一个高速缓存层的传统方法不能减少深度I/O栈中冗长且昂贵的数据处理步骤,也不能调和复杂存储系统内部的偶尔竞争。因此,该项目旨在利用突发缓冲区作为持久性应用程序特定的并行文件系统的存储库,其生命周期与HPC系统上的应用程序或应用程序活动的生命周期相称。这是伊利诺伊大学香槟分校和佛罗里达州立大学之间的一个合作项目。该项目制定了一个名为Ephemeral Coherence Coherence Cohort(ECC)的研究框架,该框架通过容器化提供了一个抽象来表示应用程序数据的主动收集,隔离不同应用程序之间的I/O活动,并实现存储分解以进行短暂分配和动态利用突发缓冲区。拟议的ECC框架旨在增强在能源部和国家科学基金会领导计算设施上运行的各种关键任务应用程序。该项目加强了伊利诺伊大学香槟分校和佛罗里达州立大学之间的合作。该项目计划在即将到来的HPC会议上组织关于突发缓冲区研究的小组和羽毛会议,并与超级计算中心的领导者合作,以更广泛的社区渗透这项研究的技术。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Compression of Time Evolutionary Image Data through Predictive Deep Neural Networks
通过预测深度神经网络压缩时间演化图像数据
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Roy, Rupak;Sato, Kento;Bhattachrya, Subhadeep;Fang, Xingang;Joti, Yasumasa;Hatsui, Takaki;Hiraki, Toshiyuki;Guo, Jian;Yu, Weikuan.
  • 通讯作者:
    Yu, Weikuan.
SVAGC: Garbage Collection with a Scalable Virtual Address Swapping Technique
DFMan: A Graph-based Optimization of Dataflow Scheduling on High-Performance Computing Systems
DFMan:高性能计算系统上基于图的数据流调度优化
Efficient User-Level Storage Disaggregation for Deep Learning
Accurate classification of depression through optimized machine learning models on high-dimensional noisy data
  • DOI:
    10.1016/j.bspc.2021.103237
  • 发表时间:
    2022-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xingang Fang;Julia Klawohn;Alexander De Sabatino;Harsh Kundnani;Jon Ryan;Weikuan Yu;G. Hajcak
  • 通讯作者:
    Xingang Fang;Julia Klawohn;Alexander De Sabatino;Harsh Kundnani;Jon Ryan;Weikuan Yu;G. Hajcak
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Weikuan Yu其他文献

Performance Evaluation of FPGA-Based Biological Applications
基于 FPGA 的生物应用的性能评估
  • DOI:
  • 发表时间:
    2007
  • 期刊:
  • 影响因子:
    0
  • 作者:
    O. Storaasli;Weikuan Yu;D. Strenski;James Maltby
  • 通讯作者:
    James Maltby
Ad Hoc File Systems for High-Performance Computing
用于高性能计算的临时文件系统
  • DOI:
    10.1007/s11390-020-9801-1
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    1.9
  • 作者:
    A. Brinkmann;K. Mohror;Weikuan Yu;P. Carns;Toni Cortes;S. Klasky;Alberto Miranda;F. Pfreundt;R. Ross;Marc
  • 通讯作者:
    Marc
JVM-Bypass for Efficient Hadoop Shuffling
用于高效 Hadoop Shuffle 的 JVM 旁路
Performance evaluation and tuning of BioPig for genomic analysis
BioPig 用于基因组分析的性能评估和调整
Understanding I/O Behavior in Scientific Workflows on High Performance Computing Systems
了解高性能计算系统上科学工作流程中的 I/O 行为
  • DOI:
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Fahim Chowdhury;Francesco Di;A. Moody;Elsa Gonsiorowski;K. Mohror;Weikuan Yu
  • 通讯作者:
    Weikuan Yu

Weikuan Yu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Weikuan Yu', 18)}}的其他基金

Collaborative Research: OAC Core: CropDL - Scheduling and Checkpoint/Restart Support for Deep Learning Applications on HPC Clusters
合作研究:OAC 核心:CropDL - HPC 集群上深度学习应用的调度和检查点/重启支持
  • 批准号:
    2403089
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
SaTC: CORE: Small: Realizing Enhanced Authentication in the Mobile Era
SaTC:核心:小:实现移动时代的增强认证
  • 批准号:
    2131143
  • 财政年份:
    2021
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
IRES Track-1: I/O Research for Data-Intensive Analytics and Deep Learning
IRES Track-1:数据密集型分析和深度学习的 I/O 研究
  • 批准号:
    1952302
  • 财政年份:
    2020
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
CRI: II-New: A Software Defined Infrastructure for Cross-Layer Research on Reconfigurable Architecture and Systems
CRI:II-New:用于可重构架构和系统跨层研究的软件定义基础设施
  • 批准号:
    1822737
  • 财政年份:
    2018
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Eager: Collaborative Research: DiRecMR: Reconciling the Dichotomy of MapReduce for Efficient Speculation and Resilience
Eager:协作研究:DiRecMR:调和 MapReduce 的二分法以实现高效推测和弹性
  • 批准号:
    1744336
  • 财政年份:
    2017
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
CSR: Small: XooMR: Cross-Layer and Cross-Phase Cooperation for Fair and Efficient MapReduce
CSR:小:XooMR:跨层跨阶段合作实现公平高效的 MapReduce
  • 批准号:
    1564647
  • 财政年份:
    2015
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
EAGER: Tadoop: A Dual-Purpose Framework Taming the Bipolarity of Storage and Communication for High-Performance Computing and Data Analytics
EAGER:Tadoop:一个双用途框架,克服存储和通信的两极性,实现高性能计算和数据分析
  • 批准号:
    1561041
  • 财政年份:
    2015
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
EAGER: Tadoop: A Dual-Purpose Framework Taming the Bipolarity of Storage and Communication for High-Performance Computing and Data Analytics
EAGER:Tadoop:一个双用途框架,克服存储和通信的两极性,实现高性能计算和数据分析
  • 批准号:
    1432892
  • 财政年份:
    2014
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
CSR: Small: XooMR: Cross-Layer and Cross-Phase Cooperation for Fair and Efficient MapReduce
CSR:小:XooMR:跨层跨阶段合作实现公平高效的 MapReduce
  • 批准号:
    1320016
  • 财政年份:
    2013
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
II-New: A Compute and Storage Cluster for Multidisciplinary Research on Computer Systems and Scientific Simulations
II-New:用于计算机系统和科学模拟多学科研究的计算和​​存储集群
  • 批准号:
    1059376
  • 财政年份:
    2011
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant

相似海外基金

Collaborative Research: SHF: Medium: Differentiable Hardware Synthesis
合作研究:SHF:媒介:可微分硬件合成
  • 批准号:
    2403134
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Enabling Graphics Processing Unit Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的图形处理单元性能仿真
  • 批准号:
    2402804
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Tiny Chiplets for Big AI: A Reconfigurable-On-Package System
合作研究:SHF:中:用于大人工智能的微型芯片:可重新配置的封装系统
  • 批准号:
    2403408
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Toward Understandability and Interpretability for Neural Language Models of Source Code
合作研究:SHF:媒介:实现源代码神经语言模型的可理解性和可解释性
  • 批准号:
    2423813
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Enabling GPU Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的 GPU 性能仿真
  • 批准号:
    2402806
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Differentiable Hardware Synthesis
合作研究:SHF:媒介:可微分硬件合成
  • 批准号:
    2403135
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Tiny Chiplets for Big AI: A Reconfigurable-On-Package System
合作研究:SHF:中:用于大人工智能的微型芯片:可重新配置的封装系统
  • 批准号:
    2403409
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Enabling GPU Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的 GPU 性能仿真
  • 批准号:
    2402805
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: High-Performance, Verified Accelerator Programming
合作研究:SHF:中:高性能、经过验证的加速器编程
  • 批准号:
    2313024
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Verifying Deep Neural Networks with Spintronic Probabilistic Computers
合作研究:SHF:中:使用自旋电子概率计算机验证深度神经网络
  • 批准号:
    2311295
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了