CAREER: Dynamic Management of Compressed Arrays for High-Performance Computing Applications
职业:高性能计算应用的压缩阵列的动态管理
基本信息
- 批准号:1943114
- 负责人:
- 金额:$ 50万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-04-01 至 2025-03-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
High-performance computing (HPC) has enabled significant advancements across all fields of science and engineering by allowing researchers to simulate complex phenomena that are difficult, if not impossible, in a normal laboratory setting. As new HPC systems come online, computational speed far exceeds the speed of data movement. Thus, data movement can limit application performance and system throughput. However, the disparity allows the expenditure of computational time to lower the bandwidth requirements to move data in HPC applications, mitigating performance bottlenecks. This project investigates the performance and utility of data compression and aggregation techniques to reduce the volume of data communicated, computed on, and stored by large-scale scientific applications. An outcome of this project is a transformative data management runtime that allows science and engineering applications that require large amounts of memory enables them to be run on cheaper and more common systems with less memory. Thus, the throughput of workloads can be greatly improved, facilitating research progress in their respected areas. Furthermore, reducing the amount of memory required by the application allows the application to run larger and more detailed problems, allowing scientists and engineers to run and analyze previously intractable experiments. Finally, this project seeks to broaden undergraduates' use and understanding of HPC by creating a multi-semester hands-on research course. This course engages STEM students to build/design/use the next generation of HPC systems and applications and prepares them with the cross-disciplinary skills needed to succeed in on-campus research opportunities, graduate school, and the modern workforce.This project improves current state-of-the-art lossy and lossless data compression by adding logic to dynamically manage compressed data; reducing the performance impact of high-cost compression and decompression times. The data management runtime allows application users/developers to select a subset of variables to register. Data, from the point of allocation, resides compressed in main memory and remains compressed during all inter/intra-process data motion. For inter-node communication, the runtime aggregates messages with the same destination node before transmission. Data that are needed for computation are decompressed just before use and are placed in a reconfigurable software-managed cache that utilizes a prefetcher to decompress data prior to use, limiting delays on the critical path of the application. For variables that use lossy compression, the runtime seeks to mitigate the accumulation of error beyond what the application can tolerate by dynamically altering the lossy compression error bound. This project evaluates the data management runtime on a diverse set of proxy application and production-level applications with varying memory requirements, communication computation ratios, communication patterns, and data access patterns.This project is jointly funded by CCF Division Software and Hardware Foundations Program and the Established Program to Stimulate Competitive Research (EPSCoR).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
高性能计算(HPC)使科学和工程的所有领域都取得了重大进展,使研究人员能够模拟在正常实验室环境中难以实现的复杂现象。随着新的HPC系统上线,计算速度远远超过数据移动速度。因此,数据移动可能会限制应用程序性能和系统吞吐量。然而,这种差异允许花费计算时间来降低HPC应用程序中移动数据的带宽要求,从而缓解性能瓶颈。该项目研究数据压缩和聚合技术的性能和实用性,以减少大规模科学应用程序通信,计算和存储的数据量。该项目的成果是一个变革性的数据管理运行时,它允许需要大量内存的科学和工程应用程序在更便宜,更通用的系统上运行,内存更少。因此,可以大大提高工作负载的吞吐量,促进其相关领域的研究进展。此外,减少应用程序所需的内存量允许应用程序运行更大和更详细的问题,允许科学家和工程师运行和分析以前棘手的实验。最后,这个项目旨在通过创建一个多学期的动手研究课程,扩大本科生的使用和理解HPC。本课程让STEM学生参与构建/设计/使用下一代HPC系统和应用程序,并为他们提供在校园研究机会,研究生院和现代劳动力中取得成功所需的跨学科技能。本项目通过添加逻辑来动态管理压缩数据,改进了当前最先进的有损和无损数据压缩;减少了高成本压缩和解压缩时间的性能影响。数据管理运行时允许应用程序用户/开发人员选择要注册的变量子集。从分配的角度来看,数据压缩驻留在主存储器中,并在所有进程间/进程内数据运动期间保持压缩。对于节点间通信,运行时在传输之前聚合具有相同目的地节点的消息。计算所需的数据在使用前被解压缩,并被放置在可重新配置的软件管理的缓存中,该缓存利用预取器在使用前对数据进行重新配置,从而限制了应用程序关键路径上的延迟。对于使用有损压缩的变量,运行库通过动态更改有损压缩误差界限,寻求减少超出应用程序可容忍范围的误差累积。该项目评估了一组不同的代理应用程序和生产级应用程序上的数据管理运行时,这些应用程序具有不同的内存需求、通信计算比率、通信模式,该项目由CCF部门软件和硬件基础计划和刺激竞争研究的既定计划(EPSCoR)联合资助。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(13)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Analyzing the Energy Consumption of Synchronous and Asynchronous Checkpointing Strategies
分析同步和异步检查点策略的能耗
- DOI:10.1109/supercheck56652.2022.00006
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Wilkins, Grant;Gossman, Mikaila J.;Nicolae, Bogdan;Smith, Melissa C.;Calhoun, Jon C.
- 通讯作者:Calhoun, Jon C.
Exploring Data Reduction Techniques for Additive Manufacturing Analysis
探索增材制造分析的数据缩减技术
- DOI:10.1109/drbsd56682.2022.00008
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Nichols, Coleman;Fulp, Megan Hickman;DeBardeleben, Nathan;Calhoun, Jon C.
- 通讯作者:Calhoun, Jon C.
Posits and the state of numerical representations in the age of exascale and edge computing
- DOI:10.1002/spe.3022
- 发表时间:2021-09
- 期刊:
- 影响因子:0
- 作者:Alexandra Poulos;S. Mckee;J. C. Calhoun
- 通讯作者:Alexandra Poulos;S. Mckee;J. C. Calhoun
Towards Combining Error-bounded Lossy Compression and Cryptography for Scientific Data
将有误有损压缩与科学数据密码学相结合
- DOI:10.1109/hpec49654.2021.9622874
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Shan, Ruiwen;Di, Sheng;Calhoun, Jon C.;Cappello, Franck
- 通讯作者:Cappello, Franck
ARC: An Automated Approach to Resiliency for Lossy Compressed Data via Error Correcting Codes
- DOI:10.1145/3431379.3460638
- 发表时间:2021-06
- 期刊:
- 影响因子:0
- 作者:Dakota Fulp;Alexandra Poulos;Robert Underwood;Jon C. Calhoun
- 通讯作者:Dakota Fulp;Alexandra Poulos;Robert Underwood;Jon C. Calhoun
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jon Calhoun其他文献
Recovering Detectable Uncorrectable Errors via Spatial Data Prediction
通过空间数据预测恢复可检测的不可纠正的错误
- DOI:
10.1145/3624062.3624120 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Kristen Guernsey;Sarah Placke;Alexandra Poulos;Jon Calhoun - 通讯作者:
Jon Calhoun
Multifacets of lossy compression for scientific data in the Joint-Laboratory of Extreme Scale Computing
超大规模计算联合实验室科学数据有损压缩的多方面
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Franck Cappello;Sheng Di;Robert Underwood;Dingwen Tao;Jon Calhoun;Yoshii Kazutomo;Kento Sato;Amarjit Singh;Luc Giraud;Emmanuel Agullo;Xavier Yepes;Mario Acosta;Sian Jin;Jiannan Tian;Frédéric Vivien;Bo Zhang;Kentaro Sano;Tomohiro Ueno;Thomas Grützmacher;H. Anzt - 通讯作者:
H. Anzt
Evaluating the Resiliency of Posits for Scientific Computing
评估科学计算假设的弹性
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Benjamin Schlueter;Jon Calhoun;Alexandra Poulos - 通讯作者:
Alexandra Poulos
Lossy and Lossless Compression for BioFilm Optical Coherence Tomography (OCT)
生物膜光学相干断层扫描 (OCT) 的有损和无损压缩
- DOI:
10.1145/3624062.3625125 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
M. Faykus;Jon Calhoun;Melissa C. Smith - 通讯作者:
Melissa C. Smith
Jon Calhoun的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jon Calhoun', 18)}}的其他基金
CDS&E: HAM3R: Heterogeneous Automated Management of Multiscale Methods and Resources
CDS
- 批准号:
2204011 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
SHF: Small: Using Error-Bounded Lossy Compression to Improve High-Performance Computing Systems and Applications
SHF:小型:使用误差有限有损压缩来改进高性能计算系统和应用程序
- 批准号:
1910197 - 财政年份:2019
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
相似国自然基金
Dynamic Credit Rating with Feedback Effects
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金项目
相似海外基金
CAREER: Set-Based Dynamic Modeling and Control for Trustworthy Energy Management Systems
职业:可信赖的能源管理系统的基于集的动态建模和控制
- 批准号:
2336007 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CAREER: Dynamic connectivity: a research and educational frontier for sustainable environmental management under climate and land use uncertainty
职业:动态连通性:气候和土地利用不确定性下可持续环境管理的研究和教育前沿
- 批准号:
2340161 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
CAREER: Using Mobile Sensors for Traffic Knowledge Extraction and Dynamic Network Management
职业:使用移动传感器进行交通知识提取和动态网络管理
- 批准号:
1719551 - 财政年份:2016
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
CAREER: Self-Organizing Demand Side Management for Smart Grid: A Dynamic Game-Theoretic Framework
职业:智能电网的自组织需求侧管理:动态博弈论框架
- 批准号:
1149735 - 财政年份:2012
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CAREER: Self-Organizing Demand Side Management for Smart Grid: A Dynamic Game-Theoretic Framework
职业:智能电网的自组织需求侧管理:动态博弈论框架
- 批准号:
1253516 - 财政年份:2012
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CAREER: A Time-Scale Decomposition Approach to Dynamic-Oriented Resource Management for Wireless Networks
职业生涯:无线网络动态资源管理的时间尺度分解方法
- 批准号:
1150169 - 财政年份:2012
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
CAREER: Using Mobile Sensors for Traffic Knowledge Extraction and Dynamic Network Management
职业:使用移动传感器进行交通知识提取和动态网络管理
- 批准号:
1055555 - 财政年份:2011
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
Self-directed career management in the transition from university to work: Dynamic development and interaction with career and organizational outcomes
从大学到工作的过渡中的自我导向职业管理:动态发展以及与职业和组织成果的互动
- 批准号:
193632229 - 财政年份:2011
- 资助金额:
$ 50万 - 项目类别:
Research Grants
CAREER: SMART: Scalable Adaptive Runtime Management Algorithms and Toolkit for Large-Scale Dynamic Scientific Applications
职业:SMART:用于大规模动态科学应用的可扩展自适应运行时管理算法和工具包
- 批准号:
0953371 - 财政年份:2010
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
CAREER: SMART: Scalable Adaptive Runtime Management Algorithms and Toolkit for Large-Scale Dynamic Scientific Applications
职业:SMART:用于大规模动态科学应用的可扩展自适应运行时管理算法和工具包
- 批准号:
1128805 - 财政年份:2010
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant














{{item.name}}会员




