OAC Core: Small: Collaborative Research: Scalable Run-Time for Highly Parallel, Heterogeneous Systems
OAC 核心:小型:协作研究:高度并行、异构系统的可扩展运行时
基本信息
- 批准号:1909015
- 负责人:
- 金额:$ 25万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-07-01 至 2023-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Supercomputing has become an essential tool in many scientific fields, including advances in engineering and medicine, and contributes to national security. Progress in many areas depends on continued improvements in the performance of supercomputers and their usability. Communication between processes is a critical component of this effort and is the target of this project. This project departs from the traditional communication protocols. Rather, the project focuses on providing middle ground solutions between hardware and software. This approach potentially reduces communication overheads and better matches the functionality of the communication library to the capabilities of modern communication adapters and also improves the match between the requirements of modern parallel computing frameworks and applications. By improving the communication capabilities of computational platforms, this project will promote faster and more flexible communication capabilities and will improve the time to completion of scientific applications. It, therefore, increases the scientific throughput of existing and future cyberinfrastructure platforms. The research and educational outcomes of this project are closely related, resulting in highly trained new generations of researchers and engineers leading to a more efficient and globally competent workforce. Therefore, this project aligns with the NSF's mission to promote the progress of science and to advance national prosperity and welfare through science, and serves the national interest. This project brings together a multidisciplinary team and aims at breaking away from the limitation of standards such as Message Passing Interface and pointing the way for handling the needs of future computational frameworks and high-end systems. To this end the project (1) designs and implements a communication library with new communication primitives to enable fast coordination with no serial bottleneck, to manage irregular, fine grain communication, and to provide new efficient synchronization mechanisms; (2) demonstrates the value of this library by using it to accelerate multiple task-based runtimes (Legion, PaRSEC) and communication libraries (MPI and GasNET); (3) demonstrates the value of hardware support by porting key components to a programmable NIC; and (4) delivers improvements and extensions to mainstream communication libraries to provide the new functionality. This work puts a special emphasis on emerging programming models, such as Legion or PaRSEC, and on emerging application domains, such as graph analytics. It aims at an orthogonal design where different mechanisms for associating producer buffer with consumer buffer can be composed with different mechanisms for synchronizing producer and consumer; and where mechanisms can be specialized so as to allow efficient hardware support.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
超级计算已经成为许多科学领域的重要工具,包括工程和医学的进步,并有助于国家安全。许多领域的进展取决于超级计算机性能及其可用性的持续改进。过程之间的通信是这项工作的关键组成部分,也是本项目的目标。该方案脱离了传统的通信协议。相反,该项目侧重于提供硬件和软件之间的中间解决方案。这种方法潜在地减少了通信开销,并且更好地将通信库的功能与现代通信适配器的能力相匹配,并且还改进了现代并行计算框架和应用程序的要求之间的匹配。通过提高计算平台的通信能力,该项目将促进更快和更灵活的通信能力,并将缩短完成科学应用的时间。 因此,它增加了现有和未来网络基础设施平台的科学吞吐量。该项目的研究和教育成果是密切相关的,导致训练有素的新一代研究人员和工程师导致更高效和全球竞争力的劳动力。因此,该项目符合NSF的使命,即通过科学促进科学进步和国家繁荣和福利,并为国家利益服务。该项目汇集了一个多学科团队,旨在摆脱消息传递接口等标准的限制,并为处理未来计算框架和高端系统的需求指明方向。为此,本项目(1)设计并实现了一个具有新通信原语的通信库,以实现无串行瓶颈的快速协调,管理不规则的细粒度通信,并提供新的高效同步机制;(2)通过使用该库来加速多个基于任务的运行时,展示了该库的价值(Legion、PaRSEC)和通信库(MPI和GasNET);(3)通过将关键组件移植到可编程NIC来展示硬件支持的价值;以及(4)对主流通信库进行改进和扩展以提供新功能。这项工作特别强调新兴的编程模型,如军团或PaRSEC,以及新兴的应用领域,如图形分析。该奖项旨在通过正交设计,将生产者缓冲区与消费者缓冲区相关联的不同机制与同步生产者和消费者的不同机制相结合,并将机制专门化,以实现有效的硬件支持。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Callback-based completion notification using MPI Continuations
使用 MPI Continuations 的基于回调的完成通知
- DOI:10.1016/j.parco.2021.102793
- 发表时间:2021
- 期刊:
- 影响因子:1.4
- 作者:Schuchart, Joseph;Samfass, Philipp;Niethammer, Christoph;Gracia, José;Bosilca, George
- 通讯作者:Bosilca, George
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
George Bosilca其他文献
An evaluation of User-Level Failure Mitigation support in MPI
- DOI:
10.1007/s00607-013-0331-3 - 发表时间:
2013-05-29 - 期刊:
- 影响因子:2.800
- 作者:
Wesley Bland;Aurelien Bouteiller;Thomas Herault;Joshua Hursey;George Bosilca;Jack J. Dongarra - 通讯作者:
Jack J. Dongarra
Cache Optimization and Performance Modeling of Batched, Small, and Rectangular Matrix Multiplication on Intel, AMD, and Fujitsu Processors
Intel、AMD 和 Fujitsu 处理器上的批量、小型和矩形矩阵乘法的缓存优化和性能建模
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:2.7
- 作者:
Sameer Deshmukh;Rio Yokota;George Bosilca - 通讯作者:
George Bosilca
Self-healing network for scalable fault-tolerant runtime environments
- DOI:
10.1016/j.future.2009.04.001 - 发表时间:
2010-03-01 - 期刊:
- 影响因子:
- 作者:
Thara Angskun;Graham Fagg;George Bosilca;Jelena Pješivac-Grbović;Jack Dongarra - 通讯作者:
Jack Dongarra
Kernel-assisted and topology-aware MPI collective communications on multicore/many-core platforms
- DOI:
10.1016/j.jpdc.2013.01.015 - 发表时间:
2013-07-01 - 期刊:
- 影响因子:
- 作者:
Teng Ma;George Bosilca;Aurelien Bouteiller;Jack J. Dongarra - 通讯作者:
Jack J. Dongarra
George Bosilca的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('George Bosilca', 18)}}的其他基金
Collaborative Research: Frameworks: Production quality Ecosystem for Programming and Executing eXtreme-scale Applications (EPEXA)
合作研究:框架:用于编程和执行超大规模应用程序的生产质量生态系统 (EPEXA)
- 批准号:
1931384 - 财政年份:2019
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
SPX: Collaborative Research: Cross-layer Application-Aware Resilience at Extreme Scale (CAARES)
SPX:协作研究:超大规模跨层应用程序感知弹性 (CAARES)
- 批准号:
1725692 - 财政年份:2017
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: SI2-SSI: EVOLVE: Enhancing the Open MPI Software for Next Generation Architectures and Applications
合作研究:SI2-SSI:EVOLVE:增强下一代架构和应用的开放式 MPI 软件
- 批准号:
1664142 - 财政年份:2017
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: SI2-SSI:Task-Based Environment for Scientific Simulation at Extreme Scale (TESSE)
合作研究:SI2-SSI:基于任务的超大规模科学模拟环境 (TESSE)
- 批准号:
1450300 - 财政年份:2015
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
SI2-SSE: Collaborative Research: ADAPT: Next Generation Message Passing Interface (MPI) Library - Open MPI
SI2-SSE:协作研究:ADAPT:下一代消息传递接口 (MPI) 库 - 开放 MPI
- 批准号:
1339820 - 财政年份:2013
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
G8 Initiative: Collaborative Research: ECS: Enabling Climate Simulation at Extreme Scale
G8 倡议:合作研究:ECS:实现极端规模的气候模拟
- 批准号:
1063019 - 财政年份:2011
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
Collaborative: CSR-AES: System Support for Auto-tuning MPI Applications
协作:CSR-AES:自动调整 MPI 应用程序的系统支持
- 批准号:
0720678 - 财政年份:2007
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
相似国自然基金
胆固醇羟化酶CH25H非酶活依赖性促进乙型肝炎病毒蛋白Core及Pre-core降解的分子机制研究
- 批准号:82371765
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
锕系元素5f-in-core的GTH赝势和基组的开发
- 批准号:22303037
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于合成致死策略搭建Core-matched前药共组装体克服肿瘤耐药的机制研究
- 批准号:
- 批准年份:2022
- 资助金额:52 万元
- 项目类别:
鼠伤寒沙门氏菌LPS core经由CD209/SphK1促进树突状细胞迁移加重炎症性肠病的机制研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于外泌体精准调控的“核-壳”(core-shell)同步血管化骨组织工程策略的应用与机制探讨
- 批准号:
- 批准年份:2020
- 资助金额:55 万元
- 项目类别:
肌营养不良蛋白聚糖Core M3型甘露糖肽的精确制备及功能探索
- 批准号:92053110
- 批准年份:2020
- 资助金额:70.0 万元
- 项目类别:重大研究计划
Core-1-O型聚糖黏蛋白缺陷诱导胃炎发生并介导慢性胃炎向胃癌转化的分子机制研究
- 批准号:81902805
- 批准年份:2019
- 资助金额:20.5 万元
- 项目类别:青年科学基金项目
原始地球增生晚期的Core-merging大碰撞事件:地核增生、核幔平衡与核幔边界结构的新认识
- 批准号:41973063
- 批准年份:2019
- 资助金额:65.0 万元
- 项目类别:面上项目
RBM38通过协助Pol-ε结合、招募core调控HBV复制
- 批准号:31900138
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
CORDEX-CORE区域气候模拟与预估研讨会
- 批准号:41981240365
- 批准年份:2019
- 资助金额:1.5 万元
- 项目类别:国际(地区)合作与交流项目
相似海外基金
Collaborative Research: OAC Core: Small: Anomaly Detection and Performance Optimization for End-to-End Data Transfers at Scale
协作研究:OAC 核心:小型:大规模端到端数据传输的异常检测和性能优化
- 批准号:
2412329 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
OAC Core: SHF: SMALL: ICURE -- In-situ Analytics with Compressed or Summary Representations for Extreme-Scale Architectures
OAC 核心:SHF:SMALL:ICURE——针对超大规模架构的压缩或摘要表示的原位分析
- 批准号:
2333899 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
OAC Core: SHF: SMALL: ICURE -- In-situ Analytics with Compressed or Summary Representations for Extreme-Scale Architectures
OAC 核心:SHF:SMALL:ICURE——针对超大规模架构的压缩或摘要表示的原位分析
- 批准号:
2007775 - 财政年份:2020
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: CNS core: OAC core: Small: New Techniques for I/O Behavior Modeling and Persistent Storage Device Configuration
合作研究: CNS 核心:OAC 核心:小型:I/O 行为建模和持久存储设备配置新技术
- 批准号:
2008324 - 财政年份:2020
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Small: Anomaly Detection and Performance Optimization for End-to-End Data Transfers at Scale
协作研究:OAC 核心:小型:大规模端到端数据传输的异常检测和性能优化
- 批准号:
2007789 - 财政年份:2020
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: CNS core: OAC core: Small: New Techniques for I/O Behavior Modeling and Persistent Storage Device Configuration
合作研究: CNS 核心:OAC 核心:小型:I/O 行为建模和持久存储设备配置新技术
- 批准号:
2008072 - 财政年份:2020
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Small: Efficient and Policy-driven Burst Buffer Sharing
合作研究:OAC Core:小型:高效且策略驱动的突发缓冲区共享
- 批准号:
2008388 - 财政年份:2020
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
OAC Core: Small: Collaborative Research: Conversational Agents for Supporting Sustainable Implementation and Systemic Diffusion of Cyberinfrastructure and Science Gateways
OAC 核心:小型:协作研究:支持网络基础设施和科学网关可持续实施和系统扩散的对话代理
- 批准号:
2007100 - 财政年份:2020
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
OAC Core: SMALL: DeepJIMU: Model-Parallelism Infrastructure for Large-scale Deep Learning by Gradient-Free Optimization
OAC 核心:小型:DeepJIMU:通过无梯度优化实现大规模深度学习的模型并行基础设施
- 批准号:
2007976 - 财政年份:2020
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
OAC Core: Small: Collaborative Research: Conversational Agents for Supporting Sustainable Implementation and Systemic Diffusion of Cyberinfrastructure and Science Gateways
OAC 核心:小型:协作研究:支持网络基础设施和科学网关可持续实施和系统扩散的对话代理
- 批准号:
2006816 - 财政年份:2020
- 资助金额:
$ 25万 - 项目类别:
Standard Grant