Collaborative Research: SI2-SSI: EVOLVE: Enhancing the Open MPI Software for Next Generation Architectures and Applications
合作研究:SI2-SSI:EVOLVE:增强下一代架构和应用的开放式 MPI 软件
基本信息
- 批准号:1663887
- 负责人:
- 金额:$ 30.88万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-06-01 至 2022-05-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
For nearly two decades, the Message Passing Interface (MPI) has been an essential part of the High-Performance Computing ecosystem and consequently a key enabler for important scientific breakthroughs. It is a fundamental building block for most large-scale simulations from physics, chemistry, biology, material sciences as engineering. Open MPI is an open source implementation of the MPI specification, widely used and adopted by the research community as well as industry. The Open MPI library is jointly developed and maintained by a consortium of academic institutions, national labs and industrial partners. It is installed on virtually all large-scale computer systems in the US as well as in the rest of the world. The goal of this project is to enhance and modernize the Open MPI library in the context of the ongoing evolution of modern computer systems, and to ensure its future operability on all upcoming architectures. We aim at implementing fundamental software techniques that can be used in many-core systems to execute MPI-based parallel applications more efficiently, and to tolerate process and memory failures at all scales, from current systems, up to the extreme scales expected before the end of the decade.Open MPI is an open source implementation of the Message Passing Interface (MPI) specification. The MPI API is currently being extended to consider the needs of application developers in terms of efficiency, productivity and resilience. The project will also support academic involvement in the design, development and evaluation of the Open MPI software, and ensure academic presence in the MPI Forum. The goal of this proposal is to enhance the Open MPI software library, focusing on two aspects: (1) Extend Open MPI to support new features of the MPI specification. Open MPI will continue to support all new features of current and upcoming MPI specifications. The two most significant areas within the context of this proposal are (a) extensions to better support hybrid programming models and (b) support for fault tolerance in MPI applications. To improve support for hybrid programming models, the MPI Forum is currently considering introducing the notion of MPI Endpoints, which could be used by different threads of an MPI rank to instantiate multiple separate communication contexts. The goal within this project is to develop an implementation of endpoints to support effective hybrid programming model, and to extend the concept to other aspects of parallel applications such as File I/O operations. One of the project partners (UTK) leads the current proposal in the MPI Forum to expose failures and ensure the continuation of the execution of MPI applications. In the context of this SSI proposal, the goal is to harden, improve, and expand the support of the existing ULFM implementation in Open MPI and thus enable end-users to design application-specific resilience approaches for future platforms. (2) Enhance the Open MPI core to support new architectures and improve scalability. While Open MPI has demonstrated very good scalability in the past, there is significant work to be done to ensure similarly good performance on future architectures. Specifically, we propose a groundbreaking rework of the startup environment that will improve process launch scalability, increase support for asynchronous progress of operations, enable support for accelerators, and reduce sensitivity to system noise. The project would also enhance the support for File I/O operations as part of the Open MPI package by expanding our work on highly scalable collective I/O operations through delegation and exploring the utilization of burst buffers as temporary storage.
近二十年来,消息传递接口(MPI)一直是高性能计算生态系统的重要组成部分,因此也是重要科学突破的关键推动者。它是物理学、化学、生物学、材料科学和工程学中大多数大规模模拟的基本构建块。Open MPI是MPI规范的开源实现,被研究界和工业界广泛使用和采用。Open MPI库是由学术机构、国家实验室和工业合作伙伴组成的联盟共同开发和维护的。在美国和世界其他地方,几乎所有的大型计算机系统都安装了它。该项目的目标是在现代计算机系统不断发展的背景下增强和现代化Open MPI库,并确保其未来在所有即将到来的体系结构上的可操作性。我们的目标是实现可以在多核系统中使用的基本软件技术,以更有效地执行基于mpi的并行应用程序,并容忍各种规模的进程和内存故障,从当前系统到预计在十年结束之前的极端规模。Open MPI是消息传递接口(MPI)规范的开源实现。MPI API目前正在扩展,以考虑应用程序开发人员在效率、生产力和弹性方面的需求。该项目还将支持学术界参与开放式MPI软件的设计、开发和评估,并确保学术界参与MPI论坛。本提案的目标是增强Open MPI软件库,重点关注两个方面:(1)扩展Open MPI以支持MPI规范的新功能。Open MPI将继续支持当前和即将推出的MPI规范的所有新功能。在这个提议的上下文中,两个最重要的领域是(a)更好地支持混合编程模型的扩展和(b)支持MPI应用程序中的容错。为了改进对混合编程模型的支持,MPI论坛目前正在考虑引入MPI端点的概念,MPI级别的不同线程可以使用MPI端点实例化多个独立的通信上下文。该项目的目标是开发端点的实现,以支持有效的混合编程模型,并将该概念扩展到并行应用程序的其他方面,如文件I/O操作。其中一个项目合作伙伴(UTK)在MPI论坛上提出了当前的建议,以暴露失败并确保MPI应用程序的继续执行。在此SSI提案的背景下,目标是加强、改进和扩展Open MPI中现有ULFM实现的支持,从而使最终用户能够为未来的平台设计特定于应用程序的弹性方法。(2)增强Open MPI核心以支持新架构并提高可扩展性。虽然Open MPI在过去已经证明了非常好的可伸缩性,但要确保在未来的架构上具有类似的良好性能,还有很多工作要做。具体来说,我们建议对启动环境进行开创性的重新设计,以提高过程启动的可扩展性,增加对操作异步进程的支持,启用对加速器的支持,并降低对系统噪声的敏感性。该项目还将增强对文件I/O操作的支持,作为Open MPI包的一部分,通过委托和探索利用突发缓冲区作为临时存储来扩展我们在高度可扩展的集体I/O操作方面的工作。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
On Overlapping Communication and File I/O in Collective Write Operation
- DOI:10.1109/ipdpsw50202.2020.00175
- 发表时间:2020-05
- 期刊:
- 影响因子:0
- 作者:Raafat Feki;E. Gabriel
- 通讯作者:Raafat Feki;E. Gabriel
Parallel I/O on Compressed Data Files: Semantics, Algorithms, and Performance Evaluation
- DOI:10.1109/ccgrid49817.2020.00-74
- 发表时间:2020-05
- 期刊:
- 影响因子:0
- 作者:S. Singh;E. Gabriel
- 通讯作者:S. Singh;E. Gabriel
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Edgar Gabriel其他文献
A Robust and Efficient Message Passing Library for Volunteer Computing Environments
- DOI:
10.1007/s10723-010-9172-x - 发表时间:
2010-11-18 - 期刊:
- 影响因子:2.900
- 作者:
Rakhi Anand;Troy LeBlanc;Edgar Gabriel;Jaspal Subhlok - 通讯作者:
Jaspal Subhlok
Edgar Gabriel的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Edgar Gabriel', 18)}}的其他基金
SI2-SSE: Collaborative Research: ADAPT: Next Generation Message Passing Interface (MPI) Library - Open MPI
SI2-SSE:协作研究:ADAPT:下一代消息传递接口 (MPI) 库 - 开放 MPI
- 批准号:
1339763 - 财政年份:2013
- 资助金额:
$ 30.88万 - 项目类别:
Standard Grant
SI2-SSI: Collaborative Research: A Glass Box Approach to Enabling Open, Deep Interactions in the HPC Toolchain
SI2-SSI:协作研究:在 HPC 工具链中实现开放、深度交互的玻璃盒方法
- 批准号:
1148052 - 财政年份:2012
- 资助金额:
$ 30.88万 - 项目类别:
Standard Grant
II-NEW: A Heterogeneous Testbed for Exploring Emerging HPC Tools, Programming Languages, and Applications
II-新:用于探索新兴 HPC 工具、编程语言和应用程序的异构测试平台
- 批准号:
0958464 - 财政年份:2010
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant
CAREER: Dynamic Run-Time Optimization of Parallel, Adaptive and Hybrid Applications
职业:并行、自适应和混合应用程序的动态运行时优化
- 批准号:
0846002 - 财政年份:2009
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: SI2-SSI: Expanding Volunteer Computing
合作研究:SI2-SSI:扩展志愿者计算
- 批准号:
2039142 - 财政年份:2020
- 资助金额:
$ 30.88万 - 项目类别:
Standard Grant
SI2-SSI: Collaborative Research: Einstein Toolkit Community Integration and Data Exploration
SI2-SSI:协作研究:Einstein Toolkit 社区集成和数据探索
- 批准号:
2114580 - 财政年份:2020
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant
Collaborative Research: SI2-SSI: Expanding Volunteer Computing
合作研究:SI2-SSI:扩展志愿者计算
- 批准号:
2001752 - 财政年份:2019
- 资助金额:
$ 30.88万 - 项目类别:
Standard Grant
Collaborative Research: NISC SI2-S2I2 Conceptualization of CFDSI: Model, Data, and Analysis Integration for End-to-End Support of Fluid Dynamics Discovery and Innovation
合作研究:NISC SI2-S2I2 CFDSI 概念化:模型、数据和分析集成,用于流体动力学发现和创新的端到端支持
- 批准号:
1743178 - 财政年份:2018
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant
Collaborative Research: NISC SI2-S2I2 Conceptualization of CFDSI: Model, Data, and Analysis Integration for End-to-End Support of Fluid Dynamics Discovery and Innovation
合作研究:NISC SI2-S2I2 CFDSI 概念化:模型、数据和分析集成,用于流体动力学发现和创新的端到端支持
- 批准号:
1743185 - 财政年份:2018
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant
Collaborative Research: NISC SI2-S2I2 Conceptualization of CFDSI: Model, Data, and Analysis Integration for End-to-End Support of Fluid Dynamics Discovery and Innovation
合作研究:NISC SI2-S2I2 CFDSI 概念化:模型、数据和分析集成,用于流体动力学发现和创新的端到端支持
- 批准号:
1743180 - 财政年份:2018
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant
Collaborative Research: NISC SI2-S2I2 Conceptualization of CFDSI: Model, Data, and Analysis Integration for End-to-End Support of Fluid Dynamics Discovery and Innovation
合作研究:NISC SI2-S2I2 CFDSI 概念化:模型、数据和分析集成,用于流体动力学发现和创新的端到端支持
- 批准号:
1743179 - 财政年份:2018
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant
Collaborative Research: NISC SI2-S2I2 Conceptualization of CFDSI: Model, Data, and Analysis Integration for End-to-End Support of Fluid Dynamics Discovery and Innovation
合作研究:NISC SI2-S2I2 CFDSI 概念化:模型、数据和分析集成,用于流体动力学发现和创新的端到端支持
- 批准号:
1743191 - 财政年份:2018
- 资助金额:
$ 30.88万 - 项目类别:
Continuing Grant
Collaborative Research: SI2-SSI: Expanding Volunteer Computing
合作研究:SI2-SSI:扩展志愿者计算
- 批准号:
1664022 - 财政年份:2017
- 资助金额:
$ 30.88万 - 项目类别:
Standard Grant
Collaborative Research: SI2-SSI: Cyberinfrastructure for Advancing Hydrologic Knowledge through Collaborative Integration of Data Science, Modeling and Analysis
合作研究:SI2-SSI:通过数据科学、建模和分析的协作集成推进水文知识的网络基础设施
- 批准号:
1664061 - 财政年份:2017
- 资助金额:
$ 30.88万 - 项目类别:
Standard Grant