权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

ExtraPeak - Automatic Performance Modeling of HPC Applications with Multiple Model Parameters

ExtraPeak - 具有多个模型参数的 HPC 应用程序的自动性能建模

基本信息

批准号：
323299120
负责人：
Professor Dr. Felix Wolf
金额：
--
依托单位：
Departement Informatik
依托单位国家：
德国
项目类别：
Research Grants
财政年份：
2017
资助国家：
德国
起止时间：
2016-12-31 至 2021-12-31
项目状态：
已结题

来源：
https://gepris.dfg.de/gepris/projekt/323299120?language=en
关键词：
ExtraPeak Automatic Performance Modeling HPC

项目摘要

Applications of ever-increasing complexity combined with rapidly growing volumes of data create an insatiable demand for computing power. However, the operational and procurement costs of the supercomputers needed to run them are tremendous. Minimizing runtime and energy consumption of a code is therefore an economic imperative. Tuning complex HPC applications requires the clever exploration of their design and configuration space. Especially on supercomputers, however, this space is so large that its exhaustive traversal via performance experiments is too expensive, if not impossible. Performance models, which describe performance metrics such as the execution time as a function of parameters such as the number of cores or the size of the input problem in an equation, allow this space to be explored more efficiently. Unfortunately, creating performance models manually is extremely laborious if done for large real-world applications. Further, to ensure that applications are free of performance bugs, it is often not enough to analyze any single aspect, such as processor count or problem size. The effect that the one varying parameter has on performance must be understood not only in a vacuum, but also in the context of the variation of other relevant parameters, including algorithmic options, input characteristics, or tuning parameters such as tiling.Recent advances in automatic empirical performance modeling, i.e., the generation of performance models based on a limited set of performance experiments, aspire to bridge this gap. However, while models with one parameter can be handled quite easily, performance modeling with multiple parameters poses significant challenges, namely (i) the identification of performance-relevant parameters, (ii) the resource-aware design of the required performance experiments, (iii) the diversity of possible model functions, and (iv) the efficient traversal of a complex high-dimensional model search space.In this project, we will develop an automatic empirical approach that allows performance modeling of any combination of application execution parameters. Solution components to tackle the above challenges include prior source-code analysis and a feedback-guided process of performance-data acquisition and model generation. The goal are insightful performance models that enable a wide range of uses from performance predictions for balanced machine design to detailed performance tuning. Our approach will help application developers understand complex performance tradeoffs and ultimately improve the performance of their code. We will integrate our method into the open-source performance-modeling tool Extra-P, which is currently restricted to models with only a single parameter. Finally, as a specific application of our approach, we will devise a novel co-design method that will replace error-prone and time-consuming back-of-the-envelope calculations to define the requirements for future machine procurements.

日益复杂的应用程序与快速增长的数据量相结合，对计算能力产生了无法满足的需求。然而，运行它们所需的超级计算机的运营和采购成本是巨大的。因此，最大限度地减少代码的运行时间和能耗是经济上的当务之急。调整复杂的 HPC 应用程序需要巧妙地探索其设计和配置空间。然而，特别是在超级计算机上，这个空间是如此之大，以至于通过性能实验进行详尽的遍历即使不是不可能，也太昂贵了。性能模型将性能指标（例如执行时间）描述为参数（例如内核数量或方程中输入问题的大小）的函数，从而可以更有效地探索该空间。不幸的是，如果为大型实际应用程序手动创建性能模型是非常费力的。此外，为了确保应用程序不存在性能错误，分析任何单个方面（例如处理器数量或问题大小）通常是不够的。一个变化的参数对性能的影响不仅必须在真空中理解，而且还必须在其他相关参数变化的背景下理解，包括算法选项、输入特性或平铺等调整参数。自动经验性能建模的最新进展，即基于一组有限的性能实验生成性能模型，渴望弥合这一差距。然而，虽然具有一个参数的模型可以很容易地处理，但具有多个参数的性能建模提出了重大挑战，即（i）性能相关参数的识别，（ii）所需性能实验的资源感知设计，（iii）可能的模型函数的多样性，以及（iv）复杂的高维模型搜索空间的有效遍历。在这个项目中，我们将开发一种自动经验方法，允许性能对应用程序执行参数的任意组合进行建模。解决上述挑战的解决方案组件包括先前的源代码分析以及性能数据采集和模型生成的反馈引导过程。目标是具有洞察力的性能模型，可实现从平衡机器设计的性能预测到详细的性能调整的广泛用途。我们的方法将帮助应用程序开发人员了解复杂的性能权衡，并最终提高其代码的性能。我们将把我们的方法集成到开源性能建模工具 Extra-P 中，该工具目前仅限于仅具有单个参数的模型。最后，作为我们方法的具体应用，我们将设计一种新颖的协同设计方法，该方法将取代容易出错且耗时的粗略计算，以定义未来机器采购的要求。