SHF:Medium:Collaborative Research:A comprehensive methodology to pursue reproducible accuracy in ensemble scientific simulations on multi- and many-core platforms
SHF:中:协作研究:在多核和众核平台上追求集合科学模拟的可重复精度的综合方法
基本信息
- 批准号:1728850
- 负责人:
- 金额:$ 37.02万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-01-01 至 2020-05-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Ensemble simulations of scientific phenomena typically run for weeks or even months on high-performance computing clusters. The already high level of concurrency of these computing environments is expected to significantly increase in the near future, causing simulations to suffer not only from numerical errors due to limited arithmetic precision but also from the non-determinism in the execution associated with multithreading. Ultimately this trend can compromise the simulation results and break the scientific community's trust in ensemble simulations. This project tackles this problem and defines a methodology to enable the reproducible accuracy of large ensemble simulations on exascale platforms that include multi- and many-core processors. This project moves along two major fronts. First, the investigators identify common sources of accuracy errors and study their accumulation, propagation, and runtime effects in a controlled environment. This phase includes three research activities: (i) generating code motifs that model those computations that may lead to accuracy errors; (ii) providing multiple implementations of these motifs, called code inspectors, targeting different parallel platforms; and (iii) evaluating the accuracy and runtime of these implementations using a variety of datasets and stress conditions. Second, by installing these code inspectors in real scientific code bases, the investigators study their behavior in uncertain environments. This phase includes two research activities: (i) prioritizing code segments based on quantitative impact scores and matching segments to inspector motifs; and (ii) finding the optimal code inspector implementations and patching the code with them so as to optimize the overall result variance. The applications targeted in this project are deterministic chaotic applications including n-body atomic system simulations and astrophysical simulations.
科学现象的整体模拟通常在高性能计算集群上运行数周甚至数月。这些计算环境已经很高的并发水平预计将在不久的将来显著增加,导致模拟不仅受到由于有限的算术精度而导致的数值误差的影响,而且还受到与多线程相关联的执行中的不确定性的影响。最终,这种趋势可能会损害模拟结果,并打破科学界对整体模拟的信任。该项目解决了这个问题,并定义了一种方法,以在包括多核和多核处理器的亿级平台上实现大型系综模拟的可重复精度。这个项目沿着两条主要战线前进。首先,研究人员确定精度误差的常见来源,并研究它们在受控环境中的累积、传播和运行时效应。此阶段包括三个研究活动:(I)生成代码主题,以模拟可能导致精度错误的计算;(Ii)提供这些主题的多个实现,称为代码检查器,针对不同的并行平台;以及(Iii)使用各种数据集和压力条件评估这些实现的准确性和运行时间。其次,通过在真正的科学代码库中安装这些代码检查器,调查人员可以研究他们在不确定环境中的行为。这一阶段包括两个研究活动:(I)根据量化影响分数确定代码段的优先顺序,并将代码段与检查器主题进行匹配;(Ii)找到最优的代码检查器实现,并用它们修补代码,以优化总体结果方差。本项目的目标应用是确定性混沌应用,包括n体原子系统模拟和天体物理模拟。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michela Becchi其他文献
Editorial: Special Issue on Computing Frontiers
- DOI:
10.1007/s11265-019-1439-2 - 发表时间:
2019-01-21 - 期刊:
- 影响因子:1.800
- 作者:
Francesca Palumbo;Michela Becchi - 通讯作者:
Michela Becchi
Michela Becchi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michela Becchi', 18)}}的其他基金
SHF: Small: Collaborative Research: Accelerated Data Transformation: A Software-Hardware Stack for Transducers
SHF:小型:协作研究:加速数据转换:传感器的软件硬件堆栈
- 批准号:
1907863 - 财政年份:2019
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
CSR: Small: Middleware Technologies for Multi-Accelerator Clusters
CSR:小型:多加速器集群的中间件技术
- 批准号:
1812727 - 财政年份:2018
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
SHF: Small: Collaborative Research: The Automata Programming Paradigm for Genomic Analysis
SHF:小型:协作研究:基因组分析的自动机编程范式
- 批准号:
1740583 - 财政年份:2017
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
CAREER: Compiler and Runtime Support for Irregular Applications on Many-core Processors
职业:多核处理器上不规则应用程序的编译器和运行时支持
- 批准号:
1741683 - 财政年份:2017
- 资助金额:
$ 37.02万 - 项目类别:
Continuing Grant
NeTS: Small: A Language-Based Approach to Deep Packet Inspection: from Theory to Practice
NeTS:Small:基于语言的深度数据包检测方法:从理论到实践
- 批准号:
1724934 - 财政年份:2017
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
CAREER: Compiler and Runtime Support for Irregular Applications on Many-core Processors
职业:多核处理器上不规则应用程序的编译器和运行时支持
- 批准号:
1452454 - 财政年份:2015
- 资助金额:
$ 37.02万 - 项目类别:
Continuing Grant
SHF:Medium:Collaborative Research:A comprehensive methodology to pursue reproducible accuracy in ensemble scientific simulations on multi- and many-core platforms
SHF:中:协作研究:在多核和众核平台上追求集合科学模拟的可重复精度的综合方法
- 批准号:
1513603 - 财政年份:2015
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
SHF: Small: Collaborative Research: The Automata Programming Paradigm for Genomic Analysis
SHF:小型:协作研究:基因组分析的自动机编程范式
- 批准号:
1421765 - 财政年份:2014
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
NeTS: Small: A Language-Based Approach to Deep Packet Inspection: from Theory to Practice
NeTS:Small:基于语言的深度数据包检测方法:从理论到实践
- 批准号:
1319748 - 财政年份:2013
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
CSR: Small: Scheduling and Virtualization Technologies for Heterogeneous Clusters with Many-core Devices
CSR:小:多核设备异构集群的调度和虚拟化技术
- 批准号:
1216756 - 财政年份:2012
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
相似海外基金
Collaborative Research: SHF: Medium: Differentiable Hardware Synthesis
合作研究:SHF:媒介:可微分硬件合成
- 批准号:
2403134 - 财政年份:2024
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Enabling Graphics Processing Unit Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的图形处理单元性能仿真
- 批准号:
2402804 - 财政年份:2024
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Tiny Chiplets for Big AI: A Reconfigurable-On-Package System
合作研究:SHF:中:用于大人工智能的微型芯片:可重新配置的封装系统
- 批准号:
2403408 - 财政年份:2024
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Toward Understandability and Interpretability for Neural Language Models of Source Code
合作研究:SHF:媒介:实现源代码神经语言模型的可理解性和可解释性
- 批准号:
2423813 - 财政年份:2024
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Enabling GPU Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的 GPU 性能仿真
- 批准号:
2402806 - 财政年份:2024
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Differentiable Hardware Synthesis
合作研究:SHF:媒介:可微分硬件合成
- 批准号:
2403135 - 财政年份:2024
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Tiny Chiplets for Big AI: A Reconfigurable-On-Package System
合作研究:SHF:中:用于大人工智能的微型芯片:可重新配置的封装系统
- 批准号:
2403409 - 财政年份:2024
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Enabling GPU Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的 GPU 性能仿真
- 批准号:
2402805 - 财政年份:2024
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: High-Performance, Verified Accelerator Programming
合作研究:SHF:中:高性能、经过验证的加速器编程
- 批准号:
2313024 - 财政年份:2023
- 资助金额:
$ 37.02万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Verifying Deep Neural Networks with Spintronic Probabilistic Computers
合作研究:SHF:中:使用自旋电子概率计算机验证深度神经网络
- 批准号:
2311295 - 财政年份:2023
- 资助金额:
$ 37.02万 - 项目类别:
Continuing Grant