XPS: EXPL: CCA: Collaborative Research: Nixing Scale Bugs in HPC Applications
XPS:EXPL:CCA:协作研究:消除 HPC 应用程序中的规模错误
基本信息
- 批准号:1439002
- 负责人:
- 金额:$ 15万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2014
- 资助国家:美国
- 起止时间:2014-09-01 至 2018-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Large-scale simulation is a fundamental component of modern science and engineering. Unfortunately, programs written to perform simulations on large-scale parallel computers frequently suffer from software defects that result from the sheer scale and the variety of parallelization approaches employed. Especially egregious are software bugs that occur when large resource allocations (e.g., memory requests) are made. Formally based active-testing techniques are essential to locate such defects. However, these testing tools are themselves seldom run on parallel machines, let alone at large scale, making it difficult and very time consuming to find scale bugs with high assurance. Efforts to parallelize verification tools should reuse existing technology for easy parallelization, result collection, and fault handling. Key innovations of this project include the insight that large-scale verification runs can be described through work-flows, which makes it possible to take advantage of already available distributed computing platforms, in particular Swift/T from Argonne. The complementary backgrounds of the PIs are well matched with the need to push both formal aspects and distributed verification in the context of three widely-used concurrency models, namely MPI, OpenMP, and CUDA. This work will help create a public distributed formal active testing framework. The tools and case-study software driving this research will be maintained by the PIs and released freely under open-source licenses through websites and repositories. They will facilitate large-scale debugging of scientific simulation codes by researchers and software developers in academia, government labs, and industry. The project will also generate pedagogical material and best practices, helping educate students in the use of existing work-flow based problem solving approaches. It will help train present and future scientists, engineers, and programmers, thus assisting in maintaining our nation's leadership in computing, homeland and energy security, and STEM education.
大规模仿真是现代科学与工程的重要组成部分。不幸的是,程序编写执行大规模并行计算机上的模拟经常遭受软件缺陷,导致纯粹的规模和各种并行化方法。尤其令人震惊的是当大量资源分配时发生的软件错误(例如,存储器请求)。基于形式的主动测试技术对于定位此类缺陷至关重要。然而,这些测试工具本身很少在并行机上运行,更不用说在大规模上运行,这使得很难并且非常耗时地找到具有高保证的规模错误。并行化验证工具的努力应该重用现有技术,以便于并行化、结果收集和故障处理。该项目的主要创新包括:大规模验证运行可以通过工作流来描述,这使得利用现有的分布式计算平台成为可能,特别是来自阿贡的Swift/T。PI的互补背景与在三种广泛使用的并发模型(即MPI,OpenMP和CUDA)的背景下推动正式方面和分布式验证的需求非常匹配。这项工作将有助于创建一个公共的分布式正式的主动测试框架。推动这项研究的工具和案例研究软件将由PI维护,并通过网站和存储库在开源许可证下免费发布。它们将促进学术界、政府实验室和工业界的研究人员和软件开发人员对科学模拟代码进行大规模调试。该项目还将产生教学材料和最佳做法,帮助教育学生使用现有的基于工作流程的问题解决方法。它将帮助培养现在和未来的科学家,工程师和程序员,从而帮助保持我们国家在计算,国土和能源安全以及STEM教育方面的领导地位。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ganesh Gopalakrishnan其他文献
Binary Decision Diagrams as Minimal DFA
- DOI:
10.1201/9781315148175-20 - 发表时间:
2019-03 - 期刊:
- 影响因子:0
- 作者:
Ganesh Gopalakrishnan - 通讯作者:
Ganesh Gopalakrishnan
FTTN: Feature-Targeted Testing for Numerical Properties of NVIDIA & AMD Matrix Accelerators
FTTN:针对 NVIDIA 数值特性的特征测试
- DOI:
10.48550/arxiv.2403.00232 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Xinyi Li;Ang Li;Bo Fang;Katarzyna Swirydowicz;Ignacio Laguna;Ganesh Gopalakrishnan - 通讯作者:
Ganesh Gopalakrishnan
Observations and modeling of symmetric instability in the ocean interior in the Northwestern Equatorial Pacific
- DOI:
https://doi.org/10.1038/s43247-022-00362-4 - 发表时间:
2022 - 期刊:
- 影响因子:7.9
- 作者:
Hui Zhou;William K. Dewar;Wenlong Yang;Hengchang Liu;Xu Chen;Rui Li;Chuanyu Liu;Ganesh Gopalakrishnan - 通讯作者:
Ganesh Gopalakrishnan
Retroperitoneal lymphatics on CT and MR
- DOI:
10.1007/s00261-006-9036-9 - 发表时间:
2006-08-31 - 期刊:
- 影响因子:2.200
- 作者:
Shalini Govil;Asha Justus;Raghuram Lakshminarayanan;Sukria Nayak;Antony Devasia;Ganesh Gopalakrishnan - 通讯作者:
Ganesh Gopalakrishnan
Observations and modeling of symmetric instability in the ocean interior in the Northwestern Equatorial Pacific
西北赤道太平洋海洋内部对称不稳定性的观测和模拟
- DOI:
10.1038/s43247-022-00362-4 - 发表时间:
2022-02 - 期刊:
- 影响因子:7.9
- 作者:
Hui Zhou;William K. Dewar;Wenlong Yang;Hengchang Liu;Xu Chen;Rui Li;Chuanyu Liu;Ganesh Gopalakrishnan - 通讯作者:
Ganesh Gopalakrishnan
Ganesh Gopalakrishnan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ganesh Gopalakrishnan', 18)}}的其他基金
REU Site: Trust and Reproducibility of Intelligent Computation
REU 站点:智能计算的信任和可重复性
- 批准号:
2244492 - 财政年份:2023
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
FMiTF: Track-2 : Rigorous and Scalable Formal Floating-Point Error Analysis from LLVM
FMiTF:Track-2:来自 LLVM 的严格且可扩展的形式浮点误差分析
- 批准号:
2319507 - 财政年份:2023
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Collaborative Research: FMitF: Track-1: Correctness at Both Ends: Rigorous ML Meets Efficient Sparse Implementations
协作研究:FMitF:Track-1:两端的正确性:严格的 ML 满足高效的稀疏实现
- 批准号:
2124100 - 财政年份:2021
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Practical and Rigorous Correctness Checking and Correctness Preservation for Irregular Parallel Programs
合作研究:SHF:Medium:不规则并行程序的实用且严格的正确性检查和正确性保持
- 批准号:
1956106 - 财政年份:2020
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
FMiTF: Track II: Rigorous and Versatile Float-Point Precision Analysis and Tuning
FMiTF:轨道 II:严格且多功能的浮点精度分析和调整
- 批准号:
1918497 - 财政年份:2019
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
SHF: Small: Indy: Toward Safe and Fast Compiler Flags
SHF:小:Indy:迈向安全快速的编译器标志
- 批准号:
1817073 - 财政年份:2018
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
SHF: Medium: Hierarchical Tuning of Floating-Point Computations
SHF:中:浮点计算的分层调整
- 批准号:
1704715 - 财政年份:2017
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
2017 Software Infrastructure for Sustained Innovation (SI2) Principal Investigator Workshop
2017持续创新软件基础设施(SI2)首席研究员研讨会
- 批准号:
1702722 - 财政年份:2016
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
EAGER: Application-driven Data Precision Selection Methods
EAGER:应用驱动的数据精度选择方法
- 批准号:
1643056 - 财政年份:2016
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
SI2-SSE: Scalable Multifaceted Graphical Processing Unit (GPU) Program Debugging
SI2-SSE:可扩展多方面图形处理单元 (GPU) 程序调试
- 批准号:
1535032 - 财政年份:2015
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
相似海外基金
XPS: EXPL: FP: Collaborative Research: SPANDAN: Scalable Parallel Algorithms for Network Dynamics Analysis
XPS:EXPL:FP:协作研究:SPANDAN:用于网络动态分析的可扩展并行算法
- 批准号:
1924486 - 财政年份:2018
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
XPS: EXPL: Enabling An Ecosystem of Parallel Programming Abstractions
XPS:EXPL:启用并行编程抽象生态系统
- 批准号:
1628929 - 财政年份:2016
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
XPS: EXPL: Cache Management for Data Parallel Architecture
XPS:EXPL:数据并行架构的缓存管理
- 批准号:
1628401 - 财政年份:2016
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
XPS: EXPL: Hippogriff: Efficient Heterogeneous Servers for Data Centers and Cloud Services
XPS:EXPL:Hippogriff:用于数据中心和云服务的高效异构服务器
- 批准号:
1629395 - 财政年份:2016
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
XPS: EXPL: Exploring the Design Space of Augmented Memory Controllers with Native Support for In-Memory Data Storage
XPS:EXPL:探索具有内存数据存储本机支持的增强型内存控制器的设计空间
- 批准号:
1629201 - 财政年份:2016
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
XPS: EXPL: Write Locality Theory and Optimization for Hybrid Memory
XPS:EXPL:混合内存的写入局部性理论和优化
- 批准号:
1629376 - 财政年份:2016
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
XPS: EXPL: DSD: A Memristive Hardware Platform for Large Scale Combinatorial Optimization
XPS:EXPL:DSD:用于大规模组合优化的忆阻硬件平台
- 批准号:
1533762 - 财政年份:2015
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
XPS: EXPL: CCA: Verification and Optimization Tools for Heterogeneous Memory Consistency Models
XPS:EXPL:CCA:异构内存一致性模型的验证和优化工具
- 批准号:
1533837 - 财政年份:2015
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
AitF: EXPL: Collaborative Research: Approximate Discrete Programming for Real-Time Systems
AitF:EXPL:协作研究:实时系统的近似离散编程
- 批准号:
1535902 - 财政年份:2015
- 资助金额:
$ 15万 - 项目类别:
Standard Grant
XPS: EXPL: FP: Symmetric Queries as a Building Block for Efficient Parallel Query Evaluation
XPS:EXPL:FP:对称查询作为高效并行查询评估的构建块
- 批准号:
1606557 - 财政年份:2015
- 资助金额:
$ 15万 - 项目类别:
Standard Grant