Collaborative Research: OAC Core: Enabling Extremely Fine-grained Parallelism on Modern Many-core Architectures
合作研究:OAC Core:在现代多核架构上实现极其细粒度的并行性
基本信息
- 批准号:2107283
- 负责人:
- 金额:$ 16.63万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-07-01 至 2024-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Computer systems are becoming increasingly complex: multisocket systems with many-core processors and general graphic processors have the potential to address the needs of demanding applications at the node level. Programmability and efficiency are often not easy to find together due to the hardware growing several orders of magnitude in degree of parallelism to thousands of computing units on a chip. Task parallelism is an important type of parallelism in which computation is broken down into a set of inter-dependent tasks which can be executed concurrently on various computing units. To achieve strong scaling and high levels of effective parallelism, there is a growing need in today's parallel languages with supporting over-decomposition (many more tasks than cores) in order to improve performance, hide latency caused by blocking operations, and otherwise achieve maximum speedup. By enabling the efficient support of fine-grained parallelism across the growing range of scales seen in modern and future hardware, it is expected that the productivity of parallel programmers will be enhanced. Trends show evidence that most of the Top500 high-performance computing systems will likely employ hardware that this work directly targets. The project aims to conduct a high-impact education program in distributed parallel programming with broad reach, encouraging student internships grounded in real-world challenges, and paving the way for technology transfer from research to open-source projects. Special emphasis is placed on engaging women and underrepresented minorities. This education facet will create a new and more accessible foundation for fluency in parallel computing for scientists and engineers.This work explores novel data-structures and algorithms that allow for scalable runtime and execution models for fine-grained parallelism at sub-microsecond timescales. Preliminary work by the PIs at the language and runtime levels suggests a path to achieving this. The project objectives are: 1) unifying runtime enabling task granularities measured in cycles: design, analysis, and implementation of building blocks for efficient fine-grained computing on diverse node hardware; 2) evaluating performance of these building blocks in the context of real parallel systems and application kernels on a range of computer architectures; 3) measuring performance and scalability impact of runtime on benchmark kernels and real applications; and 4) integrating this research with education programs from undergraduate to graduate levels through new course material on parallel computing. This high-risk/high-reward research is geared towards yielding transformative improvements in the ease and efficiency of programming parallel machines at every scale. The contributions lie in the realization of productive, implicitly parallel high-level languages optimized for single node deployments with many-core architectures to support fine-grained parallelism measured in cycles, enabling an entirely new class of many-task computing applications. The dataflow architecture makes implicit parallelism tractable with a programming model whose impact could rival that of MATLAB, R, and Python, with the added benefit that the same code could also run in a distributed system or large-scale HPC systems. Thus, the scientist would be able to write a program once, run it at any suitable scale, and have it seamlessly use the most appropriate granularity for each component of the hardware. This work’s innovations in dataflow architecture will be broadly applicable to a number of existing parallel programming systems such as OpenMP, Swift/Parsl, and CUDA/OpenCL, in terms of both efficiency in executing fine grained parallelism and adding support for implicit parallelism where possible. Target hardware includes Intel/AMD x86, ThunderX/2 ARM, IBM Power9, and NVIDIA/AMD GPUs.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
计算机系统正变得越来越复杂:具有众核处理器和通用图形处理器的多插槽系统有可能满足节点级苛刻应用的需求。可编程性和效率通常不容易同时找到,这是由于硬件在并行度上增长了几个数量级,以达到芯片上的数千个计算单元。任务并行是一种重要的并行类型,其中计算被分解为一组相互依赖的任务,这些任务可以在不同的计算单元上并发执行。为了实现强大的可伸缩性和高水平的有效并行性,当今的并行语言越来越需要支持过度分解(比核心多得多的任务),以提高性能,隐藏由阻塞操作引起的延迟,并以其他方式实现最大加速。通过在现代和未来硬件中看到的越来越大的规模范围内实现对细粒度并行的有效支持,预计并行程序员的生产力将得到提高。趋势表明,大多数Top500高性能计算系统可能会采用这项工作直接针对的硬件。该项目旨在开展具有广泛影响力的分布式并行编程教育计划,鼓励学生在现实世界的挑战中实习,并为从研究到开源项目的技术转移铺平道路。特别强调让妇女和代表性不足的少数群体参与。这方面的教育将创造一个新的和更容易获得的基础,在并行计算的科学家和工程师的流畅性。这项工作探索新的数据结构和算法,允许可扩展的运行时和执行模型的细粒度并行在亚微秒的时间尺度。PI在语言和运行时级别的初步工作提出了实现这一目标的途径。该项目的目标是:1)统一运行时使能以周期测量的任务粒度:设计、分析和实现用于在不同节点硬件上进行高效细粒度计算的构建块; 2)在一系列计算机体系结构上的真实的并行系统和应用内核的上下文中评估这些构建块的性能; 3)测量运行时对基准内核和真实的应用的性能和可伸缩性影响;以及4)通过新的并行计算课程材料将本研究与从本科到研究生水平的教育计划相结合。这项高风险/高回报的研究旨在为各种规模的并行机编程的易用性和效率带来变革性的改进。贡献在于实现生产力,隐式并行高级语言优化的单节点部署与众核架构,以支持细粒度的并行周期测量,使一个全新的类的多任务计算应用程序。该架构使隐式并行性易于处理,其编程模型的影响可以与MATLAB,R和Python相媲美,其额外的好处是相同的代码也可以在分布式系统或大规模HPC系统中运行。因此,科学家将能够编写一个程序,以任何合适的规模运行它,并使它无缝地为硬件的每个组件使用最合适的粒度。这项工作的创新,将广泛适用于一些现有的并行编程系统,如OpenMP,Swift/Parsl和CUDA/OpenCL,在执行细粒度并行和增加支持隐式并行的效率方面。目标硬件包括Intel/AMD x86、ThunderX/2 ARM、IBM Power 9和NVIDIA/AMD GPU。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Enabling Extremely Fine-grained Parallelism via Scalable Concurrent Queues on Modern Many-core Architectures
通过现代多核架构上的可扩展并发队列实现极其细粒度的并行性
- DOI:10.1109/mascots53633.2021.9614292
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Nookala, Poornima;Dinda, Peter;Hale, Kyle C.;Chard, Kyle;Raicu, Ioan
- 通讯作者:Raicu, Ioan
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Kyle Chard其他文献
Walking the cost-accuracy tightrope: balancing trade-offs in data-intensive genomics
走成本准确性钢丝:平衡数据密集型基因组学的权衡
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
K. Leung;M. Kimball;Jason Pitt;A. Woodard;Kyle Chard - 通讯作者:
Kyle Chard
GreenFaaS: Maximizing Energy Efficiency of HPC Workloads with FaaS
GreenFaaS:利用 FaaS 最大限度提高 HPC 工作负载的能源效率
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Alok V. Kamatar;Valerie Hayot;Y. Babuji;André Bauer;Gourav Rattihalli;Ninad Hogade;D. Milojicic;Kyle Chard;Ian Foster - 通讯作者:
Ian Foster
Enabling real-time multi-messenger astrophysics discoveries with deep learning
利用深度学习实现实时多信使天体物理学发现
- DOI:
10.1038/s42254-019-0097-4 - 发表时间:
2019-10-03 - 期刊:
- 影响因子:39.500
- 作者:
E. A. Huerta;Gabrielle Allen;Igor Andreoni;Javier M. Antelis;Etienne Bachelet;G. Bruce Berriman;Federica B. Bianco;Rahul Biswas;Matias Carrasco Kind;Kyle Chard;Minsik Cho;Philip S. Cowperthwaite;Zachariah B. Etienne;Maya Fishbach;Francisco Forster;Daniel George;Tom Gibbs;Matthew Graham;William Gropp;Robert Gruendl;Anushri Gupta;Roland Haas;Sarah Habib;Elise Jennings;Margaret W. G. Johnson;Erik Katsavounidis;Daniel S. Katz;Asad Khan;Volodymyr Kindratenko;William T. C. Kramer;Xin Liu;Ashish Mahabal;Zsuzsa Marka;Kenton McHenry;J. M. Miller;Claudia Moreno;M. S. Neubauer;Steve Oberlin;Alexander R. Olivas;Donald Petravick;Adam Rebei;Shawn Rosofsky;Milton Ruiz;Aaron Saxton;Bernard F. Schutz;Alex Schwing;Ed Seidel;Stuart L. Shapiro;Hongyu Shen;Yue Shen;Leo P. Singer;Brigitta M. Sipocz;Lunan Sun;John Towns;Antonios Tsokaros;Wei Wei;Jack Wells;Timothy J. Williams;Jinjun Xiong;Zhizhen Zhao - 通讯作者:
Zhizhen Zhao
Unveiling Temporal Performance Deviation: Leveraging Clustering in Microservices Performance Analysis
揭示时间性能偏差:在微服务性能分析中利用集群
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
André Bauer;Timo Dittus;Martin Straesser;Alok V. Kamatar;Matt Baughman;Lukas Beierlieb;Marius Hadry;Daniel Grillmeyer;Yannik Lubas;Samuel Kounev;Ian Foster;Kyle Chard - 通讯作者:
Kyle Chard
A terminology for scientific workflow systems
科学工作流系统的术语
- DOI:
10.1016/j.future.2025.107974 - 发表时间:
2026-01-01 - 期刊:
- 影响因子:6.100
- 作者:
Frédéric Suter;Tainã Coleman;İlkay Altintaş;Rosa M. Badia;Bartosz Balis;Kyle Chard;Iacopo Colonnelli;Ewa Deelman;Paolo Di Tommaso;Thomas Fahringer;Carole Goble;Shantenu Jha;Daniel S. Katz;Johannes Köster;Ulf Leser;Kshitij Mehta;Hilary Oliver;J.-Luc Peterson;Giovanni Pizzi;Loïc Pottier;Rafael Ferreira da Silva - 通讯作者:
Rafael Ferreira da Silva
Kyle Chard的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Kyle Chard', 18)}}的其他基金
Collaborative Research: Frameworks: Diamond: Democratizing Large Neural Network Model Training for Science
合作研究:框架:钻石:科学大型神经网络模型训练的民主化
- 批准号:
2311769 - 财政年份:2023
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Collaborative Research: REU Site: BigDataX: From theory to practice in Big Data computing at eXtreme scales
合作研究:REU 网站:BigDataX:极限规模大数据计算从理论到实践
- 批准号:
2150501 - 财政年份:2022
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Collaborative Research: Sustainability: A Community-Centered Approach for Supporting and Sustaining Parsl
合作研究:可持续性:以社区为中心的支持和维持 Parsl 的方法
- 批准号:
2209919 - 财政年份:2022
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Frameworks: Collaborative Research: ChronoLog: A High-Performance Storage Infrastructure for Activity and Log Workloads
框架:协作研究:ChronoLog:用于活动和日志工作负载的高性能存储基础架构
- 批准号:
2104008 - 财政年份:2021
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
CCRI: Planning: Collaborative Research: Infrastructure for Enabling Systematic Development and Research of Scientific Workflow Management Systems
CCRI:规划:协作研究:支持科学工作流程管理系统系统开发和研究的基础设施
- 批准号:
2016682 - 财政年份:2020
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
CSR: Small: Cost-Aware Cloud Profiling, Prediction, and Provisioning as a Service
CSR:小:具有成本意识的云分析、预测和配置即服务
- 批准号:
1816611 - 财政年份:2018
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
REU Site: Collaborative Research: BigDataX: From theory to practice in Big Data computing at eXtreme scales
REU 网站:协作研究:BigDataX:极限规模大数据计算从理论到实践
- 批准号:
1757970 - 财政年份:2018
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Collaborative Research: SI2-SSI: Swift/E: Integrating Parallel Scripted Workflow into the Scientific Software Ecosystem
协作研究:SI2-SSI:Swift/E:将并行脚本工作流程集成到科学软件生态系统中
- 批准号:
1550588 - 财政年份:2016
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: OAC CORE: Federated-Learning-Driven Traffic Event Management for Intelligent Transportation Systems
合作研究:OAC CORE:智能交通系统的联邦学习驱动的交通事件管理
- 批准号:
2414474 - 财政年份:2024
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Distributed Graph Learning Cyberinfrastructure for Large-scale Spatiotemporal Prediction
合作研究:OAC Core:用于大规模时空预测的分布式图学习网络基础设施
- 批准号:
2403312 - 财政年份:2024
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Large-Scale Spatial Machine Learning for 3D Surface Topology in Hydrological Applications
合作研究:OAC 核心:水文应用中 3D 表面拓扑的大规模空间机器学习
- 批准号:
2414185 - 财政年份:2024
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Learning AI Surrogate of Large-Scale Spatiotemporal Simulations for Coastal Circulation
合作研究:OAC Core:学习沿海环流大规模时空模拟的人工智能替代品
- 批准号:
2402947 - 财政年份:2024
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Distributed Graph Learning Cyberinfrastructure for Large-scale Spatiotemporal Prediction
合作研究:OAC Core:用于大规模时空预测的分布式图学习网络基础设施
- 批准号:
2403313 - 财政年份:2024
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: Learning AI Surrogate of Large-Scale Spatiotemporal Simulations for Coastal Circulation
合作研究:OAC Core:学习沿海环流大规模时空模拟的人工智能替代品
- 批准号:
2402946 - 财政年份:2024
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: CropDL - Scheduling and Checkpoint/Restart Support for Deep Learning Applications on HPC Clusters
合作研究:OAC 核心:CropDL - HPC 集群上深度学习应用的调度和检查点/重启支持
- 批准号:
2403088 - 财政年份:2024
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: CropDL - Scheduling and Checkpoint/Restart Support for Deep Learning Applications on HPC Clusters
合作研究:OAC 核心:CropDL - HPC 集群上深度学习应用的调度和检查点/重启支持
- 批准号:
2403090 - 财政年份:2024
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Collaborative Research: OAC: Core: Harvesting Idle Resources Safely and Timely for Large-scale AI Applications in High-Performance Computing Systems
合作研究:OAC:核心:安全及时地收集闲置资源,用于高性能计算系统中的大规模人工智能应用
- 批准号:
2403399 - 财政年份:2024
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant
Collaborative Research: OAC Core: CropDL - Scheduling and Checkpoint/Restart Support for Deep Learning Applications on HPC Clusters
合作研究:OAC 核心:CropDL - HPC 集群上深度学习应用的调度和检查点/重启支持
- 批准号:
2403089 - 财政年份:2024
- 资助金额:
$ 16.63万 - 项目类别:
Standard Grant