Collaborative Research: SHF:SMALL: Compile-Parallelize-Schedule-Retarget-Repeat (EASER) Paradigm for Dealing with Extreme Heterogeneity

合作研究:SHF:SMALL:处理极端异构性的编译-并行化-调度-重定向-重复 (EASER) 范式

基本信息

  • 批准号:
    2146852
  • 负责人:
  • 金额:
    $ 25万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-06-15 至 2023-07-31
  • 项目状态:
    已结题

项目摘要

Heterogeneity in computing refers to having a variety of devices present within one computing system or even within one node of a cluster. A number of technological trends are making a high degree of heterogeneity inevitable in High Performance Computing (HPC), leading to research along many directions. The traditional scheduling problem, which refers to taking a set of programs to be executed and mapping them to the available resources, becomes more complicated in the presence of such heterogeneity, as the schedulers need to interact with the compiler also. The goal of this project is to consider new paradigms for application execution in view of these developments and conduct research in developing predictions of execution times, compilation, parallelization, and scheduling. Traditionally, deciding (likely manually) how an application is to be parallelized, compilation, and cluster-level scheduling are done sequentially and independently. The investigators posit that their isolated treatment is not going to be acceptable when one tries to optimize for multi-tenant heterogeneous clusters. Instead, the investigators envision a requirement that can be referred to as EASER -- compilE-pArallelize-Schedule-rEtarget-Repeat. To elaborate on the vision, in the EASER paradigm the compiler first maps the core functions to a specific device, generating predictions of execution time that are input to the parallelization approach selection module, and together they produce a final executable. Subsequently, this binary is presented to the scheduler, which assesses the job queue and might suggest alternative configuration(s)/device(s). If so, a retargeting module is to be invoked, leading to a potential repetition of the above steps. This project develops, supports, and evaluates the EASER framework in the context of a cluster that executes emerging machine learning (ML) workloads. Research is proposed in the following areas: 1) Compiler-Driven Performance Prediction -- It includes a novel strategy that comprises a general model for predicting SIMD/VLIW performance and an operator classification based approach to developing a memory hierarchy performance model. 2) Integrated Job Scheduling and Parallelization Strategy Selection -- Building on the performance prediction models, these two (conventionally independent) modules are integrated, by including parameterized and incremental parallelization strategy selection methods and aggressively reducing the search space in scheduling methods. 3) Retargeting Compiler -- By classifying optimizations as either architecture-dependent or independent, a retargeting compiler for ML workloads will be developed. This project will also make several contributions to education and human resource development. Both investigators will be introducing course(s) (material) at the intersection of computer systems and machine learning, bringing attention to ML-related workloads in computer systems education. A majority of funds at each University will be used to support Ph.D. students in their research, who will be trained to work across traditional (sub-) areas. Both investigators are strongly committed to increasing diversity in computing fields and have a strong record of supervising members of underrepresented groups in their research programs. Building on their Universities' existing connections, they will be further working on improving diversity at all levels.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
计算中的异构是指在一个计算系统内或甚至在集群的一个节点内存在各种设备。许多技术趋势使得高性能计算(HPC)中的高度异构性不可避免,从而导致研究沿着多个方向进行。 传统的调度问题,这是指采取一组要执行的程序,并将它们映射到可用的资源,变得更加复杂,在这种异构性的存在,因为编译器也需要与编译器进行交互。该项目的目标是考虑这些发展的应用程序执行的新范例,并在开发执行时间,编译,并行化和调度的预测进行研究。 传统上,决定(可能是手动)如何并行化应用程序,编译和集群级调度是顺序和独立完成的。研究人员认为,当人们试图优化多租户异构集群时,他们的孤立处理是不可接受的。相反,研究人员设想了一个可以称为EASER的要求-- compilE-pArallelize-Schedule-rEtarget-Repeat。为了详细说明这一愿景,在EASER范式中,编译器首先将核心函数映射到特定设备,生成输入到并行化方法选择模块的执行时间预测,并一起生成最终的可执行文件。随后,该二进制文件被提交给调度程序,调度程序评估作业队列并可能建议替代配置/设备。如果是,则将调用重定向模块,从而导致上述步骤的潜在重复。 该项目在执行新兴机器学习(ML)工作负载的集群环境中开发,支持和评估EASER框架。本文提出了以下几个方面的研究内容:1)基于操作符驱动的性能预测--提出了一种新的策略,该策略包括一个预测SIMD/VLIW性能的通用模型和一个基于操作符分类的开发存储器层次性能模型的方法。2)集成作业调度和并行化策略选择--在性能预测模型的基础上,通过引入参数化和增量式并行化策略选择方法,并积极减少调度方法中的搜索空间,将这两个(传统上独立的)模块集成起来。3)重定向编译器-通过将优化分类为依赖于架构或独立,将开发用于ML工作负载的重定向编译器。 该项目还将对教育和人力资源开发作出若干贡献。两位研究人员将在计算机系统和机器学习的交叉点上介绍课程(材料),引起人们对计算机系统教育中ML相关工作量的关注。每所大学的大部分资金将用于支持博士学位。学生在他们的研究,谁将接受培训,跨传统(子)领域的工作。两位研究人员都坚定地致力于增加计算领域的多样性,并在其研究项目中监督代表性不足的群体成员方面有着良好的记录。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
End-to-End LU Factorization of Large Matrices on GPUs
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Gagan Agrawal其他文献

MMIS-07, 08: Mining Multiple Information Sources Workshop Report
MMIS-07, 08:挖掘多信息源研讨会报告
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
    朱兴全;Gagan Agrawal;Yuri Breitbart;Ruoming Jin
  • 通讯作者:
    Ruoming Jin
Middleware for data mining applications on clusters and grids
  • DOI:
    10.1016/j.jpdc.2007.06.007
  • 发表时间:
    2008-01-01
  • 期刊:
  • 影响因子:
  • 作者:
    Leonid Glimcher;Ruoming Jin;Gagan Agrawal
  • 通讯作者:
    Gagan Agrawal
<strong>POSTER:</strong> MDS-044 Cancer Disparities in Survival of Patients With Hematologic Malignancies in the Context of Social Determinants of Health: A Systematic Review
  • DOI:
    10.1016/s2152-2650(23)00577-3
  • 发表时间:
    2023-09-01
  • 期刊:
  • 影响因子:
  • 作者:
    Marisol Miranda-Galvis;Kellen Tjioe;Andrew Balas;Gagan Agrawal;Jorge Cortes
  • 通讯作者:
    Jorge Cortes
Organizing Records for Retrieval in Multi-Dimensional Range Searchable Encryption
多维范围可搜索加密中组织检索记录
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Mahdieh Heidaripour;Ladan Kian;Maryam Rezapour;Mark Holcomb;Benjamin Fuller;Gagan Agrawal;Hoda Maleki
  • 通讯作者:
    Hoda Maleki
The interaction between social determinants of health and cervical cancer survival: A systematic review
健康的社会决定因素与宫颈癌生存之间的相互作用:系统评价
  • DOI:
    10.1016/j.ygyno.2023.12.020
  • 发表时间:
    2024-02-01
  • 期刊:
  • 影响因子:
    4.100
  • 作者:
    Kellen Cristine Tjioe;Marisol Miranda-Galvis;Marian Symmes Johnson;Gagan Agrawal;E. Andrew Balas;Jorge E. Cortes
  • 通讯作者:
    Jorge E. Cortes

Gagan Agrawal的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Gagan Agrawal', 18)}}的其他基金

Collaborative Research: CNS Core: Small: A Compilation System for Mapping Deep Learning Models to Tensorized Instructions (DELITE)
合作研究:CNS Core:Small:将深度学习模型映射到张量化指令的编译系统(DELITE)
  • 批准号:
    2230945
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: CNS Core: Small: A Compilation System for Mapping Deep Learning Models to Tensorized Instructions (DELITE)
合作研究:CNS Core:Small:将深度学习模型映射到张量化指令的编译系统(DELITE)
  • 批准号:
    2341378
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
OAC Core: SHF: SMALL: ICURE -- In-situ Analytics with Compressed or Summary Representations for Extreme-Scale Architectures
OAC 核心:SHF:SMALL:ICURE——针对超大规模架构的压缩或摘要表示的原位分析
  • 批准号:
    2333899
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
SHF: Small: K-Way Speculation for Mapping Applications with Dependencies on Modern HPC Systems
SHF:小型:依赖现代 HPC 系统的地图应用程序的 K-Way 推测
  • 批准号:
    2334273
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF:SMALL: Compile-Parallelize-Schedule-Retarget-Repeat (EASER) Paradigm for Dealing with Extreme Heterogeneity
合作研究:SHF:SMALL:处理极端异构性的编译-并行化-调度-重定向-重复 (EASER) 范式
  • 批准号:
    2333895
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
OAC Core: SHF: SMALL: ICURE -- In-situ Analytics with Compressed or Summary Representations for Extreme-Scale Architectures
OAC 核心:SHF:SMALL:ICURE——针对超大规模架构的压缩或摘要表示的原位分析
  • 批准号:
    2007775
  • 财政年份:
    2020
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
OAC Core: SHF: SMALL: ICURE -- In-situ Analytics with Compressed or Summary Representations for Extreme-Scale Architectures
OAC 核心:SHF:SMALL:ICURE——针对超大规模架构的压缩或摘要表示的原位分析
  • 批准号:
    2034850
  • 财政年份:
    2020
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
SHF: Small: K-Way Speculation for Mapping Applications with Dependencies on Modern HPC Systems
SHF:小型:依赖于现代 HPC 系统的地图应用程序的 K-Way 推测
  • 批准号:
    2007793
  • 财政年份:
    2020
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
II-New: Infrastructure for Energy-Aware High Performance Computing (HPC) and Data Analytics on Heterogeneous Systems
II-新:异构系统上的能源感知高性能计算 (HPC) 和数据分析基础设施
  • 批准号:
    1513120
  • 财政年份:
    2015
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
SI2-SSE: Collaborative Research: Software Elements for Transfer and Analysis of Large-Scale Scientific Data
SI2-SSE:协作研究:用于大规模科学数据传输和分析的软件元素
  • 批准号:
    1339757
  • 财政年份:
    2013
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant

相似国自然基金

水凝胶改性陶瓷人工关节牢固结合界面的构筑与减磨润滑机理研究
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
锆酸铅基反铁电体畴动力学及其调控机理研究
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
载铁生物炭对土壤镉污染的吸附固定及微生物协同作用机制研究
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
SREBP转录因子BbSre1负调控球孢白僵菌抗真菌物质产生的机制研究
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
面向截肢患者运动感知重建的肌电假肢手关节运动反馈时变编码研究
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
面向水质应急快检的碳点/微流控限域增强发光传感研究
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
面向挠性压电太阳翼的物理信息混合建模与非同位控制方法研究
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
随机3维 Burgers 方程正则性研究
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
犬尿氨酸通过AhR/STAT3轴活化粒细胞样MDSCs促进慢性肾脏病心脏纤维化的机制研究
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
磁性的机器学习研究: 以图神经网络为中心
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目

相似海外基金

Collaborative Research: SHF: Small: LEGAS: Learning Evolving Graphs At Scale
协作研究:SHF:小型:LEGAS:大规模学习演化图
  • 批准号:
    2331302
  • 财政年份:
    2024
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Small: LEGAS: Learning Evolving Graphs At Scale
协作研究:SHF:小型:LEGAS:大规模学习演化图
  • 批准号:
    2331301
  • 财政年份:
    2024
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Differentiable Hardware Synthesis
合作研究:SHF:媒介:可微分硬件合成
  • 批准号:
    2403134
  • 财政年份:
    2024
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Small: Efficient and Scalable Privacy-Preserving Neural Network Inference based on Ciphertext-Ciphertext Fully Homomorphic Encryption
合作研究:SHF:小型:基于密文-密文全同态加密的高效、可扩展的隐私保护神经网络推理
  • 批准号:
    2412357
  • 财政年份:
    2024
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Enabling Graphics Processing Unit Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的图形处理单元性能仿真
  • 批准号:
    2402804
  • 财政年份:
    2024
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Tiny Chiplets for Big AI: A Reconfigurable-On-Package System
合作研究:SHF:中:用于大人工智能的微型芯片:可重新配置的封装系统
  • 批准号:
    2403408
  • 财政年份:
    2024
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Toward Understandability and Interpretability for Neural Language Models of Source Code
合作研究:SHF:媒介:实现源代码神经语言模型的可理解性和可解释性
  • 批准号:
    2423813
  • 财政年份:
    2024
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Enabling GPU Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的 GPU 性能仿真
  • 批准号:
    2402806
  • 财政年份:
    2024
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Differentiable Hardware Synthesis
合作研究:SHF:媒介:可微分硬件合成
  • 批准号:
    2403135
  • 财政年份:
    2024
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Tiny Chiplets for Big AI: A Reconfigurable-On-Package System
合作研究:SHF:中:用于大人工智能的微型芯片:可重新配置的封装系统
  • 批准号:
    2403409
  • 财政年份:
    2024
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了