权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

OAC: Small: Data Locality Optimization for Sparse Matrix/Tensor Computations

OAC：小型：稀疏矩阵/张量计算的数据局部性优化

基本信息

批准号：
2009007
负责人：
Ponnuswamy Sadayappan
金额：
$ 49.94万
依托单位：
University of Utah
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2020
资助国家：
美国
起止时间：
2020-07-01 至 2024-06-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2009007&HistoricalAwards=false
关键词：
OAC Small Data Locality Optimization

项目摘要

The cost of data movement vastly exceeds the cost of execution of arithmetic operations on current computers and the imbalance is only expected to get worse. Hence the minimization of data movement in the implementation of algorithms is critical. Tiling is a well known technique for data-locality optimization and is widely used in compilers as well as high-performance numerical libraries for dense matrix/tensor computations. However, data-locality optimization for sparse computations is a significant challenge, in large part because the data access patterns are not known a priori. This project proposes a plan of research to systematically explore a number of issues pertaining to data-locality optimization for sparse matrix/tensor computations. The project identifies an important subclass of sparse computations used in machine learning and data analytics, and proposes tools and techniques to enable high-performance parallel implementations on multicore CPUs and GPUs. The broader impact of the project will be the enhancement of programmer productivity and the enabling of software portability and high performance for applications in data analytics and machine learning.The challenge of data-locality optimization for the data-dependent and irregular access patterns that occur with sparse matrix/tensor computations will be addressed through research along multiple directions: 1) Compact signatures for sparse matrices: the strong relationship between the data access patterns for key sparse matrix primitives of use in machine learning and data analytics drives the development of one-dimensional signature vectors that capture the essential characteristics of the two-dimensional sparsity pattern as it pertains to needed data movement in a memory hierarchy; 2) Sparse tiling: Sparse matrix signature vectors will serve as a basis for dynamic decisions based on target platform characteristics, for tile size selection and scheduling of tiles for load-balanced execution; 3) Matrix renumbering/reordering: The impact of row/column reordering on the performance of sparse matrix primitives will be investigated, and new reordering schemes will be devised to enhance data-locality for key sparse matrix/tensor primitives; 4) Sparse microkernels: Microkernels will be developed and optimized for CPUs/GPUs, and used as the lowest-level building blocks that execute the innermost tiles in the tiled execution of sparse matrix/tensor computations; 5) Architecture-aware performance prediction: Models will be developed that combine analysis of predicted data-movement volume in combination with machine learning using algorithmic and architectural features.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

数据移动的成本大大超过了当前计算机上执行算术运算的成本，而且这种不平衡只会变得更糟。因此，在算法的实现中最小化数据移动是至关重要的。平铺是一种众所周知的数据局部优化技术，广泛用于编译器以及高性能数值库，用于密集矩阵/张量计算。然而，稀疏计算的数据局部优化是一个重大的挑战，在很大程度上是因为数据访问模式是未知的先验。该项目提出了一个研究计划，系统地探讨了一些与稀疏矩阵/张量计算的数据局部优化有关的问题。该项目确定了用于机器学习和数据分析的稀疏计算的一个重要子类，并提出了在多核CPU和GPU上实现高性能并行实现的工具和技术。该项目的更广泛影响将是提高程序员的生产力，并使软件的可移植性和高性能的应用程序在数据分析和机器学习。数据局部性优化的挑战，数据依赖和不规则的访问模式，发生稀疏矩阵/张量计算将通过研究解决沿着多个方向：1）紧凑的签名稀疏矩阵：在机器学习和数据分析中使用的关键稀疏矩阵基元的数据访问模式之间的强关系驱动一维签名向量的开发，该一维签名向量捕获二维稀疏模式的基本特征，因为它与存储器层次结构中所需的数据移动有关; 2）稀疏平铺：稀疏矩阵签名向量将用作基于目标平台特性的动态决策的基础，用于瓦片大小选择和瓦片调度以用于负载平衡执行; 3）矩阵重新编号/重新排序：将研究行/列重排序对稀疏矩阵基元性能的影响，并将设计新的重排序方案以增强关键稀疏矩阵/张量基元的数据局部性; 4）稀疏微内核：微内核将针对CPU/GPU进行开发和优化，并用作最低级别的构建块，在稀疏矩阵/张量计算的分块执行中执行最内层的分块; 5）架构感知性能预测：开发的模型将结合预测数据的联合收割机分析-该奖项反映了NSF的法定使命，并通过评估被认为值得支持。使用基金会的知识价值和更广泛的影响审查标准。

项目成果

期刊论文数量（3）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Sparsity-Aware Tensor Decomposition

稀疏感知张量分解

DOI：
10.1109/ipdps53621.2022.00097
发表时间：
2022
期刊：
2022 IEEE International Parallel and Distributed Processing Symposium
影响因子：
0
作者：
Kurt, Sureyya Emre;Raje, Saurabh;Sukumaran-Rajam, Aravind;Sadayappan, P.
通讯作者：
Sadayappan, P.

Communication Optimization for Distributed Execution of Graph Neural Networks

DOI：
10.1109/ipdps54959.2023.00058
发表时间：
2023-05
期刊：
2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
影响因子：
0
作者：
Süreyya Emre Kurt;Jinghua Yan;Aravind Sukumaran-Rajam;Prashant Pandey;P. Sadayappan
通讯作者：
Süreyya Emre Kurt;Jinghua Yan;Aravind Sukumaran-Rajam;Prashant Pandey;P. Sadayappan

Efficient Tiled Sparse Matrix Multiplication through Matrix Signatures

DOI：
10.1109/sc41405.2020.00091
发表时间：
2020-11
期刊：
SC20: International Conference for High Performance Computing, Networking, Storage and Analysis
影响因子：
0
作者：
Süreyya Emre Kurt;Aravind Sukumaran-Rajam;F. Rastello;P. Sadayappan
通讯作者：
Süreyya Emre Kurt;Aravind Sukumaran-Rajam;F. Rastello;P. Sadayappan