权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: PPoSS: LARGE: A Full-Stack Architecture for Sparse Computation

协作研究：PPoSS：LARGE：稀疏计算的全栈架构

基本信息

批准号：
2217099
负责人：
Daniel Sanchez Martin
金额：
$ 225万
依托单位：
Massachusetts Institute of Technology
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-10-01 至 2027-09-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2217099&HistoricalAwards=false
关键词：
Collaborative Research PPoSS LARGE Full

项目摘要

Computer systems have been designed and optimized primarily for dense computations, i.e., those that process regularly structured data. But current systems are ill-suited to sparse computations, i.e., those that process unstructured data. Sparse computations are very common because many relations and interactions are sparse. For example, most people are not friends and most neurons are not directly connected. Sparse computations take advantage of this sparsity by encoding and processing only meaningful relations, such as storing only the non-zero elements of a matrix. These applications are crucial in many domains, like deep learning, data analytics, and scientific computing, but their irregular structure makes them inefficient and hard to scale in currentsystems, wasting billions of dollars yearly. This project aims to redesign the computing stack to provide first-class support for sparse computations. The project's novelties include a full system stack that spans programming languages, compilers, and specialized hardware architectures and large-scale computer systems. The project's impacts include making future parallel systems much more versatile, scalable, energy efficient and easier to program.This project takes a coordinated approach across the system stack to unlock the performance and scalability of sparse computations, because they pose challenges that cannot be addressed at a single layer. For example, sparse computations have a rich space of choices in algorithm, data representation, and schedule, which current languages and compilers cannot capture or optimize properly. The right choice of algorithm and data representation are often unknown in advance and may change at run-time, thwarting the rigid division between current compilers and schedulers. Irregular, data-dependent control and memory accesses stymie compiler analysis, hinder parallelization, make poor use of hardware, and introduce numerous side channels that thwart security. Finally, their data-intensive nature is a poor match to the processors and accelerators pervasive in current clusters and datacenters, which optimize for compute operations rather than to minimize data movement. To tackle these challenges, this project will develop a full system stack spanning domain-specific languages, a tightly integrated compiler and scheduler, and specialized hardware architectures and high-performance, multi-node computer systems and networks. This stack is built around a unifying abstraction, anovel sparse intermediate representation that (1) encodes semantic information on key sparse data structures and their iterations, (2) enables optimizing compiler transformations and dynamic scheduling decisions, and (3) can be easily compiled to parallel architectures, including graphics processing units (GPUs), general-purpose processors, our proposed specialized architecture, and their combination. The full stack will be designed with security at the forefront, leveraging novel cross-layer techniques to achieve secure high performance. This system will be rigorously evaluated using a broad set of sparse applications and at a wide range of system scales, including large-scale clusters with hundreds of GPUs or tens of specialized processors. By innovating across the full software and hardware stack, these techniques will achieve performance, scalability, and efficiency gains that single-layer approaches cannot provide.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

计算机系统主要是针对密集计算而设计和优化的，即，那些处理常规结构化数据的人。但是当前的系统不适合稀疏计算，即，那些处理非结构化数据的人。稀疏计算非常常见，因为许多关系和交互都是稀疏的。例如，大多数人不是朋友，大多数神经元没有直接连接。稀疏计算通过仅编码和处理有意义的关系来利用这种稀疏性，例如仅存储矩阵的非零元素。这些应用程序在许多领域都至关重要，比如深度学习、数据分析和科学计算，但它们不规则的结构使它们效率低下，难以在当前系统中扩展，每年浪费数十亿美元。该项目旨在重新设计计算堆栈，为稀疏计算提供一流的支持。该项目的新颖之处包括一个完整的系统堆栈，涵盖编程语言、编译器、专用硬件架构和大型计算机系统。该项目的影响包括使未来的并行系统更加通用、可扩展、节能和易于编程。该项目在整个系统堆栈中采用协调方法，以释放稀疏计算的性能和可扩展性，因为它们带来了无法在单个层解决的挑战。例如，稀疏计算在算法、数据表示和调度方面具有丰富的选择空间，而当前的语言和编译器无法正确地捕获或优化这些选择空间。算法和数据表示的正确选择通常是事先未知的，并且可能在运行时发生变化，这阻碍了当前编译器和编译器之间的严格划分。不规则的、依赖于数据的控制和内存访问阻碍了编译器的分析，阻碍了并行化，使硬件的利用率很低，并引入了许多阻碍安全性的侧通道。最后，它们的数据密集型本质与当前集群和数据中心中普遍存在的处理器和加速器不匹配，这些处理器和加速器针对计算操作进行了优化，而不是最大限度地减少数据移动。为了应对这些挑战，该项目将开发一个完整的系统堆栈，涵盖特定领域的语言，紧密集成的编译器和调度器，以及专门的硬件架构和高性能，多节点计算机系统和网络。这个堆栈是围绕一个统一的抽象，一个新的稀疏中间表示，（1）编码关键稀疏数据结构及其迭代的语义信息，（2）能够优化编译器转换和动态调度决策，（3）可以很容易地编译到并行架构，包括图形处理单元（GPU），通用处理器，我们提出的专用架构，和他们的组合。整个堆栈的设计将安全性放在首位，利用新颖的跨层技术来实现安全的高性能。该系统将使用广泛的稀疏应用程序和广泛的系统规模进行严格评估，包括具有数百个GPU或数十个专用处理器的大规模集群。通过在整个软件和硬件堆栈中进行创新，这些技术将实现单层方法无法提供的性能、可扩展性和效率提升。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（7）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

SecureLoop: Design Space Exploration of Secure DNN Accelerators

DOI：
10.1145/3613424.3614273
发表时间：
2023-10
期刊：
2023 56th IEEE/ACM International Symposium on Microarchitecture (MICRO)
影响因子：
0
作者：
Kyungmi Lee;Mengjia Yan;J. Emer;A. Chandrakasan
通讯作者：
Kyungmi Lee;Mengjia Yan;J. Emer;A. Chandrakasan

The Sparse Abstract Machine

DOI：
10.1145/3582016.3582051
发表时间：
2022-08
期刊：
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3
影响因子：
0
作者：
Olivia Hsu;Maxwell Strange;Jaeyeon Won;Ritvik Sharma;K. Olukotun;J. Emer;M. Horowitz;Fredrik Kjolstad
通讯作者：
Olivia Hsu;Maxwell Strange;Jaeyeon Won;Ritvik Sharma;K. Olukotun;J. Emer;M. Horowitz;Fredrik Kjolstad

Metior: A Comprehensive Model to Evaluate Obfuscating Side-Channel Defense Schemes

Metior：评估混淆侧通道防御方案的综合模型

DOI：
10.1145/3579371.3589073
发表时间：
2023
期刊：
49th Annual International Symposium on Computer Architecture
影响因子：
0
作者：
Deutsch, Peter W.;Na, Weon Taek;Bourgeat, Thomas;Emer, Joel S.;Yan, Mengjia
通讯作者：
Yan, Mengjia

Spatula: A Hardware Accelerator for Sparse Matrix Factorization

DOI：
10.1145/3613424.3623783
发表时间：
2023-10
期刊：
2023 56th IEEE/ACM International Symposium on Microarchitecture (MICRO)
影响因子：
0
作者：
Axel Feldmann;Daniel Sanchez
通讯作者：
Axel Feldmann;Daniel Sanchez

ISOSceles: Accelerating Sparse CNNs through Inter-Layer Pipelining

DOI：
10.1109/hpca56546.2023.10071080
发表时间：
2023-02
期刊：
2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)
影响因子：
0
作者：
Yifan Yang;J. Emer;Daniel S. Sanchez
通讯作者：
Yifan Yang;J. Emer;Daniel S. Sanchez