权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Achieving Real-Time Machine Learning with Sparsification-Compilation Co-design

职业：通过稀疏编译协同设计实现实时机器学习

基本信息

批准号：
2047516
负责人：
Bin Ren
金额：
$ 49.37万
依托单位：
College of William and Mary
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-10-01 至 2026-09-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2047516&HistoricalAwards=false
关键词：
CAREER Achieving Real Time Machine

项目摘要

Machine Learning (ML), particularly Deep Learning (DL), has gained great success in recent years, especially with the use of Deep Neural Networks (DNNs) of different types. Varied DNNs serve as the state-of-the-art foundation and core enabler of many key applications, such as robotics, high-quality video stream processing, augmented reality, wearable devices, smart health devices, etc. Achieving high accuracy typically requires DNNs with large and complex model structures, which also translates into high computing requirements for both training and inference steps. Accelerating the training process on a modern High-Performance Computing (HPC) node or cluster and inference process on a lower-end power-efficient device have both emerged as major challenges. This project focuses on this problem, viewing DNN training and inference as HPC workloads that need to exploit available multi-level parallelism, complex memory hierarchy, and device heterogeneity; while automating the optimizations through a compiler. If this project succeeds, it will, for the first time, enable real-time machine learning for many edge devices, enabling the greater success of ML-based end applications that are important for the society, economy, and other science and engineering areas. This project will also make several contributions towards both education and improving diversity, including: (1) introducing HPC in an ML course, and ML workloads optimization experience in both undergraduate systems and graduate research courses, particularly with interesting demonstration videos; (2) outreaching to undergraduates with the goal of creating interest in (systems) research, and to K-12 with the goal of attracting underrepresented groups to computer science.The key idea of this project to address the above challenge is sparsification-compilation co-design. It first introduces a general sparsification idea called fine-grained structured pruning, which prunes the weights according to certain fine-grained structures and preserves non-zero weights in a more regular way. Based on this idea, this project designs a high-level abstraction called layer-wise intermediate representation (IR) to capture the sparsity information with the goal of enabling aggressive compiler optimizations. Building on a successful application of this idea on two-dimensional DNNs, this project undertakes a comprehensive agenda to fully apply the benefits of this approach. First, it unifies Convolutional Neural Networks and Recurrent Neural Networks acceleration with a more general fine-grained structured pruning instance and a set of enhanced compiler-based automatic optimizations. Second, it improves the pruning or retraining process itself by extending the compiler optimizations from inference to pruning and exploiting domain properties to carry-out optimized application-level checkpointing. Third, it extends the (compiler automated) optimization framework to support high-dimensional and extremely deep DNNs. Finally, it explores data reuse across DNNs for situations where multiple DNNs are co-executed on the same device.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

机器学习（ML），特别是深度学习（DL），近年来取得了巨大的成功，特别是使用了不同类型的深度神经网络（DNN）。各种DNN是许多关键应用的最先进的基础和核心使能器，如机器人、高质量视频流处理、增强现实、可穿戴设备、智能健康设备等。要实现高精度，通常需要具有大型复杂模型结构的DNN，这也意味着对训练和推理步骤的计算要求很高。加快现代高性能计算（HPC）节点或群集上的训练过程以及低端节能设备上的推理过程都已成为主要挑战。该项目专注于这个问题，将DNN训练和推理视为HPC工作负载，需要利用可用的多级并行性，复杂的内存层次结构和设备异构性;同时通过编译器自动优化。如果该项目成功，它将首次为许多边缘设备实现实时机器学习，使基于ML的终端应用取得更大成功，这些应用对社会、经济和其他科学和工程领域都很重要。该项目还将为教育和提高多样性做出一些贡献，包括：（1）在ML课程中介绍HPC，以及在本科生系统和研究生研究课程中介绍ML工作负载优化经验，特别是有趣的演示视频;（2）向本科生进行外展，目的是培养他们对（系统）研究的兴趣，和K-12的目标是吸引代表性不足的群体到计算机科学。这个项目的核心思想，以解决上述挑战是稀疏化编译协同设计。它首先介绍了一种称为细粒度结构化修剪的稀疏化思想，根据一定的细粒度结构修剪权重，并以更有规律的方式保留非零权重。基于这一思想，该项目设计了一个高层次的抽象层，称为逐层中间表示（IR），以捕获稀疏信息，目标是实现积极的编译器优化。在二维DNN上成功应用这一想法的基础上，该项目开展了一项全面的议程，以充分应用这种方法的好处。首先，它将卷积神经网络和递归神经网络加速与更一般的细粒度结构化修剪实例和一组增强的基于编译器的自动优化相结合。其次，它通过将编译器优化从推理扩展到修剪和利用域属性来执行优化的应用级检查点来改进修剪或再训练过程本身。第三，它扩展了（编译器自动化）优化框架，以支持高维和极深的DNN。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（17）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Toward Efficient Interactions between Python and Native Libraries

实现 Python 和本机库之间的高效交互

DOI：
发表时间：
2021
期刊：
The 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE
影响因子：
0
作者：
Tan, J;Chen, C;Liu, Z;Ren, R;Song, R;Shen, X;Liu, X
通讯作者：
Liu, X

SparCL: Sparse Continual Learning on the Edge

DOI：
10.48550/arxiv.2209.09476
发表时间：
2022-09
期刊：
ArXiv
影响因子：
0
作者：
Zifeng Wang;Zheng Zhan;Yifan Gong;Geng Yuan;Wei Niu;T. Jian;Bin Ren;Stratis Ioannidis;Yanzhi Wang;Jennifer G. Dy
通讯作者：
Zifeng Wang;Zheng Zhan;Yifan Gong;Geng Yuan;Wei Niu;T. Jian;Bin Ren;Stratis Ioannidis;Yanzhi Wang;Jennifer G. Dy

Decentralized Application-Level Adaptive Scheduling for Multi-Instance DNNs on Open Mobile Devices

DOI：
发表时间：
2023
期刊：
影响因子：
0
作者：
Hsin-Hsuan Sung;Jou-An Chen;Weiguo Niu;Jiexiong Guan;Bin Ren;Xipeng Shen
通讯作者：
Hsin-Hsuan Sung;Jou-An Chen;Weiguo Niu;Jiexiong Guan;Bin Ren;Xipeng Shen

Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration

DOI：
10.1145/3495532
发表时间：
2021-11
期刊：
ACM Transactions on Design Automation of Electronic Systems (TODAES)
影响因子：
0
作者：
Yifan Gong;Geng Yuan;Zheng Zhan;Wei Niu;Zhengang Li;Pu Zhao;Yuxuan Cai;Sijia Liu;Bin Ren;Xue Lin;Xulong Tang;Yanzhi Wang
通讯作者：
Yifan Gong;Geng Yuan;Zheng Zhan;Wei Niu;Zhengang Li;Pu Zhao;Yuxuan Cai;Sijia Liu;Bin Ren;Xue Lin;Xulong Tang;Yanzhi Wang

GRIM: A General, Real-Time Deep Learning Inference Framework for Mobile Devices Based on Fine-Grained Structured Weight Sparsity

DOI：
10.1109/tpami.2021.3089687
发表时间：
2021-06
期刊：
IEEE Transactions on Pattern Analysis and Machine Intelligence
影响因子：
23.6
作者：
Wei Niu;Zhengang;Xiaolong Ma;Peiyan Dong;Gang Zhou;Xuehai Qian;Xue Lin;Yanzhi Wang;Bin Ren
通讯作者：
Wei Niu;Zhengang;Xiaolong Ma;Peiyan Dong;Gang Zhou;Xuehai Qian;Xue Lin;Yanzhi Wang;Bin Ren

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Bin Ren其他文献

Development of arteriolar niche and self-renewal of breast cancer stem cells by lysophosphatidic Acid/protein kinase D signaling

通过溶血磷脂酸/蛋白激酶 D 信号传导实现小动脉生态位的发育和乳腺癌干细胞的自我更新

DOI：
发表时间：
2021
期刊：
影响因子：
0
作者：
Yinan Jiang;Yichen Guo;Jinjin Hao;R. Guenter;J. Lathia;A. Beck;R. Hattaway;D. Hurst;Q. Wang;Yehe Liu;Qi Cao;H. Krontiras;He;R. Silverstein;Bin Ren
通讯作者：
Bin Ren