SPX: Collaborative Research: FASTLEAP: FPGA based compact Deep Learning Platform
SPX:协作研究:FASTLEAP:基于 FPGA 的紧凑型深度学习平台
基本信息
- 批准号:1919117
- 负责人:
- 金额:$ 35万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-10-01 至 2024-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
With the rise of artificial intelligence in recent years, Deep Neural Networks (DNNs) have been widely used because of their high accuracy, excellent scalability, and self-adaptiveness properties. Many applications employ DNNs as the core technology, such as face detection, speech recognition, scene parsing. To meet the high accuracy requirement of various applications, DNN models are becoming deeper and larger, and are evolving at a fast pace. They are computation and memory intensive and pose intensive challenges to the conventional Von Neumann architecture used in computing. The key problem addressed by the project is how to accelerate deep learning, not only inference, but also training and model compression, which have not received enough attention in the prior research. This endeavor has the potential to enable the design of fast and energy-efficient deep learning systems, applications of which are found in our daily lives -- ranging from autonomous driving, through mobile devices, to IoT systems, thus benefiting the society at large.The outcome of this project is FASTLEAP - an Field Programmable Gate Array (FPGA)-based platform for accelerating deep learning. The platform takes in a dataset as an input and outputs a model which is trained, pruned, and mapped on FPGA, optimized for fast inferencing. The project will utilize the emerging FPGA technologies that have access to High Bandwidth Memory (HBM) and consist of floating-point DSP units. In a vertical perspective, FASTLEAP integrates innovations from multiple levels of the whole system stack algorithm, architecture and down to efficient FPGA hardware implementation. In a horizontal perspective, it embraces systematic DNN model compression and associated FPGA-based training, as well as FPGA-based inference acceleration of compressed DNN models. The platform will be delivered as a complete solution, with both the software tool chain and hardware implementation to ensure the ease of use. At algorithm level of FASTLEAP, the proposed Alternating Direction Method of Multipliers for Neural Networks (ADMM-NN) framework, will perform unified weight pruning and quantization, given training data, target accuracy, and target FPGA platform characteristics (performance models, inter-accelerator communication). The training procedure in ADMM-NN is performed on a platform with multiple FPGA accelerators, dictated by the architecture-level optimizations on communication and parallelism. Finally, the optimized FPGA inference design is generated based on the trained DNN model with compression, accounting for FPGA performance modeling. The project will address the following SPX research areas: 1) Algorithms: Bridging the gap between deep learning developments in theory and their system implementations cognizant of performance model of the platform. 2) Applications: Scaling of deep learning for domains such as image processing. 3) Architecture and Systems: Automatic generation of deep learning designs on FPGA optimizing area, energy-efficiency, latency, and throughput.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
近年来,随着人工智能的兴起,深度神经网络(Deep Neural network, dnn)以其高精度、良好的可扩展性和自适应特性得到了广泛的应用。许多应用都采用深度神经网络作为核心技术,如人脸检测、语音识别、场景解析等。为了满足各种应用的高精度要求,深度神经网络模型正变得越来越深、越来越大,并且正在快速发展。它们是计算和内存密集型的,对传统的计算中使用的冯·诺依曼架构提出了强烈的挑战。该项目解决的关键问题是如何加速深度学习,不仅是推理,还有训练和模型压缩,这些在之前的研究中没有得到足够的重视。这一努力有可能使设计快速节能的深度学习系统成为可能,这些系统在我们的日常生活中得到应用,从自动驾驶到移动设备,再到物联网系统,从而使整个社会受益。该项目的成果是FASTLEAP——一个基于现场可编程门阵列(FPGA)的加速深度学习平台。该平台以数据集作为输入,输出经过训练、剪枝和映射到FPGA上的模型,并针对快速推理进行了优化。该项目将利用可访问高带宽存储器(HBM)的新兴FPGA技术,并由浮点DSP单元组成。从垂直角度来看,FASTLEAP集成了从整个系统堆栈算法、架构到高效FPGA硬件实现的多个层面的创新。从横向角度来看,它包括系统的DNN模型压缩和相关的基于fpga的训练,以及压缩DNN模型的基于fpga的推理加速。该平台将作为一个完整的解决方案交付,包括软件工具链和硬件实现,以确保易用性。在FASTLEAP的算法层面,提出的神经网络乘法器交替方向方法(ADMM-NN)框架将根据给定的训练数据、目标精度和目标FPGA平台特性(性能模型、加速器间通信)执行统一的权值修剪和量化。ADMM-NN的训练过程在具有多个FPGA加速器的平台上进行,并在通信和并行性上进行架构级优化。最后,基于训练好的DNN模型进行压缩生成优化的FPGA推理设计,考虑FPGA性能建模。该项目将涉及以下SPX研究领域:1)算法:弥合理论深度学习发展与其系统实现之间的差距,认识到平台的性能模型。2)应用:在图像处理等领域扩展深度学习。3)架构和系统:在FPGA上自动生成深度学习设计,优化面积、能效、延迟和吞吐量。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(13)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding
- DOI:10.1007/978-3-031-19775-8_3
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Geng Yuan;Sung-En Chang;Qing Jin;Alec Lu;Yanyu Li;Yushu Wu;Zhenglun Kong;Yanyue Xie;Peiyan Dong;Minghai Qin;Xiaolong Ma;Xulong Tang;Zhenman Fang;Yanzhi Wang
- 通讯作者:Geng Yuan;Sung-En Chang;Qing Jin;Alec Lu;Yanyu Li;Yushu Wu;Zhenglun Kong;Yanyue Xie;Peiyan Dong;Minghai Qin;Xiaolong Ma;Xulong Tang;Zhenman Fang;Yanzhi Wang
Advancing Model Pruning via Bi-level Optimization
- DOI:10.48550/arxiv.2210.04092
- 发表时间:2022-10
- 期刊:
- 影响因子:0
- 作者:Yihua Zhang;Yuguang Yao;Parikshit Ram;Pu Zhao;Tianlong Chen;Min-Fong Hong;Yanzhi Wang;Sijia Liu-Siji
- 通讯作者:Yihua Zhang;Yuguang Yao;Parikshit Ram;Pu Zhao;Tianlong Chen;Min-Fong Hong;Yanzhi Wang;Sijia Liu-Siji
Non-Structured DNN Weight Pruning--Is It Beneficial in Any Platform?
- DOI:10.1109/tnnls.2021.3063265
- 发表时间:2021-03-18
- 期刊:
- 影响因子:10.4
- 作者:Ma, Xiaolong;Lin, Sheng;Wang, Yanzhi
- 通讯作者:Wang, Yanzhi
An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices
- DOI:10.1007/978-3-030-58601-0_37
- 发表时间:2020-01
- 期刊:
- 影响因子:0
- 作者:Xiaolong Ma;Wei Niu;Tianyun Zhang;Sijia Liu;Fu-Ming Guo;Sheng Lin;Hongjia Li;Xiang Chen;Jian Tang;Kaisheng Ma;Bin Ren;Yanzhi Wang
- 通讯作者:Xiaolong Ma;Wei Niu;Tianyun Zhang;Sijia Liu;Fu-Ming Guo;Sheng Lin;Hongjia Li;Xiang Chen;Jian Tang;Kaisheng Ma;Bin Ren;Yanzhi Wang
DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network
- DOI:10.1109/cvpr52729.2023.00597
- 发表时间:2023-03
- 期刊:
- 影响因子:0
- 作者:Xuan Shen;Yaohua Wang;Ming Lin;Yi-Li Huang;Hao Tang;Xiuyu Sun;Yanzhi Wang
- 通讯作者:Xuan Shen;Yaohua Wang;Ming Lin;Yi-Li Huang;Hao Tang;Xiuyu Sun;Yanzhi Wang
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yanzhi Wang其他文献
Design and Evaluation of Deep Learning Accelerators Using Superconductor Logic Families
使用超导逻辑系列的深度学习加速器的设计和评估
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Qiuyun Xu;Yanzhi Wang;Naoki Takeuchi;Nobuyuki Yoshikawa - 通讯作者:
Nobuyuki Yoshikawa
Dynamics and control of spin entanglement
自旋纠缠的动力学与控制
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Qiuyun Xu;Yanzhi Wang;Naoki Takeuchi;Nobuyuki Yoshikawa;井澤佳那子,羽藤英二,菊池雅彦,石神孝裕,川名義輝,杉本保男;寺地徳之,フィオリ アレクサンドレ,桐谷範彦,谷本 智,ゲラール エチェン,小出康夫;小林研介;Zenji Horita;S. Tarucha - 通讯作者:
S. Tarucha
Resource allocation optimization in a data center with energy storage devices
具有储能设备的数据中心的资源分配优化
- DOI:
- 发表时间:
2014 - 期刊:
- 影响因子:0
- 作者:
Shuang Chen;Yanzhi Wang;Massoud Pedram - 通讯作者:
Massoud Pedram
Ultra-broad bandwidth low-dispersion mirror with smooth dispersion and high laser damage resistance.
超宽带宽低色散镜,色散平滑,抗激光损伤能力强。
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:3.6
- 作者:
Yuhui Zhang;Yanzhi Wang;Yu Chen;Ye Lu;Xinliang Wang;Fanyu Kong;Zhihao Wang;Chang Chen;Yi Xu;Yuxin Leng;Hongbo He;J. Shao - 通讯作者:
J. Shao
Share Repurchases as a Potential Tool to Mislead Investors
股票回购是误导投资者的潜在工具
- DOI:
10.2139/ssrn.1485583 - 发表时间:
2009 - 期刊:
- 影响因子:0
- 作者:
K. Chan;D. Ikenberry;I. Lee;Yanzhi Wang - 通讯作者:
Yanzhi Wang
Yanzhi Wang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yanzhi Wang', 18)}}的其他基金
Collaborative Research: CSR: Small: Expediting Continual Online Learning on Edge Platforms through Software-Hardware Co-designs
协作研究:企业社会责任:小型:通过软硬件协同设计加快边缘平台上的持续在线学习
- 批准号:
2312158 - 财政年份:2023
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
FET: SHF: Small: Collaborative: Advanced Circuits, Architectures and Design Automation Technologies for Energy-efficient Single Flux Quantum Logic
FET:SHF:小型:协作:用于节能单通量量子逻辑的先进电路、架构和设计自动化技术
- 批准号:
2008514 - 财政年份:2020
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
CNS Core: Small: Collaborative: Content-Based Viewport Prediction Framework for Live Virtual Reality Streaming
CNS 核心:小型:协作:用于直播虚拟现实流的基于内容的视口预测框架
- 批准号:
1909172 - 财政年份:2019
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
IRES Track I: U.S.-Japan International Research Experience for Students on Superconducting Electronics
IRES Track I:美国-日本超导电子学学生国际研究经验
- 批准号:
1854213 - 财政年份:2019
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
相似海外基金
SPX: Collaborative Research: Automated Synthesis of Extreme-Scale Computing Systems Using Non-Volatile Memory
SPX:协作研究:使用非易失性存储器自动合成超大规模计算系统
- 批准号:
2408925 - 财政年份:2023
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: Scalable Neural Network Paradigms to Address Variability in Emerging Device based Platforms for Large Scale Neuromorphic Computing
SPX:协作研究:可扩展神经网络范式,以解决基于新兴设备的大规模神经形态计算平台的可变性
- 批准号:
2401544 - 财政年份:2023
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: Intelligent Communication Fabrics to Facilitate Extreme Scale Computing
SPX:协作研究:促进超大规模计算的智能通信结构
- 批准号:
2412182 - 财政年份:2023
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: Cross-stack Memory Optimizations for Boosting I/O Performance of Deep Learning HPC Applications
SPX:协作研究:用于提升深度学习 HPC 应用程序 I/O 性能的跨堆栈内存优化
- 批准号:
2318628 - 财政年份:2022
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: NG4S: A Next-generation Geo-distributed Scalable Stateful Stream Processing System
SPX:合作研究:NG4S:下一代地理分布式可扩展状态流处理系统
- 批准号:
2202859 - 财政年份:2022
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: FASTLEAP: FPGA based compact Deep Learning Platform
SPX:协作研究:FASTLEAP:基于 FPGA 的紧凑型深度学习平台
- 批准号:
2333009 - 财政年份:2022
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: Memory Fabric: Data Management for Large-scale Hybrid Memory Systems
SPX:协作研究:内存结构:大规模混合内存系统的数据管理
- 批准号:
2132049 - 财政年份:2021
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: Automated Synthesis of Extreme-Scale Computing Systems Using Non-Volatile Memory
SPX:协作研究:使用非易失性存储器自动合成超大规模计算系统
- 批准号:
2113307 - 财政年份:2020
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: Intelligent Communication Fabrics to Facilitate Extreme Scale Computing
SPX:协作研究:促进超大规模计算的智能通信结构
- 批准号:
1918987 - 财政年份:2019
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: Parallel Algorithm by Blocks - A Data-centric Compiler/runtime System for Productive Programming of Scalable Parallel Systems
SPX:协作研究:块并行算法 - 用于可扩展并行系统的高效编程的以数据为中心的编译器/运行时系统
- 批准号:
1919021 - 财政年份:2019
- 资助金额:
$ 35万 - 项目类别:
Standard Grant