SPX: Collaborative Research: FASTLEAP: FPGA based compact Deep Learning Platform
SPX:协作研究:FASTLEAP:基于 FPGA 的紧凑型深度学习平台
基本信息
- 批准号:1919117
- 负责人:
- 金额:$ 35万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-10-01 至 2024-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
With the rise of artificial intelligence in recent years, Deep Neural Networks (DNNs) have been widely used because of their high accuracy, excellent scalability, and self-adaptiveness properties. Many applications employ DNNs as the core technology, such as face detection, speech recognition, scene parsing. To meet the high accuracy requirement of various applications, DNN models are becoming deeper and larger, and are evolving at a fast pace. They are computation and memory intensive and pose intensive challenges to the conventional Von Neumann architecture used in computing. The key problem addressed by the project is how to accelerate deep learning, not only inference, but also training and model compression, which have not received enough attention in the prior research. This endeavor has the potential to enable the design of fast and energy-efficient deep learning systems, applications of which are found in our daily lives -- ranging from autonomous driving, through mobile devices, to IoT systems, thus benefiting the society at large.The outcome of this project is FASTLEAP - an Field Programmable Gate Array (FPGA)-based platform for accelerating deep learning. The platform takes in a dataset as an input and outputs a model which is trained, pruned, and mapped on FPGA, optimized for fast inferencing. The project will utilize the emerging FPGA technologies that have access to High Bandwidth Memory (HBM) and consist of floating-point DSP units. In a vertical perspective, FASTLEAP integrates innovations from multiple levels of the whole system stack algorithm, architecture and down to efficient FPGA hardware implementation. In a horizontal perspective, it embraces systematic DNN model compression and associated FPGA-based training, as well as FPGA-based inference acceleration of compressed DNN models. The platform will be delivered as a complete solution, with both the software tool chain and hardware implementation to ensure the ease of use. At algorithm level of FASTLEAP, the proposed Alternating Direction Method of Multipliers for Neural Networks (ADMM-NN) framework, will perform unified weight pruning and quantization, given training data, target accuracy, and target FPGA platform characteristics (performance models, inter-accelerator communication). The training procedure in ADMM-NN is performed on a platform with multiple FPGA accelerators, dictated by the architecture-level optimizations on communication and parallelism. Finally, the optimized FPGA inference design is generated based on the trained DNN model with compression, accounting for FPGA performance modeling. The project will address the following SPX research areas: 1) Algorithms: Bridging the gap between deep learning developments in theory and their system implementations cognizant of performance model of the platform. 2) Applications: Scaling of deep learning for domains such as image processing. 3) Architecture and Systems: Automatic generation of deep learning designs on FPGA optimizing area, energy-efficiency, latency, and throughput.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
随着近年来人工智能的兴起,由于其高精度,出色的可扩展性和自适应性能,深度神经网络(DNN)被广泛使用。许多应用程序采用DNN作为核心技术,例如面部检测,语音识别,场景解析。为了满足各种应用的高精度要求,DNN模型变得越来越深,并且正在快速发展。它们是计算和记忆密集型的,并对计算中使用的常规von Neumann架构构成了密集的挑战。该项目解决的关键问题是如何加速深度学习,不仅推论,而且还要加速培训和模型压缩,这些培训和模型压缩在先前的研究中没有得到足够的关注。这项努力有可能使快速,节能的深度学习系统的设计在我们的日常生活中发现的应用 - 从自动驾驶,通过移动设备到物联网系统,从而使社会受益匪浅。该项目的结果是FastLeap- FastLeap-FastLeap-Field-Field-abledable Gate Array阵列(FPGA)基于基于深度学习的平台,以加速学习。该平台作为输入接收数据集,并输出一个在FPGA上训练,修剪和映射的模型,以快速推断。该项目将利用可以访问高带宽内存(HBM)的新兴FPGA技术,由浮点DSP单元组成。从垂直的角度来看,Fastleap从整个系统堆栈算法,体系结构和向下到有效的FPGA硬件实现的创新集成了创新。从水平角度来看,它包含系统的DNN模型压缩和相关的基于FPGA的训练,以及基于FPGA的压缩DNN模型的推理加速度。该平台将通过软件工具链和硬件实现提供作为完整的解决方案,以确保易用性。在FastLeap算法级别上,提议的神经网络乘数的交替方向方法(ADMM-NN)框架将执行统一的重量修剪和量化,给定培训数据,目标准确性和目标FPGA平台特征(性能模型,Inter-Accelerer inter-Accelerator通信)。 ADMM-NN中的培训程序是在具有多个FPGA加速器的平台上执行的,该平台由建筑级别的通信和并行性优化决定。最后,基于训练有素的DNN模型生成了优化的FPGA推理设计,并考虑了FPGA性能建模。该项目将解决以下SPX研究领域:1)算法:弥合理论中深度学习发展与其系统实现平台性能模型之间的差距。 2)应用:图像处理等域的深度学习缩放。 3)建筑和系统:自动生成有关FPGA优化领域,能源效率,潜伏期和吞吐量的深度学习设计。该奖项反映了NSF的法定任务,并被认为是值得通过基金会的知识分子和更广泛影响的评估评估来获得支持的。
项目成果
期刊论文数量(13)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding
- DOI:10.1007/978-3-031-19775-8_3
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Geng Yuan;Sung-En Chang;Qing Jin;Alec Lu;Yanyu Li;Yushu Wu;Zhenglun Kong;Yanyue Xie;Peiyan Dong;Minghai Qin;Xiaolong Ma;Xulong Tang;Zhenman Fang;Yanzhi Wang
- 通讯作者:Geng Yuan;Sung-En Chang;Qing Jin;Alec Lu;Yanyu Li;Yushu Wu;Zhenglun Kong;Yanyue Xie;Peiyan Dong;Minghai Qin;Xiaolong Ma;Xulong Tang;Zhenman Fang;Yanzhi Wang
CSB-RNN: a faster-than-realtime RNN acceleration framework with compressed structured blocks
- DOI:10.1145/3392717.3392749
- 发表时间:2020-05
- 期刊:
- 影响因子:0
- 作者:Runbin Shi;Peiyan Dong;Tong Geng;Yuhao Ding;Xiaolong Ma;Hayden Kwok-Hay So;M. Herbordt;Ang Li;Yanzhi Wang
- 通讯作者:Runbin Shi;Peiyan Dong;Tong Geng;Yuhao Ding;Xiaolong Ma;Hayden Kwok-Hay So;M. Herbordt;Ang Li;Yanzhi Wang
Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration
- DOI:10.1145/3495532
- 发表时间:2021-11
- 期刊:
- 影响因子:0
- 作者:Yifan Gong;Geng Yuan;Zheng Zhan;Wei Niu;Zhengang Li;Pu Zhao;Yuxuan Cai;Sijia Liu;Bin Ren;Xue Lin;Xulong Tang;Yanzhi Wang
- 通讯作者:Yifan Gong;Geng Yuan;Zheng Zhan;Wei Niu;Zhengang Li;Pu Zhao;Yuxuan Cai;Sijia Liu;Bin Ren;Xue Lin;Xulong Tang;Yanzhi Wang
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion
- DOI:10.1145/3453483.3454083
- 发表时间:2021-01-01
- 期刊:
- 影响因子:0
- 作者:Niu, Wei;Guan, Jiexiong;Ren, Bin
- 通讯作者:Ren, Bin
Advancing Model Pruning via Bi-level Optimization
- DOI:10.48550/arxiv.2210.04092
- 发表时间:2022-10
- 期刊:
- 影响因子:0
- 作者:Yihua Zhang;Yuguang Yao;Parikshit Ram;Pu Zhao;Tianlong Chen;Min-Fong Hong;Yanzhi Wang;Sijia Liu-Siji
- 通讯作者:Yihua Zhang;Yuguang Yao;Parikshit Ram;Pu Zhao;Tianlong Chen;Min-Fong Hong;Yanzhi Wang;Sijia Liu-Siji
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yanzhi Wang其他文献
Meaningful Use of Inhaled Nitric Oxide (iNO): a Cross-Sectional National Survey
吸入一氧化氮 (iNO) 的有意义使用:全国横断面调查
- DOI:
10.1007/s42399-021-00818-2 - 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Mina Hafzalah;Yanzhi Wang;S. Tripathi - 通讯作者:
S. Tripathi
Proteolysis Targeting Chimeras (PROTACs) Based on Imatinib Induced Degradation of BCR‐ABL in K562 Cells
基于伊马替尼诱导 K562 细胞中 BCR-ABL 降解的蛋白水解靶向嵌合体 (PROTAC)
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:2.1
- 作者:
Chuang Li;P. Zhang;Gaojie Chang;Mingyue Pan;Feng Lu;Jiahao Huang;Yanzhi Wang;Qingyan Zhao;Bingxia Sun;Yuting Cui;Feng Sang - 通讯作者:
Feng Sang
Progress of Solid‐state Electrolytes Used in Organic Secondary Batteries
有机二次电池固态电解质研究进展
- DOI:
10.1002/celc.202101005 - 发表时间:
2021-10 - 期刊:
- 影响因子:4
- 作者:
Shaolong Wang;Jing Lv;Xuehan Wang;Haixia Cui;Weiwei Huang;Yanzhi Wang - 通讯作者:
Yanzhi Wang
A Yolk-Shell Structured Metal-Organic Framework with Encapsulated Iron-Porphyrin and Its Derived Bimetallic Nitrogen-Doped Porous Carbon for An Efficient Oxygen Reduction Reaction
具有包封铁卟啉的蛋黄壳结构金属有机框架及其衍生的双金属氮掺杂多孔碳,用于有效的氧还原反应
- DOI:
10.1039/d0ta00962h - 发表时间:
2020 - 期刊:
- 影响因子:11.9
- 作者:
Chaochao Zhang;Hao Yang;Dan Zhong;Yang Xu;Yanzhi Wang;Qi Yuan;Zuozhong Liang;Bin Wang;Wei Zhang;Haoquan Zheng;Tao Cheng;Rui Cao - 通讯作者:
Rui Cao
The Optimal Machine Life in Tesla
Tesla 的最佳机器寿命
- DOI:
10.2991/aebmr.k.220307.144 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Zeyun Lu;Jilin Lyu;Zhengyang Wan;Yanzhi Wang - 通讯作者:
Yanzhi Wang
Yanzhi Wang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yanzhi Wang', 18)}}的其他基金
Collaborative Research: CSR: Small: Expediting Continual Online Learning on Edge Platforms through Software-Hardware Co-designs
协作研究:企业社会责任:小型:通过软硬件协同设计加快边缘平台上的持续在线学习
- 批准号:
2312158 - 财政年份:2023
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
FET: SHF: Small: Collaborative: Advanced Circuits, Architectures and Design Automation Technologies for Energy-efficient Single Flux Quantum Logic
FET:SHF:小型:协作:用于节能单通量量子逻辑的先进电路、架构和设计自动化技术
- 批准号:
2008514 - 财政年份:2020
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
IRES Track I: U.S.-Japan International Research Experience for Students on Superconducting Electronics
IRES Track I:美国-日本超导电子学学生国际研究经验
- 批准号:
1854213 - 财政年份:2019
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
CNS Core: Small: Collaborative: Content-Based Viewport Prediction Framework for Live Virtual Reality Streaming
CNS 核心:小型:协作:用于直播虚拟现实流的基于内容的视口预测框架
- 批准号:
1909172 - 财政年份:2019
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
相似国自然基金
数智背景下的团队人力资本层级结构类型、团队协作过程与团队效能结果之间关系的研究
- 批准号:72372084
- 批准年份:2023
- 资助金额:40 万元
- 项目类别:面上项目
在线医疗团队协作模式与绩效提升策略研究
- 批准号:72371111
- 批准年份:2023
- 资助金额:41 万元
- 项目类别:面上项目
面向人机接触式协同作业的协作机器人交互控制方法研究
- 批准号:62373044
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
基于数字孪生的颅颌面人机协作智能手术机器人关键技术研究
- 批准号:82372548
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
A-型结晶抗性淀粉调控肠道细菌协作产丁酸机制研究
- 批准号:32302064
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
SPX: Collaborative Research: Automated Synthesis of Extreme-Scale Computing Systems Using Non-Volatile Memory
SPX:协作研究:使用非易失性存储器自动合成超大规模计算系统
- 批准号:
2408925 - 财政年份:2023
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: Scalable Neural Network Paradigms to Address Variability in Emerging Device based Platforms for Large Scale Neuromorphic Computing
SPX:协作研究:可扩展神经网络范式,以解决基于新兴设备的大规模神经形态计算平台的可变性
- 批准号:
2401544 - 财政年份:2023
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: Intelligent Communication Fabrics to Facilitate Extreme Scale Computing
SPX:协作研究:促进超大规模计算的智能通信结构
- 批准号:
2412182 - 财政年份:2023
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: Cross-stack Memory Optimizations for Boosting I/O Performance of Deep Learning HPC Applications
SPX:协作研究:用于提升深度学习 HPC 应用程序 I/O 性能的跨堆栈内存优化
- 批准号:
2318628 - 财政年份:2022
- 资助金额:
$ 35万 - 项目类别:
Standard Grant
SPX: Collaborative Research: FASTLEAP: FPGA based compact Deep Learning Platform
SPX:协作研究:FASTLEAP:基于 FPGA 的紧凑型深度学习平台
- 批准号:
2333009 - 财政年份:2022
- 资助金额:
$ 35万 - 项目类别:
Standard Grant