SHF: Medium: Training Sparse Neural Networks with Co-Designed Hardware Accelerators: Enabling Model Optimization and Scientific Exploration
SHF:中:使用共同设计的硬件加速器训练稀疏神经网络:实现模型优化和科学探索
基本信息
- 批准号:1763747
- 负责人:
- 金额:$ 119.98万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-07-01 至 2023-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Machine learning systems are critical drivers of new technologies such as near-perfect automatic speech recognition, autonomous vehicles, computer vision, and natural language understanding. The underlying inference engine for many of these systems is based on neural networks. Before a neural network can be used for these inference tasks, it must be trained using a data corpus of known input-output pairs. This training process is very computationally intensive with current systems requiring weeks to months of time on graphic processing units (GPUs) or central processing units in the cloud. As more data becomes available, this problem of long training time is further exacerbated because larger, more effective network models become desirable. The theoretical understanding of neural networks is limited, so experimentation and empirical optimization remains the primary tool for understanding deep neural networks and innovating in the field. However, the ability to conduct larger scale experiments is becoming concentrated with a few large entities with the necessary financial and computational resources. Even for those with such resources, the painfully long experimental cycle for training neural networks means that large-scale searches and optimizations over the neural network model structure are not performed. The ultimate goal of this research project is to democratize and distribute the ability to conduct large scale neural network training and model optimizations at high speed, using hardware accelerators. Reducing the training time from weeks to hours will allow researchers to run many more experiments, gaining knowledge into the fundamental inner workings of deep learning systems. The hardware accelerators are also much more energy efficient than the existing GPU-based training paradigm, so advances made in this project can significantly reduce the energy consumption required for neural network training tasks.This project comprises an interdisciplinary research plan that spans theory, hardware architecture and design, software control, and system integration. A new class of neural networks that have pre-defined sparsity is being explored. These sparse neural networks are co-designed with a very flexible, high-speed, energy-efficient hardware architecture that maximizes circuit speed for any model size in a given Field Programmable Gate Array (FPGA) chip. This algorithm-hardware co-design is a key research theme that differentiates this approach from previous research that enforces some sparsity during the training process in a manner incompatible with parallel hardware acceleration. In particular, the proposed architecture operates on each network layer simultaneously, executing the forward- and back-propagation in parallel and pipelined fully across layers. With high precision arithmetic, a speed-up of about 5X relative to GPUs is expected. Using log-domain arithmetic, these gains are expected to increase to 100X or larger. Software and algorithms are being developed to manage multiple FPGA boards, simplifying and automating the model search and training process. These algorithms exploit the ability to reconfigure the FPGAs to trade speed for accuracy, a capability lacking in GPUs. These software tools will also serve as a bridge to popular Python libraries used by the machine learning community.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
机器学习系统是近乎完美的自动语音识别、自动驾驶汽车、计算机视觉和自然语言理解等新技术的关键驱动力。 许多这些系统的底层推理引擎是基于神经网络的。 在神经网络可以用于这些推理任务之前,它必须使用已知输入输出对的数据库进行训练。 这个训练过程是非常计算密集型的,当前的系统需要在云中的图形处理单元(GPU)或中央处理单元上花费数周至数月的时间。 随着更多的数据变得可用,这个长训练时间的问题进一步加剧,因为更大,更有效的网络模型变得可取。 对神经网络的理论理解是有限的,因此实验和经验优化仍然是理解深度神经网络和在该领域创新的主要工具。 然而,进行更大规模实验的能力正变得集中在具有必要的财政和计算资源的少数大型实体。 即使对于那些拥有这些资源的人来说,训练神经网络的漫长实验周期意味着无法对神经网络模型结构进行大规模搜索和优化。 该研究项目的最终目标是使用硬件加速器民主化和分发高速进行大规模神经网络训练和模型优化的能力。 将训练时间从数周减少到数小时将使研究人员能够进行更多的实验,从而获得深度学习系统基本内部工作原理的知识。 硬件加速器也比现有的基于GPU的训练范式更节能,因此该项目的进展可以显着降低神经网络训练任务所需的能耗。该项目包括跨学科的研究计划,涵盖理论,硬件架构和设计,软件控制和系统集成。 正在探索一类具有预定义稀疏性的新神经网络。 这些稀疏神经网络与一个非常灵活、高速、节能的硬件架构协同设计,可以在给定的现场可编程门阵列(FPGA)芯片中最大限度地提高任何模型尺寸的电路速度。 这种算法-硬件协同设计是一个关键的研究主题,它将这种方法与以前的研究区分开来,以前的研究在训练过程中以与并行硬件加速不兼容的方式强制执行一些稀疏性。特别是,所提出的体系结构同时在每个网络层上运行,并行执行正向和反向传播,并完全跨层流水线。 在高精度运算的情况下,相对于GPU的速度提高约5倍。 使用对数域算法,这些增益预计将增加到100倍或更大。正在开发软件和算法来管理多个FPGA板,简化和自动化模型搜索和训练过程。这些算法利用重新配置FPGA的能力,以速度换取准确性,这是GPU所缺乏的能力。 这些软件工具还将作为机器学习社区使用的流行Python库的桥梁。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(22)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Analyzing the Confidentiality of Undistillable Teachers in Knowledge Distillation
- DOI:
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Souvik Kundu;Qirui Sun;Yao Fu;M. Pedram;P. Beerel
- 通讯作者:Souvik Kundu;Qirui Sun;Yao Fu;M. Pedram;P. Beerel
BMPQ: Bit-Gradient Sensitivity Driven Mixed-Precision Quantization of DNNs from Scratch
BMPQ:从头开始进行位梯度灵敏度驱动的 DNN 混合精度量化
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Souvik Kundu, Shikai Wang
- 通讯作者:Souvik Kundu, Shikai Wang
Predicting Throughput of Distributed Stochastic Gradient Descent
- DOI:10.1109/tpds.2022.3151739
- 发表时间:2022
- 期刊:
- 影响因子:5.3
- 作者:Zhuojin Li;Marco Paolieri;L. Golubchik;Sung-Han Lin;Wumo Yan
- 通讯作者:Zhuojin Li;Marco Paolieri;L. Golubchik;Sung-Han Lin;Wumo Yan
Pre-Defined Sparsity for Low-Complexity Convolutional Neural Networks
- DOI:10.1109/tc.2020.2972520
- 发表时间:2020-01
- 期刊:
- 影响因子:3.7
- 作者:Souvik Kundu;M. Nazemi;M. Pedram;K. Chugg;P. Beerel
- 通讯作者:Souvik Kundu;M. Nazemi;M. Pedram;K. Chugg;P. Beerel
Performance and Revenue Analysis of Hybrid Cloud Federations with QoS Requirements
具有 QoS 要求的混合云联合的性能和收入分析
- DOI:10.1109/cloud55607.2022.00055
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:B. Song, M. Paolieri
- 通讯作者:B. Song, M. Paolieri
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Keith Chugg其他文献
Keith Chugg的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Keith Chugg', 18)}}的其他基金
Two Dimensional Parallel Signaling an Detection Techniques With Applications To Volume Optical Memories
二维并行信号传输和检测技术及其在体积光学存储器中的应用
- 批准号:
9616663 - 财政年份:1996
- 资助金额:
$ 119.98万 - 项目类别:
Standard Grant
相似海外基金
Collaborative Research: CyberTraining: Implementation: Medium: Training Users, Developers, and Instructors at the Chemistry/Physics/Materials Science Interface
协作研究:网络培训:实施:媒介:在化学/物理/材料科学界面培训用户、开发人员和讲师
- 批准号:
2321102 - 财政年份:2024
- 资助金额:
$ 119.98万 - 项目类别:
Standard Grant
Collaborative Research: CyberTraining: Implementation: Medium: Training Users, Developers, and Instructors at the Chemistry/Physics/Materials Science Interface
协作研究:网络培训:实施:媒介:在化学/物理/材料科学界面培训用户、开发人员和讲师
- 批准号:
2321103 - 财政年份:2024
- 资助金额:
$ 119.98万 - 项目类别:
Standard Grant
DESIGN: Creating cultural change in small to medium-sized professional societies: a training network approach
设计:在中小型专业团体中创造文化变革:培训网络方法
- 批准号:
2334964 - 财政年份:2024
- 资助金额:
$ 119.98万 - 项目类别:
Standard Grant
Collaborative Research: CyberTraining: Implementation: Medium: Training Users, Developers, and Instructors at the Chemistry/Physics/Materials Science Interface
协作研究:网络培训:实施:媒介:在化学/物理/材料科学界面培训用户、开发人员和讲师
- 批准号:
2321104 - 财政年份:2024
- 资助金额:
$ 119.98万 - 项目类别:
Standard Grant
Collaborative Research: Implementation: Medium: Secure, Resilient Cyber-Physical Energy System Workforce Pathways via Data-Centric, Hardware-in-the-Loop Training
协作研究:实施:中:通过以数据为中心的硬件在环培训实现安全、有弹性的网络物理能源系统劳动力路径
- 批准号:
2320972 - 财政年份:2023
- 资助金额:
$ 119.98万 - 项目类别:
Standard Grant
Collaborative Research: Implementation: Medium: Secure, Resilient Cyber-Physical Energy System Workforce Pathways via Data-Centric, Hardware-in-the-Loop Training
协作研究:实施:中:通过以数据为中心的硬件在环培训实现安全、有弹性的网络物理能源系统劳动力路径
- 批准号:
2320975 - 财政年份:2023
- 资助金额:
$ 119.98万 - 项目类别:
Standard Grant
Collaborative Research: CyberTraining: Implementation: Medium: Cross-Disciplinary Training for Joint Cyber-Physical Systems and IoT Security
协作研究:网络培训:实施:中:联合网络物理系统和物联网安全的跨学科培训
- 批准号:
2230086 - 财政年份:2023
- 资助金额:
$ 119.98万 - 项目类别:
Continuing Grant
Collaborative Research: CyberTraining: Implementation: Medium: Cross-Disciplinary Training for Joint Cyber-Physical Systems and IoT Security
协作研究:网络培训:实施:中:联合网络物理系统和物联网安全的跨学科培训
- 批准号:
2230087 - 财政年份:2023
- 资助金额:
$ 119.98万 - 项目类别:
Continuing Grant
Collaborative Research: CyberTraining: Implementation: Medium: CyberInfrastructure Training and Education for Synchrotron X-Ray Science (X-CITE)
合作研究:网络培训:实施:媒介:同步加速器 X 射线科学网络基础设施培训和教育 (X-CITE)
- 批准号:
2320375 - 财政年份:2023
- 资助金额:
$ 119.98万 - 项目类别:
Standard Grant
Collaborative Research: CyberTraining: Implementation: Medium: Cyber Training for Open Science in Climate, Water and Environmental Sustainability
合作研究:网络培训:实施:中:气候、水和环境可持续性开放科学的网络培训
- 批准号:
2230093 - 财政年份:2023
- 资助金额:
$ 119.98万 - 项目类别:
Standard Grant