权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Neural Approximate Accelerator Architecture Optimization for DNN Inference on Lightweight FPGAs (NA^3Os)

用于轻量级 FPGA (NA^3Os) 上 DNN 推理的神经近似加速器架构优化

基本信息

批准号：
524986327
负责人：
Professor Dr.-Ing. Jörg Henkel
金额：
--
依托单位：
CES - Chair for Embedded Systems
依托单位国家：
德国
项目类别：
Research Grants
财政年份：
资助国家：
德国
起止时间：
项目状态：
未结题

来源：
https://gepris.dfg.de/gepris/projekt/524986327?language=en
关键词：
Neural Approximate Accelerator Architecture Optimization

项目摘要

Deep learning has changed the way and the quality complex technical problems have been solved. Many of the advances from mostly within the last decade already had profound impact on systems used in everyday life. Background is that technical systems are often to such an extent complex that it is infeasible to build sufficiently accurate models that may serve as a basis for classical optimization techniques. These are the scenarios where Deep Neural Networks(DNNs) shine. The drawback, however, are high computational demands to process DNNs. Besides high computing requirements, memory resources and energy, among others, are often also very high. This proposal presents an approach to successfully deploy DNNs in systems with very limited resources, particularly FPGAs, enabling thus efficient TinyML implementation. An emphasis in our investigations is put on a unique amalgam (combination) of compression techniques such as pruning and quantization with emerging approximate computing principles. Particularly for FPGAs, we want to investigate the opportunity of approximate arithmetic units. Moreover, we want to exploit FPGA-specific artifacts such as DSPs and BRAMs to provide highly resource and energy efficient hardware implementations for DNNs. To the best of our knowledge, this proposal presents the first important steps in optimizing the deployment of DNNs on approximate and reconfigurable hardware. This involves investigating innovative mapping and design space exploration techniques. The combination of micro-architectural peculiarities with the approximate computing paradigm promises a controllable trade-off between the quality of DNN results and the computational resources needed. The final goal is the development of a co-search methodology between the neural network architecture, its optimization, and the synthesis of approximated DNN accelerators on FPGAs. Further research includes the analysis of DNN robustness and energy tradeoffs. In summary, we propose the first steps to successfully deploy DNN on highly resource-constrained FPGA systems while exploiting approximate computing principles.

深度学习改变了方式，解决了质量复杂的技术问题。过去十年的许多进步已经对日常生活中使用的系统产生了深远的影响。背景是，技术系统通常复杂到这样的程度，以至于不可能构建可以用作经典优化技术的基础的足够精确的模型。这些都是深度神经网络（DNN）大放异彩的场景。然而，缺点是处理DNN的高计算需求。除了高计算要求之外，存储器资源和能量等通常也非常高。该提案提出了一种在资源非常有限的系统中成功部署DNN的方法，特别是FPGA，从而实现高效的TinyML实现。在我们的调查重点放在一个独特的汞合金（组合）的压缩技术，如修剪和量化与新兴的近似计算原理。特别是对于FPGA，我们希望研究近似算术单元的机会。此外，我们希望利用FPGA特定的工件，如DSP和BRAM，为DNN提供高度资源和能源效率的硬件实现。据我们所知，该提案提出了在近似和可重构硬件上优化DNN部署的第一个重要步骤。这涉及研究创新的测绘和设计空间探索技术。微架构特性与近似计算范式的结合有望在DNN结果的质量和所需的计算资源之间实现可控的权衡。最终目标是在神经网络架构、其优化和FPGA上近似DNN加速器的合成之间开发一种共同搜索方法。进一步的研究包括DNN鲁棒性和能量权衡的分析。总之，我们提出了在高度资源受限的FPGA系统上成功部署DNN的第一步，同时利用近似计算原理。