权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

SHF: Small: Collaborative Research: Retraining-free Concurrent Test and Diagnosis in Emerging Neural Network Accelerators

SHF：小型：协作研究：新兴神经网络加速器中的免再训练并发测试和诊断

基本信息

批准号：
1909854
负责人：
Chengmo Yang
金额：
$ 26.5万
依托单位：
University of Delaware
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-10-01 至 2024-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1909854&HistoricalAwards=false
关键词：
SHF Small Collaborative Research Retraining

项目摘要

Neural networks have become the go-to tool for solving many real-world recognition and classification problems in computer vision, language processing, life sciences and finance. While promising, smart and intelligent data interpretation via deep learning is extremely power hungry. To conduct power-efficient deep learning on battery-constrained edge platforms, one promising solution is to use hardware accelerators built with emerging non-volatile memory (NVM) devices, which offer high density, extremely low power consumption, as well as in-situ and parallelized data processing. While these advances are enticing, NVM devices also impose extra challenges, as their design and manufacturing technology are far less mature than CMOS. Furthermore, NVM technologies are likely to exhibit new types of errors, such as read/write disturbance, values drifting over time, and short data retention time. These errors can accumulate while the accelerator is running a deep learning application, and without careful mitigation could lead to significant accuracy degradation. To assuage these concerns, this project will develop a self-healing framework for NVM-based neural network accelerators integrating a test, diagnosis, and recovery loop that monitors and maintains the health of the accelerator. Results of this project will (1) deepen the understanding of interactions among hardware defects and errors, NVM-based accelerators, and machine learning, (2) increase community awareness of post-fabrication error debugging and fixing techniques, (3) enrich the computer engineering course curriculum, and (4) train and promote students of diverse backgrounds for both the workforce and research. This project will investigate, characterize, and mitigate errors that will affect the adoption of NVM-based neural network accelerators. While existing solutions focus on fixing errors observed at fabrication time, this project targets the NVM-specific errors that will occur over the life of the accelerator, not just at the time of manufacturing. The project will lead to four outcomes, namely, (1) measurement and characterization of the error resilience capability of neural networks with different topologies and data types, (2) cost-effective approaches for deploying neural networks alongside NVM-based accelerators which exhibit new and diverse error patterns without involving costly retraining, (3) methods for generating neural network inputs as test vectors which will be tuned to be sensitive to different levels of error accumulation and accuracy loss and will provide real-time accelerator health statistics, and (4) an algorithm and device level co-diagnosis procedure which identifies and protects the most critical and vulnerable components of the neural network and the accelerator.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

神经网络已成为解决计算机视觉、语言处理、生命科学和金融领域许多现实世界识别和分类问题的首选工具。虽然很有前途，但通过深度学习进行智能和智能数据解释是非常耗电的。为了在电池受限的边缘平台上进行节能深度学习，一个有前途的解决方案是使用由新兴的非易失性存储器（NVM）设备构建的硬件加速器，这些设备提供高密度，极低功耗以及原位和并行数据处理。虽然这些进步是诱人的，但NVM设备也带来了额外的挑战，因为它们的设计和制造技术远不如CMOS成熟。此外，NVM技术很可能表现出新类型的错误，例如读/写干扰、值随时间漂移以及短数据保留时间。当加速器运行深度学习应用程序时，这些错误可能会累积，如果不仔细缓解，可能会导致准确性显著下降。为了缓解这些担忧，该项目将为基于NVM的神经网络加速器开发一个自我修复框架，该框架集成了一个测试、诊断和恢复循环，用于监控和维护加速器的健康状况。该项目的成果将（1）加深对硬件缺陷和错误，基于NVM的加速器和机器学习之间相互作用的理解，（2）提高社区对制造后错误调试和修复技术的认识，（3）丰富计算机工程课程，（4）培养和促进不同背景的学生，无论是劳动力还是研究。该项目将调查，表征和减轻将影响基于NVM的神经网络加速器采用的错误。虽然现有的解决方案专注于修复在制造时观察到的错误，但该项目的目标是在加速器的生命周期中发生的特定于NVM的错误，而不仅仅是在制造时。该项目将产生四个成果，即（1）测量和表征具有不同拓扑结构和数据类型的神经网络的错误恢复能力，（2）在基于NVM的加速器旁边部署神经网络的具有成本效益的方法，这些加速器表现出新的和多样化的错误模式，而不需要进行昂贵的再培训，（3）用于生成作为测试向量的神经网络输入的方法，所述测试向量将被调整为对不同水平的误差累积和准确性损失敏感，并且将提供实时加速器健康统计，以及（4）算法和设备级的协同诊断程序，其识别和保护神经网络和加速器的最关键和最脆弱的组件。该奖项反映了NSF的法定使命，并且通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（6）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Exploring Image Selection for Self-Testing in Neural Network Accelerators

DOI：
10.1109/isvlsi54635.2022.00076
发表时间：
2022-07
期刊：
2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)
影响因子：
0
作者：
Fanruo Meng;Chengmo Yang
通讯作者：
Fanruo Meng;Chengmo Yang

Monitoring the Health of Emerging Neural Network Accelerators with Cost-effective Concurrent Test

DOI：
10.1109/dac18072.2020.9218675
发表时间：
2020-07
期刊：
2020 57th ACM/IEEE Design Automation Conference (DAC)
影响因子：
0
作者：
Qi Liu;Tao Liu;Zihao Liu;Wujie Wen;Chengmo Yang
通讯作者：
Qi Liu;Tao Liu;Zihao Liu;Wujie Wen;Chengmo Yang

A Self-Test Framework for Detecting Fault-induced Accuracy Drop in Neural Network Accelerators

DOI：
10.1145/3394885.3431519
发表时间：
2021-01
期刊：
2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)
影响因子：
0
作者：
Fanruo Meng;Fateme S. Hosseini;Chengmo Yang
通讯作者：
Fanruo Meng;Fateme S. Hosseini;Chengmo Yang

Tolerating Defects in Low-Power Neural Network Accelerators Via Retraining-Free Weight Approximation

DOI：
10.1145/3477016
发表时间：
2021-09
期刊：
ACM Transactions on Embedded Computing Systems (TECS)
影响因子：
0
作者：
Fateme S. Hosseini;Fanruo Meng;Chengmo Yang;Wujie Wen;Rosario Cammarota
通讯作者：
Fateme S. Hosseini;Fanruo Meng;Chengmo Yang;Wujie Wen;Rosario Cammarota

NeuroPots: Realtime Proactive Defense against Bit-Flip Attacks in Neural Networks

NeuroPots：实时主动防御神经网络中的位翻转攻击

DOI：
发表时间：
2023
期刊：
USENIX Security Symposium
影响因子：
0
作者：
Liu, Q;Yin, J;Wen, W;Yang, C;Sha, S
通讯作者：
Sha, S

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Chengmo Yang其他文献

Power efficient branch prediction through early identification of branch addresses

通过早期识别分支地址进行高能效分支预测

DOI：
10.1145/1176760.1176782
发表时间：
2006
期刊：
2011 IEEE Computer Society Annual Symposium on VLSI
影响因子：
0
作者：
Chengmo Yang;A. Orailoglu
通讯作者：
A. Orailoglu

A DWM-Based Stack Architecture Implementation for Energy Harvesting Systems

用于能量收集系统的基于 DWM 的堆栈架构实现

DOI：
10.1145/3126543
发表时间：
2017
期刊：
ACM Transactions on Embedded Computing Systems (TECS)
影响因子：
0
作者：
Hoda Aghaei Khouzani;Chengmo Yang
通讯作者：
Chengmo Yang

Power-aware and cost-efficient state encoding in non-volatile memory based FPGAs

基于非易失性存储器的 FPGA 中的功耗感知且经济高效的状态编码

DOI：
10.1109/vlsi-soc.2017.8203455
发表时间：
2017
期刊：
2017 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC)
影响因子：
0
作者：
Yuan Xue;Abraham Mcllvaine;Chengmo Yang
通讯作者：
Chengmo Yang

Processor reliability enhancement through compiler-directed register file peak temperature reduction

通过编译器控制的寄存器文件峰值温度降低来增强处理器可靠性

DOI：
10.1109/dsn.2009.5270305
发表时间：
2009
期刊：
2009 IEEE/IFIP International Conference on Dependable Systems & Networks
影响因子：
0
作者：
Chengmo Yang;A. Orailoglu
通讯作者：
A. Orailoglu

Behavioral Synthesis for Hardware Security

硬件安全的行为综合

DOI：
10.1007/978-3-030-78841-4
发表时间：
2022
期刊：
Behavioral Synthesis for Hardware Security
影响因子：
0
作者：
Srinivas Katkoori;Omkar Dokur;Rajeev Joshi;Kavya Lakshmi Kalyanam;Md Adnan Zaman;Ariful Islam;Nandeesha Veeranna;Benjamin Carrion Schafer;Rajat Pranesh Santikellur;Subhra Chakraborty;S. Bhunia;Hannah Badier;Jean;Philippe Coussy;Guy Gogniat;C. Pilato;D. Sciuto;Francesco Regazzoni;Siddharth Garg;Ramesh Karri;Anirban Sengupta;Mahendra Rathor;Matthew Lewandowski;Chen Liu;Chengmo Yang;Farhath Zareen;Robert Karam;S. T. C. Konigsmark;Wei Ren;Martin D. F. Wong;Deming Chen;Mike Borowczak;Ranga Vemuri;Steffen Peter;T. Givargis;Wei Hu;Armaiti Ardeshiricham;Lingjuan Wu;Ryan Kastner;Christian Pilato Politecnico;di Milano;Italy Milan;ST Micro;Singapore Singapore;S. Islam
通讯作者：
S. Islam