SHF:Small: Benchmarking of Transient and Intermittent Errors and Their Application to Microarchitecture
SHF:Small:瞬态和间歇性错误的基准测试及其在微架构中的应用
基本信息
- 批准号:1219186
- 负责人:
- 金额:$ 40万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2012
- 资助国家:美国
- 起止时间:2012-08-01 至 2016-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Computing infrastructure has been a driving force for our socio-economic progress in the past several decades. From drug discovery to space exploration, every scientific and engineering domain relies on computer systems to accurately analyze complex datasets. Historically, computational accuracy has been taken for granted in all these disciplines, but this notion is changing. While rapidly shrinking transistor dimensions lead to exponential power and performance benefits, the trend is also creating several unwanted side effects in computer system reliability. There are two types of errors that will become prevalent in the near future: (1) multi-bit soft errors where alpha particles and neutrons cause multiple bits to flip at the same time, and (2) intermittent errors that occur due to stress accumulation over the lifetime of a computer. Thus it is critical to benchmark the impact of these errors on the lifetime of a computer chip. Only when the impact is accurately measured is it possible to judiciously deploy solutions to improve reliability. Since any protection scheme comes with a cost, it is necessary to understand when a particular protection scheme being considered, such as parity or single-error-correcting double-error-detecting code, is too much or too little. This project presents two solutions for benchmarking multi-bit soft errors and intermittent errors. This project will develop a unified methodology to benchmark the impacts of single-bit and multi-bit soft errors on caches protected with an arbitrary protection scheme, such as an inter-leaved, block-level or word-level error correcting code. Such a benchmarking framework will significantly enhance a computer designer's ability to objectively evaluate the performance, power, and reliability tradeoffs of various protection schemes proposed for protecting caches. This research also develops a methodology to benchmark the vulnerability of an instruction set architecture (ISA) to intermittent errors. Each instruction in an ISA specification is enhanced to quantify the amount of stress that it is expected to cause on the underlying microarchitecture of a chip. The stress level information from the ISA is combined with operating conditions of the chip to continuously monitor intermittent error probability during application execution. Any unwanted degradation in chip reliability is then tackled by software exception handlers, which trigger redundant execution of vulnerable code. Broader societal impact will result from these research solutions. Benchmarking is essential to objectively evaluate the cost-benefit tradeoffs of various solutions currently being proposed to tackle reliability concerns. Without benchmarking, building a system to meet reliability specifications is a guessing game. By providing the right set of tools to initiate just-in-time error correction and recovery mechanisms, a computer designer can significantly lower the cost of providing reliable computations.
在过去几十年中,计算基础设施一直是我们社会经济进步的驱动力。从药物发现到太空探索,每个科学和工程领域都依赖计算机系统来准确分析复杂的数据集。从历史上看,所有这些学科都认为计算精度是理所当然的,但这种观念正在发生变化。虽然晶体管尺寸的快速缩小带来了指数级的功率和性能优势,但这一趋势也在计算机系统可靠性方面产生了一些不必要的副作用。在不久的将来,有两种类型的错误将变得普遍:(1)多位软错误,其中α粒子和中子导致多个位同时翻转,以及(2)由于计算机寿命期间的应力积累而发生的间歇性错误。因此,衡量这些错误对计算机芯片寿命的影响至关重要。只有准确地衡量影响,才有可能明智地部署解决方案以提高可靠性。由于任何保护方案都有成本,因此有必要了解所考虑的特定保护方案(例如奇偶校验或单错误校正双错误检测码)何时过多或过少。该项目提出了两种解决方案,用于基准测试多位软错误和间歇性错误。 该项目将开发一种统一的方法,以基准测试单比特和多比特软错误对采用任意保护方案(如交织、块级或字级纠错码)保护的高速缓存的影响。 这样的基准框架将显着提高计算机设计者的能力,客观地评估性能,功率和可靠性的权衡提出的各种保护方案保护缓存。 本研究也发展出一种方法来衡量指令集架构(伊萨)对间歇性错误的脆弱性。伊萨规范中的每条指令都被增强,以量化预期对芯片底层微架构造成的压力。来自伊萨的应力水平信息与芯片的操作条件相结合,以在应用程序执行期间连续监控间歇性错误概率。 芯片可靠性的任何不必要的降级都由软件异常处理程序处理,这会触发易受攻击代码的冗余执行。这些研究解决方案将产生更广泛的社会影响。基准是必不可少的,客观地评估目前提出的各种解决方案,以解决可靠性问题的成本效益权衡。如果没有基准测试,构建一个满足可靠性规范的系统就是一场猜谜游戏。通过提供正确的工具集来启动即时纠错和恢复机制,计算机设计人员可以显著降低提供可靠计算的成本。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Murali Annavaram其他文献
A privacy mechanism for mobile-based urban traffic monitoring
- DOI:
10.1016/j.pmcj.2014.12.007 - 发表时间:
2015-07-01 - 期刊:
- 影响因子:
- 作者:
Chi Wang;Hua Liu;Kwame-Lante Wright;Bhaskar Krishnamachari;Murali Annavaram - 通讯作者:
Murali Annavaram
Differentially Private Next-Token Prediction of Large Language Models
大型语言模型的差分隐私下一个标记预测
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
James Flemings;Meisam Razaviyayn;Murali Annavaram - 通讯作者:
Murali Annavaram
Murali Annavaram的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Murali Annavaram', 18)}}的其他基金
SHF: Small: ML Accelerator Cohort Architecture
SHF:小型:ML 加速器群组架构
- 批准号:
2224319 - 财政年份:2022
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Student Travel Support for the 2018 International Symposium on Computer Architecture (ISCA)
2018 年计算机体系结构国际研讨会 (ISCA) 学生旅行支持
- 批准号:
1812942 - 财政年份:2018
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
SHF:Small: Accelerating Graph Analytics Through Coordinated Storage, Memory and Computing Advances
SHF:Small:通过协调存储、内存和计算进步加速图形分析
- 批准号:
1719074 - 财政年份:2017
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
IEEE International Symposium on Workload Characterization (IISWC) Student Subsidy Proposal
IEEE 国际工作负载表征研讨会 (IISWC) 学生资助提案
- 批准号:
1104542 - 财政年份:2011
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CAREER: From Nonstop-Monitoring to Nano-ISA: An Adaptive Multi-Dimensional Framework for Processor Reliability
职业生涯:从不间断监控到 Nano-ISA:处理器可靠性的自适应多维框架
- 批准号:
0954211 - 财政年份:2010
- 资助金额:
$ 40万 - 项目类别:
Continuing Grant
CSR-PSCE,SM: Trade-offs Between Static Power, Performance and Reliability in Future Chip Multiprocessors
CSR-PSCE,SM:未来芯片多处理器静态功耗、性能和可靠性之间的权衡
- 批准号:
0834799 - 财政年份:2008
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CSR-PSCE,SM: A Holistic Design Approach to Reliability Using 3D Stacked
CSR-PSCE,SM:使用 3D 堆叠的可靠性整体设计方法
- 批准号:
0834798 - 财政年份:2008
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CT-ISG: A Game Theoretic Framework for Privacy Preservation in Community-Based Mobile Applications
CT-ISG:基于社区的移动应用程序中隐私保护的博弈论框架
- 批准号:
0831545 - 财政年份:2008
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
相似国自然基金
昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
- 批准号:n/a
- 批准年份:2022
- 资助金额:10.0 万元
- 项目类别:省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
- 批准号:32000033
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
- 批准号:31972324
- 批准年份:2019
- 资助金额:58.0 万元
- 项目类别:面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
- 批准号:81900988
- 批准年份:2019
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
- 批准号:31802058
- 批准年份:2018
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
- 批准号:31870821
- 批准年份:2018
- 资助金额:56.0 万元
- 项目类别:面上项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
- 批准号:31772128
- 批准年份:2017
- 资助金额:60.0 万元
- 项目类别:面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
- 批准号:81704176
- 批准年份:2017
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
- 批准号:91640114
- 批准年份:2016
- 资助金额:85.0 万元
- 项目类别:重大研究计划
相似海外基金
Powering Small Craft with a Novel Ammonia Engine
用新型氨发动机为小型船只提供动力
- 批准号:
10099896 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Collaborative R&D
"Small performances": investigating the typographic punches of John Baskerville (1707-75) through heritage science and practice-based research
“小型表演”:通过遗产科学和基于实践的研究调查约翰·巴斯克维尔(1707-75)的印刷拳头
- 批准号:
AH/X011747/1 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Research Grant
Fragment to small molecule hit discovery targeting Mycobacterium tuberculosis FtsZ
针对结核分枝杆菌 FtsZ 的小分子片段发现
- 批准号:
MR/Z503757/1 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Research Grant
Bacteriophage control of host cell DNA transactions by small ORF proteins
噬菌体通过小 ORF 蛋白控制宿主细胞 DNA 交易
- 批准号:
BB/Y004426/1 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Research Grant
Windows for the Small-Sized Telescope (SST) Cameras of the Cherenkov Telescope Array (CTA)
切伦科夫望远镜阵列 (CTA) 小型望远镜 (SST) 相机的窗口
- 批准号:
ST/Z000017/1 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Research Grant
CSR: Small: Leveraging Physical Side-Channels for Good
CSR:小:利用物理侧通道做好事
- 批准号:
2312089 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
CSR: Small: Multi-FPGA System for Real-time Fraud Detection with Large-scale Dynamic Graphs
CSR:小型:利用大规模动态图进行实时欺诈检测的多 FPGA 系统
- 批准号:
2317251 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
AF: Small: Problems in Algorithmic Game Theory for Online Markets
AF:小:在线市场的算法博弈论问题
- 批准号:
2332922 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
Collaborative Research: FET: Small: Algorithmic Self-Assembly with Crisscross Slats
合作研究:FET:小型:十字交叉板条的算法自组装
- 批准号:
2329908 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant
NeTS: Small: ML-Driven Online Traffic Analysis at Multi-Terabit Line Rates
NeTS:小型:ML 驱动的多太比特线路速率在线流量分析
- 批准号:
2331111 - 财政年份:2024
- 资助金额:
$ 40万 - 项目类别:
Standard Grant