Investigation of Reliability-Constrained On-Chip Networks
可靠性受限片上网络的研究
基本信息
- 批准号:0541417
- 负责人:
- 金额:$ 37.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2006
- 资助国家:美国
- 起止时间:2006-04-15 至 2012-03-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Abstract0541417Alexander SawchukLos Angeles, CAInvestigation of Reliability - Constrained On-Chip NetworksAmong the many challenges computer architects will face over the next decade and beyond is the growing demand for reliable on-chip communication between system microarchitecture functional domains. Continued increases in scaling and integration of transistor and wiring resources are allowing more system functions to be implemented on chip, but also more circuit defects and variability. Recent trends toward partitioning the system microarchitecture into multiple on-chip compute domains in the form of functional unit blocks, tiles and processor cores mitigate chipcrossing delays and facilitate chip survivability. That is, it helps to prevent system performance and cost from being encumbered by deep submicron technology scaling. With these developments, support for low latency, high throughput, and fault tolerant communication is becoming more and more critical within the on-chip network used to interconnect the compute domains. Much recent research is directed toward the design of on-chip networks to meet certain cost/performance goals(chip area, latency and throughput), but very little architecture research explores on-chip networkreliability issues specific to the problem of hard faults, which is recognized as a growing problem.In this research, we investigate reliability challenges and techniques for on-chip networks that will meet manufacturing yield and chip reliability targets as technology scales into the deep submicron regime. The goal is to understand the problem more fully and to develop on-chip network techniques for efficient resource and reliability management, fault isolation, dynamic reconfiguration and fault recovery to allow fault-stricken microarchitectures partitioned across a chip to have increased usability and prolonged life. We endeavor to increase understanding of chip failure mechanisms (their causes and impact); appropriately model them as related specifically to on-chip networks; develop approaches and techniques that will allow on-chip networks (in cooperation with techniques for other components of the chip microarchitecture) to be resilient tohard faults; evaluate and assess the benefit of the proposed techniques under expected workloads and common-case operational conditions; and, furthermore, understand the tradeoffs in using the proposed fault-resilient on-chip network techniques that is, identify those situations in which various techniques can be most usefully applied given the existence of other possible constraints. The Intellectual Merit of this research is substantial. The research is timely as it addresses an important issue that will only worsen with continuing advancements in technology scaling. The research will culminate with key contributions made in (1) increasing our understanding of the fundamental design, process, and operational mechanisms most responsible for on-chip interconnect failures and (2) producing original and promising techniques for increasing on-chip interconnect reliability and chip reliability as a whole. Beyond the specific results produced by the models and simulation environments we will develop through this project, these tool artifacts will likely have a profound impact on future research infrastructure and education for years to come. Theywill be invaluable assets to researchers, students, and practitioners for understanding, developing,evaluating, and trading-off alternative reliability techniques as demanded by advanced technologies and systems. The tools we develop will be made publicly available and are expected to have widespread use. The results of this research will also be widely disseminated through publications. The Broader Impact of this research is significant and far-reaching. This research can have a profound impact on the success of near-future nanoscale technologies (molecular, quantum, etc.) used to implement integrated circuits beyond the CMOS era as ICs implemented in these technologies are expected to have substantially more hard faults (orders of magnitude) than CMOS ICs. Reliability techniques such as the ones that will be derived from this research will be critical to systems implemented in these technologies as well as those implemented in future deep submicron technology. In the nearer term, many of the ideas coming from this research may be transferrable to system-level networks, where form-factor constraints often are not as rigid as they are on-chip.
[摘要]alexander SawchukLos Angeles,可靠性约束片上网络研究未来十年及以后,计算机架构师将面临许多挑战,其中之一是对系统微架构功能域之间可靠的片上通信的需求日益增长。晶体管和布线资源的规模和集成度不断提高,使得更多的系统功能可以在芯片上实现,但也有更多的电路缺陷和可变性。最近的趋势是将系统微架构以功能单元块、块和处理器内核的形式划分为多个片上计算域,以减轻芯片交叉延迟并提高芯片的生存能力。也就是说,它有助于防止系统性能和成本受到深亚微米技术缩放的阻碍。随着这些发展,对低延迟、高吞吐量和容错通信的支持在用于互连计算域的片上网络中变得越来越重要。最近的许多研究都是针对片上网络的设计,以满足某些成本/性能目标(芯片面积、延迟和吞吐量),但很少有架构研究探讨片上网络的可靠性问题,特别是硬故障问题,这被认为是一个日益严重的问题。在这项研究中,我们研究了芯片上网络的可靠性挑战和技术,这些技术将满足制造良率和芯片可靠性目标,随着技术扩展到深亚微米范围。目标是更全面地了解问题,并开发片上网络技术,以实现有效的资源和可靠性管理、故障隔离、动态重构和故障恢复,从而使跨芯片分区的故障微架构具有更高的可用性和更长的寿命。我们努力增加对芯片失效机制的理解(它们的原因和影响);适当地将它们建模为专门与片上网络相关的;开发方法和技术,使片上网络(与芯片微架构的其他组件技术合作)能够适应硬故障;在预期的工作量和常见的操作条件下,评估和评估所建议的技术的效益;此外,了解使用所提出的故障弹性片上网络技术的权衡,即确定在存在其他可能约束的情况下,各种技术可以最有效地应用的情况。这项研究的智力价值是巨大的。这项研究很及时,因为它解决了一个重要的问题,这个问题只会随着技术规模的不断进步而恶化。该研究最终将在以下方面做出关键贡献:(1)增加我们对片上互连故障的基本设计,过程和操作机制的理解;(2)为提高片上互连可靠性和芯片整体可靠性提供原创和有前途的技术。除了我们将通过该项目开发的模型和模拟环境产生的具体结果之外,这些工具工件可能会对未来几年的研究基础设施和教育产生深远的影响。它们将成为研究人员、学生和从业人员理解、开发、评估和权衡先进技术和系统所需的替代可靠性技术的宝贵资产。我们开发的工具将向公众开放,并有望得到广泛使用。这项研究的结果也将通过出版物广泛传播。本研究的广泛影响是重要和深远的。这项研究可以对不久的将来用于实现超越CMOS时代的集成电路的纳米技术(分子,量子等)的成功产生深远的影响,因为在这些技术中实现的ic预计将比CMOS ic具有更多的硬故障(数量级)。可靠性技术,如将从这项研究中衍生出来的技术,对于在这些技术中实现的系统以及在未来的深亚微米技术中实现的系统至关重要。从近期来看,来自这项研究的许多想法可能会转移到系统级网络中,在系统级网络中,形状因素的限制通常不像芯片上那样严格。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Timothy Pinkston其他文献
Timothy Pinkston的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Timothy Pinkston', 18)}}的其他基金
Collaborative Research: SHF: Small: Architecture Innovations for Enabling Simultaneous Translation at the Edge
合作研究:SHF:小型:支持边缘同步翻译的架构创新
- 批准号:
2223484 - 财政年份:2022
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
SHF: Small: Collaborative Research: Design of Many-core NoCs for the Dark Silicon Era
SHF:小型:协作研究:暗硅时代的多核 NoC 设计
- 批准号:
1619472 - 财政年份:2016
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
SHF: Small: Enhancing Power, Performance, and Resource Efficiency of Many-core NoCs
SHF:小型:增强多核 NoC 的功耗、性能和资源效率
- 批准号:
1321131 - 财政年份:2013
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
EAGER: Network-Driven Shared Resource Design and Management in Multicores
EAGER:多核中网络驱动的共享资源设计和管理
- 批准号:
0946388 - 财政年份:2009
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
Efficient Adaptive Techniques for Irregular Switch-based Networks
基于不规则交换机的网络的高效自适应技术
- 批准号:
9812137 - 财政年份:1998
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
CAREER: Optically-Interconnected Fully-Adaptive Network Router
职业:光互连全自适应网络路由器
- 批准号:
9624251 - 财政年份:1996
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
System-level Integration of Optics into Multiprocessor Interconnect Architecture
将光学器件系统级集成到多处理器互连架构中
- 批准号:
9411587 - 财政年份:1994
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
相似海外基金
CAREER: Enhanced Reliability and Efficiency of Software Regression Testing in the Presence of Flaky Tests
职业:在存在不稳定测试的情况下增强软件回归测试的可靠性和效率
- 批准号:
2338287 - 财政年份:2024
- 资助金额:
$ 37.5万 - 项目类别:
Continuing Grant
A Secure Hub for Access, Reliability, and Exchange of Data (SHARED)
用于访问、可靠性和数据交换的安全中心(共享)
- 批准号:
2346746 - 财政年份:2024
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
CAREER: Energy Storage Systems for Dynamic Reliability of Modern Clean Smart Grid
职业:用于现代清洁智能电网动态可靠性的储能系统
- 批准号:
2339456 - 财政年份:2024
- 资助金额:
$ 37.5万 - 项目类别:
Continuing Grant
Eliminating localised wear of air foil thrust bearing for improved reliability and life of fuel cell system
消除箔片推力轴承的局部磨损,提高燃料电池系统的可靠性和使用寿命
- 批准号:
10089986 - 财政年份:2024
- 资助金额:
$ 37.5万 - 项目类别:
Collaborative R&D
CAREER: Understanding Fiber Bundle Failure Mechanics for Ultra-high Reliability Applications
职业:了解超高可靠性应用的光纤束失效机制
- 批准号:
2339223 - 财政年份:2024
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant
Exploring physical reservoir computing mechanisms by ultra-thin Si nanoresonators for enhancing computational reliability
通过超薄硅纳米谐振器探索物理储层计算机制以提高计算可靠性
- 批准号:
24K08219 - 财政年份:2024
- 资助金额:
$ 37.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
SBIR Phase II: A software-based tool for beyond visual line of sight (BVLOS) drone's connection reliability enhancement
SBIR 第二阶段:基于软件的工具,用于增强超视距 (BVLOS) 无人机的连接可靠性
- 批准号:
2304143 - 财政年份:2023
- 资助金额:
$ 37.5万 - 项目类别:
Cooperative Agreement
Auditing the accuracy of entertainment AI systems to increase reliability and trust.
审核娱乐人工智能系统的准确性,以提高可靠性和信任度。
- 批准号:
10075659 - 财政年份:2023
- 资助金额:
$ 37.5万 - 项目类别:
Grant for R&D
SaTC: CORE: Small: Mitigating Threats of Physical-Domain Signal Injections on Security, Reliability, and Safety of Sensing and Control Systems
SaTC:核心:小型:减轻物理域信号注入对传感和控制系统的安全性、可靠性和安全性的威胁
- 批准号:
2231682 - 财政年份:2023
- 资助金额:
$ 37.5万 - 项目类别:
Continuing Grant
IUCRC Planning Grant Carnegie Mellon University: Center for Materials Data Science for Reliability and Degradation (MDS-Rely)
IUCRC 规划拨款 卡内基梅隆大学:可靠性和退化材料数据科学中心 (MDS-Rely)
- 批准号:
2310663 - 财政年份:2023
- 资助金额:
$ 37.5万 - 项目类别:
Standard Grant