Investigation of Reliability-Constrained On-Chip Networks

可靠性受限片上网络的研究

基本信息

  • 批准号:
    0541417
  • 负责人:
  • 金额:
    $ 37.5万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2006
  • 资助国家:
    美国
  • 起止时间:
    2006-04-15 至 2012-03-31
  • 项目状态:
    已结题

项目摘要

Abstract0541417Alexander SawchukLos Angeles, CAInvestigation of Reliability - Constrained On-Chip NetworksAmong the many challenges computer architects will face over the next decade and beyond is the growing demand for reliable on-chip communication between system microarchitecture functional domains. Continued increases in scaling and integration of transistor and wiring resources are allowing more system functions to be implemented on chip, but also more circuit defects and variability. Recent trends toward partitioning the system microarchitecture into multiple on-chip compute domains in the form of functional unit blocks, tiles and processor cores mitigate chipcrossing delays and facilitate chip survivability. That is, it helps to prevent system performance and cost from being encumbered by deep submicron technology scaling. With these developments, support for low latency, high throughput, and fault tolerant communication is becoming more and more critical within the on-chip network used to interconnect the compute domains. Much recent research is directed toward the design of on-chip networks to meet certain cost/performance goals(chip area, latency and throughput), but very little architecture research explores on-chip networkreliability issues specific to the problem of hard faults, which is recognized as a growing problem.In this research, we investigate reliability challenges and techniques for on-chip networks that will meet manufacturing yield and chip reliability targets as technology scales into the deep submicron regime. The goal is to understand the problem more fully and to develop on-chip network techniques for efficient resource and reliability management, fault isolation, dynamic reconfiguration and fault recovery to allow fault-stricken microarchitectures partitioned across a chip to have increased usability and prolonged life. We endeavor to increase understanding of chip failure mechanisms (their causes and impact); appropriately model them as related specifically to on-chip networks; develop approaches and techniques that will allow on-chip networks (in cooperation with techniques for other components of the chip microarchitecture) to be resilient tohard faults; evaluate and assess the benefit of the proposed techniques under expected workloads and common-case operational conditions; and, furthermore, understand the tradeoffs in using the proposed fault-resilient on-chip network techniques that is, identify those situations in which various techniques can be most usefully applied given the existence of other possible constraints. The Intellectual Merit of this research is substantial. The research is timely as it addresses an important issue that will only worsen with continuing advancements in technology scaling. The research will culminate with key contributions made in (1) increasing our understanding of the fundamental design, process, and operational mechanisms most responsible for on-chip interconnect failures and (2) producing original and promising techniques for increasing on-chip interconnect reliability and chip reliability as a whole. Beyond the specific results produced by the models and simulation environments we will develop through this project, these tool artifacts will likely have a profound impact on future research infrastructure and education for years to come. Theywill be invaluable assets to researchers, students, and practitioners for understanding, developing,evaluating, and trading-off alternative reliability techniques as demanded by advanced technologies and systems. The tools we develop will be made publicly available and are expected to have widespread use. The results of this research will also be widely disseminated through publications. The Broader Impact of this research is significant and far-reaching. This research can have a profound impact on the success of near-future nanoscale technologies (molecular, quantum, etc.) used to implement integrated circuits beyond the CMOS era as ICs implemented in these technologies are expected to have substantially more hard faults (orders of magnitude) than CMOS ICs. Reliability techniques such as the ones that will be derived from this research will be critical to systems implemented in these technologies as well as those implemented in future deep submicron technology. In the nearer term, many of the ideas coming from this research may be transferrable to system-level networks, where form-factor constraints often are not as rigid as they are on-chip.
可靠性约束的片上网络的研究在计算机架构师在未来十年及以后将面临的许多挑战中,对系统微架构功能域之间可靠的片上通信的需求日益增长。晶体管和布线资源的缩放和集成的持续增加允许更多的系统功能在芯片上实现,但也允许更多的电路缺陷和可变性。最近的趋势是以功能单元块、瓦片和处理器核心的形式将系统微架构划分为多个片上计算域,这减轻了芯片交叉延迟并促进了芯片生存性。也就是说,它有助于防止系统性能和成本受到深亚微米技术缩放的阻碍。随着这些发展,在用于互连计算域的片上网络中,对低延迟、高吞吐量和容错通信的支持变得越来越重要。最近的研究主要集中在片上网络的设计上,以满足一定的成本/性能目标(芯片面积,延迟和吞吐量),但很少有体系结构研究探讨特定于硬故障问题的片上网络可靠性问题,这被认为是一个日益严重的问题。在这项研究中,我们研究可靠性挑战和技术,随着技术扩展到深亚微米范围,芯片网络将满足制造产量和芯片可靠性目标。我们的目标是更充分地了解这个问题,并开发有效的资源和可靠性管理,故障隔离,动态重新配置和故障恢复,允许故障的微架构跨芯片分区,以提高可用性和延长寿命的芯片上的网络技术。我们奋进提高对芯片故障机制的理解(其原因和影响);适当地将其建模为与片上网络具体相关;开发方法和技术,(与用于芯片微体系结构的其他组件的技术合作)对硬故障有弹性;评价和评估拟议技术在预期工作量和常见操作条件下的效益;并且此外理解在使用所提出的故障恢复芯片上网络技术时的权衡,即,识别在给定存在其它可能约束的情况下可以最有效地应用各种技术的那些情况。 这项研究的学术价值是巨大的。这项研究是及时的,因为它解决了一个重要的问题,这个问题只会随着技术规模的不断进步而恶化。该研究将在以下方面做出关键贡献:(1)增加我们对最容易导致片上互连故障的基本设计、工艺和操作机制的了解,以及(2)开发出原创且有前途的技术来提高片上互连可靠性和芯片可靠性作为一个整体。除了我们将通过该项目开发的模型和模拟环境产生的具体结果之外,这些工具工件可能会对未来的研究基础设施和教育产生深远的影响。他们将是宝贵的资产,研究人员,学生和从业人员的理解,开发,评估和权衡替代可靠性技术的先进技术和系统的要求。我们开发的工具将公开提供,预计将得到广泛使用。这项研究的结果也将通过出版物广泛传播。这项研究的广泛影响是重大和深远的。这项研究可能会对不久的将来纳米级技术(分子、量子等)的成功产生深远的影响用于实现CMOS时代之后的集成电路,因为预期在这些技术中实现的IC具有比CMOS IC实质上更多的硬故障(数量级)。可靠性技术,如那些将来自这项研究将是至关重要的系统中实现这些技术,以及那些在未来的深亚微米技术。从近期来看,这项研究中的许多想法可能会转移到系统级网络中,其中的形状因子约束通常不像片上那样严格。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Timothy Pinkston其他文献

Timothy Pinkston的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Timothy Pinkston', 18)}}的其他基金

Collaborative Research: SHF: Small: Architecture Innovations for Enabling Simultaneous Translation at the Edge
合作研究:SHF:小型:支持边缘同步翻译的架构创新
  • 批准号:
    2223484
  • 财政年份:
    2022
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
SHF: Small: Collaborative Research: Design of Many-core NoCs for the Dark Silicon Era
SHF:小型:协作研究:暗硅时代的多核 NoC 设计
  • 批准号:
    1619472
  • 财政年份:
    2016
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
SHF: Small: Enhancing Power, Performance, and Resource Efficiency of Many-core NoCs
SHF:小型:增强多核 NoC 的功耗、性能和资源效率
  • 批准号:
    1321131
  • 财政年份:
    2013
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
EAGER: Network-Driven Shared Resource Design and Management in Multicores
EAGER:多核中网络驱动的共享资源设计和管理
  • 批准号:
    0946388
  • 财政年份:
    2009
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Efficient Adaptive Techniques for Irregular Switch-based Networks
基于不规则交换机的网络的高效自适应技术
  • 批准号:
    9812137
  • 财政年份:
    1998
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
CAREER: Optically-Interconnected Fully-Adaptive Network Router
职业:光互连全自适应网络路由器
  • 批准号:
    9624251
  • 财政年份:
    1996
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
System-level Integration of Optics into Multiprocessor Interconnect Architecture
将光学器件系统级集成到多处理器互连架构中
  • 批准号:
    9411587
  • 财政年份:
    1994
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant

相似海外基金

CAREER: Enhanced Reliability and Efficiency of Software Regression Testing in the Presence of Flaky Tests
职业:在存在不稳定测试的情况下增强软件回归测试的可靠性和效率
  • 批准号:
    2338287
  • 财政年份:
    2024
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Continuing Grant
A Secure Hub for Access, Reliability, and Exchange of Data (SHARED)
用于访问、可靠性和数据交换的安全中心(共享)
  • 批准号:
    2346746
  • 财政年份:
    2024
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
CAREER: Energy Storage Systems for Dynamic Reliability of Modern Clean Smart Grid
职业:用于现代清洁智能电网动态可靠性的储能系统
  • 批准号:
    2339456
  • 财政年份:
    2024
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Continuing Grant
CAREER: Understanding Fiber Bundle Failure Mechanics for Ultra-high Reliability Applications
职业:了解超高可靠性应用的光纤束失效机制
  • 批准号:
    2339223
  • 财政年份:
    2024
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
Eliminating localised wear of air foil thrust bearing for improved reliability and life of fuel cell system
消除箔片推力轴承的局部磨损,提高燃料电池系统的可靠性和使用寿命
  • 批准号:
    10089986
  • 财政年份:
    2024
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Collaborative R&D
Exploring physical reservoir computing mechanisms by ultra-thin Si nanoresonators for enhancing computational reliability
通过超薄硅纳米谐振器探索物理储层计算机制以提高计算可靠性
  • 批准号:
    24K08219
  • 财政年份:
    2024
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
SBIR Phase II: A software-based tool for beyond visual line of sight (BVLOS) drone's connection reliability enhancement
SBIR 第二阶段:基于软件的工具,用于增强超视距 (BVLOS) 无人机的连接可靠性
  • 批准号:
    2304143
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Cooperative Agreement
Auditing the accuracy of entertainment AI systems to increase reliability and trust.
审核娱乐人工智能系统的准确性,以提高可靠性和信任度。
  • 批准号:
    10075659
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Grant for R&D
SaTC: CORE: Small: Mitigating Threats of Physical-Domain Signal Injections on Security, Reliability, and Safety of Sensing and Control Systems
SaTC:核心:小型:减轻物理域信号注入对传感和控制系统的安全性、可靠性和安全性的威胁
  • 批准号:
    2231682
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Continuing Grant
IUCRC Planning Grant Carnegie Mellon University: Center for Materials Data Science for Reliability and Degradation (MDS-Rely)
IUCRC 规划拨款 卡内基梅隆大学:可靠性和退化材料数据科学中心 (MDS-Rely)
  • 批准号:
    2310663
  • 财政年份:
    2023
  • 资助金额:
    $ 37.5万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了