SHF: Small: Addressing Challenges for the Next Decade of Massively Parallel NUMA Accelerators

SHF:小型:应对大规模并行 NUMA 加速器未来十年的挑战

基本信息

  • 批准号:
    1910924
  • 负责人:
  • 金额:
    $ 49.54万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-10-01 至 2023-09-30
  • 项目状态:
    已结题

项目摘要

The physical and economic principles that enabled Dennard scaling and Moore's law in the semiconductor industry have reached their breaking point. However, as the number of transistors economically fabricated on a single chip plateaus, the processor industry has pivoted to create single-package computing systems, composed of multiple sub-components known as chiplets. Chiplets, which communicate via high-bandwidth on-package networks, offer the potential for transparent performance scaling into the next decade. However, chiplets introduce challenging non-uniform memory access characteristics into single-package systems that have traditionally not been subject to these effects. This project develops techniques to overcome the challenges of non-uniform memory accesses on high-performance single- and multi-package systems without programmer intervention. Exploring programmer-transparent scaling mechanisms improves the portability and lifetime of programs, decreasing the cost and complexity of software. Through the creation of course content and undergraduate summer internships, the project fosters an understanding of how to program machines in a post-Moore world and how compute accelerators should be designed to minimize the impact on the end-programmer as system complexity increases.This project develops coordinated data placement and thread scheduling algorithms that leverage static information from the compiler and dynamic information from the runtime system to inform data placement and hardware-based thread scheduling. It advances the state-of-the-art by developing an open-source Graphic Processing Unit (GPU) simulator with a hierarchical interconnect that can be used to model both chiplet-based GPUs and multi-GPU systems. The researchers are exploring compiler informed data placement and thread scheduling in GPUs. Initial results demonstrate that a static analysis of the code can predict the data accessed by GPU threadblocks. Analysis shows that it is possible to determine which threads in a grid share memory pages, and the manner of that sharing, by building new static techniques that add an additional dimension to decades of work on compilers for sequential code. Using static information, in combination with runtime information provided by GPU drivers, the researchers are developing advanced data placement, prefetching, and thread scheduling algorithms. Both future chiplet-based designs and existing multi-GPU systems benefit from the development of these algorithms. Looking beyond the high-bandwidth memory used in GPUs today the project explores the system-level implications of heterogeneous memory in a chiplet-based system. Data placement and thread scheduling have even more importance in GPU systems of the future that make use of high bandwidth memory, traditional dynamic random-access memory, and non-volatile memory. The problem sizes in such systems are anticipated to be so large that opportunistic data placement and thread scheduling are even more critical than in conventional systems. The project uses sharing patterns based on the inter-kernel producer-consumer nature of machine learning workloads to change the program's code layout, runtime data placement, and threadblock scheduling algorithm to maximize locality in multi-node systems.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
在半导体行业中,使Dennard比例和摩尔定律得以实现的物理和经济原理已经达到了它们的临界点。然而,随着在单个芯片平台上经济地制造出越来越多的晶体管,处理器行业已经转向创建由多个被称为芯片的子组件组成的单封装计算系统。Chiplet通过高带宽封装网络进行通信,为下一个十年的透明性能扩展提供了潜力。然而,小芯片将具有挑战性的非均匀存储器访问特性引入了传统上不受这些影响的单封装系统。该项目开发了一些技术,以克服高性能单封装和多封装系统上非统一内存访问的挑战,而无需程序员干预。探索程序员透明的伸缩机制提高了程序的可移植性和生命周期,降低了软件的成本和复杂性。通过创建课程内容和本科生暑期实习,该项目促进了对后摩尔时代机器编程的理解,以及随着系统复杂性的增加,如何设计计算加速器以将对最终程序员的影响降至最低。该项目开发了协调的数据放置和线程调度算法,利用来自编译器的静态信息和来自运行时系统的动态信息来通知数据放置和基于硬件的线程调度。它通过开发具有分层互连的开源图形处理单元(GPU)仿真器来推进最先进的技术,该仿真器可用于对基于芯片的GPU和多GPU系统进行建模。研究人员正在探索在GPU中进行编译器知情的数据放置和线程调度。初步结果表明,对代码的静态分析可以预测GPU线程块访问的数据。分析表明,通过构建新的静态技术来确定网格中的哪些线程共享内存页面以及共享的方式是可能的,这些技术为数十年来针对顺序代码的编译器工作增加了一个额外的维度。利用静态信息,结合GPU驱动程序提供的运行时信息,研究人员正在开发高级数据放置、预取和线程调度算法。未来基于芯片的设计和现有的多GPU系统都受益于这些算法的发展。展望当今GPU中使用的高带宽内存,该项目探索了基于芯片的系统中异类内存的系统级含义。在使用高带宽存储器、传统的动态随机存取存储器和非易失性存储器的未来GPU系统中,数据放置和线程调度更加重要。这种系统中的问题大小预计会如此之大,以至于机会性数据放置和线程调度甚至比传统系统中更关键。该项目使用基于机器学习工作负载的内核内生产者-消费者本质的共享模式来改变程序的代码布局、运行时数据放置和线程块调度算法,以最大化多节点系统中的局部性。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Mitigating GPU Core Partitioning Performance Effects
SIMR: Single Instruction Multiple Request Processing for Energy-Efficient Data Center Microservices
SIMR:节能数据中心微服务的单指令多请求处理
Locality-Centric Data and Threadblock Management for Massive GPUs
Accel-Sim: An Extensible Simulation Framework for Validated GPU Modeling
Principal Kernel Analysis: A Tractable Methodology to Simulate Scaled GPU Workloads
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Timothy Rogers其他文献

Surgical Lymph Node Staging in Extremity Rhabdomyosarcoma: The EpSSG RMS 2005 Trial Experience
  • DOI:
    10.1245/s10434-025-17908-3
  • 发表时间:
    2025-07-24
  • 期刊:
  • 影响因子:
    3.500
  • 作者:
    Sheila Terwisscha van Scheltinga;Johannes H. M. Merks;Florent Guerin;Timothy Rogers;Ross J. Craigie;Gabriela Guillén;Federica De Corti;Patrizia Dall’Igna;Raquel Dávila Fajardo;Gianni Bisogno;Andrea Ferrari;Daniel Orbach;Meriel Jenney;Julia C. Chisholm;Véronique Minard-Colin;Maya Cesen;Nina Jehanno;Laura S. Hiemcke-Jiwa;Ilaria Zanetti;Beatrice Coppadoro;Alida F. W. van der Steeg;Max M. van Noesel;Marc H. W. A. Wijnen
  • 通讯作者:
    Marc H. W. A. Wijnen
A tale of 3 testes? A rare presentation of lipoblastoma with a novel karyotype
  • DOI:
    10.1016/j.jpedsurg.2009.10.093
  • 发表时间:
    2010-01-01
  • 期刊:
  • 影响因子:
  • 作者:
    Andrew Robb;Timothy Rogers;Guy Nicholls
  • 通讯作者:
    Guy Nicholls
Self-Reported Emotions in Simulation-Based Learning: Active Participants vs. Observers.
基于模拟的学习中的自我报告情绪:主动参与者与观察者。
The BEST study--a prospective study to compare business class versus economy class air travel as a cause of thrombosis.
最佳研究——一项比较商务舱和经济舱航空旅行作为血栓形成原因的前瞻性研究。
Analyzing the Communication Gap Between the Instructional Design Consultant and the Faculty Member in the Design and Development Process of a Web-Based Course
分析网络课程设计和开发过程中教学设计顾问和教师之间的沟通差距
  • DOI:
  • 发表时间:
    2010
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Timothy Rogers
  • 通讯作者:
    Timothy Rogers

Timothy Rogers的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Timothy Rogers', 18)}}的其他基金

Autonomous Modelling Solutions for Operational Structural Dynamic Systems
运行结构动态系统的自主建模解决方案
  • 批准号:
    EP/W002140/1
  • 财政年份:
    2022
  • 资助金额:
    $ 49.54万
  • 项目类别:
    Research Grant
CAREER: Accessible Accelerators: Leveraging Productive Software on Efficient Hardware
职业:无障碍加速器:在高效硬件上利用高效软件
  • 批准号:
    1943379
  • 财政年份:
    2020
  • 资助金额:
    $ 49.54万
  • 项目类别:
    Continuing Grant

相似国自然基金

昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
  • 批准号:
    32000033
  • 批准年份:
    2020
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
  • 批准号:
    31972324
  • 批准年份:
    2019
  • 资助金额:
    58.0 万元
  • 项目类别:
    面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
  • 批准号:
    81900988
  • 批准年份:
    2019
  • 资助金额:
    21.0 万元
  • 项目类别:
    青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
  • 批准号:
    31870821
  • 批准年份:
    2018
  • 资助金额:
    56.0 万元
  • 项目类别:
    面上项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
  • 批准号:
    31802058
  • 批准年份:
    2018
  • 资助金额:
    26.0 万元
  • 项目类别:
    青年科学基金项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
  • 批准号:
    31772128
  • 批准年份:
    2017
  • 资助金额:
    60.0 万元
  • 项目类别:
    面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
  • 批准号:
    81704176
  • 批准年份:
    2017
  • 资助金额:
    20.0 万元
  • 项目类别:
    青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
  • 批准号:
    91640114
  • 批准年份:
    2016
  • 资助金额:
    85.0 万元
  • 项目类别:
    重大研究计划

相似海外基金

CPS: Small: Infusing Quantum Computing, Decomposition, and Learning for Addressing Cyber-Physical Systems Optimization Challenges
CPS:小型:融合量子计算、分解和学习来应对网络物理系统优化挑战
  • 批准号:
    2312086
  • 财政年份:
    2023
  • 资助金额:
    $ 49.54万
  • 项目类别:
    Standard Grant
Addressing Human Factors Challenges for Control Room Operators in Small Modular Reactors
解决小型模块化反应堆控制室操作员面临的人为因素挑战
  • 批准号:
    580479-2022
  • 财政年份:
    2022
  • 资助金额:
    $ 49.54万
  • 项目类别:
    Alliance Grants
Passive membrane filtration: addressing drinking water quality challenges in small, rural and/or marginalized communities
被动膜过滤:解决小型、农村和/或边缘化社区的饮用水质量挑战
  • 批准号:
    558389-2020
  • 财政年份:
    2021
  • 资助金额:
    $ 49.54万
  • 项目类别:
    Alliance Grants
CHS: Small: Collaborative Research: Structured Data Peer Production: Addressing Challenges and Leveraging Opportunities
CHS:小型:协作研究:结构化数据同行生产:应对挑战并利用机遇
  • 批准号:
    1815507
  • 财政年份:
    2018
  • 资助金额:
    $ 49.54万
  • 项目类别:
    Continuing Grant
TWC: Small: Collaborative: The Master Print: Investigating and Addressing Vulnerabilities in Fingerprint-based Authentication Systems
TWC:小:协作:主打印:调查和解决基于指纹的身份验证系统中的漏洞
  • 批准号:
    1617466
  • 财政年份:
    2016
  • 资助金额:
    $ 49.54万
  • 项目类别:
    Standard Grant
TWC: Small: Collaborative: The Master Print: Investigating and Addressing Vulnerabilities in Fingerprint-based Authentication Systems
TWC:小:协作:主打印:调查和解决基于指纹的身份验证系统中的漏洞
  • 批准号:
    1618750
  • 财政年份:
    2016
  • 资助金额:
    $ 49.54万
  • 项目类别:
    Standard Grant
NeTS: Small: Addressing End-system Bottlenecks in High-speed Networks
NeTS:小型:解决高速网络中的终端系统瓶颈
  • 批准号:
    1528087
  • 财政年份:
    2015
  • 资助金额:
    $ 49.54万
  • 项目类别:
    Standard Grant
TWC: Small: Addressing the challenges of cryptocurrencies: Security, anonymity, stability
TWC:小:应对加密货币的挑战:安全性、匿名性、稳定性
  • 批准号:
    1421689
  • 财政年份:
    2014
  • 资助金额:
    $ 49.54万
  • 项目类别:
    Standard Grant
NeTS: Small: Mobile mmWaves: Addressing the Cellular Capacity Crisis with 60 GHz Picocells
NeTS:小型:移动毫米波:利用 60 GHz 微微蜂窝解决蜂窝容量危机
  • 批准号:
    1317153
  • 财政年份:
    2013
  • 资助金额:
    $ 49.54万
  • 项目类别:
    Standard Grant
Addressing Protein Synthesis Regulation within Small Numbers of Discrete Neurons
解决少量离散神经元内的蛋白质合成调控问题
  • 批准号:
    10586226
  • 财政年份:
    2013
  • 资助金额:
    $ 49.54万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了