CAREER: Building Scalable and Reliable Composable Computer Architectures
职业:构建可扩展且可靠的可组合计算机架构
基本信息
- 批准号:2341039
- 负责人:
- 金额:$ 49.8万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2024
- 资助国家:美国
- 起止时间:2024-07-01 至 2029-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
In the post-Moore era, computing platforms have become more diverse and heterogeneous. With the evolution of packaging and interconnect technology, multiple computing and memory components are integrated into a single processor package. The high-bandwidth and coherent interconnects enable multiple accelerators and memory components on a platform together achieve server scale computing power. Though this new paradigm of computing platforms enables more optimal processor designs for domain-specific computing, the scalability is unclear. The fast interconnects between intra- and inter-chip components do not necessarily lead to linear speedup unless the communications are carefully handled. This project aims to keep up with performance projection of Moore’s law in post-Moore era with scalable architecture-level solutions. As graphics processing units (GPUs) are increasingly important for accelerating big data workloads, this project will focus on architecting highly scalable and reliable GPU platforms that can achieve almost linear speedup with the scaling of GPU chiplet modules and memory devices. The presented research tools and virtual memory systems will advance the state-of-the-art architectures with coherent and scalable communications among the intra- and inter-GPU chiplet components. The presented architecture design will be able to accelerate emerging big-data workloads without needing to access expensive cloud or data center supercomputers. The research findings will be incorporated into new and existing undergraduate and graduate courses as well as K-12 outreach programs.This project aims to address the following research questions: 1) How to manage all the integrated computing and memory components to communicate efficiently? Can the conventional virtual memory system handle large volumes of address translations? 2) How to achieve scalable and sustainable performance over multi-level non-uniform memory access (NUMA) architectures? Can consistent data access latency be enforced? This project answers these questions through two technical thrusts. The first thrust will design research tools that enable design explorations of scalable and heterogeneous platforms. Then, efficient virtual memory systems and page mapping algorithms will be architected. Unlike existing solutions, the methods presented in this project will exploit the unique GPU execution model while enabling coherent communication among intra- and inter-GPU packages. The second thrust will explore methods to enforce sustainable performance on the target multi-GPU systems having disaggregated memories. These new platforms have emerging challenges of deeper NUMA levels than conventional systems because individual computing and memory components can be integrated through multiple levels of extensible switches. This thrust will design efficient memory management and prefetch algorithms, which together enforce data to be ready within 1-2 NUMA distances.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
在后摩尔时代,计算平台变得更加多样化和异构。随着封装和互连技术的发展,多个计算和存储器组件被集成到单个处理器封装中。高带宽和一致的互连使平台上的多个加速器和内存组件能够一起实现服务器规模的计算能力。尽管这种新的计算平台范式能够为特定领域的计算提供更优化的处理器设计,但其可扩展性尚不清楚。芯片内和芯片间组件之间的快速互连不一定会导致线性加速,除非仔细处理通信。该项目旨在通过可扩展的体系结构级解决方案,跟上后摩尔时代摩尔定律的性能预测。由于图形处理单元(GPU)对于加速大数据工作负载越来越重要,该项目将专注于构建高度可扩展和可靠的GPU平台,这些平台可以通过GPU小芯片模块和内存设备的扩展实现几乎线性的加速。所提出的研究工具和虚拟内存系统将推进最先进的架构,在GPU内和GPU间小芯片组件之间进行一致和可扩展的通信。所提出的架构设计将能够加速新兴的大数据工作负载,而无需访问昂贵的云或数据中心超级计算机。研究结果将被纳入新的和现有的本科生和研究生课程以及K-12推广计划。本项目旨在解决以下研究问题:1)如何管理所有集成的计算和存储组件,以有效地进行通信?传统的虚拟存储器系统能处理大量的地址转换吗?2)如何在多级非均匀内存访问(NUMA)架构上实现可扩展和可持续的性能?是否可以强制实施一致的数据访问延迟?这个项目通过两个技术要点回答了这些问题。第一个推力将设计研究工具,使设计探索的可扩展性和异构平台。然后,有效的虚拟存储器系统和页面映射算法将架构。与现有的解决方案不同,该项目中提出的方法将利用独特的GPU执行模型,同时实现GPU内部和GPU之间的一致通信。第二个重点将探索在具有分散存储器的目标多GPU系统上实施可持续性能的方法。这些新平台具有比传统系统更深的NUMA级别的新挑战,因为可以通过多个级别的可扩展交换机来集成各个计算和存储器组件。该奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Hyeran Jeon其他文献
Pilot Register File: Energy Efficient Partitioned Register File for GPUs
Pilot 寄存器文件:GPU 的节能分区寄存器文件
- DOI:
10.1109/hpca.2017.47 - 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Mohammad Abdel;A. Shafaei;Hyeran Jeon;Massoud Pedram;M. Annavaram - 通讯作者:
M. Annavaram
Understanding Scalability of Multi-GPU Systems
了解多 GPU 系统的可扩展性
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Yuan Feng;Hyeran Jeon - 通讯作者:
Hyeran Jeon
Locality-Aware GPU Register File
位置感知 GPU 寄存器文件
- DOI:
10.1109/lca.2019.2959298 - 发表时间:
2019 - 期刊:
- 影响因子:2.3
- 作者:
Hyeran Jeon;Hodjat Asghari Esfeden;N. Abu;Daniel Wong;S. Elango - 通讯作者:
S. Elango
Architectural Vulnerability Modeling and Analysis of Integrated Graphics Processors
集成图形处理器的架构漏洞建模与分析
- DOI:
- 发表时间:
2013 - 期刊:
- 影响因子:0
- 作者:
Hyeran Jeon;Mark Wilkening;Vilas Sridharan;Sudhanva Hurumurthi;G. Loh - 通讯作者:
G. Loh
Improving Energy Efficiency of GPUs through Data Compression and Compressed Execution
通过数据压缩和压缩执行提高 GPU 的能源效率
- DOI:
10.1109/tc.2016.2619348 - 发表时间:
2017 - 期刊:
- 影响因子:3.7
- 作者:
Sangpil Lee;Keunsoo Kim;Gunjae Koo;Hyeran Jeon;M. Annavaram;W. Ro - 通讯作者:
W. Ro
Hyeran Jeon的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Hyeran Jeon', 18)}}的其他基金
Travel: Student Travel Support for the 51st International Symposium on Computer Architecture (ISCA)
旅行:第 51 届计算机体系结构国际研讨会 (ISCA) 的学生旅行支持
- 批准号:
2409279 - 财政年份:2024
- 资助金额:
$ 49.8万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Small: Towards Robust Deep Learning Computing on GPUs
合作研究:SHF:小型:在 GPU 上实现稳健的深度学习计算
- 批准号:
2114514 - 财政年份:2021
- 资助金额:
$ 49.8万 - 项目类别:
Standard Grant
NSF Student Travel Support for the 5th Career Workshop for Women and Minorities in Computer Architecture
NSF 学生为第五届计算机架构领域女性和少数族裔职业研讨会提供旅行支持
- 批准号:
1946220 - 财政年份:2019
- 资助金额:
$ 49.8万 - 项目类别:
Standard Grant
相似国自然基金
基于支链淀粉building blocks构建优质BE突变酶定向修饰淀粉调控机制的研究
- 批准号:31771933
- 批准年份:2017
- 资助金额:60.0 万元
- 项目类别:面上项目
相似海外基金
CC* Networking Infrastructure: Building a Scalable and Polymorphic Cyberinfrastructure for Diverse Research and Education Needs at Illinois State University
CC* 网络基础设施:为伊利诺伊州立大学的多样化研究和教育需求构建可扩展和多态的网络基础设施
- 批准号:
2346712 - 财政年份:2024
- 资助金额:
$ 49.8万 - 项目类别:
Standard Grant
CAS: Degradable Polyacrylates From Natural and Scalable Building Blocks
CAS:来自天然和可扩展构件的可降解聚丙烯酸酯
- 批准号:
2348679 - 财政年份:2024
- 资助金额:
$ 49.8万 - 项目类别:
Standard Grant
Development and demonstration of Automated Rapid Thermal Performance Assessments (RaThPAs) for scalable, accurate assessment of building fabric
开发和演示自动快速热性能评估 (RaThPA),用于对建筑结构进行可扩展、准确的评估
- 批准号:
10073283 - 财政年份:2023
- 资助金额:
$ 49.8万 - 项目类别:
Collaborative R&D
Making the Building Blocks of Early Math Scalable, Accessible, and Viable for All Young Children and Their Teachers
使早期数学的基础对于所有幼儿及其老师来说都是可扩展的、可访问的和可行的
- 批准号:
2300606 - 财政年份:2023
- 资助金额:
$ 49.8万 - 项目类别:
Continuing Grant
AutoEPC - Scalable, Accurate, Automated Building Fabric Assessment
AutoEPC - 可扩展、准确、自动化的建筑结构评估
- 批准号:
10074665 - 财政年份:2023
- 资助金额:
$ 49.8万 - 项目类别:
Collaborative R&D
Efficient and Scalable Design of Resource Allocation and Orchestration Solutions Towards Building Beyond 5G Networks
高效且可扩展的资源分配和编排解决方案设计,构建超越 5G 的网络
- 批准号:
RGPIN-2020-06622 - 财政年份:2022
- 资助金额:
$ 49.8万 - 项目类别:
Discovery Grants Program - Individual
Evolving Hands: Building Workflows and Scalable Practices for Handwriting Recognition and Text Encoding in Cultural Institutions
进化之手:在文化机构中构建手写识别和文本编码的工作流程和可扩展实践
- 批准号:
AH/W005360/1 - 财政年份:2022
- 资助金额:
$ 49.8万 - 项目类别:
Research Grant
Building Scalable and Integrated Pathways to Industry 4.0 for Regional SMEs
为区域中小企业构建可扩展和集成的工业 4.0 路径
- 批准号:
CCMOB-2021-00065 - 财政年份:2022
- 资助金额:
$ 49.8万 - 项目类别:
CCI Mobilize Grants
21EBTA. Bioengineering iLUNGs - Building scalable, integrated, multicellular and personalised human in vitro LUNGs
21 EBTA。
- 批准号:
BB/W014564/1 - 财政年份:2022
- 资助金额:
$ 49.8万 - 项目类别:
Research Grant
Efficient and Scalable Design of Resource Allocation and Orchestration Solutions Towards Building Beyond 5G Networks
高效且可扩展的资源分配和编排解决方案设计,构建超越 5G 的网络
- 批准号:
RGPIN-2020-06622 - 财政年份:2021
- 资助金额:
$ 49.8万 - 项目类别:
Discovery Grants Program - Individual