SPX: CISIT: Computing In Situ and In Memory for Hierarchical Numerical Algorithms
SPX:CISIT:分层数值算法的原位和内存计算
基本信息
- 批准号:1725743
- 负责人:
- 金额:$ 80万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-10-01 至 2020-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
High performance computing holds an enormous promise for revolutionizing science, technology, and everyday life through modeling and simulation, statistical inference, and artificial intelligence. Despite the numerous successes in software and hardware technologies, energy efficiency barriers have become a major hurdle towards more powerful computers -- from mobile devices all the way to supercomputers. The originally natural separation between the memory subsystem and the central processing unit (CPU) of a computer has emerged as one the main reasons for energy inefficiency. Data movement between the memory and the CPU requires orders of magnitude more energy than the computations themselves. To address these challenges, this project will consider novel architectural design paradigms and algorithms that are aimed at blurring these traditional boundaries between separated memory and computation subsystems and, by distributing computations to be performed directly in the memory or as part of the memory data transfers, achieve order of magnitude gains inenergy efficiency and performance. This project will investigate such novel approaches in the context of a class of methods in computational mathematics, which appear at the core of many problems in computational science, large-scale data analytics, and machine learning.Specifically, this project will focus on data-driven rather than compute-driven co-design of algorithms and architectures for the construction, approximation, and factorization of hierarchical matrices. The end-goal of the project is the design of a novel architecture, CISIT (for ``Computing In Situ and In Transit''), that specifically aims to address acceleration of both computation and data movement in the context of hierarchical matrices. CISIT will uniquely combine traditional general-purpose CPU and GPU cores with: (1) acceleration of core algorithmic primitives using custom hardware; (2) in-situ computing capabilities that will comprise both processing in or near main memory as well as computing within on-chip caches and memory close to the cores; (3) novel in-transit compute capabilities that will enable cutting down on and in many cases completely eliminating unnecessary roundtrip data transfers by processing of data transparently as it is transferred between main memory and local compute cores across the cache hierarchies. Upon success, CISIT will influence future architectural implementations. Along with the research activities, an educational and dissemination program will be designed to communicate the results of this work to both students and researchers, as well as a more general audience of computational and application scientists.
高性能计算通过建模和仿真、统计推断和人工智能为科学、技术和日常生活带来了巨大的变革。尽管在软件和硬件技术方面取得了许多成功,但从移动设备到超级计算机,能效障碍已经成为迈向更强大计算机的主要障碍。计算机的内存子系统和中央处理器(CPU)之间原本自然的分离已经成为能源效率低下的主要原因之一。内存和CPU之间的数据移动需要比计算本身多几个数量级的能量。为了应对这些挑战,该项目将考虑新的架构设计范式和算法,旨在模糊分离内存和计算子系统之间的传统界限,并通过直接在内存中执行的分布式计算或作为内存数据传输的一部分,实现能源效率和性能的数量级提升。该项目将在计算数学中的一类方法的背景下研究这些新方法,这些方法出现在计算科学,大规模数据分析和机器学习中的许多问题的核心。具体来说,这个项目将侧重于数据驱动而不是计算驱动的算法和架构的共同设计,用于构建、逼近和分解层次矩阵。该项目的最终目标是设计一种新颖的体系结构,CISIT(“原位和传输计算”),专门用于解决分层矩阵背景下计算和数据移动的加速问题。CISIT将独特地结合传统的通用CPU和GPU内核:(1)使用自定义硬件加速核心算法原语;(2)就地计算能力,既包括在主存储器内或主存储器附近的处理,也包括在片上高速缓存和靠近核心的存储器内的计算;(3)新颖的传输计算能力,通过透明地处理跨缓存层次结构在主存和本地计算核心之间传输的数据,可以减少并在许多情况下完全消除不必要的往返数据传输。一旦成功,CISIT将影响未来的体系结构实现。在进行研究活动的同时,还将设计一个教育和传播计划,向学生和研究人员以及计算和应用科学家等更广泛的受众交流这项工作的结果。
项目成果
期刊论文数量(15)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Off-Chip Congestion Management for GPU-based Non-Uniform Processing-in-Memory Networks
基于 GPU 的非均匀处理内存网络的片外拥塞管理
- DOI:10.1109/pdp50117.2020.00050
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Punniyamurthy, Kishore;Gerstlauer, Andreas
- 通讯作者:Gerstlauer, Andreas
Cacheline Utilization-Aware Link Traffic Compression for Modular GPUs
模块化 GPU 的缓存线利用率感知链路流量压缩
- DOI:10.1109/vlsid49098.2020.00041
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Punniyamurthy, Kishore;Das, Shomit;Gerstlauer, Andreas
- 通讯作者:Gerstlauer, Andreas
CLAIRE: A DISTRIBUTED-MEMORY SOLVER FOR CONSTRAINED LARGE DEFORMATION DIFFEOMORPHIC IMAGE REGISTRATION.
- DOI:10.1137/18m1207818
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Mang A;Gholami A;Davatzikos C;Biros G
- 通讯作者:Biros G
HALO: A Hierarchical Memory Access Locality Modeling Technique For Memory System Explorations
- DOI:10.1145/3205289.3205323
- 发表时间:2018-06
- 期刊:
- 影响因子:0
- 作者:Reena Panda;L. John
- 通讯作者:Reena Panda;L. John
A Case for Granularity Aware Page Migration
- DOI:10.1145/3205289.3208064
- 发表时间:2018-06
- 期刊:
- 影响因子:0
- 作者:Jee Ho Ryoo;L. John;Arkaprava Basu
- 通讯作者:Jee Ho Ryoo;L. John;Arkaprava Basu
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
George Biros其他文献
Performance Characterization of Python Runtimes for Multi-device Task Parallel Programming
- DOI:
10.1007/s10766-025-00788-1 - 发表时间:
2025-03-18 - 期刊:
- 影响因子:0.900
- 作者:
William Ruys;Hochan Lee;Bozhi You;Shreya Talati;Jaeyoung Park;James Almgren-Bell;Yineng Yan;Milinda Fernando;Mattan Erez;Milos Gligoric;Martin Burtscher;Christopher J. Rossbach;Keshav Pingali;George Biros - 通讯作者:
George Biros
George Biros的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('George Biros', 18)}}的其他基金
CDS&E: AI-RHEO: Learning coarse-graining of complex fluids
CDS
- 批准号:
2204226 - 财政年份:2022
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
SHF: Small: Algorithms and Software for Scalable Kernel Methods
SHF:小型:可扩展核方法的算法和软件
- 批准号:
1817048 - 财政年份:2018
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
XPS: DSD: A2MA - Algorithms and Architectures for Multiresolution Applications
XPS:DSD:A2MA - 多分辨率应用的算法和架构
- 批准号:
1337393 - 财政年份:2013
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: Petascale Algorithms for Particulate Flows
合作研究:颗粒流的千万亿次算法
- 批准号:
1341290 - 财政年份:2012
- 资助金额:
$ 80万 - 项目类别:
Continuing Grant
Collaborative Research: SI2-SSE: Software for integral equation solvers on manycore and heterogeneous architectures
合作研究:SI2-SSE:多核和异构架构上的积分方程求解器软件
- 批准号:
1203182 - 财政年份:2012
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
CDI Type II/Collaborative Research: Ultra-high Resolution Dynamic Earth Models through Joint Inversion of Seismic and Geodynamic Data
CDI II 型/合作研究:通过地震和地球动力学数据联合反演的超高分辨率动态地球模型
- 批准号:
1209203 - 财政年份:2011
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
CDI Type II/Collaborative Research: Ultra-high Resolution Dynamic Earth Models through Joint Inversion of Seismic and Geodynamic Data
CDI II 型/合作研究:通过地震和地球动力学数据联合反演的超高分辨率动态地球模型
- 批准号:
1029022 - 财政年份:2010
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: SI2-SSE: Software for integral equation solvers on manycore and heterogeneous architectures
合作研究:SI2-SSE:多核和异构架构上的积分方程求解器软件
- 批准号:
1047980 - 财政年份:2010
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: DDDAS-TMRP: MIPS: A Real-Time Measurement Inversion Prediction Steering Framework for Hazardous Events
合作研究:DDDAS-TMRP:MIPS:危险事件实时测量反演预测指导框架
- 批准号:
0929947 - 财政年份:2009
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: Petascale Algorithms for Particulate Flows
合作研究:颗粒流的千万亿次算法
- 批准号:
0923710 - 财政年份:2009
- 资助金额:
$ 80万 - 项目类别:
Continuing Grant














{{item.name}}会员




