Compiler Optimizations for RTM-based computing systems

基于 RTM 的计算系统的编译器优化

基本信息

项目摘要

Computing systems are undergoing an incredible evolution since the end of Denard scaling and in the face of the current limitations of CMOS technologies. In addition to new computing paradigms, several new memory technologies are being proposed to replace or augment traditional random access memories (RAM). Among them, racetrack memories (RTMs) are an exciting non-volatile memory technology that promises the density of hard-disk drives with the a latency somewhere between static (SRAM) and dynamic RAM (DRAM). A fundamental difference of RTMs is that they store multiple bits sequentially per access transistor, as opposed to one bit in SRAM and DRAM. This makes the latency and energy needed to access data dependent on where the bits are located in the sequential bit stream, creating a new kind of spatial locality where the distance between memory offsets must be minimized to improve performance and save energy. While compilers have targeted temporal and spatial locality in the classical sense, there is not established theory or algorithms to handle the sequential nature of RTMs. This project proposes novel compiler analysis and optimizations for RTM-based computing systems, focusing on the concrete case of nested loop programs from the domains of linear algebra, machine learning and physics simulations. We propose extensions to polyhedral compilers to analyze profitable memory access patterns and transform the program by changing the data layout and the operation schedule. The main goal of these transformations is to produce a semantic-preserving memory access trace where the distances between consecutive accesses are minimized. We then leverage the higher-level semantics in domain-specific languages (DSLs) for tensor expressions, which nicely map to nested loop programs. DSLs offer more degrees of freedom for optimization, since the data layout can be more freely chosen and known algebraic properties of operators enable coarser-grained transformations. Optimizations in this project will target not only performance and energy consumption, but also the interesting trade-off between these standard metrics and capacity offered by RTMs. We expect this project to lay the groundwork for future compilers for RTM-based systems and and provide valuable system-level feedback to computer architects and perhaps material scientists.
自Denard Scaling结束以来,面对当前CMOS技术的局限性,计算系统正在经历一场令人难以置信的演变。除了新的计算模式外,还提出了几种新的存储器技术来取代或增强传统的随机存取存储器(RAM)。其中,赛道存储器(RTM)是一种激动人心的非易失性存储器技术,它承诺了硬盘驱动器的密度,延迟介于静态(SRAM)和动态RAM(DRAM)之间。RTM的一个根本区别在于,与SRAM和DRAM中的一位不同,它们在每个存取晶体管上顺序存储多个位。这使得访问数据所需的延迟和能量取决于位在顺序比特流中的位置,从而创建了一种新的空间局部性,其中必须最小化内存偏移量之间的距离以提高性能和节省能源。虽然编译器以经典意义上的时间和空间局部性为目标,但还没有确定的理论或算法来处理RTM的顺序性质。该项目针对基于RTM的计算系统提出了新的编译器分析和优化方案,重点针对线性代数、机器学习和物理模拟领域中的嵌套循环程序的具体情况。我们提出了对多面体编译器的扩展,以分析有利可图的内存访问模式,并通过改变数据布局和操作调度来转换程序。这些转换的主要目标是产生保持语义的存储器访问轨迹,其中连续访问之间的距离被最小化。然后,我们将领域特定语言(DSL)中的高级语义用于张量表达式,这很好地映射到嵌套循环程序。DSL为优化提供了更多自由度,因为可以更自由地选择数据布局,并且运算符的已知代数属性支持更粗粒度的转换。此项目中的优化不仅针对性能和能源消耗,而且还针对这些标准指标和RTMS提供的容量之间的有趣权衡。我们期望这个项目为未来基于RTM的系统的编译器奠定基础,并为计算机架构师,也许还有材料科学家提供有价值的系统级反馈。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Professor Dr.-Ing. Jeronimo Castrillon其他文献

Professor Dr.-Ing. Jeronimo Castrillon的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Professor Dr.-Ing. Jeronimo Castrillon', 18)}}的其他基金

TraceSymm: Trace analysis and Symmetry theory for improved application mapping onto manycores
TraceSymm:跟踪分析和对称理论,用于改进应用程序映射到多个内核上
  • 批准号:
    366764507
  • 财政年份:
    2017
  • 资助金额:
    --
  • 项目类别:
    Research Grants
OpenPME: Open Particle Mesh Environment for Systems Biology
OpenPME:系统生物学的开放粒子网格环境
  • 批准号:
    350008342
  • 财政年份:
    2017
  • 资助金额:
    --
  • 项目类别:
    Research Grants
Interferences in Design Methodology for High-performance Multi-Core Platforms
高性能多核平台设计方法中的干扰
  • 批准号:
    505744711
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:
    Research Grants
Balancing computations in in-memory nonvolatile heterogeneous systems
平衡内存中非易失性异构系统中的计算
  • 批准号:
    502388442
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:
    Priority Programmes

相似海外基金

CAREER: Scalable Physics-Inspired Ising Computing for Combinatorial Optimizations
职业:用于组合优化的可扩展物理启发伊辛计算
  • 批准号:
    2340453
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Collaborative Research: Scalable Circuit theoretic Framework for Large Grid Simulations and Optimizations: from Combined T&D Planning to Electromagnetic Transients
协作研究:大型电网仿真和优化的可扩展电路理论框架:来自组合 T
  • 批准号:
    2330195
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Collaborative Research: Scalable Circuit theoretic Framework for Large Grid Simulations and Optimizations: from Combined T&D Planning to Electromagnetic Transients
协作研究:大型电网仿真和优化的可扩展电路理论框架:来自组合 T
  • 批准号:
    2330196
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Collaborative Research: CNS Core: Medium: Reconfigurable Kernel Datapaths with Adaptive Optimizations
协作研究:CNS 核心:中:具有自适应优化的可重构内核数据路径
  • 批准号:
    2345339
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
SBIR Phase I: Trajectory Optimizations and Learned Foliage Manipulation to Accelerate Throughput in Automated Strawberry Harvesting
SBIR 第一阶段:轨迹优化和学习叶子操纵,以提高自动化草莓收获的吞吐量
  • 批准号:
    2322402
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
SPX: Collaborative Research: Cross-stack Memory Optimizations for Boosting I/O Performance of Deep Learning HPC Applications
SPX:协作研究:用于提升深度学习 HPC 应用程序 I/O 性能的跨堆栈内存优化
  • 批准号:
    2318628
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Robust Optimizations For Equity-Linked Products
股票挂钩产品的稳健优化
  • 批准号:
    RGPIN-2020-06821
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Discovery Grants Program - Individual
Applied Harmonic Analysis Methods for Non-Convex Optimizations and Low-Rank Matrix Analysis
非凸优化和低阶矩阵分析的应用调和分析方法
  • 批准号:
    2108900
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
CAREER: SHF: Chiplet-Package Co-Optimizations for 2.5D Heterogeneous SoCs with Low-Overhead IOs
职业:SHF:具有低开销 IO 的 2.5D 异构 SoC 的 Chiplet 封装协同优化
  • 批准号:
    2047388
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
Collaborative Research: CNS Core: Medium: Reconfigurable Kernel Datapaths with Adaptive Optimizations
协作研究:CNS 核心:中:具有自适应优化的可重构内核数据路径
  • 批准号:
    2105868
  • 财政年份:
    2021
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了