Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
基本信息
- 批准号:2328975
- 负责人:
- 金额:$ 9.97万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2024
- 资助国家:美国
- 起止时间:2024-01-01 至 2026-12-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
In traditional Von Neumann computing systems, a significant bottleneck arises because the data transfer speed to and from the computing units has considerably fallen behind capacity, processing speed, and efficiency. To mitigate this bottleneck by bridging the gap between storage and computation, many innovative storage technologies have been introduced, along with near- and in-memory processing solutions designed for both emerging and traditional memory systems. Nonetheless, a considerable challenge remains: the prototyping and characterization of actual fabricated systems, especially those encompassing both mature technologies and cutting-edge technologies. To overcome this challenge, this project develops a cutting-edge Retunable and Reconfigurable Acceleration Platform (R3AP) based on emerging racetrack memory, leveraging a device-architecture-application co-design approach. The standout features of R3AP include its ability to function as a reconfigurable logic, a processing-in-memory (PIM) accelerator, and a high-density memory storage. It is retunable, meaning it can operate with bit-wise, integer, and floating-point precision, and can simulate analog-like storage and processing. R3AP effectively mitigates data movement inefficiencies while offering domain-specific acceleration and adaptability. With its dense, reliable, energy-efficient, and ultra-low latency computational capability, R3AP has the potential to revolutionize the storage and processing capabilities of future computing systems, such as those in Internet of Things (IoT) and Cyber-Physical Systems (CPS). It can also be applied to high-performance and cloud computing systems. The project's findings are shared through publications, workshops, design contests, tutorials, industrial courses, and technology transfer activities. Educational resources and outreach activity plans are made available on the project website, and software artifacts are released on GitHub.To realize R3AP, the project comprises a series of interrelated research tasks spanning multiple system layers. At the device level, the project integrates the voltage-controlled skyrmion motion mechanism with the industrial-grade 8-inch wafer magnetic tunneling junction stack and demonstrates a fully functional Skyrmion racetrack memory (SRTM), including the formation, shifting, and detection of the skyrmion stream. Additionally, it evaluates the performance of SRTM, focusing on aspects such as write-error-rate, shift-error-rate, read-error-rate, operation speed, and energy consumption. It also addresses and mitigates non-idealities, such as the pinning effect, and goes on to develop and demonstrate CMOS-integrated SRTM. On the architecture and circuit layers, the project involves the creation of a mutable lookup table, compute, and memory unit. This unit performs like multi-context Field-Programmable Gate Array (FPGA) logic, parallel PIM logic, massively parallel accumulators, and analog-like storage and compute structures, leveraging the unique properties of SRTM. This layer ensures high-speed memory access from a hierarchy consisting of banks, subarrays, tiles, etc., and further adds links via configurable switch boxes and a mesh-based network-on-chip to enable data movement operations for PIM that would otherwise be challenging. At the application layer, the project develops novel modeling, analysis, design space exploration, and runtime adjustment techniques to exploit the high degree of reconfigurability provided by R3AP. The goal is to adapt future IoT and CPS applications to changing environments and requirements, optimize resource usage, withstand external disturbances, and enhance overall system performance, resilience, and sustainability. Across all these layers, the project develops a scalable computer-aided design (CAD) flow. This involves a multi-level intermediate representation-based compilation flow, which can compile high-level description languages such as PyTorch and C/C++ into binaries for the R3AP device. This flow uses a multi-level hierarchy including front-end, middle-end, and back-end compilation of the designs, and abstracts various optimization and management problems to a suitable level for efficient resolution.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
在传统的冯·诺依曼计算系统中,由于往返于计算单元的数据传输速度已经大大落后于容量、处理速度和效率,因此出现了显著的瓶颈。为了通过弥合存储和计算之间的差距来缓解这一瓶颈,人们引入了许多创新的存储技术,以及为新兴和传统存储系统设计的近内存和内存中处理解决方案。沿着。尽管如此,仍然存在相当大的挑战:实际制造系统的原型设计和表征,特别是那些包含成熟技术和尖端技术的系统。为了克服这一挑战,该项目基于新兴的赛道存储器开发了一个先进的可重新调整和可重新配置的加速平台(R3 AP),利用设备-架构-应用协同设计方法。R3 AP的突出功能包括其作为可重新配置逻辑、内存处理(PIM)加速器和高密度内存存储的能力。它是可重调的,这意味着它可以按位、整数和浮点精度操作,并且可以模拟类似于模拟的存储和处理。R3 AP有效地降低了数据移动效率,同时提供特定于域的加速和适应性。凭借其密集、可靠、节能和超低延迟的计算能力,R3 AP有可能彻底改变未来计算系统的存储和处理能力,例如物联网(IoT)和网络物理系统(CPS)。它也可以应用于高性能和云计算系统。通过出版物、讲习班、设计竞赛、教程、工业课程和技术转让活动分享该项目的成果。教育资源和推广活动计划在项目网站上提供,软件产品在GitHub上发布。为了实现R3 AP,该项目包括一系列跨越多个系统层的相互关联的研究任务。在器件层面,该项目将电压控制的Skyrmion运动机制与工业级8英寸晶圆磁性隧道结堆栈集成在一起,并展示了一个功能齐全的Skyrmion赛道存储器(SRTM),包括Skyrmion流的形成、移动和检测。此外,它评估SRTM的性能,重点是写错误率,移位错误率,读错误率,操作速度和能耗等方面。它还解决和减轻非理想性,如钉扎效应,并继续开发和演示CMOS集成SRTM。在架构和电路层,该项目涉及到创建一个可变的查找表,计算和内存单元。该单元的性能类似于多上下文现场可编程门阵列(FPGA)逻辑、并行PIM逻辑、大规模并行FPGA以及类似模拟的存储和计算结构,充分利用了SRTM的独特属性。这一层确保了从由存储体、子阵列、瓦片等组成的层次结构进行高速存储器访问,并且进一步经由可配置的开关盒和基于网格的片上网络添加链路,以实现PIM的数据移动操作,否则这将是具有挑战性的。在应用层,该项目开发了新的建模、分析、设计空间探索和运行时调整技术,以利用R3 AP提供的高度可重构性。其目标是使未来的物联网和CPS应用适应不断变化的环境和要求,优化资源使用,抵御外部干扰,并增强整体系统性能,弹性和可持续性。在所有这些层面上,该项目开发了一个可扩展的计算机辅助设计(CAD)流程。这涉及到一个基于多级中间表示的编译流程,可以将PyTorch和C/C++等高级描述语言编译成R3 AP设备的二进制文件。该流程采用多层次结构,包括前端、中间端和后端的设计编译,并将各种优化和管理问题抽象到适当的层次,以有效解决问题。该奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Mimi Xie其他文献
Enabling Reliable, Efficient, and Secure Computing for Energy Harvesting Powered IoT Devices
- DOI:
- 发表时间:
2019-09 - 期刊:
- 影响因子:0
- 作者:
Mimi Xie - 通讯作者:
Mimi Xie
Memory-aware Efficient Deep Learning Mechanism for IoT Devices
适用于物联网设备的内存感知高效深度学习机制
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Jishnu Banerjee;S. Islam;Wei Wei;Chen Pan;Dakai Zhu;Mimi Xie - 通讯作者:
Mimi Xie
Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off
通过平衡探索-利用权衡进行动态稀疏训练
- DOI:
10.1109/dac56929.2023.10247716 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Shaoyi Huang;Bowen Lei;Dongkuan Xu;Hongwu Peng;Yue Sun;Mimi Xie;Caiwen Ding - 通讯作者:
Caiwen Ding
Autotile: Autonomous Task-tiling for Deep Inference on Battery-less Embedded System
Autotile:用于无电池嵌入式系统深度推理的自主任务平铺
- DOI:
10.1145/3649476.3658798 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Jishnu Banerjee;Sahidul Islam;Wei Wei;Chen Pan;Mimi Xie - 通讯作者:
Mimi Xie
An Intermittent OTA Approach to Update the DL Weights on Energy Harvesting Devices
一种间歇性 OTA 方法来更新能量收集设备上的 DL 权重
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Wei Wei;Sahidul Islam;Jishnu Banerjee;Shangli Zhou;Chen Pan;Caiwen Ding;Mimi Xie - 通讯作者:
Mimi Xie
Mimi Xie的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Mimi Xie', 18)}}的其他基金
SCC-PG: Bridge: An AI-Enabled Platform to Support Coordinated Care for Children with Autism
SCC-PG:Bridge:支持自闭症儿童协调护理的人工智能平台
- 批准号:
2306596 - 财政年份:2023
- 资助金额:
$ 9.97万 - 项目类别:
Standard Grant
相似国自然基金
复杂电子产品超精密加工及检测关键技术研究与应用
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
基于合成生物学的动物底盘品种优化及中试应用研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
运用组学整合技术探索萆薢分清散联合化疗治疗晚期胰腺癌的临床研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
九里香等提取物多靶向制剂抗肺癌的作用及机制研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
升血小板方治疗原发免疫性血小板减少症的临床研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
八髎穴微波热疗在女性膀胱过度活动症治疗中的价值研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
基于 miR-455-5p 介导的氧化应激机制探讨糖尿病视网膜病变中医分型治疗的临床研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
基于 UPLC-Q-TOF-MS/MS 分析的 异功散活性成分评价及提取工艺研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
无创电针对于痉挛型双瘫脑 瘫患儿的有效性与安全性研究:一项随机 单盲前瞻性队列研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
弹压式手法与体外冲击波治疗肱骨外上髁炎的对比研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
相似海外基金
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328973 - 财政年份:2024
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328972 - 财政年份:2024
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328974 - 财政年份:2024
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: Indium selenides based back end of line neuromorphic accelerators
合作研究:FuSe:基于硒化铟的后端神经形态加速器
- 批准号:
2328741 - 财政年份:2023
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: Interconnects with Co-Designed Materials, Topology, and Wire Architecture
合作研究:FuSe:与共同设计的材料、拓扑和线路架构互连
- 批准号:
2328906 - 财政年份:2023
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: Interconnects with Co-Designed Materials, Topology, and Wire Architecture
合作研究:FuSe:与共同设计的材料、拓扑和线路架构互连
- 批准号:
2328908 - 财政年份:2023
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: Collaborative Optically Disaggregated Arrays of Extreme-MIMO Radio Units (CODAeMIMO)
合作研究:FuSe:Extreme-MIMO 无线电单元的协作光学分解阵列 (CODAeMIMO)
- 批准号:
2328947 - 财政年份:2023
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
FuSe/Collaborative Research: Heterogeneous Integration in Power Electronics for High-Performance Computing (HIPE-HPC)
FuSe/合作研究:用于高性能计算的电力电子异构集成 (HIPE-HPC)
- 批准号:
2329063 - 财政年份:2023
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: High-throughput Discovery of Phase Change Materials for Co-designed Electronic and Optical Computational Devices (PHACEO)
合作研究:FuSe:用于共同设计的电子和光学计算设备的相变材料的高通量发现(PHACEO)
- 批准号:
2329087 - 财政年份:2023
- 资助金额:
$ 9.97万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: Monolithic 3D Integration (M3D) of 2D Materials-Based CFET Logic Elements towards Advanced Microelectronics
合作研究:FuSe:面向先进微电子学的基于 2D 材料的 CFET 逻辑元件的单片 3D 集成 (M3D)
- 批准号:
2329189 - 财政年份:2023
- 资助金额:
$ 9.97万 - 项目类别:
Standard Grant