FoMR: DeepFetch: Compact Deep Learning based Prefetcher on Configurable Hardware
FoMR:DeepFetch:可配置硬件上基于紧凑深度学习的预取器
基本信息
- 批准号:1912680
- 负责人:
- 金额:$ 20万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-10-01 至 2022-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Fast computer processors, tensor processing units, hardware accelerators, and heterogeneous architectures have enabled large-scale speed-ups in computational power, but memory speeds have not kept pace at the same time. Memory performance therefore has become the bottleneck in many applications that rely on heavy memory access. Several emerging memory technologies such 3D-Stacked Dynamic Random Access Memory (3D-DRAM) and non-volatile memory attempt to address memory bottleneck issues from a hardware perspective, but with a tradeoff among bandwidth, power, latency, and cost. Rather than redesigning existing algorithms to suit specific memory technology, this project will develop a Machine Learning-based approach that automatically learns access patterns which may be used to optimally prefetch data. Specifically, highly compact Long short-term memory (LSTM) models will be used as the centerpiece of the prefetcher for predicting memory accesses. Through novel model compression techniques, hierarchical memory modeling and dedicated hardware, this project will overcome barriers of fully exploiting machine learning and emerging hardware to improve prefetching. Successful completion of this project will lead to improved memory performance for applications, including signal processing, computer vision, and language processing.A practical LSTM based prefetcher implementation on hardware requires dealing with certain challenges that will be addressed in this endeavor: (i) training a small model (to enable fast inference) with large traces that is highly accurate in predicting memory accesses for multiple applications; (ii) model compression to ensure real-time inference; (iii) retraining the model online on-demand to learn application specific models, which would require fast learning with small amount of data; (iv) making prefetching decisions in real-time based on the prediction and uncertainty of the model ''what'', ''when'', and ''where'' to prefetch, which also requires careful modeling of the target memory hierarchy; (vi) based on the predictions, deciding in real-time if reordering data (dynamic data layout) can improve the latency, making future prefetches more effective; (vii) mapping the framework of predictions and decision making on limited available configurable hardware in - ensuring low latency training and high-throughput prefetching utilizing small area/power.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
快速的计算机处理器、张量处理单元、硬件加速器和异构架构已经实现了计算能力的大规模加速,但内存速度却没有同时跟上。因此,存储器性能已成为许多依赖于大量存储器访问的应用程序中的瓶颈。诸如3D堆叠动态随机存取存储器(3D-DRAM)和非易失性存储器的若干新兴存储器技术试图从硬件角度解决存储器瓶颈问题,但是在带宽、功率、延迟和成本之间进行权衡。该项目不是重新设计现有算法以适应特定的内存技术,而是开发一种基于机器学习的方法,该方法自动学习可用于最佳预取数据的访问模式。具体来说,高度紧凑的长短期记忆(LSTM)模型将被用作预取器的核心,用于预测内存访问。通过新的模型压缩技术,分层内存建模和专用硬件,该项目将克服充分利用机器学习和新兴硬件来改善预取的障碍。该项目的成功完成将提高应用程序的内存性能,包括信号处理,计算机视觉和语言处理。基于LSTM的预取器在硬件上的实际实现需要处理将在此奋进中解决的某些挑战:(i)训练一个小模型(ii)模型压缩以确保实时推断;(iv)基于要预取的模型“什么”、“何时”和“何处”的预测和不确定性来实时地做出预取决策,这也需要对目标存储器层级进行仔细建模;(vi)基于所述预测,实时决定是否对数据进行重新排序(动态数据布局)可以改善延迟,使未来的预取更有效;(vii)将预测和决策框架映射到有限的可用可配置硬件上-确保利用小区域/该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
RAOP: Recurrent Neural Network Augmented Offset Prefetcher
- DOI:10.1145/3422575.3422807
- 发表时间:2020-09
- 期刊:
- 影响因子:0
- 作者:Pengmiao Zhang;Ajitesh Srivastava;Benjamin Brooks;R. Kannan;V. Prasanna
- 通讯作者:Pengmiao Zhang;Ajitesh Srivastava;Benjamin Brooks;R. Kannan;V. Prasanna
SHARP: Software Hint-Assisted Memory Access Prediction for Graph Analytics
- DOI:10.1109/hpec55821.2022.9926307
- 发表时间:2022-09
- 期刊:
- 影响因子:0
- 作者:Pengmiao Zhang;R. Kannan;Xiangzhi Tong;Anant V. Nori;V. Prasanna
- 通讯作者:Pengmiao Zhang;R. Kannan;Xiangzhi Tong;Anant V. Nori;V. Prasanna
ReSemble: reinforced ensemble framework for data prefetching
ReSemble:用于数据预取的增强型集成框架
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Zhang, Pengmiao;Kannan, Rajgopal;Srivastava, Ajitesh;Nori, Anant V.;Prasanna, Viktor K.
- 通讯作者:Prasanna, Viktor K.
TransforMAP: Transformer for Memory Access Prediction
TransforMAP:用于内存访问预测的变压器
- DOI:
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Zhang, Pengmiao;Srivastava, Ajitesh;Kannan, Rajgopal;Nori, Anant V.;Prasanna, Viktor K.
- 通讯作者:Prasanna, Viktor K.
MemMAP: Compact and Generalizable Meta-LSTM Models for Memory Access Prediction
- DOI:10.1007/978-3-030-47436-2_5
- 发表时间:2020-04-17
- 期刊:
- 影响因子:0
- 作者:Srivastava A;Wang TY;Zhang P;De Rose CA;Kannan R;Prasanna VK
- 通讯作者:Prasanna VK
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Viktor Prasanna其他文献
Accelerating Deep Neural Network guided MCTS using Adaptive Parallelism
使用自适应并行加速深度神经网络引导的 MCTS
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Yuan Meng;Qian Wang;Tianxin Zu;Viktor Prasanna - 通讯作者:
Viktor Prasanna
PEARL: Enabling Portable, Productive, and High-Performance Deep Reinforcement Learning using Heterogeneous Platforms
PEARL:使用异构平台实现便携式、高效且高性能的深度强化学习
- DOI:
10.1145/3649153.3649193 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Yuan Meng;Michael Kinsner;Deshanand Singh;Mahesh Iyer;Viktor Prasanna - 通讯作者:
Viktor Prasanna
Accelerating GNN Training on CPU+Multi-FPGA Heterogeneous Platform
在 CPU 多 FPGA 异构平台上加速 GNN 训练
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Yi-Chien Lin;Bingyi Zhang;Viktor Prasanna - 通讯作者:
Viktor Prasanna
Guest Editorial: Computing Frontiers
- DOI:
10.1007/s10766-013-0240-2 - 发表时间:
2013-01-31 - 期刊:
- 影响因子:0.900
- 作者:
Calin Cascaval;Pedro Trancoso;Viktor Prasanna - 通讯作者:
Viktor Prasanna
Viktor Prasanna的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Viktor Prasanna', 18)}}的其他基金
IUCRC Phase I University of Southern California: Center for Intelligent Distributed Embedded Applications and Systems (IDEAS)
IUCRC 第一期南加州大学:智能分布式嵌入式应用和系统中心 (IDEAS)
- 批准号:
2231662 - 财政年份:2023
- 资助金额:
$ 20万 - 项目类别:
Continuing Grant
Elements: Portable Library for Homomorphic Encrypted Machine Learning on FPGA Accelerated Cloud Cyberinfrastructure
元素:FPGA 加速云网络基础设施上同态加密机器学习的便携式库
- 批准号:
2311870 - 财政年份:2023
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
OAC Core: Scalable Graph ML on Distributed Heterogeneous Systems
OAC 核心:分布式异构系统上的可扩展图 ML
- 批准号:
2209563 - 财政年份:2022
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
SaTC: CORE: Small: Accelerating Privacy Preserving Deep Learning for Real-time Secure Applications
SaTC:核心:小型:加速实时安全应用程序的隐私保护深度学习
- 批准号:
2104264 - 财政年份:2021
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
Collaborative Research:PPoSS:Planning: Streamware - A Scalable Framework for Accelerating Streaming Data Science
合作研究:PPoSS:规划:Streamware - 加速流数据科学的可扩展框架
- 批准号:
2119816 - 财政年份:2021
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
RAPID: ReCOVER: Accurate Predictions and Resource Allocation for COVID-19 Epidemic Response
RAPID:ReCOVER:COVID-19 流行病应对的准确预测和资源分配
- 批准号:
2027007 - 财政年份:2020
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
CNS Core: Small: AccelRITE: Accelerating ReInforcemenT Learning based AI at the Edge Using FPGAs
CNS 核心:小型:AccelRITE:使用 FPGA 在边缘加速基于强化学习的 AI
- 批准号:
2009057 - 财政年份:2020
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
OAC Core: Small: Scalable Graph Analytics on Emerging Cloud Infrastructure
OAC 核心:小型:新兴云基础设施上的可扩展图形分析
- 批准号:
1911229 - 财政年份:2019
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
CNS: CSR: Small: Exploiting 3D Memory for Energy-Efficient Memory-Driven Computing
CNS:CSR:小型:利用 3D 内存实现节能内存驱动计算
- 批准号:
1643351 - 财政年份:2016
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
EAGER: Safer Connected Communities Through Integrated Data-driven Modeling, Learning, and Optimization
EAGER:通过集成的数据驱动建模、学习和优化打造更安全的互联社区
- 批准号:
1637372 - 财政年份:2016
- 资助金额:
$ 20万 - 项目类别:
Standard Grant