权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CSR: Small: Cache-Coherent Accelerators for Efficient Persistent Memory Programming

CSR：小型：用于高效持久内存编程的缓存一致性加速器

基本信息

批准号：
2245999
负责人：
Ryan Stutsman
金额：
$ 59.29万
依托单位：
University of Utah
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-10-01 至 2026-09-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2245999&HistoricalAwards=false
关键词：
CSR Small Cache Coherent Accelerators

项目摘要

Persistent memory (PM) is a new class of computer storage that upends the model that computer systems have used for more than half a century. Unlike conventional storage devices, PM can be accessed by CPUs as if it were memory. Accessing storage this way can be implemented in hardware instead of software, so it can be accessed more quickly and efficiently than conventional storage devices. Even so, it persists across faults and power failures. This proposal seeks to convert existing software to use this faster PM-based storage without requiring programmers to change their code while providing safety during crashes and power failures. Unlike existing approaches, the proposed approach does this using emerging commercially available hardware; hence, it is fast and efficient since it does not reintroduce software to the CPU storage access path.The improved computer system memory performance, efficiency, and capacity that PM provides when combined with this project's accelerated crash consistency will have broad benefits to many private and public sector applications. Many of the costs that virtually all database systems introduce to survive crashes and power failures can be mitigated by the proposed approach. Furthermore, applications that process and modify massive data sets in real-time including data center applications, social networks, and machine learning training and inference over changing data sets can all benefit from improved scale, performance, and efficiency. Hence, this work can help accelerate code in the data center applications that are used by billions of users daily.This project will also carry out several outreach and educational activities along with specific collaborations to encourage industry adoption including a tutorial, a new graduate seminar, new modules for graduate and undergraduate courses, and a new course lab assignment. Additionally, the PI will host two incoming undergraduate students from an underrepresented group for research rotations as well as host two undergraduates through the NSF REU program. The resulting tools, framework, and accelerator code will be developed in the open under a permissive license to support use and development both in academia and industry, and it will be packaged for easy use and deployment on open platforms.In more detail, PM support in recent CPU architectures allows CPUs to access and manipulate massive data sets, lowering data access times from 10s of microseconds to 100s of nanoseconds. However, system crashes in the middle of modifying persistent data structures can lead to inconsistencies that are difficult or impossible to repair; thus, today PM-based data structures still place software on the path to storage access to provide extra steps for crash consistency. The key insight of this proposal is that the interposition needed on PM data accesses for crash consistency can be done fully in hardware without any changes to existing CPU architectures by using newly emerging cache-coherent accelerators and field-programmable gate arrays (FPGAs). Furthermore, it can be done with existing, off-the-shelf code for data structures that were designed without PM in mind. In the proposed approach, applications interact with PM through a hardware FPGA, which carefully controls how changes are propagated to PM to provide crash consistency. Since this interposition is in hardware it is efficient, which helps realize the full performance potential of PM's direct load/store interface. Also, this new approach works well with CPUs' cache coherence protocols, so CPUs can cache PM data more aggressively than is safe with direct PM access; in turn, this makes the proposed approach faster than direct load/store access to PM in many cases. Finally, the proposed work includes using this cache-coherent accelerator to provide replicated, fault-tolerant PM, and it includes new approaches to hiding PM and remote memory access times by implementing new, intelligent prefetching policies in hardware without CPU changes.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

永久性存储器（PM）是一种新的计算机存储类型，它颠覆了计算机系统使用了半个多世纪的模型。与传统的存储设备不同，PM可以被CPU访问，就像它是内存一样。这种方式可以用硬件而不是软件来实现存储，因此可以比传统的存储设备更快更有效地访问它。即便如此，它仍然存在于故障和电源故障中。该提案旨在将现有软件转换为使用这种更快的基于PM的存储，而无需程序员更改代码，同时在崩溃和电源故障期间提供安全性。与现有的方法不同，所提出的方法使用新兴的商用硬件，因此，它是快速和高效的，因为它不重新引入软件的CPU存储访问path.The改进的计算机系统的内存性能，效率和容量，PM提供结合本项目的加速崩溃的一致性时，将有广泛的好处，许多私人和公共部门的应用。几乎所有的数据库系统引入生存崩溃和电源故障的成本可以减轻所提出的方法。此外，实时处理和修改海量数据集的应用程序，包括数据中心应用程序、社交网络以及对不断变化的数据集的机器学习训练和推理，都可以从改进的规模、性能和效率中受益。因此，这项工作可以帮助加速每天有数十亿用户使用的数据中心应用程序中的代码。该项目还将开展几项推广和教育活动，沿着特定的合作，以鼓励行业采用，包括教程，新的研究生研讨会，研究生和本科生课程的新模块，以及新的课程实验室作业。此外，PI将接待两名来自代表性不足的群体的本科生进行研究轮换，并通过NSF REU计划接待两名本科生。由此产生的工具、框架和加速器代码将在开放的许可证下开发，以支持学术界和工业界的使用和开发，并将其打包以便于在开放平台上使用和部署。更详细地说，最近的CPU架构中的PM支持允许CPU访问和操作海量数据集，将数据访问时间从10微秒降低到100纳秒。然而，在修改持久化数据结构的过程中系统崩溃可能导致难以或无法修复的不一致;因此，今天基于PM的数据结构仍然将软件置于存储访问的路径上，以提供崩溃一致性的额外步骤。该提案的关键见解是，通过使用新出现的高速缓存一致性加速器和现场可编程门阵列（FPGA），可以完全在硬件中完成PM数据访问所需的崩溃一致性插入，而无需对现有CPU架构进行任何更改。此外，它可以用现有的现成代码来完成，这些代码用于设计时没有考虑PM的数据结构。在所提出的方法中，应用程序通过硬件FPGA与PM进行交互，FPGA仔细控制如何将更改传播到PM以提供崩溃一致性。由于这种插入是在硬件中，因此它是有效的，这有助于实现PM的直接加载/存储接口的全部性能潜力。此外，这种新方法与CPU的高速缓存一致性协议配合良好，因此CPU可以比直接PM访问更安全地高速缓存PM数据;反过来，这使得所提出的方法在许多情况下比直接加载/存储访问PM更快。最后，拟议的工作包括使用这种缓存一致性加速器提供复制，容错PM，它包括新的方法来隐藏PM和远程内存访问时间，通过实施新的，智能预取政策在硬件中没有CPU的变化。这个奖项反映了NSF的法定使命，并已被认为是值得通过使用基金会的智力价值和更广泛的影响力审查标准进行评估的支持。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Ryan Stutsman其他文献

To Lock, Swap, or Elide: On the Interplay of Hardware Transactional Memory and Lock-Free Indexing

锁定、交换或删除：关于硬件事务内存和无锁索引的相互作用

DOI：
发表时间：
2015
期刊：
Proceedings of the VLDB Endowment
影响因子：
2.5
作者：
Darko Makreshanski;Justin J. Levandoski;Ryan Stutsman
通讯作者：
Ryan Stutsman

Durability and crash recovery in distributed in-memory storage systems

DOI：
发表时间：
2013
期刊：
影响因子：
0
作者：
Ryan Stutsman
通讯作者：
Ryan Stutsman

Hybrid network clusters using common gameplay for massively multiplayer online games

使用大型多人在线游戏通用游戏玩法的混合网络集群

DOI：
10.1145/3235765.3235785
发表时间：
2018
期刊：
Proceedings of the 13th International Conference on the Foundations of Digital Games
影响因子：
0
作者：
Jared N. Plumb;S. Kasera;Ryan Stutsman
通讯作者：
Ryan Stutsman