CAREER: Towards Efficient In-storage Indexing
职业:实现高效的存储内索引
基本信息
- 批准号:2338457
- 负责人:
- 金额:$ 61.55万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2024
- 资助国家:美国
- 起止时间:2024-07-01 至 2029-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Data indexing plays a crucial role in numerous modern technologies, including search engines, big data analytics, file systems, and databases. In this context, in-storage indexing devices (ISIDs) have emerged to enhance the functionalities of storage devices, leading to improved performance, efficiency, and cost-effective data processing. By storing index information alongside the data it indexes within the same storage device, ISIDs offer several advantages over traditional indexing methods. These advantages include reducing data movement, improving access speed, minimizing network impact, enabling efficient data management, and freeing host computing for critical tasks. To design efficient ISIDs, several challenges need to be addressed. Firstly, there is a need for low-cost and open-source research platforms to facilitate the reproduction and comparison of research work, promoting quick adoption of ISID advancements. Secondly, integrating the fragmented advancements of individual ISID components is crucial to capture their holistic impacts and interactions effectively. Thirdly, addressing diverse workload requests, interference in multi-tenant environments, and data distribution considerations requires new research methods for overall operation optimization. This CAREER research project aims to overcome these research challenges and promote the adoption of ISIDs, contributing to the advancements of storage systems. This project will explore and develop innovative methods to unleash the full potential of ISIDs in modern data management systems. By addressing the core challenges, the project seeks to revolutionize data storage systems and make significant contributions to the field of storage technology. This project will share the findings with undergraduate and graduate students through computer science programs and open up career opportunities to female students, underrepresented minorities, and first-generation college students. This project will disseminate the proposed techniques into the industry and foster technology transfer through new industrial collaborations. The developed infrastructure will be available to the research community through a web-based portal.This research makes significant empirical contributions to the ISID design and development space by addressing major challenges posed by in-storage indexing. Specifically, it advances the state of knowledge by investigating the following questions: (1) How can we design and develop new ISID models that accurately capture the behavior of internal modules, such as the index manager, request handler, data access parallelism, index-induced wear leveling, and garbage collection? These insights will enable scientific design advancements and detailed tradeoff analysis for ISIDs. (2) How can we develop efficient dynamic model calibration techniques using coarse measurements to parameterize queuing models that accurately capture burstiness and variability in ISIDs? (3) How can we emulate index manager delays using different data structures and sizes and utilize black-box and gray-box calibration techniques to identify ground truth for ISIDs? (4) How can we design a new re-configurable indexing architecture and index cache that ensures deterministic tail latency, low overhead prefetching and eviction, and improved membership checking through object signatures and ML-based feature learning in ISIDs? (5) How can we design tenant-local eviction policies that consider the effect of allocating space for index and data, recognizing the dependencies between them for efficient data access in ISIDs? (6) How can we minimize log-checking overhead and avoid in-storage hash computations while exploring the trade-off between consistency and performance by allowing read-only tenants to bypass the log and access their own consistent index in ISIDs? (7) Does capacity variance, which gracefully reduces ISID capacity as flash pages become bad, provide a better alternative to wear-leveling for ISIDs? Throughout the project, the PI will facilitate the connection of the proposed research with the contents and concepts of several courses on Systems at FIU.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据索引在许多现代技术中扮演着至关重要的角色,包括搜索引擎、大数据分析、文件系统和数据库。在这种情况下,存储索引设备(ISID)应运而生,以增强存储设备的功能,从而提高性能、效率和经济高效的数据处理。通过将索引信息与其索引的数据一起存储在同一存储设备中,ISID提供了比传统索引方法更多的优势。这些优势包括减少数据移动、提高访问速度、最大限度地减少对网络的影响、实现高效的数据管理以及将主机计算释放出来用于关键任务。要设计高效的ISID,需要解决几个挑战。首先,需要低成本和开放源码的研究平台,以促进研究工作的复制和比较,促进迅速采用国际开发协会的进步成果。其次,整合ISID各个组成部分的零散进展对于有效捕捉其整体影响和相互作用至关重要。第三,解决不同的工作负载请求、多租户环境中的干扰和数据分布考虑因素,需要新的研究方法来进行整体运营优化。这一职业研究项目旨在克服这些研究挑战,促进ISID的采用,为存储系统的进步做出贡献。该项目将探索和开发创新方法,以充分释放ISID在现代数据管理系统中的潜力。通过解决核心挑战,该项目寻求使数据存储系统发生革命性变化,并为存储技术领域做出重大贡献。该项目将通过计算机科学项目与本科生和研究生分享这一发现,并向女学生、未被充分代表的少数族裔和第一代大学生打开就业机会。该项目将把拟议的技术传播到行业中,并通过新的行业合作促进技术转让。开发的基础设施将通过基于网络的门户向研究界提供。这项研究通过解决存储索引带来的主要挑战,对ISID的设计和开发空间做出了重大的经验贡献。具体地说,它通过调查以下问题来推进知识状态:(1)我们如何设计和开发新的ISID模型,以准确地捕获内部模块的行为,如索引管理器、请求处理程序、数据访问并行性、索引引发的损耗平衡和垃圾回收?这些见解将使ISID的科学设计进步和详细的权衡分析成为可能。(2)如何开发有效的动态模型校准技术,使用粗略测量来参数化排队模型,准确地捕捉ISID中的突发性和变异性?(3)我们如何使用不同的数据结构和大小来模拟索引管理器延迟,并利用黑盒和灰盒校准技术来识别ISID的基本事实?(4)我们如何设计一种新的可重新配置的索引体系结构和索引缓存,以确保确定性的尾部延迟、低开销的预取和驱逐,以及如何通过对象签名和ISID中基于ML的功能学习来改进成员检查?(5)我们如何设计租户本地逐出策略,以考虑为索引和数据分配空间的影响,并认识到它们之间的依赖关系,以便在ISID中进行有效的数据访问?(6)我们如何最大限度地减少日志检查开销,避免存储中的哈希计算,同时通过允许只读租户绕过日志并访问他们自己的ISID中的一致索引来探索一致性和性能之间的权衡?(7)容量差异是否为ISID提供了一种更好的替代方案?容量差异可以在闪存页面变坏时优雅地减少ISID容量?在整个项目中,PI将促进拟议的研究与FIU几门系统课程的内容和概念的联系。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Janki Bhimani其他文献
Janki Bhimani的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Janki Bhimani', 18)}}的其他基金
CSR: Small: Learning and Management in Tiered Memory Systems
CSR:小:分层内存系统中的学习和管理
- 批准号:
2323100 - 财政年份:2023
- 资助金额:
$ 61.55万 - 项目类别:
Standard Grant
Collaborative Research: CNS core: OAC core: Small: New Techniques for I/O Behavior Modeling and Persistent Storage Device Configuration
合作研究: CNS 核心:OAC 核心:小型:I/O 行为建模和持久存储设备配置新技术
- 批准号:
2008324 - 财政年份:2020
- 资助金额:
$ 61.55万 - 项目类别:
Standard Grant
相似海外基金
CAREER: Towards highly efficient UV emitters with lattice engineered substrates
事业:采用晶格工程基板实现高效紫外线发射器
- 批准号:
2338683 - 财政年份:2024
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Green Functions as a Service: Towards Sustainable and Efficient Distributed Computing Infrastructure
职业:绿色功能即服务:迈向可持续、高效的分布式计算基础设施
- 批准号:
2340722 - 财政年份:2024
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Towards 3D Omnidirectional and Efficient Wireless Power Transfer with Controlled 2D Near-Field Coil Array
职业:利用受控 2D 近场线圈阵列实现 3D 全向高效无线功率传输
- 批准号:
2338697 - 财政年份:2024
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Towards Efficient Cryptography for Next Generation Applications
职业:面向下一代应用的高效密码学
- 批准号:
2402031 - 财政年份:2023
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Towards Efficient and Scalable Zero-Knowledge Proofs
职业:迈向高效且可扩展的零知识证明
- 批准号:
2401481 - 财政年份:2023
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Towards Efficient and Scalable Zero-Knowledge Proofs
职业:迈向高效且可扩展的零知识证明
- 批准号:
2144625 - 财政年份:2022
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Towards Efficient and Fast Hierarchical Federated Learning in Heterogeneous Wireless Edge Networks
职业:在异构无线边缘网络中实现高效快速的分层联邦学习
- 批准号:
2145031 - 财政年份:2022
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Towards Elastic Security with Safe and Efficient Network Security Function Virtualization
职业:通过安全高效的网络安全功能虚拟化迈向弹性安全
- 批准号:
2129164 - 财政年份:2021
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Towards Efficient Accelerated Cloud Data Centers
职业:迈向高效加速云数据中心
- 批准号:
2047521 - 财政年份:2021
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Towards a Principled Framework for Resilient, Data Efficient and Scalable Reinforcement Learning for Control
职业:建立一个有弹性、数据高效且可扩展的强化学习控制原则框架
- 批准号:
2045783 - 财政年份:2021
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant