BIGDATA: F: Collaborative Research: Optimizing Log-Structured-Merge-Based Big Data Management Systems
BIGDATA:F:协作研究:优化基于日志结构合并的大数据管理系统
基本信息
- 批准号:1838222
- 负责人:
- 金额:$ 135.81万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-01-01 至 2024-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Modern big data management systems support fast read and write operations based on the unique identifier (key) of a record. That is, they are fast when inserting key-value pairs, and given a key they quickly return the value associated with that key. To do so, most such systems rely on a Log-Structured-Merge Tree (LSM) structure that batches writes together before writing them to persistent storage. This project will study how to efficiently support more sophisticated operations on LSM-based storage systems, that is, operations that do not simply specify the key of a record. Examples of such operations include searching for records based instead on their location or time. By optimizing the storage and management of big data, this project has the potential to cut the storage costs and energy consumption in data centers. Further, the successful completion of this work will allow users to manage more data with the existing hardware infrastructure, which is critical given the new wave of big data being generated by sensors and the Internet-of-Things. The project will capitalize on the student diversity at two Hispanic Serving Institutions, and thus broaden the participation of under-represented groups in the research process.To support richer data modeling and querying capabilities on top of LSM key-value stores, this project will develop novel LSM indexing and access algorithms to support query plans that utilize both primary and secondary LSM components. In addition, it will design and evaluate flow control policies to dampen or eliminate the notoriously bursty data ingestion behavior that LSM-based storage structures exhibit. It will also study how to automatically and dynamically change LSM compaction policies and parameters based on the query workload. Data-semantics-aware compaction techniques will also be studied. The project will additionally develop novel LSM-aware query optimization techniques; the LSM storage layer is currently treated as a black box by most query optimizers. The planned methods will be deployed and evaluated on the open source Apache AsterixDB system.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
现代大数据管理系统基于记录的唯一标识符(键)支持快速读取操作。也就是说,它们在插入键值对时很快,并且给出了一个键,它们很快返回与该键关联的值。为此,大多数这样的系统依赖于对数结构化的树(LSM)结构,该结构在编写持续存储之前将它们批量编写在一起。该项目将研究如何有效地支持基于LSM的存储系统的更复杂的操作,即不仅仅指定记录的钥匙的操作。此类操作的示例包括搜索基于其位置或时间的记录。通过优化大数据的存储和管理,该项目有可能降低数据中心的存储成本和能源消耗。此外,这项工作的成功完成将使用户可以使用现有的硬件基础架构来管理更多数据,这至关重要,鉴于传感器和图像互联网生成的新大数据浪潮至关重要。 The project will capitalize on the student diversity at two Hispanic Serving Institutions, and thus broaden the participation of under-represented groups in the research process.To support richer data modeling and querying capabilities on top of LSM key-value stores, this project will develop novel LSM indexing and access algorithms to support query plans that utilize both primary and secondary LSM components.此外,它将设计和评估流量控制策略,以抑制或消除基于LSM的存储结构所展示的臭名昭著的爆发数据摄入行为。它还将研究如何根据查询工作负载自动和动态更改LSM压实策略和参数。还将研究数据 - 语音感知的压实技术。该项目还将开发新颖的LSM感知查询优化技术; LSM存储层目前被大多数查询优化器视为黑匣子。计划的方法将在开源Apache AsterixDB系统上进行部署和评估。该奖项反映了NSF的法定任务,并被认为是值得通过基金会的知识分子优点和更广泛影响的评估标准通过评估来支持的。
项目成果
期刊论文数量(32)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
SpiderWeb: A Spatial Data Generator on the Web
SpiderWeb:网络上的空间数据生成器
- DOI:10.1145/3397536.3422351
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Katiyar, Puloma;Vu, Tin;Migliorini, Sara;Belussi, Alberto;Eldawy, Ahmed
- 通讯作者:Eldawy, Ahmed
Spatial data generators
空间数据生成器
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Vu, Tin;Migliorini, Sara;Eldawy, Ahmed;Belussi, Alberto
- 通讯作者:Belussi, Alberto
A Learned Query Optimizer for Spatial Join
用于空间连接的学习查询优化器
- DOI:10.1145/3474717.3484217
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Vu, Tin;Belussi, Alberto;Migliorini, Sara;Eldawy, Ahmed
- 通讯作者:Eldawy, Ahmed
Raptor: Large Scale Analysis of Big Raster and Vector Data
- DOI:10.14778/3352063.3352107
- 发表时间:2019-08-01
- 期刊:
- 影响因子:2.5
- 作者:Singla, Samriddhi;Eldawy, Ahmed;Mokbel, Mohamed F.
- 通讯作者:Mokbel, Mohamed F.
Efficient local locking for massively multithreaded in-memory hash-based operators
- DOI:10.1007/s00778-020-00642-5
- 发表时间:2021-02
- 期刊:
- 影响因子:0
- 作者:Bashar Romanous;Skyler Windh;Ildar Absalyamov;Prerna Budhkar;R. Halstead;W. Najjar;V. Tsotras
- 通讯作者:Bashar Romanous;Skyler Windh;Ildar Absalyamov;Prerna Budhkar;R. Halstead;W. Najjar;V. Tsotras
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Evangelos Christidis其他文献
Evangelos Christidis的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Evangelos Christidis', 18)}}的其他基金
III: Small: Rethinking the Data Organization and Lifecycle in LSM Storage Systems
III:小:重新思考 LSM 存储系统中的数据组织和生命周期
- 批准号:
2227669 - 财政年份:2023
- 资助金额:
$ 135.81万 - 项目类别:
Standard Grant
III: Medium: Efficient Collaborative Perception over Controllable Agent Networks
III:媒介:可控代理网络上的高效协作感知
- 批准号:
1901379 - 财政年份:2019
- 资助金额:
$ 135.81万 - 项目类别:
Continuing Grant
EAGER: Joint Modeling and Querying of Social Media and Video Data
EAGER:社交媒体和视频数据的联合建模和查询
- 批准号:
1746031 - 财政年份:2017
- 资助金额:
$ 135.81万 - 项目类别:
Standard Grant
CAREER: A Collaborative Adaptive Data Sharing Platform
职业:协作自适应数据共享平台
- 批准号:
1216007 - 财政年份:2011
- 资助金额:
$ 135.81万 - 项目类别:
Continuing Grant
III-CXT-Small: Information Discovery on Domain Data Graphs
III-CXT-Small:领域数据图上的信息发现
- 批准号:
1216032 - 财政年份:2011
- 资助金额:
$ 135.81万 - 项目类别:
Standard Grant
CAREER: A Collaborative Adaptive Data Sharing Platform
职业:协作自适应数据共享平台
- 批准号:
0952347 - 财政年份:2010
- 资助金额:
$ 135.81万 - 项目类别:
Continuing Grant
III: Travel Support for US-Based Students to Attend the 2009 IEEE International Conference on Data Mining (ICDM 2009)
III:为美国学生参加 2009 年 IEEE 国际数据挖掘会议 (ICDM 2009) 提供差旅支持
- 批准号:
0949134 - 财政年份:2009
- 资助金额:
$ 135.81万 - 项目类别:
Standard Grant
III-CXT-Small: Information Discovery on Domain Data Graphs
III-CXT-Small:领域数据图上的信息发现
- 批准号:
0811922 - 财政年份:2008
- 资助金额:
$ 135.81万 - 项目类别:
Standard Grant
相似国自然基金
数智背景下的团队人力资本层级结构类型、团队协作过程与团队效能结果之间关系的研究
- 批准号:72372084
- 批准年份:2023
- 资助金额:40 万元
- 项目类别:面上项目
在线医疗团队协作模式与绩效提升策略研究
- 批准号:72371111
- 批准年份:2023
- 资助金额:41 万元
- 项目类别:面上项目
面向人机接触式协同作业的协作机器人交互控制方法研究
- 批准号:62373044
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
基于数字孪生的颅颌面人机协作智能手术机器人关键技术研究
- 批准号:82372548
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
A-型结晶抗性淀粉调控肠道细菌协作产丁酸机制研究
- 批准号:32302064
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
BIGDATA: IA: Collaborative Research: Asynchronous Distributed Machine Learning Framework for Multi-Site Collaborative Brain Big Data Mining
BIGDATA:IA:协作研究:用于多站点协作大脑大数据挖掘的异步分布式机器学习框架
- 批准号:
2348159 - 财政年份:2023
- 资助金额:
$ 135.81万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Intelligent Solutions for Navigating Big Data from the Arctic and Antarctic
BIGDATA:IA:协作研究:导航北极和南极大数据的智能解决方案
- 批准号:
2308649 - 财政年份:2022
- 资助金额:
$ 135.81万 - 项目类别:
Standard Grant
BigData:IA:Collaborative Research: TIMES: A tensor factorization platform for spatio-temporal data
BigData:IA:协作研究:TIMES:时空数据张量分解平台
- 批准号:
2034479 - 财政年份:2020
- 资助金额:
$ 135.81万 - 项目类别:
Standard Grant
BIGDATA: Collaborative Research: F: Holistic Optimization of Data-Driven Applications
BIGDATA:协作研究:F:数据驱动应用程序的整体优化
- 批准号:
2027516 - 财政年份:2020
- 资助金额:
$ 135.81万 - 项目类别:
Standard Grant
BIGDATA: F: Collaborative Research: Practical Analysis of Large-Scale Data with Lyme Disease Case Study
BIGDATA:F:协作研究:莱姆病案例研究大规模数据的实际分析
- 批准号:
1934319 - 财政年份:2019
- 资助金额:
$ 135.81万 - 项目类别:
Standard Grant