III: Small: Automatic Database Management System Tuning Through Large-scale Machine Learning

III:小型:通过大规模机器学习自动调整数据库管理系统

基本信息

  • 批准号:
    1423210
  • 负责人:
  • 金额:
    $ 49.97万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2014
  • 资助国家:
    美国
  • 起止时间:
    2014-08-01 至 2018-07-31
  • 项目状态:
    已结题

项目摘要

The ability to collect, process, and analyze large amounts of data is paramount for being able to extrapolate new knowledge in business, scientific, and medical applications. Database management systems (DBMSs) are the critical component of modern "Big Data" applications because they are the central repository for all of this information. But tuning a DBMS to perform well is historically a difficult task because they have hundreds of configuration "knobs" that control everything in the system, such as the amount of memory to use and how often data is written. Getting these settings wrong will prevent the system from answering questions about data in a reasonable amount of time or even cause it to lose data. Many organizations resort to hiring experts to configure these knobs, but this is prohibitively expensive. Personnel cost is estimated to be almost 50% of the total ownership cost of a DBMS, and many administrators spend nearly a quarter of their time on these tuning activities. Furthermore, as databases grow in both size and complexity, optimizing a DBMS to meet the needs of new applications has surpassed the abilities of even the best human experts. Thus, the goal of this proposal is to develop the foundation and corresponding practical techniques for the automatic configuration of DBMSs by using machine learning on large-scale collections of historical performance data. Our approach will differ from previous work in that we seek to reduce the amount of time that is needed to train the algorithms that tune the DBMS for each application by relying on knowledge gained from previous tuning efforts. The results from this work will allow anyone to deploy a DBMS that is able to handle large amounts of data and more complex workloads without any expertise in database administration.Achieving good performance in a database management system (DBMS) is non-trivial because they are complex systems with many tunable options that control nearly all aspects of their runtime operation. Getting this tuning right is critical for modern high-volume and high-throughput workloads, as the performance gains can be significant. As such, many organizations resort to hiring an expensive database administrator to manually tune their DBMS. But the size and complexity of databases have now surpassed the abilities of even the best human experts. Hence, we plan to develop automatic techniques for tuning and optimizing DBMS configurations for a broad class of application workloads. We will explore the foundations of using machine learning to scale DBMSs for larger data sets, thereby removing a major impediment in deriving the full benefits of data-driven decision making applications. The crux of our approach is to map an arbitrary application's workload to features of one or more canonical benchmarks that best represents the workload's properties, and then to collect performance data from the DBMS using that benchmark. This data is then used to train models that will allow us to identify the dependencies between knobs and their effects on the DBMS. From this, the models will select a near-optimal knob setting for the application. This differs from earlier work that focused on optimizing a single DBMS installation in isolation and are unable to leverage knowledge gained from previous tuning efforts. Our approach will not require the user to generate a large sample data set of (potentially expensive) experiments to derive the proper configuration.For further information see project web site at: http://oltpbenchmark.com
收集、处理和分析大量数据的能力对于推断商业、科学和医学应用中的新知识至关重要。数据库管理系统 (DBMS) 是现代“大数据”应用程序的关键组件,因为它们是所有这些信息的中央存储库。 但调整 DBMS 以使其性能良好在历史上是一项艰巨的任务,因为它们有数百个配置“旋钮”来控制系统中的所有内容,例如要使用的内存量和写入数据的频率。如果这些设置错误,系统将无法在合理的时间内回答有关数据的问题,甚至导致数据丢失。 Many organizations resort to hiring experts to configure these knobs, but this is prohibitively expensive.据估计,人员成本几乎占 DBMS 总拥有成本的 50%,许多管理员将近四分之一的时间花在这些调优活动上。此外,随着数据库规模和复杂性的增长,优化 DBMS 以满足新应用程序的需求甚至超出了最优秀的人类专家的能力。因此,该提案的目标是通过在大规模历史性能数据集合上使用机器学习来开发 DBMS 自动配置的基础和相应的实用技术。我们的方法与以前的工作不同,因为我们寻求依靠从以前的调优工作中获得的知识来减少训练为每个应用程序调优 DBMS 的算法所需的时间。这项工作的结果将允许任何人部署一个能够处理大量数据和更复杂工作负载的 DBMS,而无需任何数据库管理方面的专业知识。在数据库管理系统 (DBMS) 中实现良好的性能并非易事,因为它们是复杂的系统,具有许多可调选项,可以控制其运行时操作的几乎所有方面。 Getting this tuning right is critical for modern high-volume and high-throughput workloads, as the performance gains can be significant. As such, many organizations resort to hiring an expensive database administrator to manually tune their DBMS. But the size and complexity of databases have now surpassed the abilities of even the best human experts. Hence, we plan to develop automatic techniques for tuning and optimizing DBMS configurations for a broad class of application workloads.我们将探索使用机器学习为更大的数据集扩展 DBMS 的基础,从而消除获得数据驱动决策应用程序的全部优势的主要障碍。我们方法的关键是将任意应用程序的工作负载映射到一个或多个最能代表工作负载属性的规范基准的特征,然后使用该基准从 DBMS 收集性能数据。 This data is then used to train models that will allow us to identify the dependencies between knobs and their effects on the DBMS. From this, the models will select a near-optimal knob setting for the application.这与早期的工作不同,早期的工作侧重于单独优化单个 DBMS 安装,并且无法利用从以前的调优工作中获得的知识。我们的方法不需要用户生成大量(可能昂贵的)实验样本数据集来得出正确的配置。有关更多信息,请参阅项目网站:http://oltpbenchmark.com

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Andrew Pavlo其他文献

On Scalable Transaction Execution in Partitioned Main Memory Database Management Systems
  • DOI:
  • 发表时间:
    2014
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Andrew Pavlo
  • 通讯作者:
    Andrew Pavlo
Non-Volatile Memory Database Management Systems
非易失性内存数据库管理系统
NULLS!: Revisiting Null Representation in Modern Columnar Formats
NULLS!:重新审视现代列格式中的空表示
: Database architectures for modern hardware : report from Dagstuhl Seminar 18251
:现代硬件的数据库架构:来自 Dagstuhl 研讨会 18251 的报告
  • DOI:
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    P. Boncz;G. Graefe;Bingsheng He;K. Sattler;Philippe Bonnet;A. Kemper;Viktor Leis;Justin J. Levandoski;S. Manegold;Danica Porobic;Caetano Sauer;Carsten Binnig;Andrew Crotty;Alex Galakatos;Tim Kraska;E. Z. The;Thomas Leich;Thilo Pionteck;Gunter Saake;Olaf Spinczyk;Andreas Becher;Lekshmi B.G;David Broneske;Tobias Drewes;B. Gurumurthy;K. Meyer;Jürgen Teich;Juan A. Colmenares;Gage Eads;S. Hofmeyr;Sarah Bird;Miquel Moretó;David Chou;Brian Gluzman;Eric Roman;D. B. Bartolini;Nitesh Mor;K. Asanović;John D Kubiatowicz. 2013;Daniel Lemire;Andrew Pavlo;A. Nica
  • 通讯作者:
    A. Nica
Enterprise Database Applications and the Cloud: A Difficult Road Ahead
企业数据库应用程序和云:前进的道路艰难

Andrew Pavlo的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Andrew Pavlo', 18)}}的其他基金

CAREER: Self-Driving Database Management Systems
职业:自动驾驶数据库管理系统
  • 批准号:
    1846158
  • 财政年份:
    2019
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Continuing Grant
SPX: Collaborative Research: Distributed Database Management with Logical Leases and Hardware Transactional Memory
SPX:协作研究:具有逻辑租赁和硬件事务内存的分布式数据库管理
  • 批准号:
    1822933
  • 财政年份:
    2018
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Standard Grant
III: Small: Non-Invasive Real-Time Analytics in Database Systems using Holistic Query Compilation
III:小型:使用整体查询编译在数据库系统中进行非侵入式实时分析
  • 批准号:
    1718582
  • 财政年份:
    2017
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Continuing Grant
XPS: FULL: DSD: Collaborative Research: Moving the Abyss: Database Management on Future 1000-core Processors
XPS:完整:DSD:协作研究:移动深渊:未来 1000 核处理器上的数据库管理
  • 批准号:
    1438955
  • 财政年份:
    2014
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Standard Grant

相似国自然基金

昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
  • 批准号:
    n/a
  • 批准年份:
    2022
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
  • 批准号:
    32000033
  • 批准年份:
    2020
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
  • 批准号:
    31972324
  • 批准年份:
    2019
  • 资助金额:
    58.0 万元
  • 项目类别:
    面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
  • 批准号:
    81900988
  • 批准年份:
    2019
  • 资助金额:
    21.0 万元
  • 项目类别:
    青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
  • 批准号:
    31870821
  • 批准年份:
    2018
  • 资助金额:
    56.0 万元
  • 项目类别:
    面上项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
  • 批准号:
    31802058
  • 批准年份:
    2018
  • 资助金额:
    26.0 万元
  • 项目类别:
    青年科学基金项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
  • 批准号:
    31772128
  • 批准年份:
    2017
  • 资助金额:
    60.0 万元
  • 项目类别:
    面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
  • 批准号:
    81704176
  • 批准年份:
    2017
  • 资助金额:
    20.0 万元
  • 项目类别:
    青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
  • 批准号:
    91640114
  • 批准年份:
    2016
  • 资助金额:
    85.0 万元
  • 项目类别:
    重大研究计划

相似海外基金

SaTC: CORE: Small: Automatic Exploits Detection and Mitigation for Industrial Control System Protocols
SaTC:核心:小型:工业控制系统协议的自动漏洞检测和缓解
  • 批准号:
    2345563
  • 财政年份:
    2023
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Standard Grant
SaTC: CORE: Small: Automatic Identification of Privilege-guard Variables for Data-only Attacks and Defenses
SaTC:核心:小型:自动识别纯数据攻击和防御的权限保护变量
  • 批准号:
    2247652
  • 财政年份:
    2023
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Continuing Grant
SaTC: CORE: Small: Automatic Detection and Repair of Side Channel Vulnerabilities in Software Code
SaTC:CORE:小型:自动检测和修复软件代码中的侧信道漏洞
  • 批准号:
    2245344
  • 财政年份:
    2023
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Continuing Grant
SaTC: CORE: Small: Sound Automatic Exploit Generation
SaTC:核心:小:声音自动漏洞利用生成
  • 批准号:
    2234257
  • 财政年份:
    2023
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Continuing Grant
CCF: SHF: Small: Self-Adaptive Interference-Avoiding Wireless Receiver Hardware through Real-Time Learning-Based Automatic Optimization of Power-Efficient Integrated Circuits
CCF:SHF:小型:通过基于实时学习的高能效集成电路自动优化实现自适应干扰避免无线接收器硬件
  • 批准号:
    2218845
  • 财政年份:
    2022
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Standard Grant
NSF-BSF: SHF: Small: Efficient, Automatic, and Trustworthy Smart Contract Verification
NSF-BSF:SHF:小型:高效、自动且值得信赖的智能合约验证
  • 批准号:
    2110397
  • 财政年份:
    2021
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Standard Grant
SaTC: CORE: Small: Automatic Exploits Detection and Mitigation for Industrial Control System Protocols
SaTC:核心:小型:工业控制系统协议的自动漏洞检测和缓解
  • 批准号:
    2051621
  • 财政年份:
    2021
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Standard Grant
RI: Small: Accelerating Machine Learning via Randomized Automatic Differentiation
RI:小型:通过随机自动微分加速机器学习
  • 批准号:
    2007278
  • 财政年份:
    2020
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Standard Grant
SHF: Small: Automatic, adaptive and massive parallel data processing on GPU/RDMA clusters in both synchronous and asynchronous modes
SHF:小型:在同步和异步模式下在 GPU/RDMA 集群上自动、自适应和大规模并行数据处理
  • 批准号:
    2005884
  • 财政年份:
    2020
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Standard Grant
Adaptive control for agile automatic welding in small batches
小批量敏捷自动焊接的自适应控制
  • 批准号:
    530116-2018
  • 财政年份:
    2020
  • 资助金额:
    $ 49.97万
  • 项目类别:
    Collaborative Research and Development Grants
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了