III: Medium: Learning-based Synthesis of Data Processing Engines

III:媒介:基于学习的数据处理引擎综合

基本信息

  • 批准号:
    1900933
  • 负责人:
  • 金额:
    $ 120万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-09-01 至 2023-08-31
  • 项目状态:
    已结题

项目摘要

Modern data-processing systems are designed to be general-purpose systems, in that they can handle a wide variety of applications and data. Unfortunately, this general-purpose nature causes these systems to achieve below-optimal performance for every single application and user. Rather technical compromises have to be made to support a wide range of use cases, often leading to orders-of-magnitude worse performance than what a highly customized system would be able to achieve. At the same time, developing a database system from scratch for each individual application and user is neither economical nor practical. The goal of this project is to explore how machine learning can be used to automatically customize a database system for a specific application or user to achieve so called 'instance-optimality'. If successful, this project will transform the way that modern database systems that underpin the Internet and many enterprise computing systems are built, resulting in systems with much better performance or systems that are able to process large datasets using much less hardware than current systems. Concretely, the project investigates to what extent learned models can automatically instance-optimize the various components of a large-scale data processing system: 1) data indexing, where a model can predict the location of a key in a database; 2) algorithms, including sorting and joins, where a model can predict where in a sorted list a record should go, or where joining tuples are in another relation; 3) optimizers, where a model can predict the optimal plan to use for processing queries on data, and 4) storage layout, where a model can predict the optimal layout of data for a particular query workload. This raises a number of intellectually deep questions, including what types of models work best, what theoretical guarantees we can give about the performance of these models, how such generated systems will compare to hand-tuned systems, how such systems can exploit new hardware such as TPUs/GPUs and how program synthesis will work with such modelled data, advancing the fields of databases, machine learning, and program modeling and synthesis.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
现代数据处理系统旨在是通用系统,因为它们可以处理各种应用程序和数据。不幸的是,这种通用性的性质使这些系统为每个应用程序和用户实现低于最佳的性能。必须做出技术折衷方案以支持广泛的用例,通常会导致比高度定制的系统能够实现的质量差异。同时,为每个单独的应用程序和用户从头开始开发数据库系统既不是经济的也不是实际的。该项目的目的是探索如何使用机器学习来自动自动自定义数据库系统,以使特定应用程序或用户获得所谓的“实例 - 优先性”。 如果成功,该项目将改变构建Internet和许多企业计算系统的现代数据库系统的方式,从而导致具有更好性能的系统或能够使用比当前系统少得多的硬件处理大型数据集的系统。具体而言,该项目研究了学到的模型在多大程度上可以自动实例化大规模数据处理系统的各个组件:1)数据索引,其中模型可以预测数据库中密钥的位置; 2)算法,包括分类和加入,模型可以预测记录的排序列表中的位置,或者在其他关系中加入元组; 3)优化器,模型可以预测用于处理数据的查询的最佳计划,以及4)存储布局,其中模型可以预测特定查询工作负载的数据的最佳布局。 This raises a number of intellectually deep questions, including what types of models work best, what theoretical guarantees we can give about the performance of these models, how such generated systems will compare to hand-tuned systems, how such systems can exploit new hardware such as TPUs/GPUs and how program synthesis will work with such modelled data, advancing the fields of databases, machine learning, and program modeling and synthesis.This award reflects NSF's statutory mission并被认为是通过基金会的知识分子优点和更广泛的影响审查标准来评估值得支持的。

项目成果

期刊论文数量(15)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
SNARF: A Learning-Enhanced Range Filter
  • DOI:
    10.14778/3529337.3529347
  • 发表时间:
    2022-04
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Kapil Vaidya;Tim Kraska;Subarna Chatterjee;Eric R. Knorr;M. Mitzenmacher;Stratos Idreos
  • 通讯作者:
    Kapil Vaidya;Tim Kraska;Subarna Chatterjee;Eric R. Knorr;M. Mitzenmacher;Stratos Idreos
Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth
  • DOI:
  • 发表时间:
    2021-05
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Keyulu Xu;Mozhi Zhang;S. Jegelka;Kenji Kawaguchi
  • 通讯作者:
    Keyulu Xu;Mozhi Zhang;S. Jegelka;Kenji Kawaguchi
TreeLine: An Update-In-Place Key-Value Store for Modern Storage
  • DOI:
    10.14778/3561261.3561270
  • 发表时间:
    2022-09
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Geoffrey X. Yu;Markos Markakis;Andreas Kipf;P. Larson;U. F. Minhas;Tim Kraska
  • 通讯作者:
    Geoffrey X. Yu;Markos Markakis;Andreas Kipf;P. Larson;U. F. Minhas;Tim Kraska
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
  • DOI:
  • 发表时间:
    2020-09
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Keyulu Xu;Jingling Li;Mozhi Zhang;S. Du;K. Kawarabayashi;S. Jegelka
  • 通讯作者:
    Keyulu Xu;Jingling Li;Mozhi Zhang;S. Du;K. Kawarabayashi;S. Jegelka
SageDB: An Instance-Optimized Data Analytics System
  • DOI:
    10.14778/3565838.3565857
  • 发表时间:
    2022-09
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jialin Ding;Ryan Marcus;Andreas Kipf;Vikram Nathan;Aniruddha Nrusimha;Kapil Vaidya;Alexander van Renen;Tim Kraska
  • 通讯作者:
    Jialin Ding;Ryan Marcus;Andreas Kipf;Vikram Nathan;Aniruddha Nrusimha;Kapil Vaidya;Alexander van Renen;Tim Kraska
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Tim Kraska其他文献

Building Database Applications in the Cloud
  • DOI:
    10.3929/ethz-a-006007449
  • 发表时间:
    2010
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Tim Kraska
  • 通讯作者:
    Tim Kraska
Towards a Benchmark for the Cloud
迈向云基准
  • DOI:
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Carsten Binnig;Donald Kossmann;Tim Kraska;Simon Losing
  • 通讯作者:
    Simon Losing
Safe Visual Data Exploration
安全的可视化数据探索
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zheguang Zhao;Emanuel Zgraggen;L. Stefani;Carsten Binnig;E. Upfal;Tim Kraska
  • 通讯作者:
    Tim Kraska
Self-Organizing Data Containers
自组织数据容器
Making the Case for Query-by-Voice with EchoQuery
使用 EchoQuery 进行语音查询的案例

Tim Kraska的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Tim Kraska', 18)}}的其他基金

III: Medium: Quantifying the Unknown Unknowns for Data Integration
III:媒介:量化数据集成的未知因素
  • 批准号:
    2033792
  • 财政年份:
    2020
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
BD Spokes: SPOKE: NORTHEAST: Collaborative: A Licensing Model and Ecosystem for Data Sharing
BD Spokes:SPOKE:NORTHEAST:协作:数据共享的许可模型和生态系统
  • 批准号:
    1947440
  • 财政年份:
    2019
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
III: Medium: Quantifying the Unknown Unknowns for Data Integration
III:媒介:量化数据集成的未知因素
  • 批准号:
    1562657
  • 财政年份:
    2016
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
BD Spokes: SPOKE: NORTHEAST: Collaborative: A Licensing Model and Ecosystem for Data Sharing
BD Spokes:SPOKE:NORTHEAST:协作:数据共享的许可模型和生态系统
  • 批准号:
    1636698
  • 财政年份:
    2016
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
CAREER: Query Compilation Techniques for Complex Analytics on Enterprise Clusters
职业:企业集群上复杂分析的查询编译技术
  • 批准号:
    1453171
  • 财政年份:
    2015
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant

相似国自然基金

复合低维拓扑材料中等离激元增强光学响应的研究
  • 批准号:
    12374288
  • 批准年份:
    2023
  • 资助金额:
    52 万元
  • 项目类别:
    面上项目
基于管理市场和干预分工视角的消失中等企业:特征事实、内在机制和优化路径
  • 批准号:
    72374217
  • 批准年份:
    2023
  • 资助金额:
    41.00 万元
  • 项目类别:
    面上项目
托卡马克偏滤器中等离子体的多尺度算法与数值模拟研究
  • 批准号:
    12371432
  • 批准年份:
    2023
  • 资助金额:
    43.5 万元
  • 项目类别:
    面上项目
中等质量黑洞附近的暗物质分布及其IMRI系统引力波回波探测
  • 批准号:
    12365008
  • 批准年份:
    2023
  • 资助金额:
    32 万元
  • 项目类别:
    地区科学基金项目
中等垂直风切变下非对称型热带气旋快速增强的物理机制研究
  • 批准号:
    42305004
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

III: Medium: Collaborative Research: Integrating Large-Scale Machine Learning and Edge Computing for Collaborative Autonomous Vehicles
III:媒介:协作研究:集成大规模机器学习和边缘计算以实现协作自动驾驶汽车
  • 批准号:
    2348169
  • 财政年份:
    2023
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
Collaborative Research: III: Medium: New Machine Learning Empowered Nanoinformatics System for Advancing Nanomaterial Design
合作研究:III:媒介:新的机器学习赋能纳米信息学系统,促进纳米材料设计
  • 批准号:
    2347592
  • 财政年份:
    2023
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
III: Medium: Linear Algebra Operators in Databases to Support Analytic and Machine-Learning Workloads
III:中:数据库中的线性代数运算符支持分析和机器学习工作负载
  • 批准号:
    2312991
  • 财政年份:
    2023
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
III: Medium: Advancing Deep Learning for Inverse Modeling
III:媒介:推进逆向建模的深度学习
  • 批准号:
    2313174
  • 财政年份:
    2023
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
Collaborative Research: III: Medium: VirtualLab: Integrating Deep Graph Learning and Causal Inference for Multi-Agent Dynamical Systems
协作研究:III:媒介:VirtualLab:集成多智能体动态系统的深度图学习和因果推理
  • 批准号:
    2312501
  • 财政年份:
    2023
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了