III: Medium: Linear Algebra Operators in Databases to Support Analytic and Machine-Learning Workloads

III:中:数据库中的线性代数运算符支持分析和机器学习工作负载

基本信息

  • 批准号:
    2312991
  • 负责人:
  • 金额:
    $ 101.63万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2023
  • 资助国家:
    美国
  • 起止时间:
    2023-07-01 至 2027-06-30
  • 项目状态:
    未结题

项目摘要

Machine-learning tools have become ubiquitous in modern information systems. The data inputs used by these tools often originate from relational databases. The data outputs generated by those tools are often stored in databases, where they can be used for subsequent data analysis. Typically, however, the learning process itself is performed outside the database system. This project investigates the opportunity for performing more of the machine learning work within the database itself, avoiding expensive (and often redundant) data export and import. In partnership with researchers from Relational-AI and Microsoft, the Columbia University team will design and build two interacting open-source systems named MARQUE and ZORK. These systems will make data analysis more efficient and effective for database-resident information. Improved efficiency will lead to faster, more cost-effective machine learning, and executing ML within the DBMS will simplify operational complexity and benefit from DBMS features such as scalability, access control, and data management. Ultimately, this work will broaden the adoption of machine learning technologies in a wide range of data-intensive disciplines.MARQUE will be a database management system that supports machine learning primitives such as linear algebra operations within the context of a query processing engine. The system will efficiently compile SQL queries using embedded machine learning models within the database, combining state-of-the-art query processing techniques with highly engineered linear algebra algorithms. MARQUE will allow components of the machine-learning pipeline itself to be formulated as in-database operations, avoiding unnecessary data copying. Conventional SQL analytic queries that can be reformulated using extensions of operators like matrix multiplication can be optimized to use efficient execution plans involving specialized algorithms for such operators. To further support in-database machine learning, the project investigators will build ZORK, a system to support machine learning at scale that will make use of the infrastructure provided by MARQUE. ZORK will scale to very large datasets by processing factorized representations of the data rather than explicitly materializing large joins. This project will develop new and innovative query processing techniques for queries involving both conventional relational operators and generalized linear algebra operators. Tight integration will facilitate query optimization within and between operators. Using this system, a range of machine learning techniques will be developed that operate entirely within the database management system, avoiding data export and simplifying concerns such as data privacy administration.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
机器学习工具在现代信息系统中无处不在。这些工具使用的数据输入通常来自关系数据库。这些工具生成的数据输出通常存储在数据库中,可用于后续的数据分析。但是,通常,学习过程本身是在数据库系统之外执行的。这个项目研究了在数据库本身内执行更多机器学习工作的机会,避免了昂贵的(通常是冗余的)数据导出和导入。哥伦比亚大学的团队将与relationship - ai和微软的研究人员合作,设计并构建两个相互作用的开源系统,分别名为MARQUE和ZORK。这些系统将使数据库驻留信息的数据分析更加高效和有效。提高效率将带来更快、更经济的机器学习,在DBMS中执行ML将简化操作复杂性,并受益于DBMS的特性,如可扩展性、访问控制和数据管理。最终,这项工作将扩大机器学习技术在广泛的数据密集型学科中的应用。MARQUE将是一个数据库管理系统,支持机器学习原语,如查询处理引擎上下文中的线性代数操作。该系统将使用数据库内的嵌入式机器学习模型有效地编译SQL查询,将最先进的查询处理技术与高度设计的线性代数算法相结合。MARQUE将允许机器学习管道本身的组件被制定为数据库内操作,避免不必要的数据复制。传统的SQL分析查询可以使用像矩阵乘法这样的运算符的扩展来重新表述,可以优化为使用高效的执行计划,包括针对这些运算符的专门算法。为了进一步支持数据库内机器学习,项目研究人员将构建ZORK,这是一个利用MARQUE提供的基础设施支持大规模机器学习的系统。ZORK将通过处理数据的因式表示来扩展到非常大的数据集,而不是显式地实现大型连接。该项目将为涉及传统关系运算符和广义线性代数运算符的查询开发新的和创新的查询处理技术。紧密集成将有助于操作符内部和操作符之间的查询优化。使用该系统,将开发一系列完全在数据库管理系统内运行的机器学习技术,避免数据导出并简化数据隐私管理等问题。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Kenneth Ross其他文献

Enhancing cold spray coatings: Microstructural dynamics and performance attributes of Inconel 625 with chromium carbide incorporation for hydropower applications
增强冷喷涂涂层:用于水电应用的掺碳化铬的 Inconel 625 合金的微观结构动态及性能特性
  • DOI:
    10.1016/j.surfcoat.2025.131932
  • 发表时间:
    2025-03-15
  • 期刊:
  • 影响因子:
    6.100
  • 作者:
    Mayur Pole;Abhinav Srivastava;Julian Escobar;Joshua Silverstein;Bharat Gwalani;Kenneth Ross;Christopher Smith
  • 通讯作者:
    Christopher Smith
Special issue: best papers of VLDB 2008
  • DOI:
    10.1007/s00778-009-0173-y
  • 发表时间:
    2009-12-08
  • 期刊:
  • 影响因子:
    3.800
  • 作者:
    Peter Buneman;Volker Markl;Beng Chin Ooi;Kenneth Ross
  • 通讯作者:
    Kenneth Ross
Parallel Prefix Sum with SIMD
SIMD 并行前缀和
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Wangda Zhang;Yanbin Wang;Kenneth Ross
  • 通讯作者:
    Kenneth Ross
Using Packet Processing Object Modules Interchangeably as Stand-Alone Programs or “Multi-app” Components
可互换地使用数据包处理对象模块作为独立程序或“多应用程序”组件
Arthrogryposis and congenital absence of the anterior cruciate ligament: A case report
  • DOI:
    10.1016/j.knee.2008.08.004
  • 发表时间:
    2009-01-01
  • 期刊:
  • 影响因子:
  • 作者:
    Kenny Kwan;Kenneth Ross
  • 通讯作者:
    Kenneth Ross

Kenneth Ross的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Kenneth Ross', 18)}}的其他基金

III: Small: Bringing database query optimization to data intensive applications
III:小型:将数据库查询优化引入数据密集型应用程序
  • 批准号:
    2008295
  • 财政年份:
    2020
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Standard Grant
Evolutionary Genomics of a Supergene Implicated in Social Evolution
与社会进化有关的超基因的进化基因组学
  • 批准号:
    1354479
  • 财政年份:
    2014
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Continuing Grant
III: Small: Database Algorithms for Modern CPU Memory Hierarchies
III:小型:现代 CPU 内存层次结构的数据库算法
  • 批准号:
    1422488
  • 财政年份:
    2014
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Standard Grant
III: Small: Database Processing on GPUs
III:小型:GPU 上的数据库处理
  • 批准号:
    1218222
  • 财政年份:
    2012
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Continuing Grant
Research and Education Activities at ACM SIGMOD/PODS 2013
ACM SIGMOD/PODS 2013 的研究和教育活动
  • 批准号:
    1246690
  • 财政年份:
    2012
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Standard Grant
EAGER: Rapid Updates and Snapshot-Based Queries Using Multicore Processors
EAGER:使用多核处理器进行快速更新和基于快照的查询
  • 批准号:
    1049898
  • 财政年份:
    2010
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Standard Grant
Collaborative Research: Speciation and Evolution of Fire Ants - An Integrated Population Genetic, Phylogenetic, and Ecological Approach
合作研究:火蚁的物种形成和进化——种群遗传学、系统发育和生态学的综合方法
  • 批准号:
    1020652
  • 财政年份:
    2010
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Standard Grant
III: Small: Avoiding Contention on Multicore Machines
III:小:避免多核机器上的争用
  • 批准号:
    0915956
  • 财政年份:
    2009
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Standard Grant
Cache-Aware Database Systems on Modern Multithreading Processors
现代多线程处理器上的缓存感知数据库系统
  • 批准号:
    0534389
  • 财政年份:
    2006
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Continuing Grant
Database Query Processing in Main Memory
主内存中的数据库查询处理
  • 批准号:
    0120939
  • 财政年份:
    2001
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Continuing Grant

相似海外基金

AF: Medium: Collaborative Research: Beyond Sparsity: Refined Measures of Complexity for Linear Algebra
AF:媒介:协作研究:超越稀疏性:线性代数复杂性的精确度量
  • 批准号:
    1763481
  • 财政年份:
    2018
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Continuing Grant
AF: Medium: Collaborative Research: Beyond Sparsity: Refined Measures of Complexity for Linear Algebra
AF:媒介:协作研究:超越稀疏性:线性代数复杂性的精确度量
  • 批准号:
    1763315
  • 财政年份:
    2018
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Continuing Grant
Non-adiabatic dynamics of electronically excited linear polyenes: learning from small and medium-sized for long polyenes and their biological function
电子激发线性多烯的非绝热动力学:从中小型学习长多烯及其生物学功能
  • 批准号:
    239673056
  • 财政年份:
    2013
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Research Grants
SBIR Phase I: Non-linear Mechanical System for Creating Propulsive Thrust in a Fluid Medium
SBIR 第一阶段:在流体介质中产生推进推力的非线性机械系统
  • 批准号:
    1315383
  • 财政年份:
    2013
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Standard Grant
Systematic linear-response calculations of medium-heavy unstable nuclei in a nuclear energy-density-functional approach
核能密度泛函方法中中重不稳定核的系统线性响应计算
  • 批准号:
    23740223
  • 财政年份:
    2011
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Grant-in-Aid for Young Scientists (B)
AF: Medium: Taming Masssive Data with Sub-Linear Algorithms
AF:中:用次线性算法驯服海量数据
  • 批准号:
    1065125
  • 财政年份:
    2011
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Standard Grant
AF: Medium: Collaborative Research: Approximate Computational Geometry via Controlled Linear Perturbation
AF:媒介:协作研究:通过受控线性扰动近似计算几何
  • 批准号:
    0904832
  • 财政年份:
    2009
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Standard Grant
AF: Medium: Collaborative Research: Approximate Computational Geometry via Controlled Linear Perturbation
AF:媒介:协作研究:通过受控线性扰动近似计算几何
  • 批准号:
    0904707
  • 财政年份:
    2009
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Standard Grant
RI-Medium: Collaborative Research: Graph Cut Algorithms for Linear Inverse Systems
RI-Medium:协作研究:线性逆系统的图割算法
  • 批准号:
    0803705
  • 财政年份:
    2008
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Standard Grant
RI-Medium: Collaborative Research: Graph Cut Algorithms for Linear Inverse Systems
RI-Medium:协作研究:线性逆系统的图割算法
  • 批准号:
    0803444
  • 财政年份:
    2008
  • 资助金额:
    $ 101.63万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了