III: Medium: Massively Parallel Data Analytics on Heterogeneous Architectures

III:中:异构架构上的大规模并行数据分析

基本信息

  • 批准号:
    1763434
  • 负责人:
  • 金额:
    $ 120万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2018
  • 资助国家:
    美国
  • 起止时间:
    2018-06-01 至 2023-05-31
  • 项目状态:
    已结题

项目摘要

The rise of connected devices and the Internet has led to unprecedented growth in the volumes of data that computers must store and process. This enormous growth in data volumes coincides with a growing demand for immediate answers and interactive analytics. Increasingly companies need real-time reports of sales, network traffic, anomalies, and other business trends. Internet-connected devices, like cars and industrial plants, demand even more real-time analysis of data. These trends mean database and data analytics platforms must deliver ever-faster performance from machines, a fact that has driven the dramatic interest in scalable multi-node processing systems, like Hadoop and Spark, which distribute the processing of large data sets across clusters of machines. Unfortunately, because of the way these platforms are engineered, they provide shockingly poor utilization of the hardware resources on each node, often times yielding single-node throughput that is thousands of times lower than what the raw hardware is capable of. In this project, an orthogonal direction will be pursued; a system, called Proteus, will be built that will obtain performance that utilizes hardware to the fullest extent possible, focusing on yielding a scalable system that fully utilizes all available computing resources. If successful, this project will have broad impact because databases and data-intensive parallel computing systems are used by millions of enterprises around the world, both on-site and in computing clouds; optimized implementations of these systems that better exploit hardware will improve response times and reduce hardware and energy costs, resulting in billions of dollars of cost savings.Proteus will parallelize across many cores on a single processor, as well as take advantages of many-core systems such as GPUs and Intel's Xeon Phi. In addition, Proteus will also be able to exploit large diverse clusters of hardware, but the aim is to do that without giving up this efficiency, rather than accepting inefficiency as a given of distributed computing. To do this, research in the Proteus project will focus on four key areas: (1) Developing optimized implementations of individual database algorithms, such as top-k sorts, sequential scans, random lookups, graph and machine learning algorithms for GPUs and CPUs. (2) Building cost models that predict the performance of these algorithms on heterogeneous architectures. (3) Developing intermediate languages that abstract details of the underlying hardware, to hide the nuances of these different platforms to but without giving up performance. (4) Building an optimizer that uses cost models to place these plans onto a heterogeneous mix of hardware to obtain the best overall performance for each query plan.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
互联设备和互联网的兴起导致计算机必须存储和处理的数据量空前增长。数据量的巨大增长与对即时答案和交互式分析的需求不断增长相吻合。越来越多的公司需要销售、网络流量、异常和其他业务趋势的实时报告。互联网连接的设备,如汽车和工业工厂,需要更实时的数据分析。这些趋势意味着数据库和数据分析平台必须从机器上提供更快的性能,这一事实推动了对可扩展多节点处理系统的极大兴趣,如Hadoop和Spark,它们将大型数据集的处理分布在机器集群上。不幸的是,由于这些平台的设计方式,它们对每个节点上的硬件资源的利用率非常低,通常会产生比原始硬件低数千倍的单节点吞吐量。在这个项目中,将追求一个正交的方向;将建立一个名为Proteus的系统,该系统将获得尽可能充分利用硬件的性能,重点是产生一个充分利用所有可用计算资源的可扩展系统。如果成功,该项目将产生广泛的影响,因为数据库和数据密集型并行计算系统被世界各地数百万企业使用,包括现场和计算云中;更好地利用硬件的这些系统的优化实现将改善响应时间并降低硬件和能源成本,从而节省数十亿美元的成本。Proteus将在单个处理器上的多个核心上并行化,以及利用多核系统,如GPU和英特尔的至强融核。此外,Proteus还将能够利用大型不同的硬件集群,但目标是在不放弃这种效率的情况下做到这一点,而不是接受分布式计算的低效率。为此,Proteus项目的研究将集中在四个关键领域:(1)开发单个数据库算法的优化实现,例如top-k排序,顺序扫描,随机查找,图形和GPU和CPU的机器学习算法。(2)构建成本模型,预测这些算法在异构体系结构上的性能。(3)开发抽象底层硬件细节的中间语言,隐藏这些不同平台的细微差别,但不放弃性能。(4)构建一个优化器,该优化器使用成本模型将这些计划放置到异构硬件组合中,以获得每个查询计划的最佳总体性能。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响评审标准进行评估,被认为值得支持。

项目成果

期刊论文数量(11)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Taming the Zoo: The Unified GraphIt Compiler Framework for Novel Architectures
Optimizing ordered graph algorithms with GraphIt
Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration
A Study of the Fundamental Performance Characteristics of GPUs and CPUs for Database Analytics
Compiling Graph Applications for GPU s with GraphIt
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Samuel Madden其他文献

MEDLINE/ PubMed
MEDLINE/PubMed
  • DOI:
    10.1007/978-0-387-39940-9_3039
  • 发表时间:
    2004
  • 期刊:
  • 影响因子:
    3.8
  • 作者:
    Cornelia Caragea;V. Honavar;P. Boncz;P. Larson;S. Dietrich;Gonzalo Navarro;Bhavani Thuraisingham;Yan Luo;Ouri E. Wolfson;S. Beitzel;Eric C. Jensen;Ophir Frieder;Christian S. Jensen;N. Tradisauskas;Ethan V. Munson;A. Wun;K. Goda;Stephen E. Fienberg;Jiashun Jin;Guimei Liu;Nick Craswell;T. Pedersen;Cesare Pautasso;M. Moro;S. Manegold;B. Carminati;Marina Blanton;Sara Bouchenak;Noël de Palma;Wei Tang;Christoph Quix;M. Jeusfeld;R. K. Pon;David J. Buttler;W. Meng;P. Zezula;Michal Batko;Vlastislav Dohnal;J. Domingo;Denilson Barbosa;Ioana Manolescu;Jeffrey Xu Yu;Emmanuel Cecchet;Vivien Quéma;Xifeng Yan;G. Santucci;D. Zeinalipour;Panos K. Chrysanthis;Amol Deshpande;Carlos Guestrin;Samuel Madden;Carson Kai;R. H. Güting;Amarnath Gupta;Heng Tao Shen;G. Weikum;Ramesh Jain;Jeffrey Xu Yu;Paolo Ciaccia;K. Candan;M. Sapino;C. Meghini;F. Sebastiani;U. Straccia;F. Nack;V. S. Subrahmanian;Maria Vanina Martinez;D. Reforgiato;T. Westerveld;M. Sebillo;G. Vitiello;Maria De Marsico;K. Voruganti;C. Parent;S. Spaccapietra;Christelle Vangenot;Esteban Zimányi;Prasan Roy;S. Sudarshan;E. Puppo;Peer Kröger;Matthias Renz;H. Schuldt;Solmaz Kolahi;A. Unwin;W. Cellary
  • 通讯作者:
    W. Cellary
Cabernet: A Content Delivery Network for Moving Vehicles
Cabernet:移动车辆的内容交付网络
  • DOI:
  • 发表时间:
    2008
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Jakob Eriksson;H. Balakrishnan;Samuel Madden
  • 通讯作者:
    Samuel Madden
Cackle: Analytical Workload Cost and Performance Stability With Elastic Pools
Cackle:使用弹性池分析工作负载成本和性能稳定性
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Matthew Perron;Raul Castro Fernandez;David DeWitt;Michael Cafarella;Samuel Madden
  • 通讯作者:
    Samuel Madden
Performant almost-latch-free data structures using epoch protection in more depth
更深入地使用纪元保护的高性能几乎无锁存的数据结构
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Tianyu Li;Badrish Chandramouli;Samuel Madden
  • 通讯作者:
    Samuel Madden
Research contributions of Mike Stonebraker: an overview
  • DOI:
    10.1145/3226595.3226612
  • 发表时间:
    2018-12
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Samuel Madden
  • 通讯作者:
    Samuel Madden

Samuel Madden的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Samuel Madden', 18)}}的其他基金

Collaborative Research: Elements: A Self-tuning Anomaly Detection Service
合作研究:Elements:自调整异常检测服务
  • 批准号:
    2103799
  • 财政年份:
    2021
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
BD Spokes: SPOKE: NORTHEAST: Collaborative: A Licensing Model and Ecosystem for Data Sharing
BD Spokes:SPOKE:NORTHEAST:协作:数据共享的许可模型和生态系统
  • 批准号:
    1636766
  • 财政年份:
    2016
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
III: Medium: Collaborative Research: DataHub - A Collaborative Dataset Management Platform for Data Science
III:媒介:协作研究:DataHub - 数据科学协作数据集管理平台
  • 批准号:
    1513443
  • 财政年份:
    2015
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
ACM SIGMOD 2012 Student Programming Contest: A Multidimensional Indexing System
ACM SIGMOD 2012 学生编程竞赛:多维索引系统
  • 批准号:
    1235666
  • 财政年份:
    2012
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
III: Medium: Scalable and Secure Database as a Service
III:中等:可扩展且安全的数据库即服务
  • 批准号:
    1065219
  • 财政年份:
    2011
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
SIGMOD 2011 Programming Contest
SIGMOD 2011 编程大赛
  • 批准号:
    1129526
  • 财政年份:
    2011
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
III: Large: Collaborative Research: SciDB - An Array Oriented Data Management System for Massive Scale Scientific Data
III:大型:协作研究:SciDB - 用于大规模科学数据的面向数组的数据管理系统
  • 批准号:
    1111371
  • 财政年份:
    2011
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
2010 SIGMOD Programming Contest
2010年SIGMOD编程大赛
  • 批准号:
    1037986
  • 财政年份:
    2010
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
Collaborative Research: A Comparative Study of Approaches to Cluster-Based Large Scale Data Analysis
协作研究:基于集群的大规模数据分析方法的比较研究
  • 批准号:
    0844013
  • 财政年份:
    2009
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
2009 SIGMOD Programming Contest
2009年SIGMOD编程大赛
  • 批准号:
    0848727
  • 财政年份:
    2008
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant

相似海外基金

RII Track-4:@NASA: Bluer and Hotter: From Ultraviolet to X-ray Diagnostics of the Circumgalactic Medium
RII Track-4:@NASA:更蓝更热:从紫外到 X 射线对环绕银河系介质的诊断
  • 批准号:
    2327438
  • 财政年份:
    2024
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
Collaborative Research: Topological Defects and Dynamic Motion of Symmetry-breaking Tadpole Particles in Liquid Crystal Medium
合作研究:液晶介质中对称破缺蝌蚪粒子的拓扑缺陷与动态运动
  • 批准号:
    2344489
  • 财政年份:
    2024
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
Collaborative Research: AF: Medium: The Communication Cost of Distributed Computation
合作研究:AF:媒介:分布式计算的通信成本
  • 批准号:
    2402836
  • 财政年份:
    2024
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
Collaborative Research: AF: Medium: Foundations of Oblivious Reconfigurable Networks
合作研究:AF:媒介:遗忘可重构网络的基础
  • 批准号:
    2402851
  • 财政年份:
    2024
  • 资助金额:
    $ 120万
  • 项目类别:
    Continuing Grant
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
  • 批准号:
    2403122
  • 财政年份:
    2024
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Medium: Differentiable Hardware Synthesis
合作研究:SHF:媒介:可微分硬件合成
  • 批准号:
    2403134
  • 财政年份:
    2024
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
Collaborative Research: CyberTraining: Implementation: Medium: Training Users, Developers, and Instructors at the Chemistry/Physics/Materials Science Interface
协作研究:网络培训:实施:媒介:在化学/物理/材料科学界面培训用户、开发人员和讲师
  • 批准号:
    2321102
  • 财政年份:
    2024
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
Collaborative Research: CyberTraining: Implementation: Medium: Transforming the Molecular Science Research Workforce through Integration of Programming in University Curricula
协作研究:网络培训:实施:中:通过将编程融入大学课程来改变分子科学研究人员队伍
  • 批准号:
    2321045
  • 财政年份:
    2024
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
Collaborative Research: CyberTraining: Implementation: Medium: Training Users, Developers, and Instructors at the Chemistry/Physics/Materials Science Interface
协作研究:网络培训:实施:媒介:在化学/物理/材料科学界面培训用户、开发人员和讲师
  • 批准号:
    2321103
  • 财政年份:
    2024
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
Collaborative Research: CPS: Medium: Automating Complex Therapeutic Loops with Conflicts in Medical Cyber-Physical Systems
合作研究:CPS:中:自动化医疗网络物理系统中存在冲突的复杂治疗循环
  • 批准号:
    2322534
  • 财政年份:
    2024
  • 资助金额:
    $ 120万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了