Collaborative Research: A Comparative Study of Approaches to Cluster-Based Large Scale Data Analysis
协作研究:基于集群的大规模数据分析方法的比较研究
基本信息
- 批准号:0844013
- 负责人:
- 金额:$ 15.12万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2009
- 资助国家:美国
- 起止时间:2009-02-01 至 2012-01-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This goal of this research project is to understand the tradeoffs between the MapReduce and parallel DBMS approaches to performing large-scale data analysis over large clusters of computers, and to bring together ideas from both communities. Both MapReduce and parallel database systems provide scalable data processing over hundreds to thousands of nodes. Both provide a stylized, high-level programming environment that allows users to efficiently filter and combine datasets while masking much of the complexity of parallelizing computation over a cluster. But they differ in substantial ways as well, such as their approaches to dealing with fault tolerance, their data modeling requirements, their query flexibility, and their ability to function in a heterogeneous processing environment.This multi-university team of researchers is investigating the effect of these differences on the performance and scalability of these two approaches. The research team is running a set of experiments that compare an open source MapReduce implementation (Hadoop) to two commercial parallel database systems (DB2 and Vertica) on a benchmark that includes a range of tasks designed to assess the tradeoffs between both approaches. The research team is seeking to understand which differences between the two approaches to performing large scale data analysis are fundamental tradeoffs, and which differences are possible to combine inside a single solution, so that ideas from one community can benefit the other.
本研究项目的目标是了解MapReduce和并行DBMS方法之间的权衡,以在大型计算机集群上执行大规模数据分析,并将两个社区的想法结合在一起。MapReduce和并行数据库系统都提供了在数百到数千个节点上的可扩展数据处理。 两者都提供了一个程式化的高级编程环境,允许用户有效地过滤和联合收割机数据集,同时掩盖了在集群上并行计算的大部分复杂性。但它们在本质上也有不同,例如它们处理容错的方法,它们的数据建模要求,它们的查询灵活性,以及它们在异构处理环境中运行的能力。这个由多所大学组成的研究小组正在研究这些差异对这两种方法的性能和可扩展性的影响。该研究团队正在运行一组实验,将开源MapReduce实现(Hadoop)与两个商业并行数据库系统(DB2和Vertica)进行比较,其中包括一系列旨在评估两种方法之间权衡的任务。研究团队正在试图了解这两种执行大规模数据分析的方法之间的哪些差异是根本的权衡,以及哪些差异可能在单个解决方案中联合收割机组合,以便来自一个社区的想法可以使另一个社区受益。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Samuel Madden其他文献
MEDLINE/ PubMed
MEDLINE/PubMed
- DOI:
10.1007/978-0-387-39940-9_3039 - 发表时间:
2004 - 期刊:
- 影响因子:3.8
- 作者:
Cornelia Caragea;V. Honavar;P. Boncz;P. Larson;S. Dietrich;Gonzalo Navarro;Bhavani Thuraisingham;Yan Luo;Ouri E. Wolfson;S. Beitzel;Eric C. Jensen;Ophir Frieder;Christian S. Jensen;N. Tradisauskas;Ethan V. Munson;A. Wun;K. Goda;Stephen E. Fienberg;Jiashun Jin;Guimei Liu;Nick Craswell;T. Pedersen;Cesare Pautasso;M. Moro;S. Manegold;B. Carminati;Marina Blanton;Sara Bouchenak;Noël de Palma;Wei Tang;Christoph Quix;M. Jeusfeld;R. K. Pon;David J. Buttler;W. Meng;P. Zezula;Michal Batko;Vlastislav Dohnal;J. Domingo;Denilson Barbosa;Ioana Manolescu;Jeffrey Xu Yu;Emmanuel Cecchet;Vivien Quéma;Xifeng Yan;G. Santucci;D. Zeinalipour;Panos K. Chrysanthis;Amol Deshpande;Carlos Guestrin;Samuel Madden;Carson Kai;R. H. Güting;Amarnath Gupta;Heng Tao Shen;G. Weikum;Ramesh Jain;Jeffrey Xu Yu;Paolo Ciaccia;K. Candan;M. Sapino;C. Meghini;F. Sebastiani;U. Straccia;F. Nack;V. S. Subrahmanian;Maria Vanina Martinez;D. Reforgiato;T. Westerveld;M. Sebillo;G. Vitiello;Maria De Marsico;K. Voruganti;C. Parent;S. Spaccapietra;Christelle Vangenot;Esteban Zimányi;Prasan Roy;S. Sudarshan;E. Puppo;Peer Kröger;Matthias Renz;H. Schuldt;Solmaz Kolahi;A. Unwin;W. Cellary - 通讯作者:
W. Cellary
Cabernet: A Content Delivery Network for Moving Vehicles
Cabernet:移动车辆的内容交付网络
- DOI:
- 发表时间:
2008 - 期刊:
- 影响因子:0
- 作者:
Jakob Eriksson;H. Balakrishnan;Samuel Madden - 通讯作者:
Samuel Madden
Cackle: Analytical Workload Cost and Performance Stability With Elastic Pools
Cackle:使用弹性池分析工作负载成本和性能稳定性
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Matthew Perron;Raul Castro Fernandez;David DeWitt;Michael Cafarella;Samuel Madden - 通讯作者:
Samuel Madden
Performant almost-latch-free data structures using epoch protection in more depth
更深入地使用纪元保护的高性能几乎无锁存的数据结构
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Tianyu Li;Badrish Chandramouli;Samuel Madden - 通讯作者:
Samuel Madden
Research contributions of Mike Stonebraker: an overview
- DOI:
10.1145/3226595.3226612 - 发表时间:
2018-12 - 期刊:
- 影响因子:0
- 作者:
Samuel Madden - 通讯作者:
Samuel Madden
Samuel Madden的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Samuel Madden', 18)}}的其他基金
Collaborative Research: Elements: A Self-tuning Anomaly Detection Service
合作研究:Elements:自调整异常检测服务
- 批准号:
2103799 - 财政年份:2021
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
III: Medium: Massively Parallel Data Analytics on Heterogeneous Architectures
III:中:异构架构上的大规模并行数据分析
- 批准号:
1763434 - 财政年份:2018
- 资助金额:
$ 15.12万 - 项目类别:
Continuing Grant
BD Spokes: SPOKE: NORTHEAST: Collaborative: A Licensing Model and Ecosystem for Data Sharing
BD Spokes:SPOKE:NORTHEAST:协作:数据共享的许可模型和生态系统
- 批准号:
1636766 - 财政年份:2016
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
III: Medium: Collaborative Research: DataHub - A Collaborative Dataset Management Platform for Data Science
III:媒介:协作研究:DataHub - 数据科学协作数据集管理平台
- 批准号:
1513443 - 财政年份:2015
- 资助金额:
$ 15.12万 - 项目类别:
Continuing Grant
ACM SIGMOD 2012 Student Programming Contest: A Multidimensional Indexing System
ACM SIGMOD 2012 学生编程竞赛:多维索引系统
- 批准号:
1235666 - 财政年份:2012
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
III: Medium: Scalable and Secure Database as a Service
III:中等:可扩展且安全的数据库即服务
- 批准号:
1065219 - 财政年份:2011
- 资助金额:
$ 15.12万 - 项目类别:
Continuing Grant
SIGMOD 2011 Programming Contest
SIGMOD 2011 编程大赛
- 批准号:
1129526 - 财政年份:2011
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
III: Large: Collaborative Research: SciDB - An Array Oriented Data Management System for Massive Scale Scientific Data
III:大型:协作研究:SciDB - 用于大规模科学数据的面向数组的数据管理系统
- 批准号:
1111371 - 财政年份:2011
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
2010 SIGMOD Programming Contest
2010年SIGMOD编程大赛
- 批准号:
1037986 - 财政年份:2010
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
2009 SIGMOD Programming Contest
2009年SIGMOD编程大赛
- 批准号:
0848727 - 财政年份:2008
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: How to manipulate a plant? Testing for conserved effectors and plant responses in gall induction and growth using a multi-species comparative approach.
合作研究:如何操纵植物?
- 批准号:
2305880 - 财政年份:2023
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
Collaborative Research: Ecologies of Participation in Island Karst Science and Conservation: A Comparative Multimethods Approach
合作研究:参与岛屿喀斯特科学与保护的生态学:比较多方法方法
- 批准号:
2236152 - 财政年份:2023
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
Collaborative Research: Ecologies of Participation in Island Karst Science and Conservation: A Comparative Multimethods Approach
合作研究:参与岛屿喀斯特科学与保护的生态学:比较多方法方法
- 批准号:
2236151 - 财政年份:2023
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
Collaborative Research: RESEARCH-PGR: Comparative genomics of the capitulum: deciphering the molecular basis of a key floral innovation
合作研究:RESEARCH-PGR:头状花序的比较基因组学:破译关键花卉创新的分子基础
- 批准号:
2214473 - 财政年份:2022
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
Collaborative Research: Comparative genomics and physiology to discover integrated mechanisms that support phenotypic plasticity
合作研究:比较基因组学和生理学,发现支持表型可塑性的综合机制
- 批准号:
2200320 - 财政年份:2022
- 资助金额:
$ 15.12万 - 项目类别:
Continuing Grant
Collaborative Research: RESEARCH-PGR: Comparative genomics of the capitulum: deciphering the molecular basis of a key floral innovation
合作研究:RESEARCH-PGR:头状花序的比较基因组学:破译关键花卉创新的分子基础
- 批准号:
2214472 - 财政年份:2022
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
Collaborative Research: RESEARCH-PGR: Comparative genomics of the capitulum: deciphering the molecular basis of a key floral innovation
合作研究:RESEARCH-PGR:头状花序的比较基因组学:破译关键花卉创新的分子基础
- 批准号:
2214474 - 财政年份:2022
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
Collaborative Research: Comparative genomics and physiology to discover integrated mechanisms that support phenotypic plasticity
合作研究:比较基因组学和生理学,发现支持表型可塑性的综合机制
- 批准号:
2200319 - 财政年份:2022
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
Collaborative Research: RUI: Dynamic Learning in Comparative Courts: A Cross-National Analysis of Judicial Decision Making in Canada, the United States, and the United Kingdom
合作研究:RUI:比较法院的动态学习:加拿大、美国和英国司法决策的跨国分析
- 批准号:
2325460 - 财政年份:2022
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant
Collaborative Research: RUI: Comparative analysis of endocytic trafficking during cell division
合作研究:RUI:细胞分裂过程中内吞运输的比较分析
- 批准号:
2052517 - 财政年份:2021
- 资助金额:
$ 15.12万 - 项目类别:
Standard Grant