THOR: A New Programming Model for Data Analysis and Mining

THOR:数据分析和挖掘的新编程模型

基本信息

  • 批准号:
    0833136
  • 负责人:
  • 金额:
    $ 68.66万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2008
  • 资助国家:
    美国
  • 起止时间:
    2008-09-01 至 2012-08-31
  • 项目状态:
    已结题

项目摘要

Virtually every human endeavor in a broad variety of disciplines in science, engineering, and finance, is encountering the need to make new discoveries through the analysis of massive datasets. Yet, the task of creating the necessary scalable analysis tools for modern high-performance parallel hardware is a daunting task, due to the complexity of developing efficient algorithms and subsequently having to parallelize and tune these algorithms for real systems. This research project aims to address this problem by developing a new domain-specific programming model, called the Tree-based High-Order Reduce (THOR) model, that enables rapid automatic implementation of customized parallel data analysis and mining tasks with minimal coding effort.The key enabling insight behind THOR is the generalized n-body problem (GNP) theory, a mathematical formalism that unifies the expression of seemingly disparate statistical data analysis tasks, including n-point correlation, hierarchical clustering, k-nearest neighbors classification, and kernel density estimation, among numerous others. As its name suggests, a GNP elegantly generalizes the classical n-body problem from physics to a much broader class of problems. Most importantly, the GNP form permits the development of asymptotically fast solutions, e.g., generalized versions of the fast multipole method. The THOR model enables the data analyst to specify a GNP, from which the THOR program generator can automatically produce a highly tuned parallel implementation. In short, this project aims to show how a programming model, which is bound to an appropriately high-level mathematical formalism while having the simplicity of a model like MapReduce, can lead to both scalable data analysis algorithms and their efficient implementation.
事实上,在科学、工程和金融等各种学科中,人类的每一项奋进都需要通过分析大量数据集来获得新的发现。然而,为现代高性能并行硬件创建必要的可扩展分析工具的任务是一项艰巨的任务,这是由于开发高效算法的复杂性以及随后必须为真实的系统并行化和调整这些算法。本研究项目旨在通过开发一种新的特定于领域的编程模型来解决这个问题,称为基于树的高阶约简(THOR)模型,该模型能够以最少的编码工作快速自动实现定制的并行数据分析和挖掘任务。THOR背后的关键实现洞察力是广义n体问题(GNP)理论,一种数学形式主义,统一了看似不同的统计数据分析任务的表达,包括n点相关,分层聚类,k最近邻分类和核密度估计等。正如它的名字所暗示的那样,GNP优雅地将经典的n体问题从物理学推广到更广泛的一类问题。最重要的是,GNP形式允许渐近快速解决方案的发展,例如,快速多极子方法的广义版本。THOR模型使数据分析师能够指定一个GNP,THOR程序生成器可以自动生成一个高度调优的并行实现。简而言之,这个项目旨在展示一个编程模型,它绑定到一个适当的高级数学形式主义,同时具有像MapReduce这样的模型的简单性,可以导致可扩展的数据分析算法及其有效的实现。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Richard Vuduc其他文献

Multifidelity Memory System Simulation in SST
SST 中的多保真内存系统仿真
Modeling the Power Variability of Core Speed Scaling on Homogeneous Multicore Systems
对同质多核系统上核心速度调节的功率变化进行建模
  • DOI:
    10.1155/2017/8686971
  • 发表时间:
    2017-10
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zhihui Du;Rong Ge;Victor W. Lee;Richard Vuduc;David A. Bader;Ligang He
  • 通讯作者:
    Ligang He
Sparse Matrix-Vector Multiplication on Multicore and Accelerators
多核和加速器上的稀疏矩阵向量乘法
  • DOI:
  • 发表时间:
    2010
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Samuel Williams;Nathan Bell;Jee Whan Choi;Michael Garland;L. Oliker;Richard Vuduc
  • 通讯作者:
    Richard Vuduc

Richard Vuduc的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Richard Vuduc', 18)}}的其他基金

XPS: FULL: DSD: A Parallel Tensor Infrastructure (ParTI!) for Data Analysis
XPS:完整:DSD:用于数据分析的并行张量基础设施 (PartTI!)
  • 批准号:
    1533768
  • 财政年份:
    2015
  • 资助金额:
    $ 68.66万
  • 项目类别:
    Standard Grant
SHF: Small: How Much Execution Time, Energy, And Power Does an Algorithm Need?
SHF:小:算法需要多少执行时间、能量和功率?
  • 批准号:
    1422935
  • 财政年份:
    2014
  • 资助金额:
    $ 68.66万
  • 项目类别:
    Standard Grant
SHF: Small: Locating and Explaining Faults in Concurrent Software
SHF:小:并发软件中的故障定位和解释
  • 批准号:
    1116210
  • 财政年份:
    2011
  • 资助金额:
    $ 68.66万
  • 项目类别:
    Standard Grant
CAREER: Autotuning Foundations for Exascale Computing
职业:百亿亿次计算的自动调整基础
  • 批准号:
    0953100
  • 财政年份:
    2010
  • 资助金额:
    $ 68.66万
  • 项目类别:
    Continuing Grant

相似海外基金

SHF: SMALL: A New Semantics for Type-Level Programming in Haskell
SHF:SMALL:Haskell 中类型级编程的新语义
  • 批准号:
    2345580
  • 财政年份:
    2024
  • 资助金额:
    $ 68.66万
  • 项目类别:
    Standard Grant
A New Histone H3 Modification Regulates Epigenetic Programming and Gene Expression in Breast Cancer
一种新的组蛋白 H3 修饰调节乳腺癌的表观遗传编程和基因表达
  • 批准号:
    10607954
  • 财政年份:
    2022
  • 资助金额:
    $ 68.66万
  • 项目类别:
2022 Mixed Integer Programming Workshop Poster Session and Computational Competition; New Brunswick, New Jersey; May 24-26, 2022
2022年混合整数规划研讨会海报会议及计算竞赛;
  • 批准号:
    2211222
  • 财政年份:
    2022
  • 资助金额:
    $ 68.66万
  • 项目类别:
    Standard Grant
Studying and Developing New Text-based Programming Environments for Novices
为新手研究和开发新的基于文本的编程环境
  • 批准号:
    572490-2022
  • 财政年份:
    2022
  • 资助金额:
    $ 68.66万
  • 项目类别:
    University Undergraduate Student Research Awards
Programming synthetic cells as new therapeutic vectors
将合成细胞编程为新的治疗载体
  • 批准号:
    2827500
  • 财政年份:
    2022
  • 资助金额:
    $ 68.66万
  • 项目类别:
    Studentship
New Programming Language and Runtime System
新的编程语言和运行时系统
  • 批准号:
    537903-2018
  • 财政年份:
    2021
  • 资助金额:
    $ 68.66万
  • 项目类别:
    Collaborative Research and Development Grants
Designing new low-level systems programming languages suitable for verification - pKVM
设计适合验证的新低级系统编程语言 - pKVM
  • 批准号:
    2744391
  • 财政年份:
    2020
  • 资助金额:
    $ 68.66万
  • 项目类别:
    Studentship
New Programming Language and Runtime System
新的编程语言和运行时系统
  • 批准号:
    537903-2018
  • 财政年份:
    2020
  • 资助金额:
    $ 68.66万
  • 项目类别:
    Collaborative Research and Development Grants
New development of spatio-temporal molecular programming
时空分子编程新进展
  • 批准号:
    20H00618
  • 财政年份:
    2020
  • 资助金额:
    $ 68.66万
  • 项目类别:
    Grant-in-Aid for Scientific Research (A)
New Programming Language and Runtime System
新的编程语言和运行时系统
  • 批准号:
    537903-2018
  • 财政年份:
    2019
  • 资助金额:
    $ 68.66万
  • 项目类别:
    Collaborative Research and Development Grants
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了