EAGER: Efficient Algorithms for Dimensionality Reduction and Clustering Using Disk-Based Matrices

EAGER:使用基于磁盘的矩阵进行降维和聚类的高效算法

基本信息

  • 批准号:
    0937562
  • 负责人:
  • 金额:
    $ 10万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2009
  • 资助国家:
    美国
  • 起止时间:
    2009-07-15 至 2011-06-30
  • 项目状态:
    已结题

项目摘要

EAGER: Efficient Algorithms for Dimensionality Reduction and Clustering Using Disk-Based Matrices Carlos Ordonez1. Research and Education Proposal Linear Gaussian models on large data sets computation is characterized by heavy matrix manipulation and iterative methods with slow convergence. Efficiency issues become worse considering the fact that large data sets are stored and retrieved from disk and that in a database system models are manipulated on disk as well. Despite their importance there is scarce research work that attempts to adapt this big family of Gaussian models exploiting database systems techniques. This proposal studies how to improve algorithms for linear Gaussian models to analyze large, high dimensional, data sets, manipulating matrices on secondary storage (i.e. disk), using a small amount of primary storage (i.e. RAM memory). The models studied herein include maximum likelihood factor analysis for dimensionality reduction and mixtures of Gaussian distributions to perform clustering.The educational component of this proposal involves two main activities. The first activity is to develop a plan to expose disadvantaged and minority high school students to data mining research and practice in order to encourage them to study computer science. The second activity involves enhancing current research and teaching of data mining at the University of Houston.2. Intellectual MeritThis research project requires the discovery of common algorithmic principles to perform incremental matrix computations for a family of statistical models, understanding how to summarize large data sets, preserving their statistical properties required by multiple models and proposing new database techniques tailored for such models, capable of performing efficient matrix manipulation on secondary storage. Incremental computations are difficult to attain because methods for linear Gaussian models require iterations on the entire data set. Summarization requires transforming complex matrix equations considering high dimensionality, large data set size and numerical stability, preserving model accuracy. Developing matrix optimizations combining primary and secondary storage is quite different from optimizing a matrix algorithm that works only on primary storage.This research work requires mathematical knowledge to generalize, optimize and transform the computation of linear Gaussian models. On the other hand, it needs database systems expertise on how to organize and index diverse matrices on secondary storage for efficient reading and writing.3. Broader ImpactThis proposal will have a broad impact on the analysis of large, complex, high dimensional scientific data sets and enhancing database systems with incremental model computation capabilities. We plan to apply and test our proposed algorithms and techniques on scientific data sets, including geographical, medical and biological data sets, among others.
EAGER:Efficient Algorithms for Decisionality Reduction and Clustering Using Escherichia-Based Matrices卡洛斯奥多涅斯1.研究和教育建议大型数据集上的线性高斯模型计算的特点是繁重的矩阵运算和收敛缓慢的迭代方法。考虑到大型数据集是从磁盘存储和检索的,并且在数据库系统中模型也是在磁盘上操作的,效率问题变得更糟。尽管它们的重要性,有稀缺的研究工作,试图适应这个大家庭的高斯模型,利用数据库系统技术。该提案研究如何改进线性高斯模型的算法,以分析大型高维数据集,在二级存储器(即磁盘)上操作矩阵,使用少量的主存储器(即RAM存储器)。本文研究的模型包括用于降维的最大似然因子分析和用于执行聚类的高斯分布的混合物。第一项活动是制定一项计划,使处境不利和少数民族高中学生接触数据挖掘研究和实践,以鼓励他们学习计算机科学。第二项活动涉及加强休斯顿大学目前的数据挖掘研究和教学。智力MeritThis研究项目需要发现共同的算法原则,以执行增量矩阵计算的一个家庭的统计模型,了解如何总结大型数据集,保留其所需的多个模型的统计特性,并提出新的数据库技术,为这些模型量身定制,能够执行有效的矩阵操作二级存储。增量计算很难实现,因为线性高斯模型的方法需要对整个数据集进行迭代。 总结需要转换复杂的矩阵方程,考虑到高维,大数据集大小和数值稳定性,保持模型的准确性。开发结合主存储和辅助存储的矩阵优化与优化仅在主存储上工作的矩阵算法有很大不同,这项研究工作需要数学知识来推广、优化和转换线性高斯模型的计算。另一方面,它需要数据库系统的专业知识,如何组织和索引二级存储上的各种矩阵,以实现高效的阅读和写入。更广泛的影响该提案将对大型、复杂、高维科学数据集的分析产生广泛的影响,并增强具有增量模型计算能力的数据库系统。我们计划在科学数据集上应用和测试我们提出的算法和技术,包括地理、医学和生物数据集等。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Carlos Ordonez其他文献

Growing a FLOWER: Building a Diagram Unifying Flow and ER Notation for Data Science
种植一朵花:为数据科学构建统一流程和 ER 表示法的图表
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Carlos Ordonez;Robin Varghese;Nguyen Phan;Wojciech Macyna
  • 通讯作者:
    Wojciech Macyna
Role of Polymeric Coating of Gold Nanoparticles in their Transport through Natural Barriers
金纳米粒子的聚合物涂层在其通过自然屏障运输中的作用
  • DOI:
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Carlos Ordonez;Shingo Tanaka;Naoko Watanabe and Tamotsu Kozaki
  • 通讯作者:
    Naoko Watanabe and Tamotsu Kozaki
Migration of PEG-Functionalized Model Gold Nanoparticles in Natural Barriers
PEG 功能化模型金纳米粒子在自然屏障中的迁移
  • DOI:
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Carlos Ordonez;Shingo Tanaka;Naoko Watanabe and Tamotsu Kozaki
  • 通讯作者:
    Naoko Watanabe and Tamotsu Kozaki
Complicated intra-abdominal infections in a worldwide context: an observational prospective study (CIAOW Study)
  • DOI:
    10.1186/1749-7922-8-1
  • 发表时间:
    2013-01-03
  • 期刊:
  • 影响因子:
    5.800
  • 作者:
    Massimo Sartelli;Fausto Catena;Luca Ansaloni;Ernest Moore;Mark Malangoni;George Velmahos;Raul Coimbra;Kaoru Koike;Ari Leppaniemi;Walter Biffl;Zsolt Balogh;Cino Bendinelli;Sanjay Gupta;Yoram Kluger;Ferdinando Agresta;Salomone Di Saverio;Gregorio Tugnoli;Elio Jovine;Carlos Ordonez;Carlos Augusto Gomes;Gerson Alves Pereira;Kuo-Ching Yuan;Miklosh Bala;Miroslav P Peev;Yunfeng Cui;Sanjay Marwah;Sanoop Zachariah;Boris Sakakushev;Victor Kong;Adamu Ahmed;Ashraf Abbas;Ricardo Alessandro Teixeira Gonsaga;Gianluca Guercioni;Nereo Vettoretto;Elia Poiasina;Offir Ben-Ishay;Rafael Díaz-Nieto;Damien Massalou;Matej Skrovina;Ihor Gerych;Goran Augustin;Jakub Kenig;Vladimir Khokha;Cristian Tranà;Kenneth Yuh Yen Kok;Alain Chichom Mefire;Jae Gil Lee;Suk-Kyung Hong;Helmut Alfredo Segovia Lohse;Wagih Ghnnam;Alfredo Verni;Varut Lohsiriwat;Boonying Siribumrungwong;Alberto Tavares;Gianluca Baiocchi;Koray Das;Julien Jarry;Maurice Zida;Norio Sato;Kiyoshi Murata;Tomohisa Shoko;Takayuki Irahara;Ahmed O Hamedelneel;Noel Naidoo;Abdul Rashid Kayode Adesunkanmi;Yoshiro Kobe;AK Attri;Rajeev Sharma;Federico Coccolini;Tamer El Zalabany;Khalid Al Khalifa;Juan Sanjuan;Rita Barnabé;Wataru Ishii
  • 通讯作者:
    Wataru Ishii
Dépendances fonctionnelles et requêtes skyline multidimensionnelles
多维天际线的功能和要求的依赖性
  • DOI:
    10.3166/isi.20.5.9-26
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    0
  • 作者:
    N. Hanusse;Patrick Kamnang Wanko;Sofiane Maabout;Carlos Ordonez
  • 通讯作者:
    Carlos Ordonez

Carlos Ordonez的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Carlos Ordonez', 18)}}的其他基金

Equilibria of Two Relaxed Plasma Species With One Species Confined by the Space Charge of the Other Species
两种松弛等离子体物质的平衡,其中一种物质受到另一种物质的空间电荷的限制
  • 批准号:
    1803047
  • 财政年份:
    2018
  • 资助金额:
    $ 10万
  • 项目类别:
    Standard Grant
Collaborative Research: Experimental and Theoretical Study of the Plasma Physics of Antihydrogen Generation and Trapping
合作研究:反氢生成和捕获的等离子体物理的实验和理论研究
  • 批准号:
    1500427
  • 财政年份:
    2015
  • 资助金额:
    $ 10万
  • 项目类别:
    Standard Grant
Collaborative Research: Experimental and Theoretical Study of the Plasma Physics of Antihydrogen Generation and Trapping
合作研究:反氢生成和捕获的等离子体物理的实验和理论研究
  • 批准号:
    1202428
  • 财政年份:
    2012
  • 资助金额:
    $ 10万
  • 项目类别:
    Continuing Grant
III: Small-Collaborative: Efficient Bayesian Model Computation for Large and High Dimensional Data Sets
III:小型协作:大型高维数据集的高效贝叶斯模型计算
  • 批准号:
    0914861
  • 财政年份:
    2009
  • 资助金额:
    $ 10万
  • 项目类别:
    Standard Grant
Pan-American Institute of Science and Technology
泛美科学技术研究所
  • 批准号:
    0936560
  • 财政年份:
    2009
  • 资助金额:
    $ 10万
  • 项目类别:
    Standard Grant
Collaborative Research: Conformal Quantum Mechanics---From the Near-Horizon Characterization of Black Hole Thermodynamics to Renormalization and Path Integral Techniques
合作研究:共形量子力学——从黑洞热力学的近地平线表征到重正化和路径积分技术
  • 批准号:
    0602301
  • 财政年份:
    2006
  • 资助金额:
    $ 10万
  • 项目类别:
    Continuing Grant
Collaborative Research: Renormalization, Path Integrals, and Applications of Conformal Quantum Mechanics, Singular Potentials, and Quantum Field Theory
合作研究:重整化、路径积分以及共形量子力学、奇异势和量子场论的应用
  • 批准号:
    0308435
  • 财政年份:
    2003
  • 资助金额:
    $ 10万
  • 项目类别:
    Continuing Grant
Development of Plasma Theory in Support of the Quest to Form and Trap Cold Neutral Antimatter
等离子体理论的发展支持形成和捕获冷中性反物质
  • 批准号:
    0244444
  • 财政年份:
    2003
  • 资助金额:
    $ 10万
  • 项目类别:
    Continuing Grant
Plasma Overlap Physics in Nested Penning Traps and the Quest to Confine Neutral Antimatter
嵌套潘宁陷阱中的等离子体重叠物理和限制中性反物质的探索
  • 批准号:
    0099617
  • 财政年份:
    2001
  • 资助金额:
    $ 10万
  • 项目类别:
    Continuing Grant
Plasma Overlap Physics in Nested-Well Traps and the Quest for Cold Antihydrogen
嵌套井陷阱中的等离子体重叠物理和冷反氢的探索
  • 批准号:
    9876921
  • 财政年份:
    1999
  • 资助金额:
    $ 10万
  • 项目类别:
    Continuing Grant

相似海外基金

CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 10万
  • 项目类别:
    Continuing Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 10万
  • 项目类别:
    Continuing Grant
CAREER: A Theoretical Exploration of Efficient and Accurate Clustering Algorithms
职业生涯:高效准确聚类算法的理论探索
  • 批准号:
    2337832
  • 财政年份:
    2024
  • 资助金额:
    $ 10万
  • 项目类别:
    Continuing Grant
LEAPS-MPS: Fast and Efficient Novel Algorithms for MHD Flow Ensembles
LEAPS-MPS:适用于 MHD 流系综的快速高效的新颖算法
  • 批准号:
    2425308
  • 财政年份:
    2024
  • 资助金额:
    $ 10万
  • 项目类别:
    Standard Grant
CIF: Small: Theory and Algorithms for Efficient and Large-Scale Monte Carlo Tree Search
CIF:小型:高效大规模蒙特卡罗树搜索的理论和算法
  • 批准号:
    2327013
  • 财政年份:
    2023
  • 资助金额:
    $ 10万
  • 项目类别:
    Standard Grant
ATD: Efficient and Effective Algorithms for Detection of Anomalies in High-dimensional Spatiotemporal Data with Large Amounts of Missing Data
ATD:高效且有效的高维时空数据异常检测算法
  • 批准号:
    2318925
  • 财政年份:
    2023
  • 资助金额:
    $ 10万
  • 项目类别:
    Standard Grant
CAREER: Computation-efficient Algorithms for Grid-scale Energy Storage Control, Bidding, and Integration Analysis
职业:用于电网规模储能控制、竞价和集成分析的计算高效算法
  • 批准号:
    2239046
  • 财政年份:
    2023
  • 资助金额:
    $ 10万
  • 项目类别:
    Continuing Grant
CRII: CIF: Sequential Decision-Making Algorithms for Efficient Subset Selection in Multi-Armed Bandits and Optimization of Black-Box Functions
CRII:CIF:多臂老虎机中高效子集选择和黑盒函数优化的顺序决策算法
  • 批准号:
    2246187
  • 财政年份:
    2023
  • 资助金额:
    $ 10万
  • 项目类别:
    Standard Grant
Collaborative Research: SaTC: CORE: Medium: Graph Mining and Network Science with Differential Privacy: Efficient Algorithms and Fundamental Limits
协作研究:SaTC:核心:媒介:具有差异隐私的图挖掘和网络科学:高效算法和基本限制
  • 批准号:
    2317192
  • 财政年份:
    2023
  • 资助金额:
    $ 10万
  • 项目类别:
    Continuing Grant
Collaborative Research: SaTC: CORE: Medium: Graph Mining and Network Science with Differential Privacy: Efficient Algorithms and Fundamental Limits
协作研究:SaTC:核心:媒介:具有差异隐私的图挖掘和网络科学:高效算法和基本限制
  • 批准号:
    2317194
  • 财政年份:
    2023
  • 资助金额:
    $ 10万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了