Collaborative Research: Randomized Algorithms in Linear Algebra and Numerical Evaluations on Massive Datasets

合作研究:线性代数中的随机算法和海量数据集的数值评估

基本信息

  • 批准号:
    1008983
  • 负责人:
  • 金额:
    $ 22.04万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2010
  • 资助国家:
    美国
  • 起止时间:
    2010-10-01 至 2015-07-31
  • 项目状态:
    已结题

项目摘要

Data matrices have structural properties that present challenges and opportunities for both the Numerical Linear Algebra (NLA) community and the Theory of Algorithms (ToA) community. Matrix factorizations, such as the eigendecomposition, the rank-revealing QR factorization, and the Singular Value Decomposition, have been widely used for information retrieval. Historically, matrix factorizations have been of central interest in NLA since one can use them to express a problem in such a way that it can be solved more easily. ToA, on the other hand, has recently addressed the computation of such decompositions from a sampling perspective. The two approaches are complementary. However, thus far, the two communities have not worked closely together to integrate them. More often than not, each community is only cursorily aware of developments in the other community. Defining significant research directions that both communities can work on, and applying the resulting linear algebraic algorithms to data analysis problems (among others) will lead to important breakthroughs. The main objective for this proposal is to bridge the existing gap and bring together NLA and ToA researchers to promote cross-fertilization of ideas that could have immediate and long-term impact on data analysis. Towards that end, the PIs will work on set of prototypical research problems that can significantly benefit from ideas and research in both NLA and ToA. These problems range from approximating the singular values and vectors of a matrix by element-wise sampling to random-projection-based algorithms for least-squares problems and the design of randomized algorithms for the widely used non-negative matrix factorization.This proposal seeks to explore the complementary perspectives that the Numerical Linear Algebra and the Theory of Algorithms (ToA) communities bring to linear algebra and matrix computations. This is a timely quest, motivated by technological developments over the last two decades that permit the automatic generation of large datasets. Such datasets are often modeled as matrices. The proposed work will serve as a demonstration project on the fruitfulness of collaboration between the NLA and the ToA communities on problems that are of common interest. It is expected that the proposed research will demonstrate commonality in the two approaches, as well as highlight the advantages of the dual perspective. Through outreach activities, the PIs hope to motivate even more researchers to undertake similar investigations on related topics. The proposed algorithms will be numerically evaluated on a suite of matrices from application domains that the PIs have been working on over the past few years, in order to understand better their properties and to demonstrate their potential in dealing with the modern, massive datasets. More specifically, the PIs will test the proposed strategies on population genetics data in order to infer ancestry of individuals, as well as gene expression data in order to investigate hypotheses that correlate genes and diseases. As such, we expect that the developed algorithms will impact the areas of linear algebra, randomized algorithms, information retrieval and data mining, as well as bioinformatics. Finally, in order to disseminate the proposed research, the PIs intend to organize workshops (following the example of the Workshops on Algorithms for Modern Massive Datasets in 2006, 2008, and 2010; the PIs were co-organizers of these workshops) and working group meetings, and will disseminate their research via blogs and articles intended for broader audiences.
数据矩阵的结构属性为数值线性代数(NLA)社区和算法理论(ToA)社区带来了挑战和机遇。矩阵分解,如特征分解,秩揭示QR分解,奇异值分解,已被广泛用于信息检索。从历史上看,矩阵分解一直是NLA的核心兴趣,因为人们可以用它们来表达一个问题,这样它就可以更容易地解决。另一方面,ToA最近从采样的角度解决了这种分解的计算问题。这两种办法是相辅相成的。然而,到目前为止,两个社区还没有密切合作,使他们融合。每个社区往往只是粗略地了解另一个社区的事态发展。定义两个社区都可以研究的重要研究方向,并将由此产生的线性代数算法应用于数据分析问题(以及其他问题)将导致重要的突破。该提案的主要目标是弥合现有的差距,并将NLA和ToA研究人员聚集在一起,促进可能对数据分析产生直接和长期影响的想法的相互交流。为此,PI将致力于一系列典型的研究问题,这些问题可以从NLA和ToA的想法和研究中受益匪浅。这些问题的范围从近似矩阵的奇异值和向量的逐元素采样到基于随机投影的最小二乘算法,以及广泛使用的非负矩阵分解的随机算法的设计。本提案旨在探索数值线性代数和算法理论(ToA)社区为线性代数和矩阵计算带来的互补视角。这是一个及时的探索,受到过去二十年来允许自动生成大型数据集的技术发展的推动。这些数据集通常被建模为矩阵。拟议的工作将作为一个示范项目,展示NLA和ToA社区就共同关心的问题进行合作的成果。预计拟议的研究将展示这两种方法的共性,并突出双重视角的优势。通过外展活动,研究员希望激励更多的研究人员就相关主题进行类似的研究。提出的算法将在PI在过去几年中一直致力于研究的应用领域的一组矩阵上进行数值评估,以便更好地了解它们的属性,并展示它们在处理现代大规模数据集方面的潜力。更具体地说,PI将在群体遗传学数据上测试拟议的策略,以推断个体的祖先,以及基因表达数据,以调查基因与疾病相关的假设。因此,我们预计,开发的算法将影响线性代数,随机算法,信息检索和数据挖掘,以及生物信息学领域。最后,为了传播拟议的研究,PI打算组织研讨会(以2006年,2008年和2010年的现代大规模数据集算法研讨会为例; PI是这些研讨会的共同组织者)和工作组会议,并将通过博客和文章传播他们的研究,以更广泛的受众。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Petros Drineas其他文献

Neuropathology-based approach reveals novel Alzheimer's Disease genes and highlights female-specific pathways and causal links to disrupted lipid metabolism: insights into a vicious cycle
  • DOI:
    10.1186/s40478-024-01909-6
  • 发表时间:
    2025-01-04
  • 期刊:
  • 影响因子:
    5.700
  • 作者:
    Yin Jin;Apostolia Topaloudi;Sudhanshu Shekhar;Guangxin Chen;Alicia Nicole Scott;Bryce David Colon;Petros Drineas;Chris Rochet;Peristera Paschou
  • 通讯作者:
    Peristera Paschou
A randomized least squares solver for terabyte-sized dense overdetermined systems
  • DOI:
    10.1016/j.jocs.2016.09.007
  • 发表时间:
    2019-09-01
  • 期刊:
  • 影响因子:
  • 作者:
    Chander Iyer;Haim Avron;Georgios Kollias;Yves Ineichen;Christopher Carothers;Petros Drineas
  • 通讯作者:
    Petros Drineas

Petros Drineas的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Petros Drineas', 18)}}的其他基金

NSF-BSF: AF: Collaborative Research: Small: Randomized preconditioning of iterative processes: Theory and practice
NSF-BSF:AF:协作研究:小型:迭代过程的随机预处理:理论与实践
  • 批准号:
    2209509
  • 财政年份:
    2022
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
Collaborative Research: Randomized Numerical Linear Algebra for Large Scale Inversion, Sparse Principal Component Analysis, and Applications
合作研究:大规模反演的随机数值线性代数、稀疏主成分分析及应用
  • 批准号:
    2152687
  • 财政年份:
    2022
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
CCF-BSF: AF: Small: Collaborative Research: Practice-Friendly Theory and Algorithms for Linear Regression Problems
CCF-BSF:AF:小型:协作研究:线性回归问题的实用理论和算法
  • 批准号:
    1814041
  • 财政年份:
    2018
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
FRG: Collaborative Research: Randomization as a Resource for Rapid Prototyping
FRG:协作研究:随机化作为快速原型制作的资源
  • 批准号:
    1760353
  • 财政年份:
    2018
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
III: Small: Novel Statistical Data Analysis Approaches for Mining Human Genetics Datasets
III:小型:挖掘人类遗传学数据集的新颖统计数据分析方法
  • 批准号:
    1715202
  • 财政年份:
    2017
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
BIGDATA: F: DKA: Collaborative Research: Randomized Numerical Linear Algebra (RandNLA) for multi-linear and non-linear data
BIGDATA:F:DKA:协作研究:用于多线性和非线性数据的随机数值线性代数 (RandNLA)
  • 批准号:
    1661760
  • 财政年份:
    2016
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
III: Small: Fast and Efficient Algorithms for Matrix Decompositions and Applications to Human Genetics
III:小:快速高效的矩阵分解算法及其在人类遗传学中的应用
  • 批准号:
    1661756
  • 财政年份:
    2016
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
BIGDATA: F: DKA: Collaborative Research: Randomized Numerical Linear Algebra (RandNLA) for multi-linear and non-linear data
BIGDATA:F:DKA:协作研究:用于多线性和非线性数据的随机数值线性代数 (RandNLA)
  • 批准号:
    1447283
  • 财政年份:
    2014
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
III: Small: Fast and Efficient Algorithms for Matrix Decompositions and Applications to Human Genetics
III:小:快速高效的矩阵分解算法及其在人类遗传学中的应用
  • 批准号:
    1319280
  • 财政年份:
    2013
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
AF: Small: Fast and Efficient Randomized Algorithms for Solving Laplacian Systems of Linear Equations and Sparse Least Squares Problems
AF:小型:用于解决线性方程拉普拉斯系统和稀疏最小二乘问题的快速高效随机算法
  • 批准号:
    1016501
  • 财政年份:
    2010
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant

相似国自然基金

Research on Quantum Field Theory without a Lagrangian Description
  • 批准号:
    24ZR1403900
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Cell Research
  • 批准号:
    31224802
  • 批准年份:
    2012
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research
  • 批准号:
    31024804
  • 批准年份:
    2010
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research (细胞研究)
  • 批准号:
    30824808
  • 批准年份:
    2008
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
  • 批准号:
    10774081
  • 批准年份:
    2007
  • 资助金额:
    45.0 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: Randomized Feature Methods for Modeling and Dynamics: Theory and Algorithms
协作研究:建模和动力学的随机特征方法:理论和算法
  • 批准号:
    2331033
  • 财政年份:
    2023
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
Collaborative Research: Elements: A Cyberlaboratory for Randomized Numerical Linear Algebra
合作研究:Elements:随机数值线性代数网络实验室
  • 批准号:
    2309445
  • 财政年份:
    2023
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
Collaborative Research: Elements: A Cyberlaboratory for Randomized Numerical Linear Algebra
合作研究:Elements:随机数值线性代数网络实验室
  • 批准号:
    2309446
  • 财政年份:
    2023
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
Collaborative Research: Randomized Numerical Linear Algebra for Large Scale Inversion, Sparse Principal Component Analysis, and Applications
合作研究:大规模反演的随机数值线性代数、稀疏主成分分析及应用
  • 批准号:
    2152661
  • 财政年份:
    2022
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
NSF-BSF: AF: Collaborative Research: Small: Randomized preconditioning of iterative processes: Theory and practice
NSF-BSF:AF:协作研究:小型:迭代过程的随机预处理:理论与实践
  • 批准号:
    2209510
  • 财政年份:
    2022
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
Collaborative Research: Randomized Numerical Linear Algebra for Large Scale Inversion, Sparse Principal Component Analysis, and Applications
合作研究:大规模反演的随机数值线性代数、稀疏主成分分析及应用
  • 批准号:
    2152704
  • 财政年份:
    2022
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
Collaborative Research: Randomized Feature Methods for Modeling and Dynamics: Theory and Algorithms
协作研究:建模和动力学的随机特征方法:理论和算法
  • 批准号:
    2208339
  • 财政年份:
    2022
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
NSF-BSF: AF: Collaborative Research: Small: Randomized preconditioning of iterative processes: Theory and practice
NSF-BSF:AF:协作研究:小型:迭代过程的随机预处理:理论与实践
  • 批准号:
    2209509
  • 财政年份:
    2022
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
Collaborative Research: Randomized Numerical Linear Algebra for Large Scale Inversion, Sparse Principal Component Analysis, and Applications
合作研究:大规模反演的随机数值线性代数、稀疏主成分分析及应用
  • 批准号:
    2152687
  • 财政年份:
    2022
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Standard Grant
Collaborative Research: Algorithms for Optimal Adaptive Enrichment Design in Randomized Trial
协作研究:随机试验中最佳自适应富集设计的算法
  • 批准号:
    2230795
  • 财政年份:
    2022
  • 资助金额:
    $ 22.04万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了