CAREER: Recursive Distributed Matrix and Tensor Decompositions on Neural Engines

职业:神经引擎上的递归分布式矩阵和张量分解

基本信息

  • 批准号:
    2146509
  • 负责人:
  • 金额:
    $ 52.87万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-03-01 至 2027-02-28
  • 项目状态:
    未结题

项目摘要

Matrix and tensor decompositions are one of the most important building blocks for scientific computing and are increasingly important in data-centric computing and machine-learning models. The lack of software and algorithms that can efficiently deal with large data sets and exploit the ubiquitous availability of neural engines is holding back progress. Legacy distributed matrix packages based on complex data distribution schemes not only add friction in adoption in new areas but also impede the exploration of cutting-edge algorithms at scale. New exciting algorithms such as randomized linear algebra, structured matrix computation, and advanced eigen decompositions that are synergistic to neural engines remain unexplored, ad-hoc, or hard to use by non-experts in numerical analysis. New powerful architectures -- neural engines -- promise orders of magnitudes o performance and energy benefits but remain a challenge to use outside of neural networks. This proposal aims to create a unified software system to achieve high-performance, scalable, distributed matrix and tensor decompositions on neural engines through concerted research and development.This project addresses three research thrusts to achieve its goals. A) In contrast to conventional arithmetic-centric algorithm design, this research focuses on communication-efficient algorithm variants. A central challenge in realizing the proposed goals is the avoidance, and management, of data movement. Computation speed has become amazingly fast on neural engines, while data movement latency and bandwidth lag far behind and the gap is widening. B) Incorporation of neural engines to state-of-the-art numerical algorithms. Recent numerical analysis has seen some exciting developments in randomized algorithms, low-precision direct decomposition as a preconditioner, and novel polar decomposition-based spectral divide-and-conquer methods for eigensystems. These new developments are not only exciting by themselves, but they have the potential to exploit neural engines especially well and blend with communication-centric algorithms naturally. C) Exploration of Universal Distributed Array (UDA), a new data structure based on a multi-dimensional cyclic data distribution scheme, to achieve load balancing, scalability, and unified support for all matrix and tensor decompositions. This proposal extends the cyclic data-distribution scheme to support communication-efficient algorithms including recursive algorithms due to flexible alignment, and to multi-dimensional to support tensor decomposition. The project will develop efficient, scalable, and easy-to-use communication and computational primitives on distributed neural engines and will include the most useful matrix/tensor decomposition algorithms as a composable and extensible library.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
矩阵和张量分解是科学计算最重要的构件之一,在以数据为中心的计算和机器学习模型中越来越重要。缺乏能够有效处理大数据集和利用无处不在的神经引擎的软件和算法,阻碍了这一进展。基于复杂数据分发方案的传统分布式矩阵包不仅在新领域的采用中增加了摩擦,而且阻碍了对大规模尖端算法的探索。新的激动人心的算法,如随机化线性代数、结构化矩阵计算和高级特征分解,与神经引擎协同工作,仍未被探索、特别或难以被非专家用于数值分析。新的功能强大的架构--神经引擎--承诺了数量级的性能和能源效益,但在神经网络之外使用仍然是一个挑战。该方案旨在创建一个统一的软件系统,通过协同研究和开发来实现神经引擎上的高性能、可扩展、分布式矩阵和张量分解。A)与传统的以算法为中心的算法设计不同,本研究的重点是通信效率高的算法变体。实现拟议目标的一个核心挑战是避免和管理数据移动。神经引擎的计算速度已经变得惊人地快,而数据移动延迟和带宽远远落后,差距还在扩大。B)将神经引擎纳入最先进的数值算法。最近的数值分析在随机化算法、作为预条件的低精度直接分解以及基于极分解的特征系统谱分治方法等方面取得了一些令人振奋的进展。这些新的发展不仅本身令人兴奋,而且有可能很好地利用神经引擎,并自然地与以通信为中心的算法融合在一起。C)探索通用分布式阵列(UDA),这是一种基于多维循环数据分发方案的新数据结构,以实现负载均衡、可伸缩性,并统一支持所有矩阵和张量分解。该方案扩展了循环数据分发方案以支持通信高效的算法,包括由于灵活对齐而导致的递归算法,并扩展到多维以支持张量分解。该项目将在分布式神经引擎上开发高效、可扩展和易于使用的通信和计算原语,并将包括最有用的矩阵/张量分解算法作为可组合和可扩展的库。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Fast Symmetric Eigenvalue Decomposition via WY Representation on Tensor Core
通过张量核心上的 WY 表示进行快速对称特征值分解
  • DOI:
    10.1145/3572848.3577516
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zhang, Shaoshuai;Shah, Ruchi;Ootomo, Hiroyuki;Yokota, Rio;Wu, Panruo
  • 通讯作者:
    Wu, Panruo
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Panruo Wu其他文献

Extending checksum-based ABFT to tolerate soft errors online in iterative methods
扩展基于校验和的 ABFT 以容忍迭代方法中的在线软错误
Investigating half precision arithmetic to accelerate dense linear system solvers
研究半精度算法以加速密集线性系统求解器
High Accuracy Matrix Computations on Neural Engines: A Study of QR Factorization and its Applications
神经引擎上的高精度矩阵计算:QR 分解及其应用的研究
Silent Data Corruption Resilient Two-sided Matrix Factorizations
静默数据损坏弹性双边矩阵分解
Basic Linear Algebra Operations on TensorCore GPU
TensorCore GPU 上的基本线性代数运算
  • DOI:
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Shaoshuai Zhang;Vivek Karihaloo;Panruo Wu
  • 通讯作者:
    Panruo Wu

Panruo Wu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

Recursive Inequalities in Applied Proof Theory
应用证明理论中的递归不等式
  • 批准号:
    2889781
  • 财政年份:
    2023
  • 资助金额:
    $ 52.87万
  • 项目类别:
    Studentship
Dynamic location equilibrium problem with the recursive structure of multi-entity in disaster prone areas
灾害易发区多实体递归结构动态位置均衡问题
  • 批准号:
    23KJ0771
  • 财政年份:
    2023
  • 资助金额:
    $ 52.87万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
Collaborative Research: Bayesian Residual Learning and Random Recursive Partitioning Methods for Gaussian Process Modeling
合作研究:高斯过程建模的贝叶斯残差学习和随机递归划分方法
  • 批准号:
    2348163
  • 财政年份:
    2023
  • 资助金额:
    $ 52.87万
  • 项目类别:
    Standard Grant
Progress of Recursive Utility Maximization Theory and Its Applications
递归效用最大化理论及其应用进展
  • 批准号:
    23K01450
  • 财政年份:
    2023
  • 资助金额:
    $ 52.87万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Portfolio of compositions: Creating electroacoustic works through the sonification of recursive neural networks, and exploring the creative use of in
作品组合:通过递归神经网络的发声创作电声作品,并探索in的创造性运用
  • 批准号:
    2886370
  • 财政年份:
    2023
  • 资助金额:
    $ 52.87万
  • 项目类别:
    Studentship
CAREER: Statistical Learning with Recursive Partitioning: Algorithms, Accuracy, and Applications
职业:递归分区的统计学习:算法、准确性和应用
  • 批准号:
    2239448
  • 财政年份:
    2023
  • 资助金额:
    $ 52.87万
  • 项目类别:
    Continuing Grant
Recursive Partitioning Methods for Life History Processes
生命史过程的递归划分方法
  • 批准号:
    RGPIN-2016-04396
  • 财政年份:
    2022
  • 资助金额:
    $ 52.87万
  • 项目类别:
    Discovery Grants Program - Individual
CAREER: Reinforcement Learning for Recursive Markov Decision Processes and Beyond
职业:递归马尔可夫决策过程及其他的强化学习
  • 批准号:
    2146563
  • 财政年份:
    2022
  • 资助金额:
    $ 52.87万
  • 项目类别:
    Continuing Grant
Collaborative Research: Bayesian Residual Learning and Random Recursive Partitioning Methods for Gaussian Process Modeling
合作研究:高斯过程建模的贝叶斯残差学习和随机递归划分方法
  • 批准号:
    2152999
  • 财政年份:
    2022
  • 资助金额:
    $ 52.87万
  • 项目类别:
    Standard Grant
Elements: CRISPS: Cell-Centric Recursive Image Similarity Projection Searching
元素:CRISPS:以细胞为中心的递归图像相似性投影搜索
  • 批准号:
    2209135
  • 财政年份:
    2022
  • 资助金额:
    $ 52.87万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了