III: Medium: High-Performance Factorization Tools for Constrained and Hidden Tensor Models
III:中:用于约束和隐藏张量模型的高性能分解工具
基本信息
- 批准号:1704074
- 负责人:
- 金额:$ 120万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-09-01 至 2023-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Tensors generalize matrices to higher dimensions (called modes) and are designed to model multi-way data. Tensor factorization algorithms analyze such multi-way data to uncover relations between the different modes that can be used to both gain insights and to predict unknown aspects of the underlying system/process. For example, medical diagnosis and treatment records can be modeled via a four-mode tensor whose modes correspond to patients, physicians, diagnosis, and treatments and its factorization can provide insights on the co-occurrence of medical conditions, treatment approaches, any treatment differences based on the physician, and identify potential instances of medical fraud. This project's research is designed to address current limitations of tensor analysis by developing new theory and algorithms and high-performance scalable parallel formulations of the various computational kernels used by these algorithms, and a flexible open source software toolkit that can be used to perform constrained and hidden tensor factorization of very large and sparse multi-way datasets. The success of this project will allow researchers to leverage the power of multi-way ``Big Data'' analysis to solve various problems in diverse application domains such as healthcare, medical imaging, cybersecurity, social and behavioral sciences, and e-commerce. At the same time, the project will provide data science training to the students involved by combining cutting-edge data and signal analytics, data mining, and high-performance computing.Constrained matrix and tensor factorization techniques are widely used for dimensionality reduction, clustering, and estimation in machine learning, signal processing, and many other walks of science and engineering. Unconstrained matrix and tensor factorization algorithms are relatively mature, but constrained counterparts are lagging in terms of speed, scalability, and flexibility. In many applications (e.g., medical imaging and recommender systems), instead of observing the actual entries of a tensor, we observe a limited number of linear combinations (e.g., partial sums) of these entries and need to identify the tensor's latent factors from these measurements. Being able to directly identify the latent factors from linear measurements, which we refer to as hidden tensor factorization, has important advantages in terms of complexity, memory footprint, and the ability to handle very large data sets. Developing open source high-performance parallel tools for constrained and hidden tensor factorization in both shared- and distributed-memory systems will significantly enhance the ability to analyze very large multi-way data. The research will evolve along two synergistic thrusts. First, it will develop new theory and algorithms for constrained and hidden tensor factorization by (i) building fast first-order (FFO) and fast stochastic first-order (FSFO) constrained tensor decomposition algorithms that strike favorable trade-offs between simplicity, scalability, and speed of convergence, and (ii) tackling important identifiability and algorithmic issues related to hidden tensor factorization. Second, it will undertake a multi-pronged effort towards developing high-performance parallel formulations for the computational kernels used in constrained and unconstrained tensor and hidden tensor factorization and develop a high-performance tensor factorization software toolbox. The release of the high-performance tensor factorization toolbox will enable researchers and practitioners to scale up not only the size of data but also the variety of constraints and types of data they can analyze. The research will involve students that will be trained in data science, combining cutting-edge signal and data analytics, data mining, and high-performance computing.
张量将矩阵推广到更高的维度(称为模式),旨在对多路数据进行建模。张量因子分解算法分析此类多路数据,以揭示不同模式之间的关系,这些模式可用于获得见解并预测底层系统/流程的未知方面。例如,医疗诊断和治疗记录可以通过四模式张量进行建模,其模式对应于患者,医生,诊断和治疗,并且其因子分解可以提供关于医疗状况,治疗方法,基于医生的任何治疗差异的共同出现的见解,并识别医疗欺诈的潜在实例。该项目的研究旨在通过开发新的理论和算法以及这些算法所使用的各种计算内核的高性能可扩展并行公式来解决张量分析的当前限制,以及一个灵活的开源软件工具包,可用于执行非常大和稀疏的多路数据集的约束和隐藏张量因子分解。该项目的成功将使研究人员能够利用多路“大数据”分析的力量来解决医疗保健、医学成像、网络安全、社会和行为科学以及电子商务等不同应用领域的各种问题。与此同时,该项目还将为参与的学生提供数据科学培训,将前沿的数据和信号分析、数据挖掘和高性能计算相结合。约束矩阵和张量因子分解技术广泛用于机器学习、信号处理和许多其他科学和工程领域的降维、聚类和估计。无约束矩阵和张量分解算法相对成熟,但有约束的算法在速度、可扩展性和灵活性方面相对滞后。在许多应用中(例如,医学成像和推荐系统),我们观察有限数量的线性组合(例如,部分和),并且需要从这些测量中识别张量的潜在因子。能够直接从线性测量中识别潜在因子,我们称之为隐藏张量因子分解,在复杂性,内存占用和处理超大数据集的能力方面具有重要优势。在共享和分布式内存系统中开发用于约束和隐藏张量因式分解的开源高性能并行工具将显着增强分析非常大的多路数据的能力。这项研究将沿着两个协同的方向发展。首先,它将通过(i)构建快速一阶(FFO)和快速随机一阶(FSFO)约束张量分解算法,在简单性,可扩展性和收敛速度之间进行有利的权衡,以及(ii)解决与隐藏张量因式分解相关的重要可识别性和算法问题,开发约束和隐藏张量因式分解的新理论和算法。其次,它将多管齐下,努力为约束和无约束张量和隐藏张量因式分解中使用的计算内核开发高性能并行公式,并开发高性能张量因式分解软件工具箱。高性能张量因子分解工具箱的发布将使研究人员和从业人员不仅能够扩展数据的大小,还可以扩展他们可以分析的各种约束和数据类型。该研究将涉及将接受数据科学培训的学生,结合尖端的信号和数据分析,数据挖掘和高性能计算。
项目成果
期刊论文数量(34)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Multi-Set Low-Rank Factorizations With Shared and Unshared Components
具有共享和非共享组件的多集低秩分解
- DOI:10.1109/tsp.2020.3020408
- 发表时间:2020
- 期刊:
- 影响因子:5.4
- 作者:Sorensen, Mikael;Sidiropoulos, Nicholas D.
- 通讯作者:Sidiropoulos, Nicholas D.
Statistical Learning Using Hierarchical Modeling of Probability Tensors
使用概率张量的分层建模进行统计学习
- DOI:10.1109/dsw.2019.8755580
- 发表时间:2019
- 期刊:
- 影响因子:0
- 作者:Amiridi, Magda;Kargas, Nikos;Sidiropoulos, Nicholas D.
- 通讯作者:Sidiropoulos, Nicholas D.
Hyperspectral Super-Resolution Via Coupled Tensor Factorization: Identifiability and Algorithms
- DOI:10.1109/icassp.2018.8462525
- 发表时间:2018-04
- 期刊:
- 影响因子:0
- 作者:Charilaos I. Kanatsoulis;Xiao Fu;N. Sidiropoulos;Wing-Kin Ma
- 通讯作者:Charilaos I. Kanatsoulis;Xiao Fu;N. Sidiropoulos;Wing-Kin Ma
Prema: Principled Tensor Data Recovery From Multiple Aggregated Views
Prema:从多个聚合视图恢复有原则的张量数据
- DOI:10.1109/jstsp.2021.3056918
- 发表时间:2021
- 期刊:
- 影响因子:7.5
- 作者:Almutairi, Faisal M.;Kanatsoulis, Charilaos I.;Sidiropoulos, Nicholas D.
- 通讯作者:Sidiropoulos, Nicholas D.
STELAR: Spatio-temporal Tensor Factorization with Latent Epidemiological Regularization
STELAR:具有潜在流行病学正则化的时空张量分解
- DOI:10.1609/aaai.v35i6.16615
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Nikos Kargas, Cheng Qian
- 通讯作者:Nikos Kargas, Cheng Qian
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
George Karypis其他文献
A knowledge graph of clinical trials ( $$\mathop {\mathtt {CTKG}}\limits$$ )
- DOI:
10.1038/s41598-022-08454-z - 发表时间:
2022-03-18 - 期刊:
- 影响因子:3.900
- 作者:
Ziqi Chen;Bo Peng;Vassilis N. Ioannidis;Mufei Li;George Karypis;Xia Ning - 通讯作者:
Xia Ning
Predicting the Performance of Randomized Parallel Search: An Application to Robot Motion Planning
- DOI:
10.1023/a:1026283627113 - 发表时间:
2003-09-01 - 期刊:
- 影响因子:2.800
- 作者:
Daniel J. Challou;Maria Gini;Vipin Kumar;George Karypis - 通讯作者:
George Karypis
Out-of-core coherent closed quasi-clique mining from large dense graph databases
从大型密集图数据库中进行核外相干封闭准集团挖掘
- DOI:
10.1145/1242524.1242530 - 发表时间:
2007-06 - 期刊:
- 影响因子:0
- 作者:
Jianyong Wang;Zhiping Zeng;George Karypis;Lizhu Zhou - 通讯作者:
Lizhu Zhou
Grade prediction with models specific to students and courses
- DOI:
10.1007/s41060-016-0024-z - 发表时间:
2016-09-22 - 期刊:
- 影响因子:2.800
- 作者:
Agoritsa Polyzou;George Karypis - 通讯作者:
George Karypis
Efficient identification of Tanimoto nearest neighbors
- DOI:
10.1007/s41060-017-0064-z - 发表时间:
2017-08-02 - 期刊:
- 影响因子:2.800
- 作者:
David C. Anastasiu;George Karypis - 通讯作者:
George Karypis
George Karypis的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('George Karypis', 18)}}的其他基金
REU Site: Computational Methods for Discovery Driven by Big Data
REU 网站:大数据驱动的发现计算方法
- 批准号:
1757916 - 财政年份:2018
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
PFI:AIR - TT: Automated Out-of-Core Execution of Parallel Message-Passing Applications
PFI:AIR - TT:并行消息传递应用程序的自动核外执行
- 批准号:
1414153 - 财政年份:2014
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
BIGDATA: IA: DKA: Collaborative Research: Learning Data Analytics: Providing Actionable Insights to Increase College Student Success
大数据:IA:DKA:协作研究:学习数据分析:提供可行的见解以提高大学生的成功
- 批准号:
1447788 - 财政年份:2014
- 资助金额:
$ 120万 - 项目类别:
Continuing Grant
SI2-SSE: Software Infrastructure For Partitioning Sparse Graphs on Existing and Emerging Computer Architectures
SI2-SSE:用于在现有和新兴计算机架构上分区稀疏图的软件基础设施
- 批准号:
1048018 - 财政年份:2010
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
III: Medium: Collaborative Research: Computational Methods to Advance Chemical Genetics by Bridging Chemical and Biological Spaces
III:媒介:合作研究:通过桥接化学和生物空间推进化学遗传学的计算方法
- 批准号:
0905220 - 财政年份:2009
- 资助金额:
$ 120万 - 项目类别:
Continuing Grant
SEI: Virtual Screening Algorithms for Bioactive Compounds Based on Frequent Substructures
SEI:基于频繁子结构的生物活性化合物虚拟筛选算法
- 批准号:
0431135 - 财政年份:2004
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
ITR/NGS: Graph Partitioning Algorithms for Complex Problems & Architectures
ITR/NGS:复杂问题的图划分算法
- 批准号:
0312828 - 财政年份:2003
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
CAREER: Scalable Algorithms for Knowledge Discovery in Scientific Data Sets
职业:科学数据集中知识发现的可扩展算法
- 批准号:
0133464 - 财政年份:2002
- 资助金额:
$ 120万 - 项目类别:
Continuing Grant
CISE Research Instrumentation: Cluster Computing for Knowledge Discovery in Diverse Data Sets
CISE Research Instrumentation:用于不同数据集中知识发现的集群计算
- 批准号:
9986042 - 财政年份:2000
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Multi-Constraint, Multi-Objective Graph Partitioning
多约束、多目标图划分
- 批准号:
9972519 - 财政年份:1999
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
相似海外基金
Collaborative Research: SHF: Medium: Enabling Graphics Processing Unit Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的图形处理单元性能仿真
- 批准号:
2402804 - 财政年份:2024
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Enabling GPU Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的 GPU 性能仿真
- 批准号:
2402806 - 财政年份:2024
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: Enabling GPU Performance Simulation for Large-Scale Workloads with Lightweight Simulation Methods
合作研究:SHF:中:通过轻量级仿真方法实现大规模工作负载的 GPU 性能仿真
- 批准号:
2402805 - 财政年份:2024
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: High-Performance, Verified Accelerator Programming
合作研究:SHF:中:高性能、经过验证的加速器编程
- 批准号:
2313024 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: CSR: Medium: Fortuna: Characterizing and Harnessing Performance Variability in Accelerator-rich Clusters
合作研究:CSR:Medium:Fortuna:表征和利用富含加速器的集群中的性能变异性
- 批准号:
2312689 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Continuing Grant
Collaborative Research: CSR: Medium: Fortuna: Characterizing and Harnessing Performance Variability in Accelerator-rich Clusters
合作研究:CSR:Medium:Fortuna:表征和利用富含加速器的集群中的性能变异性
- 批准号:
2401244 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Continuing Grant
Collaborative Research: CPS: Medium: Co-Designed Control and Scheduling Adaptation for Assured Cyber-Physical System Safety and Performance
协作研究:CPS:中:共同设计控制和调度适应,以确保网络物理系统的安全和性能
- 批准号:
2229290 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: NeTS: Medium: An Integrated Multi-Time Scale Approach to High-Performance, Intelligent, and Secure O-RAN based NextG
合作研究:NeTS:Medium:基于 NextG 的高性能、智能和安全 O-RAN 的集成多时间尺度方法
- 批准号:
2312447 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: A hardware-software co-design approach for high-performance in-memory analytic data processing
协作研究:SHF:中:用于高性能内存分析数据处理的硬件软件协同设计方法
- 批准号:
2312741 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant
Collaborative Research: SHF: Medium: High-Performance, Verified Accelerator Programming
合作研究:SHF:中:高性能、经过验证的加速器编程
- 批准号:
2313023 - 财政年份:2023
- 资助金额:
$ 120万 - 项目类别:
Standard Grant