SI2-SSE: Collaborative Research: High Performance Low Rank Approximation for Scalable Data Analytics
SI2-SSE:协作研究:可扩展数据分析的高性能低秩近似
基本信息
- 批准号:1642385
- 负责人:
- 金额:$ 16.77万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-11-01 至 2021-10-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Big Data analytics is at the core of discovery covering vast areas such as medical informatics, business analytics, national security, and materials sciences. This project aims to model some of the key data analytics problems and design, verify, and deploy scalable methods for knowledge extraction. The algorithms developed will be able to handle data sets of extreme sizes and will be deployable on advanced computer hardware. The goal is to realize orders-of-magnitude improvements over existing data analytics technologies, developing algorithms that are robust to incompleteness, noise, ambiguity, and high dimension in the data. Particular focus will be parallel and distributed algorithms that can efficiently solve large problems and produce accurate solutions. The proposed research and software development will allow domain experts to tackle Big Data sets requiring large parallel systems. The improved performance will enable fast and scalable data analysis across applications, from social network analysis to study citizens' attitudes toward sustainability-related issues to computational marketing techniques that refine customers' shopping experiences. The proposed work will help bridge the gap between computational science and data analytics ecosystems, two fields that stand to make great advancements from cross-fertilization. The education and outreach plan includes graduate course creation, engagement of under-represented groups via both undergraduate and graduate research experiences, and community-building efforts by workshop and mini-symposium organization.With the advent of internet-scale data, the data mining and machine learning community has adopted Nonnegative Matrix Factorization (NMF) for performing numerous tasks such as topic modeling, background separation from video data, hyper-spectral imaging, web-scale clustering, and community detection. The goals of this proposal are to develop efficient parallel algorithms for computing nonnegative matrix and tensor factorizations (NMF and NTF) and their variants using a unified framework, and to produce a software package called Parallel Low-rank Approximation with Nonnegative Constraints (PLANCK) that delivers the high performance, flexibility, and scalability necessary to tackle the ever-growing size of today's data sets. The algorithms will be generalized to NTF problems and extend the class of algorithms we can efficiently parallelize; our software framework will allow end-users to use and extend our techniques. Rather than developing separate software for each problem domain and mathematical technique, flexibility will be achieved by characterizing nearly all of the current NMF and NTF algorithms in the context of a block coordinate descent framework. Using this framework the shared computational kernels can be separated, which usually extend run times, from the algorithm-specific computations. Finally, the usability and practicality of the proposed software will be maintained by being application driven, establishing collaborations with early end-users, and by incrementally generalizing the framework in terms of both algorithms and problems.
大数据分析是发现的核心,涵盖医学信息学、商业分析、国家安全和材料科学等广泛领域。该项目旨在对一些关键的数据分析问题进行建模,并设计、验证和部署可扩展的知识提取方法。 开发的算法将能够处理极端大小的数据集,并可部署在先进的计算机硬件上。其目标是实现对现有数据分析技术的数量级改进,开发对数据中的不完整性,噪声,模糊性和高维性具有鲁棒性的算法。 特别关注的是并行和分布式算法,可以有效地解决大型问题,并产生准确的解决方案。 拟议的研究和软件开发将使领域专家能够处理需要大型并行系统的大数据集。 改进后的性能将实现跨应用程序的快速和可扩展的数据分析,从社交网络分析到研究公民对可持续性相关问题的态度,再到优化客户购物体验的计算营销技术。 拟议的工作将有助于弥合计算科学和数据分析生态系统之间的差距,这两个领域将从交叉施肥中取得巨大进步。 教育和推广计划包括研究生课程的创建,通过本科生和研究生的研究经验参与代表性不足的群体,以及通过研讨会和小型研讨会组织的社区建设工作。随着互联网规模数据的出现,数据挖掘和机器学习社区已经采用非负矩阵分解(NMF)来执行许多任务,例如主题建模,视频数据的背景分离,超光谱成像、网络规模聚类和社区检测。 该提案的目标是开发高效的并行算法,用于使用统一的框架计算非负矩阵和张量因子分解(NMF和NTF)及其变体,并产生一个名为并行低秩非负约束近似(PLANCK)的软件包,该软件包提供了高性能,灵活性和可扩展性,以应对当今不断增长的数据集。该算法将被推广到NTF问题,并扩展类的算法,我们可以有效地并行化,我们的软件框架将允许最终用户使用和扩展我们的技术。 而不是为每个问题域和数学技术开发单独的软件,将通过在块坐标下降框架的背景下表征几乎所有当前的NMF和NTF算法来实现灵活性。使用这个框架,共享的计算内核可以从算法特定的计算中分离出来,这通常会延长运行时间。最后,所提出的软件的可用性和实用性将保持由应用程序驱动,建立与早期的最终用户的合作,并通过逐步推广的算法和问题的框架。
项目成果
期刊论文数量(11)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization
- DOI:10.1109/tkde.2017.2767592
- 发表时间:2018-03-01
- 期刊:
- 影响因子:8.9
- 作者:Kannan, Ramakrishnan;Ballard, Grey;Park, Haesun
- 通讯作者:Park, Haesun
Shared-memory parallelization of MTTKRP for dense tensors
密集张量的 MTTKRP 共享内存并行化
- DOI:10.1145/3178487.3178522
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Hayashi, Koby;Ballard, Grey;Jiang, Yujie;Tobia, Michael J.
- 通讯作者:Tobia, Michael J.
Communication Lower Bounds for Matricized Tensor Times Khatri-Rao Product
矩阵化张量时间 Khatri-Rao 产品的通信下界
- DOI:10.1109/ipdps.2018.00065
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Ballard, Grey;Knight, Nicholas;Rouse, Kathryn
- 通讯作者:Rouse, Kathryn
Parallel Hierarchical Clustering using Rank-Two Nonnegative Matrix Factorization
使用二阶非负矩阵分解的并行层次聚类
- DOI:10.1109/hipc50609.2020.00028
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Manning, Lawton;Ballard, Grey;Kannan, Ramakrishnan;Park, Haesun
- 通讯作者:Park, Haesun
Parallel Nonnegative CP Decomposition of Dense Tensors
稠密张量的并行非负 CP 分解
- DOI:10.1109/hipc.2018.00012
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Ballard, Grey;Hayashi, Koby;Ramakrishnan, Kannan
- 通讯作者:Ramakrishnan, Kannan
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Grey Ballard其他文献
Communication Lower Bounds and Optimal Algorithms for Multiple Tensor-Times-Matrix Computation
多张量矩阵计算的通信下界和最优算法
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:1.5
- 作者:
Hussam Al Daas;Grey Ballard;L. Grigori;Suraj Kumar;Kathryn Rouse - 通讯作者:
Kathryn Rouse
Avoiding Communication in Successive Band Reduction
避免连续频带减少中的通信
- DOI:
10.1145/2686877 - 发表时间:
2015 - 期刊:
- 影响因子:0
- 作者:
Grey Ballard;J. Demmel;Nicholas Knight - 通讯作者:
Nicholas Knight
Communication-Avoiding Parallel Strassen: Implementation and performance
避免通信的并行 Strassen:实施和性能
- DOI:
10.1109/sc.2012.33 - 发表时间:
2012 - 期刊:
- 影响因子:0
- 作者:
Benjamin Lipshitz;Grey Ballard;J. Demmel;O. Schwartz - 通讯作者:
O. Schwartz
GentenMPI: Distributed Memory Sparse Tensor Decomposition.
GentenMPI:分布式内存稀疏张量分解。
- DOI:
10.2172/1656940 - 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
K. Devine;Grey Ballard - 通讯作者:
Grey Ballard
A 3D Parallel Algorithm for QR Decomposition
QR分解的3D并行算法
- DOI:
10.1145/3210377.3210415 - 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Grey Ballard;J. Demmel;L. Grigori;M. Jacquelin;Nicholas Knight - 通讯作者:
Nicholas Knight
Grey Ballard的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Grey Ballard', 18)}}的其他基金
Collaborative Research: OAC Core: Robust, Scalable, and Practical Low-Rank Approximation
合作研究:OAC 核心:稳健、可扩展且实用的低阶近似
- 批准号:
2106920 - 财政年份:2021
- 资助金额:
$ 16.77万 - 项目类别:
Standard Grant
CAREER: Communication-Avoiding Tensor Decomposition Algorithms
职业:避免通信的张量分解算法
- 批准号:
1942892 - 财政年份:2020
- 资助金额:
$ 16.77万 - 项目类别:
Continuing Grant
相似国自然基金
化脓性链球菌分泌性酯酶Sse抑制LC3相关吞噬促其侵袭的机制研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
太阳能电池Cu2ZnSn(SSe)4/CdS界面过渡层结构模拟及缺陷态消除研究
- 批准号:
- 批准年份:2022
- 资助金额:55 万元
- 项目类别:面上项目
掺杂实现Cu2ZnSn(SSe)4吸收层表层稳定弱n型特性的第一性原理研究
- 批准号:12004100
- 批准年份:2020
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
基于SSE的航空信息系统信息安全保障评价指标体系的研究
- 批准号:60776808
- 批准年份:2007
- 资助金额:19.0 万元
- 项目类别:联合基金项目
相似海外基金
Collaborative Research: SI2-SSE: WRENCH: A Simulation Workbench for Scientific Worflow Users, Developers, and Researchers
协作研究:SI2-SSE:WRENCH:面向科学 Worflow 用户、开发人员和研究人员的模拟工作台
- 批准号:
1642369 - 财政年份:2017
- 资助金额:
$ 16.77万 - 项目类别:
Standard Grant
SI2-SSE: Collaborative Research: Integrated Tools for DNA Nanostructure Design and Simulation
SI2-SSE:合作研究:DNA 纳米结构设计和模拟的集成工具
- 批准号:
1740212 - 财政年份:2017
- 资助金额:
$ 16.77万 - 项目类别:
Standard Grant
Collaborative Research: NSCI: SI2-SSE: Time Stepping and Exchange-Correlation Modules for Massively Parallel Real-Time Time-Dependent DFT
合作研究:NSCI:SI2-SSE:大规模并行实时瞬态 DFT 的时间步进和交换相关模块
- 批准号:
1740219 - 财政年份:2017
- 资助金额:
$ 16.77万 - 项目类别:
Standard Grant
SI2-SSE: Collaborative Research: Integrated Tools for DNA Nanostructure Design and Simulation
SI2-SSE:合作研究:DNA 纳米结构设计和模拟的集成工具
- 批准号:
1740282 - 财政年份:2017
- 资助金额:
$ 16.77万 - 项目类别:
Standard Grant
Collaborative Research: SI2-SSE: An open source multi-physics platform to advance fundamental understanding of plasma physics and enable impactful application of plasma systems
合作研究:SI2-SSE:一个开源多物理平台,可促进对等离子体物理学的基本理解并实现等离子体系统的有效应用
- 批准号:
1740300 - 财政年份:2017
- 资助金额:
$ 16.77万 - 项目类别:
Standard Grant
SI2-SSE: Collaborative Research: Software Framework for Strongly Correlated Materials: from DFT to DMFT
SI2-SSE:协作研究:强相关材料的软件框架:从 DFT 到 DMFT
- 批准号:
1740112 - 财政年份:2017
- 资助金额:
$ 16.77万 - 项目类别:
Standard Grant
SI2-SSE: Collaborative Research: A Sustainable Future for the Glue Multi-Dimensional Linked Data Visualization Package
SI2-SSE:协作研究:Glue 多维关联数据可视化包的可持续未来
- 批准号:
1740229 - 财政年份:2017
- 资助金额:
$ 16.77万 - 项目类别:
Standard Grant
SI2-SSE: Collaborative Research: Software Framework for Strongly Correlated Materials: from DFT to DMFT
SI2-SSE:协作研究:强相关材料的软件框架:从 DFT 到 DMFT
- 批准号:
1740111 - 财政年份:2017
- 资助金额:
$ 16.77万 - 项目类别:
Standard Grant
Collaborative Proposal: SI2-SSE: An open source multi-physics platform to advance fundamental understanding of plasma physics and enable impactful application of plasma systems
合作提案:SI2-SSE:一个开源多物理平台,可促进对等离子体物理学的基本理解并实现等离子体系统的有效应用
- 批准号:
1740310 - 财政年份:2017
- 资助金额:
$ 16.77万 - 项目类别:
Standard Grant
Collaborative Research: SI2-SSE: WRENCH: A Simulation Workbench for Scientific Workflow Users, Developers, and Researchers
协作研究:SI2-SSE:WRENCH:面向科学工作流程用户、开发人员和研究人员的模拟工作台
- 批准号:
1642335 - 财政年份:2017
- 资助金额:
$ 16.77万 - 项目类别:
Standard Grant