PPoSS: Planning: Cross-Layer Design for Cost-Effective HPC in the Cloud
PPoSS:规划:云中经济高效 HPC 的跨层设计
基本信息
- 批准号:2028929
- 负责人:
- 金额:$ 25万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-10-01 至 2022-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Many high-performance computing (HPC) applications of national importance (e.g., nuclear simulations, climate modeling, drug discovery, epidemiology, and finance) process enormous datasets and have significant resource demands and strict performance/accuracy/power constraints. Ever-changing hardware elements (e.g., emerging new compute elements) and systems software (continuous fixes to operating systems, compilers and runtime systems) make hosting such HPC applications in locally-managed compute platforms increasingly less attractive. A promising alternate approach is to host these applications in the cloud. However, making legacy HPC applications cloud-ready and identifying the best blend of cloud services for a given application are significant challenges that need to be addressed. In this project, a holistic, cross-layer approach is taken to address the problem of securely mounting such HPC applications in the cloud with high efficiency, low cost, and good performance. A key distinguishing aspect of this project is that it combines both compile-time and run-time innovations and makes contributions to both client and cloud-provider sides. This project spans the following five complementary thrusts, all of which are made challenging by the increasing complexity and scale of the HPC applications of interest, and by the complexity of cloud service offerings and application service-level objectives: (i) characterizing HPC application behavior on myriad cloud infrastructural options; (ii) compiler support for HPC application cloudization; (iii) novel programming language support -- Object-as-a-Service (OaaS); (iv) workload placement and scheduling support; and (v) systems software support for PaaS/SaaS on heterogeneous hardware. The ultimate goal of this project is to devise systematic methodologies for mapping HPC applications to different types of services (spanning IaaS, SaaS, FaaS, OaaS) in multi/hybrid-cloud. This research facilitates improvements in the costs of running HPC applications. This project also enables easy transitioning of HPC applications from one cloud to another and provides data for cloud architecture designers to tune their systems better for current and future HPC workloads. In addition to its technical contributions, this project involves various educational and outreach activities as well. In particular, a new graduate curriculum for cloud computing focusing on HPC applications is created and freely disseminated. Finally, the code being developed and experimental results collected are documented and open-sourced.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
许多对国家具有重要意义的高性能计算(HPC)应用(例如,核模拟、气候建模、药物发现、流行病学和金融)处理巨大的数据集,并具有巨大的资源需求和严格的性能/精度/功耗限制。不断变化的硬件元素(例如,新出现的计算元素)和系统软件(对操作系统、编译器和运行时系统的持续修复)使得在本地管理的计算平台中托管此类HPC应用程序的吸引力越来越小。一种很有前途的替代方法是在云中托管这些应用程序。然而,让传统HPC应用程序实现云就绪并为给定应用程序确定最佳的云服务组合是需要解决的重大挑战。在本项目中,采用整体、跨层的方法来解决将此类高性能计算应用安全地挂载到云中的问题,并且效率高、成本低、性能好。该项目的一个关键区别方面是,它结合了编译时和运行时的创新,并为客户端和云提供商双方做出了贡献。该项目跨越了以下五个互为补充的项目,所有这些都因感兴趣的HPC应用程序的日益复杂和规模,以及云服务产品和应用程序服务级别目标的复杂性而变得具有挑战性:(I)表征各种云基础设施选项上的HPC应用程序行为;(Ii)对HPC应用程序云化的编译器支持;(Iii)新的编程语言支持--对象即服务(OaaS);(Iv)工作负载放置和调度支持;以及(V)在异类硬件上对PaaS/SaaS的系统软件支持。该项目的最终目标是设计系统的方法来将HPC应用映射到多/混合云中的不同类型的服务(跨IaaS、SaaS、Faas、OaaS)。这项研究有助于提高运行HPC应用程序的成本。该项目还实现了从一个云到另一个云的HPC应用的轻松过渡,并为云架构设计人员提供数据,以便更好地针对当前和未来的HPC工作负载调整其系统。除技术贡献外,该项目还包括各种教育和外联活动。特别是,创建了一个新的云计算研究生课程,重点是HPC应用程序,并免费分发。最后,正在开发的代码和收集的实验结果被记录下来并开放源代码。这一奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
SHOWAR: Right-Sizing And Efficient Scheduling of Microservices
- DOI:10.1145/3472883.3486999
- 发表时间:2021-11
- 期刊:
- 影响因子:0
- 作者:A. F. Baarzi;G. Kesidis
- 通讯作者:A. F. Baarzi;G. Kesidis
Splice: An Automated Framework for Cost-and Performance-Aware Blending of Cloud Services
- DOI:10.1109/ccgrid54584.2022.00021
- 发表时间:2022-05
- 期刊:
- 影响因子:0
- 作者:Myungjun Son;S. Mohanty;Jashwant Raj Gunasekaran;Aman Jain;M. Kandemir;G. Kesidis;B. Urgaonkar
- 通讯作者:Myungjun Son;S. Mohanty;Jashwant Raj Gunasekaran;Aman Jain;M. Kandemir;G. Kesidis;B. Urgaonkar
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Mahmut Kandemir其他文献
Particle simulation on the Cell BE architecture
- DOI:
10.1007/s10586-011-0169-4 - 发表时间:
2011-07-27 - 期刊:
- 影响因子:4.100
- 作者:
Betul Demiroz;Haluk R. Topcuoglu;Mahmut Kandemir;Oguz Tosun - 通讯作者:
Oguz Tosun
A case for core-assisted bottleneck acceleration in GPUs
GPU 中核心辅助瓶颈加速的案例
- DOI:
- 发表时间:
2015 - 期刊:
- 影响因子:0
- 作者:
Nandita Vijaykumar;Gennady Pekhimenko;Adwait Jog;A. Bhowmick;Rachata Ausavarungnirun;Chita R. Das;Mahmut Kandemir;T. Mowry;O. Mutlu - 通讯作者:
O. Mutlu
Optimizing Leakage Energy Consumption in Cache Bitlines
- DOI:
10.1007/s10617-005-5345-4 - 发表时间:
2004-03-01 - 期刊:
- 影响因子:0.900
- 作者:
Soontae Kim;Narayanan Vijaykrishnan;Mahmut Kandemir;Mary Jane Irwin - 通讯作者:
Mary Jane Irwin
Time-constrained optimization of multi-AUV cooperative mine detection
多AUV协同探雷的时间约束优化
- DOI:
10.1109/oceans.2008.5151971 - 发表时间:
2008 - 期刊:
- 影响因子:0
- 作者:
R. Prins;Mahmut Kandemir - 通讯作者:
Mahmut Kandemir
An I/O-Conscious Tiling Strategy for Disk-Resident Data Sets
- DOI:
10.1023/a:1014156327748 - 发表时间:
2002-01-01 - 期刊:
- 影响因子:2.700
- 作者:
Mahmut Kandemir;Alok Choudhary;J. Ramanujam - 通讯作者:
J. Ramanujam
Mahmut Kandemir的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Mahmut Kandemir', 18)}}的其他基金
Collaborative Research: CNS Core: Small: Resource-efficient, Strongly Consistent Replication for the Cloud
合作研究:CNS 核心:小型:资源高效、强一致性的云复制
- 批准号:
2149389 - 财政年份:2022
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
SaTC: CORE: Small: Automatic Software Patching against Microarchitectual Attacks
SaTC:核心:小型:针对微架构攻击的自动软件修补
- 批准号:
1956032 - 财政年份:2020
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
SHF: Small: Characterizing and Optimizing 3D NAND Flash
SHF:小型:表征和优化 3D NAND 闪存
- 批准号:
1908793 - 财政年份:2019
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Frameworks: Re-Engineering Galaxy for Performance, Scalability and Energy Efficiency
框架:重新设计 Galaxy 以提高性能、可扩展性和能源效率
- 批准号:
1931531 - 财政年份:2019
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
XPS: FULL: A Fresh Look at Near Data Computing: Coordinated Data and Computation Government
XPS:完整:近数据计算的新视角:协调数据和计算政府
- 批准号:
1629129 - 财政年份:2016
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
CSR: Medium: Collaborative Research: Enabling GPUs as First-Class Computing Engines
CSR:媒介:协作研究:使 GPU 成为一流的计算引擎
- 批准号:
1409095 - 财政年份:2014
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
XPS: FULL:CCA: Extracting Scalable Parallelism by Relaxing the Contracts across the System Stack
XPS:FULL:CCA:通过放松整个系统堆栈的契约来提取可扩展的并行性
- 批准号:
1439021 - 财政年份:2014
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
SHF: Medium: Breaking the Physical Divide between Computation and NAND-Flash Storage
SHF:媒介:打破计算和 NAND 闪存存储之间的物理鸿沟
- 批准号:
1302557 - 财政年份:2013
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
SHF: Medium: Automatic Control Driven Resource Management in Chip Multiprocessors
SHF:中:芯片多处理器中自动控制驱动的资源管理
- 批准号:
0963839 - 财政年份:2010
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
Collaborative Research: Adaptive Techniques for Achieving End-to-End QoS in the I/O Stack on Petascale Multiprocessors
协作研究:在千万级多处理器上的 I/O 堆栈中实现端到端 QoS 的自适应技术
- 批准号:
0937949 - 财政年份:2009
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
相似海外基金
Cross-border energy planning for just transition in Southeast Asia
东南亚公正转型的跨境能源规划
- 批准号:
24K20970 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
INTERoperable cloud-based solution for cross-vector planning and management of Positive Energy Districts
可互操作的基于云的解决方案,用于正能量区的跨矢量规划和管理
- 批准号:
10098586 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
EU-Funded
Changing primary care capacity in Canada (4C): A cross-provincial mixed methods study to inform workforce planning
改变加拿大的初级保健能力(4C):一项跨省混合方法研究,为劳动力规划提供信息
- 批准号:
488915 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Operating Grants
Collaborative Research: PPoSS: Planning: Cross-layer Coordination and Optimization for Scalable and Sparse Tensor Networks (CROSS)
合作研究:PPoSS:规划:可扩展和稀疏张量网络的跨层协调和优化(CROSS)
- 批准号:
2217028 - 财政年份:2022
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: PPoSS: Planning: Cross-layer Coordination and Optimization for Scalable and Sparse Tensor Networks (CROSS)
合作研究:PPoSS:规划:可扩展和稀疏张量网络的跨层协调和优化(CROSS)
- 批准号:
2217086 - 财政年份:2022
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: PPoSS: Planning: Cross-layer Coordination and Optimization for Scalable and Sparse Tensor Networks (CROSS)
合作研究:PPoSS:规划:可扩展和稀疏张量网络的跨层协调和优化(CROSS)
- 批准号:
2247309 - 财政年份:2022
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: PPoSS: Planning: Cross-layer Coordination and Optimization for Scalable and Sparse Tensor Networks (CROSS)
合作研究:PPoSS:规划:可扩展和稀疏张量网络的跨层协调和优化(CROSS)
- 批准号:
2217010 - 财政年份:2022
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: PPoSS: Planning: Cross-layer Coordination and Optimization for Scalable and Sparse Tensor Networks (CROSS)
合作研究:PPoSS:规划:可扩展和稀疏张量网络的跨层协调和优化(CROSS)
- 批准号:
2217020 - 财政年份:2022
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Sustainable cross-border energy planning by using participatory multi-criteria evaluation in the Thai-Laos electricity system
在泰国-老挝电力系统中采用参与式多标准评估进行可持续跨境能源规划
- 批准号:
21K17923 - 财政年份:2021
- 资助金额:
$ 25万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
PPoSS: Planning: A Cross-Layer Approach to Accelerate Large-Scale Graph Computations on Distributed Platforms
PPoSS:规划:加速分布式平台上大规模图计算的跨层方法
- 批准号:
2028861 - 财政年份:2020
- 资助金额:
$ 25万 - 项目类别:
Standard Grant