Collaborative Research: CNS Core: Small: HARMONIA: New Methods for Colocating Multiple QoS-Sensitive Jobs
协作研究:CNS 核心:小型:HARMONIA:共置多个 QoS 敏感作业的新方法
基本信息
- 批准号:2124908
- 负责人:
- 金额:$ 21.89万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-10-01 至 2025-03-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Data centers and high performance computing (HPC) systems are considered the backbone of all modern-day computational needs for services ranging from Web search to emails; from video streaming to file sharing; from social media platforms to scientific computing. Today, contemporary data center job schedulers employ conservative resource sharing strategies among applications co-running on the same physical server. Current strategies are conservative to ensure that tight latency requirements for latency-critical applications are met; however, this conservatism leads to huge underutilization of expensive computing resources, which incurs both capital and operational expenses. HARMONIA proposes a family of novel unconventional resource strategies leveraging the principles of Bayesian Optimization (BO), but introducing novel innovations to BO and demonstrating its usefulness toward data center resource management. HARMONIA will capture the impact of resource allocation on application performance using BO-based learning models, and partition the shared resources and adjust hardware/software knobs accordingly to maximize the performance of individual applications and the system utilization. To achieve practicality and scalability, HARMONIA employs a pool of approximately-accurate online learning models which are lightweight instead of a heavyweight, fully-accurate model. Incoming applications are placed and co-located with existing applications in a dynamic, efficient, and non-intrusive manner by the HARMONIA runtime framework. Outcomes of this project will influence and impact the operations of modern data centers, which serve our computational needs for a variety of workloads including short-running latency-critical application (e.g., machine learning inferences, web search queries, microservices) and long-running throughput-oriented workloads (e.g., scientific simulations). Improving the utilization of large-scale data centers and HPC systems will lead to better cost savings and a lower carbon footprint. Planned educational and outreach activities for the project HARMONIA include enhancing graduate coursework and introducing a new monthly podcast on ``concepts in computer systems'' to better engage and prepare high school students. All developed tools, software artifacts, measured datasets will be made available to the research community for further enhancing the project outcomes and their impact.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据中心和高性能计算(HPC)系统被认为是所有现代计算需求的支柱,从Web搜索到电子邮件;从视频流到文件共享;从社交媒体平台到科学计算。如今,当代数据中心作业管理者在同一物理服务器上共同运行的应用程序之间采用保守的资源共享策略。当前的策略是保守的,以确保满足延迟关键型应用程序的严格延迟要求;然而,这种保守性导致昂贵的计算资源的巨大利用不足,这会产生资本和运营费用。HARMONIA提出了一系列利用贝叶斯优化(BO)原理的新型非常规资源策略,但为BO引入了新的创新,并证明了其对数据中心资源管理的有用性。HARMONIA将使用基于BO的学习模型捕获资源分配对应用程序性能的影响,并对共享资源进行分区,并相应地调整硬件/软件旋钮,以最大限度地提高单个应用程序的性能和系统利用率。为了实现实用性和可扩展性,HARMONIA采用了一组近似准确的在线学习模型,这些模型是轻量级的,而不是重量级的、完全准确的模型。传入的应用程序通过HARMONIA运行时框架以动态、高效和非侵入的方式与现有应用程序放在一起。该项目的成果将影响和冲击现代数据中心的运营,这些数据中心服务于我们对各种工作负载的计算需求,包括短期运行的延迟关键型应用程序(例如,机器学习推理、web搜索查询、微服务)和长时间运行的面向吞吐量的工作负载(例如,科学模拟)。提高大规模数据中心和HPC系统的利用率将带来更好的成本节约和更低的碳足迹。HARMONIA项目计划开展的教育和外联活动包括加强研究生课程,并推出关于“计算机系统概念”的新的每月播客,以更好地吸引高中生并使他们做好准备。所有开发的工具,软件工件,测量的数据集将提供给研究界,以进一步提高项目的成果和他们的影响。这个奖项反映了NSF的法定使命,并已被认为是值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
SLO-aware Virtual Rebalancing for Edge Stream Processing
用于边缘流处理的 SLO 感知虚拟重新平衡
- DOI:10.1109/ic2e52221.2021.00027
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Kang, Peng;Lama, Palden;Khan, Samee U.
- 通讯作者:Khan, Samee U.
IceBreaker: warming serverless functions better with heterogeneity
- DOI:10.1145/3503222.3507750
- 发表时间:2022-02
- 期刊:
- 影响因子:0
- 作者:Rohan Basu Roy;Tirthak Patel;Devesh Tiwari
- 通讯作者:Rohan Basu Roy;Tirthak Patel;Devesh Tiwari
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Samee Khan其他文献
Samee Khan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Samee Khan', 18)}}的其他基金
REU Site: Intelligent Edge Computing Systems
REU 站点:智能边缘计算系统
- 批准号:
2348711 - 财政年份:2024
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
FET: Medium: A Quantum Computing Based Approach to Undirected Generative Machine Learning Models
FET:中:基于量子计算的无向生成机器学习模型方法
- 批准号:
2211841 - 财政年份:2022
- 资助金额:
$ 21.89万 - 项目类别:
Continuing Grant
Workshop on Quantum Computing, Information, Science, and Engineering
量子计算、信息、科学与工程研讨会
- 批准号:
2202377 - 财政年份:2022
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
Travel: NSF Student Travel Grant for 2022 IEEE Cloud Summit
旅行:2022 年 IEEE 云峰会 NSF 学生旅行补助金
- 批准号:
2243579 - 财政年份:2022
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
Collaborative Research: PPoSS: Planning: Software Stack for Scalable Heterogeneous NISQ Cluster
协作研究:PPoSS:规划:可扩展异构 NISQ 集群的软件堆栈
- 批准号:
2216898 - 财政年份:2022
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
EAGER: From Theory to Practice of Elastic Interval Runtime Schedulers
EAGER:弹性间隔运行时调度器从理论到实践
- 批准号:
2135439 - 财政年份:2021
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: CNS Core: Medium: Reconfigurable Kernel Datapaths with Adaptive Optimizations
协作研究:CNS 核心:中:具有自适应优化的可重构内核数据路径
- 批准号:
2345339 - 财政年份:2023
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
Collaborative Research: CNS Core: Small: A Compilation System for Mapping Deep Learning Models to Tensorized Instructions (DELITE)
合作研究:CNS Core:Small:将深度学习模型映射到张量化指令的编译系统(DELITE)
- 批准号:
2230945 - 财政年份:2023
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CNS Core: Small: Towards Scalable and Al-based Solutions for Beyond-5G Radio Access Networks
合作研究:NSF-AoF:CNS 核心:小型:面向超 5G 无线接入网络的可扩展和基于人工智能的解决方案
- 批准号:
2225578 - 财政年份:2023
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
Collaborative Research: CNS Core: Medium: Movement of Computation and Data in Splitkernel-disaggregated, Data-intensive Systems
合作研究:CNS 核心:媒介:Splitkernel 分解的数据密集型系统中的计算和数据移动
- 批准号:
2406598 - 财政年份:2023
- 资助金额:
$ 21.89万 - 项目类别:
Continuing Grant
Collaborative Research: CNS Core: Small: SmartSight: an AI-Based Computing Platform to Assist Blind and Visually Impaired People
合作研究:中枢神经系统核心:小型:SmartSight:基于人工智能的计算平台,帮助盲人和视障人士
- 批准号:
2418188 - 财政年份:2023
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
Collaborative Research: CNS Core: Small: Creating An Extensible Internet Through Interposition
合作研究:CNS核心:小:通过介入创建可扩展的互联网
- 批准号:
2242503 - 财政年份:2023
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
Collaborative Research: CNS Core: Small: Adaptive Smart Surfaces for Wireless Channel Morphing to Enable Full Multiplexing and Multi-user Gains
合作研究:CNS 核心:小型:用于无线信道变形的自适应智能表面,以实现完全复用和多用户增益
- 批准号:
2343959 - 财政年份:2023
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
Collaborative Research: CNS Core: Small: Efficient Ways to Enlarge Practical DNA Storage Capacity by Integrating Bio-Computer Technologies
合作研究:中枢神经系统核心:小型:通过集成生物计算机技术扩大实用 DNA 存储容量的有效方法
- 批准号:
2343863 - 财政年份:2023
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
Collaborative Research: CNS Core: Small: A Compilation System for Mapping Deep Learning Models to Tensorized Instructions (DELITE)
合作研究:CNS Core:Small:将深度学习模型映射到张量化指令的编译系统(DELITE)
- 批准号:
2341378 - 财政年份:2023
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant
Collaborative Research: CISE-MSI: RCBP-RF: CNS: ESD4CDaT - Efficient System Design for Cancer Detection and Treatment
合作研究:CISE-MSI:RCBP-RF:CNS:ESD4CDaT - 癌症检测和治疗的高效系统设计
- 批准号:
2318573 - 财政年份:2023
- 资助金额:
$ 21.89万 - 项目类别:
Standard Grant