CSR: NeTS: Small: In-Network Resource Management for Rack-Scale Computers
CSR:NetS:小型:机架级计算机的网络内资源管理
基本信息
- 批准号:1813487
- 负责人:
- 金额:$ 50万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-10-01 至 2022-10-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Datacenters are a critical infrastructure to support modern Internet services like search, social networking and e-commerce. Rack-scale computers are emerging to fundamentally change how datacenters are designed, built and managed. Rack-scale computers disaggregate resources in each rack of servers into separate pools and organize them at the rack level. Such resource disaggregation can enable fine-grained resource allocation and increase resource utilization. Resource management is essential for rack-scale computers to realize fully these benefits. Yet, the densely-packed resources and the rise of millisecond-scale and microsecond-scale tasks pose unprecedented requirements on the throughput and latency for the resource manager. Today's server-based solutions fall short to meet these requirements. This project investigates a new architecture that leverages the power and flexibility of new-generation programmable switches for resource management in rack-scale computers. This project explores the boundary of in-network computing. While networks are traditionally designed for packet forwarding, this project exploits the capability of new-generation programmable switches to realize application-level functionalities that go beyond traditional packet processing. This project uses in-network resource management to exemplify how networks and systems can be deeply integrated and co-designed for next-generation rack-scale computers. This project will not only improve resource management for rack-scale computers in practice, but also provide new architectural and theoretical insights on computer system designs and principles. While the new architecture directly benefits from switch hardware for high performance, the core challenge is to realize generic resource management policies with limited switch functionalities and resources. To address this challenge, this project will exploit compact data structures to efficiently store resource consumption and utilization information with minimal switch resources, and leverage randomized and approximation algorithms to design light-weight mechanisms for the switch data plane to make near-optimal resource management decisions. A prototype system will be implemented with commodity servers and switches and evaluated with microbenchmarks and end-to-end system experiments under a wide range of workloads. This project will provide extensive research, training and educational opportunities for both undergraduate and graduate students, and actively engage women and minorities, with research projects and new course materials. This project will release open-source software for other researchers to leverage and reproduce the results and for the industry to adopt the solutions in practice.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据中心是支持搜索,社交网络和电子商务等现代互联网服务的关键基础架构。机架规模的计算机正在出现,从根本上改变了数据中心的设计,构建和管理的方式。机架规模的计算机将每个服务器中的每个架子中的资源分解为单独的池,并在机架级别组织它们。这种资源分解可以实现细粒度分配并增加资源利用率。资源管理对于机架规模计算机以充分实现这些好处至关重要。然而,对资源经理的吞吐量和潜伏期提出了前所未有的要求,毫无空间的资源和微秒尺度的任务的兴起。今天的基于服务器的解决方案不足以满足这些要求。该项目调查了一种新的体系结构,该架构利用了机架尺度计算机中的新一代可编程开关的功能和灵活性。该项目探讨了网络内计算的边界。尽管网络是用于数据包转发的传统设计,但该项目利用了新一代可编程开关的功能,以实现超越传统数据包处理的应用程序级功能。该项目使用网络内资源管理来说明如何为下一代机架规模计算机深入集成和共同设计网络和系统。 该项目不仅将改善实践中机架规模计算机的资源管理,而且还将为计算机系统设计和原理提供新的架构和理论见解。尽管新的体系结构直接从开关硬件中受益于高性能,但核心挑战是实现具有有限的开关功能和资源的通用资源管理策略。为了应对这一挑战,该项目将利用紧凑的数据结构来有效地存储资源消耗和利用信息,并使用最少的开关资源,并利用随机和近似算法来设计开关数据平面的轻量级机制,以做出近乎最佳的资源资源管理决策。原型系统将通过商品服务器和开关实施,并通过微型计算和端到端系统实验进行评估。该项目将为本科生和研究生提供广泛的研究,培训和教育机会,并通过研究项目和新课程材料积极吸引妇女和少数民族。该项目将发布开源软件,供其他研究人员在实践中采用解决方案,以利用和复制结果。该奖项反映了NSF的法定任务,并被认为是值得通过基金会的知识分子和更广泛影响的评估审查标准来通过评估来获得支持的。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Pegasus: Tolerating Skewed Workloads in Distributed Storage with In-Network Coherence Directories
- DOI:
- 发表时间:2020-10
- 期刊:
- 影响因子:0
- 作者:Jialin Li;J. Nelson;Ellis Michael;Xin Jin;Dan R. K. Ports
- 通讯作者:Jialin Li;J. Nelson;Ellis Michael;Xin Jin;Dan R. K. Ports
Harmonia: Near-Linear Scalability for Replicated Storage with In-Network Conflict Detection
- DOI:10.14778/3368289.3368301
- 发表时间:2019-04
- 期刊:
- 影响因子:0
- 作者:Hang Zhu;Zhihao Bai;Jialin Li;Ellis Michael;Dan R. K. Ports;I. Stoica;Xin Jin
- 通讯作者:Hang Zhu;Zhihao Bai;Jialin Li;Ellis Michael;Dan R. K. Ports;I. Stoica;Xin Jin
NetLock: Fast, Centralized Lock Management Using Programmable Switches
- DOI:10.1145/3387514.3405857
- 发表时间:2020-07
- 期刊:
- 影响因子:0
- 作者:Zhuolong Yu;Yiwen Zhang;V. Braverman;Mosharaf Chowdhury;Xin Jin
- 通讯作者:Zhuolong Yu;Yiwen Zhang;V. Braverman;Mosharaf Chowdhury;Xin Jin
Is Network the Bottleneck of Distributed Training?
- DOI:10.1145/3405671.3405810
- 发表时间:2020-06
- 期刊:
- 影响因子:0
- 作者:Zhen Zhang;Chaokun Chang;Haibin Lin;Yida Wang-;R. Arora;Xin Jin
- 通讯作者:Zhen Zhang;Chaokun Chang;Haibin Lin;Yida Wang-;R. Arora;Xin Jin
Multitenancy for Fast and Programmable Networks in the Cloud
- DOI:
- 发表时间:2020-06
- 期刊:
- 影响因子:0
- 作者:Tao Wang;Hang Zhu;Fabian Ruffy;Xin Jin;Anirudh Sivaraman;Dan R. K. Ports;Aurojit Panda
- 通讯作者:Tao Wang;Hang Zhu;Fabian Ruffy;Xin Jin;Anirudh Sivaraman;Dan R. K. Ports;Aurojit Panda
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Vladimir Braverman其他文献
Metric <math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline" id="d1e20" altimg="si14.svg" class="math"><mi>k</mi></math>-median clustering in insertion-only streams
- DOI:
10.1016/j.dam.2021.07.025 - 发表时间:
2021-12-15 - 期刊:
- 影响因子:
- 作者:
Vladimir Braverman;Harry Lang;Keith Levin;Yevgeniy Rudoy - 通讯作者:
Yevgeniy Rudoy
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?
线性回归的上下文学习需要多少预训练任务?
- DOI:
10.48550/arxiv.2310.08391 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Jingfeng Wu;Difan Zou;Zixiang Chen;Vladimir Braverman;Quanquan Gu;Peter L. Bartlett - 通讯作者:
Peter L. Bartlett
Private Data Stream Analysis for Universal Symmetric Norm Estimation
用于通用对称范数估计的私有数据流分析
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Vladimir Braverman;Joel Manning;Zhiwei Steven Wu;Samson Zhou - 通讯作者:
Samson Zhou
Vladimir Braverman的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Vladimir Braverman', 18)}}的其他基金
Collaborative Research: CNS: Medium: Scalable Learning from Distributed Data for Wireless Network Management
合作研究:CNS:媒介:无线网络管理的分布式数据可扩展学习
- 批准号:
2333887 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
CSR: NeTS: Small: In-Network Resource Management for Rack-Scale Computers
CSR:NetS:小型:机架级计算机的网络内资源管理
- 批准号:
2244870 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
CAREER: New Methods for Central Streaming Problems
职业:解决中央流媒体问题的新方法
- 批准号:
2244899 - 财政年份:2022
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
Collaborative Research: CNS: Medium: Scalable Learning from Distributed Data for Wireless Network Management
合作研究:CNS:媒介:无线网络管理的分布式数据可扩展学习
- 批准号:
2107239 - 财政年份:2021
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
CAREER: New Methods for Central Streaming Problems
职业:解决中央流媒体问题的新方法
- 批准号:
1652257 - 财政年份:2017
- 资助金额:
$ 50万 - 项目类别:
Continuing Grant
EAGER: Universal Sketches for Network Monitoring
EAGER:网络监控通用草图
- 批准号:
1650041 - 财政年份:2016
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
BIGDATA: F: DKA: Collaborative Research: Clustering Algorithms for Data Streams
BIGDATA:F:DKA:协作研究:数据流的聚类算法
- 批准号:
1447639 - 财政年份:2014
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
相似国自然基金
化瘀通络法通过SATB1/JUNB介导“氨基酸代谢网-小胶质细胞极化”调控脑缺血神经功能恢复的机制研究
- 批准号:82374172
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
调节内质网蛋白质稳态保护青光眼视觉损害的小分子药物筛选及作用机制研究
- 批准号:82373849
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
面向内质网-线粒体互作的小分子荧光探针及其在药物评价方面的应用
- 批准号:22367022
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
内质网靶向硒代小分子衍生物的设计、合成与辐射防护作用研究
- 批准号:82304074
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
小胶质细胞胞外诱捕网通过上调cGAS-STING通路诱导Th1细胞分化促进多发性硬化炎性脱髓鞘的机制研究
- 批准号:82301530
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
NeTS: Small: ML-Driven Online Traffic Analysis at Multi-Terabit Line Rates
NeTS:小型:ML 驱动的多太比特线路速率在线流量分析
- 批准号:
2331111 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
NeTS: Small: NSF-DST: Modernizing Underground Mining Operations with Millimeter-Wave Imaging and Networking
NeTS:小型:NSF-DST:利用毫米波成像和网络实现地下采矿作业现代化
- 批准号:
2342833 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
Collaborative Research: NeTS: Small: A Privacy-Aware Human-Centered QoE Assessment Framework for Immersive Videos
协作研究:NetS:小型:一种具有隐私意识、以人为本的沉浸式视频 QoE 评估框架
- 批准号:
2343619 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
NSF-AoF: NeTS: Small: Local 6G Connectivity: Controlled, Resilient, and Secure (6G-ConCoRSe)
NSF-AoF:NetS:小型:本地 6G 连接:受控、弹性和安全 (6G-ConCoRSe)
- 批准号:
2326599 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant
NeTS: Small: Revisiting Network Algorithmics using the CRAM Model
NeTS:小型:使用 CRAM 模型重新审视网络算法
- 批准号:
2333587 - 财政年份:2024
- 资助金额:
$ 50万 - 项目类别:
Standard Grant