PPoSS: Planning: CP2: Towards Systems Correctness Checkability and Performance Predictability at Scale
PPoSS:规划:CP2:实现大规模系统正确性可检查性和性能可预测性
基本信息
- 批准号:2028427
- 负责人:
- 金额:$ 24.8万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-09-01 至 2022-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
As a critical backend for many of today's applications and services, large-scale distributed systems must be highly reliable. In the last couple of years the field witnessed a phenomenal scale of deployment; Google is known to run clusters with thousands of machines each, Apple deploys over 100,000 database machines, and Netflix runs tens of database clusters with 500 nodes each. This new era of cloud-scale distributed systems has given birth to a new class of faults, scalability faults---faults whose symptoms surface in large-scale deployments but not necessarily in small/medium-scale deployments. The CP2 project is proposed to solve the problem of correctness checkability and performance predictability of systems at extreme scale. Specifically the project will analyze over 500 real-world scalability faults in over a dozen large-scale systems, develop a single-machine scale-checking framework that allows developers to test large distributed code on one or a few machines, and provide groundwork for compute- and I/O-performance predictability of large-scale jobs on both existing and future architectures. These tasks will advance debugging, testing, learning, and prediction methods both on traditional hardware platforms and emerging ones and ultimately lead to correct-by-construction development methods. The CP2 project will have impact in multiple disciplines including systems (cloud/datacenter systems reliability), programming languages/compilers (new static/dynamic analysis techniques), architecture (compute/storage prediction for heterogeneous hardware), algorithms (the use of learning methods), and high-performance computing (benchmarking of HPC systems/applications).In terms of societal benefits, the CP2 project addresses paramount issues mentioned in the NSF Strategic Plan for 2018-2022. More specifically, society increasingly depends on complicated systems that are products of human ingenuity, including ecosystems of large and complex software with millions of lines of code running on thousands of machines. CP2 will address the challenges of understanding and predicting the behavior of such systems. Furthermore, as society’s reliance on complex systems grows, learning about their robustness and understanding how to strengthen them are of increasing importance. In terms of education, the CP2 project gives unique hands-on research and education with cutting-edge systems technology in which students will be trained to operate software on a large number of machines and analyze their performance and correctness. The results of the CP2 project will be released through the classic medium of publication, through the development of numerous software artifacts which will be open-sourced, and finally through collaboration with various industry partners to help shape the next generation of large-scale systems.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
作为当今许多应用程序和服务的关键后端,大规模的分布式系统必须非常可靠。在过去的几年中,该领域见证了一个惊人的部署规模。众所周知,Google可以运行每台数千台机器,Apple部署超过100,000个数据库机器,Netflix运行数十个数据库群集,每台有500个节点。这个云规模分布式系统的新时代已经诞生了一类新的故障,可伸缩性故障 - - 症状在大规模部署中表现出来,但不一定是在中小型/中等规模的部署中。提出了CP2项目来解决系统的正确性可检查性和性能可预测性的问题。具体而言,该项目将在十几个大型系统中分析500多个现实世界可伸缩性故障,开发一个单次尺度检查框架,使开发人员能够在一台或几个机器上测试大型分布式代码,并为对现有和未来架构的大规模可预测工作提供基础,并为计算和I/O/O-O-O-O-O-O-O-O-Scormentance可预测性。这些任务将提高传统硬件平台和新兴平台上的调试,测试,学习和预测方法,并最终导致纠正的开发方法。 The CP2 project will have impact in multiple disciplines including systems (cloud/datacenter systems reliability), programming languages/compilers (new static/dynamic analysis techniques), architecture (compute/storage prediction for heterogeneous hardware), algorithms (the use of learning methods), and high-performance computing (benchmarking of HPC systems/applications).In terms of society benefits, the CP2 project addresses 2018 - 2022年NSF战略计划中提到的派拉蒙问题。更具体地说,社会越来越多地取决于复杂的系统,这些系统是人类邪恶的产物,包括大而复杂的软件的生态系统,其中数百万条代码在数千台机器上运行。 CP2将解决理解和预测此类系统行为的挑战。此外,随着社会对复杂系统的缓解,了解了它们的稳健性并了解如何加强它们的重要性。在教育方面,CP2项目通过尖端的系统技术提供了独特的动手研究和教育,其中将培训学生在大量机器上操作软件并分析其性能和正确性。 The results of the CP2 project will be released through the classic medium of publication, through the development of numerous software artifacts which will be open-sourced, and finally through collaboration with various industry partners to help shape the next generation of large-scale systems.This award reflects NSF's statutory mission and has We were deemed honestly of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Haryadi Gunawi其他文献
Haryadi Gunawi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Haryadi Gunawi', 18)}}的其他基金
Collaborative Research: PPoSS: LARGE: ScaleStuds: Foundations for Correctness Checkability and Performance Predictability of Systems at Scale
合作研究:PPoSS:大型:ScaleStuds:大规模系统正确性可检查性和性能可预测性的基础
- 批准号:
2119184 - 财政年份:2021
- 资助金额:
$ 24.8万 - 项目类别:
Continuing Grant
USENIX FAST 2017 NSF Student Travel Support
USENIX FAST 2017 NSF 学生旅行支持
- 批准号:
1727380 - 财政年份:2017
- 资助金额:
$ 24.8万 - 项目类别:
Standard Grant
CSR: Medium:Combating Distributed Concurrency Bugs in Cloud Systems
CSR:中:对抗云系统中的分布式并发错误
- 批准号:
1563956 - 财政年份:2016
- 资助金额:
$ 24.8万 - 项目类别:
Continuing Grant
CSR: Small: BreezeFS: File System Transformation for Cloud and Multistore Era
CSR:小型:BreezeFS:云和多存储时代的文件系统转型
- 批准号:
1526304 - 财政年份:2015
- 资助金额:
$ 24.8万 - 项目类别:
Standard Grant
CAREER: DrCloud: Drill-Ready Cloud Computing
职业:DrCloud:可练习的云计算
- 批准号:
1350499 - 财政年份:2014
- 资助金额:
$ 24.8万 - 项目类别:
Continuing Grant
XPS:CLCCA:LigHTS: Lagging-Hardware Tolerant Systems" in the system.
系统中的“XPS:CLCCA:LigHTS:滞后硬件容忍系统”。
- 批准号:
1336580 - 财政年份:2013
- 资助金额:
$ 24.8万 - 项目类别:
Standard Grant
DC: Small: Collaborative Research: DARE: Declarative and Scalable Recovery
DC:小型:协作研究:DARE:声明式和可扩展的恢复
- 批准号:
1321958 - 财政年份:2012
- 资助金额:
$ 24.8万 - 项目类别:
Standard Grant
DC: Small: Collaborative Research: DARE: Declarative and Scalable Recovery
DC:小型:协作研究:DARE:声明式和可扩展的恢复
- 批准号:
1016924 - 财政年份:2010
- 资助金额:
$ 24.8万 - 项目类别:
Standard Grant
相似国自然基金
创新走廊的生长机理、空间绩效与规划策略研究——以长三角地区为例
- 批准号:52378045
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
乡村聚落空间分异机制及规划调控研究——以浙江地区为例
- 批准号:52378067
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
面向地下受限空间的无人机同时探索与覆盖规划研究
- 批准号:62303249
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
抛光机器人柔性变刚度并联执行器宏微协调运动规划与主被动柔顺控制
- 批准号:52305016
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
膝关节置换的患者个性化磨损功能智能规划原理
- 批准号:52375207
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
相似海外基金
HoloSurge: Multimodal 3D Holographic tool and real-time Guidance System with point-of-care diagnostics for surgical planning and interventions on liver and pancreatic cancers
HoloSurge:多模态 3D 全息工具和实时指导系统,具有护理点诊断功能,可用于肝癌和胰腺癌的手术规划和干预
- 批准号:
10103131 - 财政年份:2024
- 资助金额:
$ 24.8万 - 项目类别:
EU-Funded
Planning Grant: Developing capacity to attract diverse students to the geosciences: A public relations framework
规划补助金:培养吸引多元化学生学习地球科学的能力:公共关系框架
- 批准号:
2326816 - 财政年份:2024
- 资助金额:
$ 24.8万 - 项目类别:
Standard Grant
Planning: FIRE-PLAN: Building Wildland Fire Science Capacity in Alaska Through The University of Alaska Fairbanks Rural Campuses
规划:FIRE-PLAN:通过阿拉斯加大学费尔班克斯乡村校区建设阿拉斯加荒地火灾科学能力
- 批准号:
2333423 - 财政年份:2024
- 资助金额:
$ 24.8万 - 项目类别:
Standard Grant
Collaborative Research: Planning: FIRE-PLAN:High-Spatiotemporal-Resolution Sensing and Digital Twin to Advance Wildland Fire Science
合作研究:规划:FIRE-PLAN:高时空分辨率传感和数字孪生,以推进荒地火灾科学
- 批准号:
2335568 - 财政年份:2024
- 资助金额:
$ 24.8万 - 项目类别:
Standard Grant
Collaborative Research: Planning: FIRE-PLAN:High-Spatiotemporal-Resolution Sensing and Digital Twin to Advance Wildland Fire Science
合作研究:规划:FIRE-PLAN:高时空分辨率传感和数字孪生,以推进荒地火灾科学
- 批准号:
2335569 - 财政年份:2024
- 资助金额:
$ 24.8万 - 项目类别:
Standard Grant