CSR: Medium:Collaborative Research:Holistic, Cross-Site, Hybrid System Anomaly Debugging for Large Scale Hosting Infrastructures

CSR:中:协作研究:大规模托管基础设施的整体、跨站点、混合系统异常调试

基本信息

  • 批准号:
    1514256
  • 负责人:
  • 金额:
    $ 28.2万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2015
  • 资助国家:
    美国
  • 起止时间:
    2015-08-01 至 2020-07-31
  • 项目状态:
    已结题

项目摘要

Large-scale shared hosting infrastructures such as multi-tenant cloud computing systems have become increasingly popular by allowing users to lease resources on-demand in a cost-effective way. As multiple tenants may share computing resources, hosting infrastructures are complex systems and prone to various system anomalies. Although software developers often perform rigorous offline testing, many subtle bugs only manifest themselves during large-scale production run. Many anomalies such as those where the system does not crash but fails to behave as expected are hard to reproduce and diagnose using existing techniques. Existing system anomaly diagnosis work can be broadly classified into two categories: 1) the black-box schemes which do not require source code and are suitable for online production-site diagnosis, and 2) the white-box schemes which require source code and expensive code instrumentation and are suitable for development site, offline diagnosis. Although white-box schemes provide fine-grained diagnosis, large-scale production hosting infrastructures are reluctant to adopt them due to their high-overhead and intrusive system recording approaches.The overarching objective of this project is to explore an innovative cross-site system anomaly debugging approach that intelligently integrates production-site black-box diagnosis with development-site white-box debugging into a more powerful hosting infrastructure debugging framework. This project will develop techniques for development-site, offline white-box debugging that takes the production-site fault inference results as guidance to find the exact anomaly causes. The project will focus on diagnosing non-crashing system anomalies (e.g., performance degradation, service outage, software hang, unexpected halt) that are common in real world hosting infrastructures but are difficult to debug using existing techniques. Techniques developed in this project will generate significant impact on improving the robustness of real world hosting infrastructures. The PIs will develop new course modules on the hosting infrastructure debugging for both graduate and undergraduate classes they regularly teaches. This project will develop programming courseware based on the research prototypes developed in this project. The PIs will use their power of role model and a set of outreach activities to recruit more female students to pursue systems research. The PIs will disseminate their results and collected data broadly through publication and technology transfer. Developed software artifacts and experimental datasets will be released for public use.
大规模共享的托管基础​​架构(例如多租户云计算系统)越来越受欢迎,允许用户以经济高效的方式租赁按需进行按需租赁。由于多个租户可以共享计算资源,因此托管基础架构是复杂的系统,容易出现各种系统异常。尽管软件开发人员经常进行严格的离线测试,但许多微妙的错误只会在大规模生产过程中表现出来。许多异常,例如该系统不会崩溃但无法按预期的表现行为,很难使用现有技术来复制和诊断。现有的系统异常诊断工作可以广泛地分为两类:1)不需要源代码并且适合在线生产站点诊断的黑盒方案,以及2)需要源代码和昂贵代码仪器并且适用于开发现场的白盒方案,离线诊断。 Although white-box schemes provide fine-grained diagnosis, large-scale production hosting infrastructures are reluctant to adopt them due to their high-overhead and intrusive system recording approaches.The overarching objective of this project is to explore an innovative cross-site system anomaly debugging approach that intelligently integrates production-site black-box diagnosis with development-site white-box debugging into a more powerful hosting infrastructure debugging framework.该项目将开发用于开发站点,离线白框调试的技术,以将生产点故障推理结果作为指导,以找到确切的异常原因。该项目将着重于诊断非崩溃系统异常(例如,性能退化,服务中断,软件悬挂,意外停止)在现实世界中托管基础架构的常见,但很难使用现有技术进行调试。该项目开发的技术将对改善现实世界托管基础设施的鲁棒性产生重大影响。 PI将在他们定期教授的研究生和本科课程的托管基础​​设施调试上开发新的课程模块。该项目将根据该项目中开发的研究原型开发编程课程。 PI将利用其榜样的力量和一系列外展活动来招募更多的女学生从事系统研究。 PI将通过出版和技术传输大量传播其结果,并广泛收集数据。开发的软件工件和实验数据集将发布供公众使用。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Shan Lu其他文献

Mutual fund net flows in China: A co-holding network perspective
中国共同基金净流量:共同持股网络视角
  • DOI:
    10.3389/fphy.2023.1142905
  • 发表时间:
    2023-02
  • 期刊:
  • 影响因子:
    3.1
  • 作者:
    Yue Ma;Shan Lu;Jichang Zhao
  • 通讯作者:
    Jichang Zhao
Protective immunity induced by rotavirus DNA vaccines.
轮状病毒 DNA 疫苗诱导的保护性免疫。
  • DOI:
    10.1016/s0264-410x(96)00272-1
  • 发表时间:
    1997
  • 期刊:
  • 影响因子:
    5.5
  • 作者:
    Shing;E. F. Fynan;E. F. Fynan;H. Robinson;Shan Lu;H. Greenberg;J. Santoro;J. Herrmann
  • 通讯作者:
    J. Herrmann
Optimal vegetation index for assessing leaf water potential using reflectance factors from the adaxial and abaxial surfaces
使用近轴和远轴表面的反射系数评估叶水势的最佳植被指数
Proceedings of the 8th Workshop on Programming Languages and Operating Systems
Synergistic enhancement of immunogenicity and protection in mice against Schistosoma japonicum with codon optimization and electroporation delivery of SjTPI DNA vaccines.
通过密码子优化和 SjTPI DNA 疫苗的电穿孔递送,协同增强小鼠对日本血吸虫的免疫原性和保护作用。
  • DOI:
  • 发表时间:
    2010
  • 期刊:
  • 影响因子:
    5.5
  • 作者:
    Yin;F. Lu;Yang Dai;Xiaoting Wang;Jian;Song Zhao;Chun Zhang;Hui Zhang;Shan Lu;Shixia Wang
  • 通讯作者:
    Shixia Wang

Shan Lu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Shan Lu', 18)}}的其他基金

CSR: Medium: Improving the Interface between Machine Learning and Software Systems
CSR:中:改进机器学习和软件系统之间的接口
  • 批准号:
    2313190
  • 财政年份:
    2023
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Standard Grant
NSF Student Travel Grant for 2020 ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)
NSF 学生旅费资助 2020 年 ACM 国际编程语言和操作系统架构支持会议 (ASPLOS)
  • 批准号:
    1936025
  • 财政年份:
    2020
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Standard Grant
CNS Core: Medium: Accurate Anytime Learning for Energy andTimeliness in Software Systems
CNS 核心:中:随时准确学习软件系统的能量和及时性
  • 批准号:
    1956180
  • 财政年份:
    2020
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Continuing Grant
Student Travel Support for 2016 USENIX Annual Technical Conference
2016 年 USENIX 年度技术会议的学生旅行支持
  • 批准号:
    1632170
  • 财政年份:
    2016
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Standard Grant
BIGDATA: Collaborative Research: F: Holistic Optimization of Data-Driven Applications
BIGDATA:协作研究:F:数据驱动应用程序的整体优化
  • 批准号:
    1546543
  • 财政年份:
    2015
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Standard Grant
CAREER: Combating Performance Bugs in Software Systems
职业:对抗软件系统中的性能错误
  • 批准号:
    1514189
  • 财政年份:
    2014
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Continuing Grant
XPS: FULL: CCA: Production-Run Failure Recovery Based Approach to Reliable Parallel Software
XPS:完整:CCA:基于生产运行故障恢复的可靠并行软件方法
  • 批准号:
    1439091
  • 财政年份:
    2014
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Standard Grant
CAREER: Combating Performance Bugs in Software Systems
职业:对抗软件系统中的性能错误
  • 批准号:
    1054616
  • 财政年份:
    2011
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Continuing Grant
Fighting Concurrency Bugs through Effect-Oriented Approaches
通过面向效果的方法对抗并发错误
  • 批准号:
    1018180
  • 财政年份:
    2010
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Standard Grant

相似国自然基金

复合低维拓扑材料中等离激元增强光学响应的研究
  • 批准号:
    12374288
  • 批准年份:
    2023
  • 资助金额:
    52 万元
  • 项目类别:
    面上项目
基于管理市场和干预分工视角的消失中等企业:特征事实、内在机制和优化路径
  • 批准号:
    72374217
  • 批准年份:
    2023
  • 资助金额:
    41.00 万元
  • 项目类别:
    面上项目
托卡马克偏滤器中等离子体的多尺度算法与数值模拟研究
  • 批准号:
    12371432
  • 批准年份:
    2023
  • 资助金额:
    43.5 万元
  • 项目类别:
    面上项目
中等质量黑洞附近的暗物质分布及其IMRI系统引力波回波探测
  • 批准号:
    12365008
  • 批准年份:
    2023
  • 资助金额:
    32 万元
  • 项目类别:
    地区科学基金项目
中等垂直风切变下非对称型热带气旋快速增强的物理机制研究
  • 批准号:
    42305004
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: CSR: Medium: Scaling Secure Serverless Computing on Heterogeneous Datacenters
协作研究:CSR:中:在异构数据中心上扩展安全无服务器计算
  • 批准号:
    2312206
  • 财政年份:
    2023
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Continuing Grant
Collaborative Research: CSR: Medium: Architecting GPUs for Practical Homomorphic Encryption-based Computing
协作研究:CSR:中:为实用的同态加密计算构建 GPU
  • 批准号:
    2312276
  • 财政年份:
    2023
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Continuing Grant
Collaborative Research: CSR: Medium: Fortuna: Characterizing and Harnessing Performance Variability in Accelerator-rich Clusters
合作研究:CSR:Medium:Fortuna:表征和利用富含加速器的集群中的性能变异性
  • 批准号:
    2312689
  • 财政年份:
    2023
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Continuing Grant
Collaborative Research: CSR: Medium: Fortuna: Characterizing and Harnessing Performance Variability in Accelerator-rich Clusters
合作研究:CSR:Medium:Fortuna:表征和利用富含加速器的集群中的性能变异性
  • 批准号:
    2401244
  • 财政年份:
    2023
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Continuing Grant
Collaborative Research: CSR: Medium: Scaling Secure Serverless Computing on Heterogeneous Datacenters
协作研究:CSR:中:在异构数据中心上扩展安全无服务器计算
  • 批准号:
    2312207
  • 财政年份:
    2023
  • 资助金额:
    $ 28.2万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了