CI-NEW: Collaborative Research: Computer System Failure Data Repository to Enable Data-Driven Dependability

CI-NEW:协作研究:计算机系统故障数据存储库以实现数据驱动的可靠性

基本信息

  • 批准号:
    1513197
  • 负责人:
  • 金额:
    $ 76.33万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2015
  • 资助国家:
    美国
  • 起止时间:
    2015-07-01 至 2019-06-30
  • 项目状态:
    已结题

项目摘要

Dependability has become a necessary requisite property for many of the computer systems that surround us or work behind the scenes to support our personal and professional lives. Heroic progress has been made by computer systems researchers and practitioners working together to build and deploy dependable systems. However, an overwhelming majority of this work is not based on real publicly available failure data. Unfortunately, an open failure data repository for any recent computing infrastructure that is large enough, diverse enough and with enough information about the infrastructure and the applications that run on them does not exist. This project will address this pressing need. The research team appreciates that this effort is challenging on many levels. Failure data are considered sensitive and are usually unveiled only before trusting eyes of a small subset of the people at the organization. As part of a current one-year planning grant, this team has collected specific requirements for the repository from a wide audience, collected failure and usage data from the largest centrally managed computing cluster at Purdue and performed preliminary analysis to reveal the workload usage patterns. The goal of this full-scale project is to collect data from a variety of computational infrastructure at the two participating universities, and from several of the NSF-funded large cyberinfrastructure projects.The project will collect, curate, and present public failure data of large-scale computing systems in a repository called FRESCO. The data sets will include static information, dynamic information about the workloads, and failure information for both planned and unplanned outages. The data collection from production machines will have to obey several practical constraints -- no changes to the workload, little performance perturbation, and minimal changes to the operating system. Further, the data have to be sanitized for removing sensitive information and processed to make it interpretable by a broad group of researchers. This project will also provide analysis tools to answer certain commonly occurring questions, such as the correlation between workload and failure and the performance implications of using one library over another, as well as an intuitive graphical front-end which will allow people to explore the data sets and download the relevant ones.Widespread use of the data and the associated analysis tools will give computer systems researchers an unprecedented ability to do data-driven research and offer computing infrastructure providers an analytic-driven capability to run more efficient reliable infrastructures.
可靠性已经成为我们周围的许多计算机系统的必要属性,或者在幕后工作,以支持我们的个人和职业生活。计算机系统研究人员和从业人员共同努力,建立和部署可靠的系统,取得了巨大的进步。然而,绝大多数的工作并不是基于真实的公开可用的故障数据。不幸的是,目前还不存在一个开放的故障数据存储库,它适用于任何一个足够大、足够多样化的最新计算基础设施,并且包含有关基础设施及其上运行的应用程序的足够信息。这个项目将解决这一迫切需求。研究团队意识到这项工作在很多层面上都是具有挑战性的。失败数据被认为是敏感的,通常只会在组织中一小部分人的信任面前公布。作为当前一年计划拨款的一部分,该团队从广泛的受众中收集了存储库的特定需求,从Purdue最大的集中管理计算集群中收集了故障和使用数据,并执行了初步分析以揭示工作负载使用模式。这个全面项目的目标是从两所参与大学的各种计算基础设施中收集数据,以及从几个nsf资助的大型网络基础设施项目中收集数据。该项目将在一个名为FRESCO的存储库中收集、整理和呈现大规模计算系统的公共故障数据。数据集将包括静态信息、有关工作负载的动态信息以及计划内和计划外中断的故障信息。从生产机器收集的数据必须遵守几个实际的约束条件——不改变工作负载,不影响性能,不改变操作系统。此外,这些数据必须经过净化,以去除敏感信息,并经过处理,使其能够被广泛的研究人员解释。该项目还将提供分析工具来回答某些常见的问题,例如工作负载和故障之间的相关性以及使用一个库而不是另一个库的性能影响,以及直观的图形化前端,它将允许人们探索数据集并下载相关的数据集。数据和相关分析工具的广泛使用将使计算机系统研究人员获得前所未有的数据驱动研究能力,并为计算基础设施提供商提供分析驱动能力,以运行更高效可靠的基础设施。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Saurabh Bagchi其他文献

Intrusion detection in voice over IP environments
  • DOI:
    10.1007/s10207-008-0071-0
  • 发表时间:
    2008-12-16
  • 期刊:
  • 影响因子:
    3.200
  • 作者:
    Yu-Sung Wu;Vinita Apte;Saurabh Bagchi;Sachin Garg;Navjot Singh
  • 通讯作者:
    Navjot Singh
Erratum to: ‘MicroRNA target prediction using thermodynamic and sequence curves’
  • DOI:
    10.1186/s12864-016-2367-1
  • 发表时间:
    2016-03-09
  • 期刊:
  • 影响因子:
    3.700
  • 作者:
    Asish Ghoshal;Raghavendran Shankar;Saurabh Bagchi;Ananth Grama;Somali Chaterji
  • 通讯作者:
    Somali Chaterji
A Survey Article on Wormhole Attack Detection and Security in Wireless Sensor Networks
关于无线传感器网络中虫洞攻击检测和安全的调查文章
  • DOI:
    10.5120/ijca2017915666
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Gaurav Tejpal;Sonal Sharma;Khalil;Issa;Saurabh Bagchi;N. Shroff;S. Krishnamurthy
  • 通讯作者:
    S. Krishnamurthy
Reliable and Efficient Distributed Checkpointing System for Grid Environments
  • DOI:
    10.1007/s10723-014-9297-4
  • 发表时间:
    2014-05-20
  • 期刊:
  • 影响因子:
    2.900
  • 作者:
    Tanzima Zerin Islam;Saurabh Bagchi;Rudolf Eigenmann
  • 通讯作者:
    Rudolf Eigenmann

Saurabh Bagchi的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Saurabh Bagchi', 18)}}的其他基金

NSF Workshop on State-of-the-Art and Challenges in Resilience
美国国家科学基金会关于复原力的最新技术和挑战研讨会
  • 批准号:
    2140139
  • 财政年份:
    2021
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
CCRI: ENS: Collaborative Research: Open Computer System Usage Repository and Analytics Engine
CCRI:ENS:协作研究:开放计算机系统使用存储库和分析引擎
  • 批准号:
    2016704
  • 财政年份:
    2020
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
NSF Workshop on State-of-the-Art and Challenges in Resilience
美国国家科学基金会关于复原力的最新技术和挑战研讨会
  • 批准号:
    1845192
  • 财政年份:
    2018
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
CSR: Small: Diagnosing Performance and Correctness Errors in Parallel Applications at Large Scales
CSR:小:诊断大规模并行应用程序中的性能和正确性错误
  • 批准号:
    1527262
  • 财政年份:
    2015
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
CI-P: Computer System Failure Data Repository to Enable Data-Driven Dependability Research
CI-P:计算机系统故障数据存储库,支持数据驱动的可靠性研究
  • 批准号:
    1405906
  • 财政年份:
    2014
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
NeTS: Medium: Collaborative Research: Tango: Performance and Fault Management in Cellular Networks through Device-Network Cooperation
NeTS:媒介:协作研究:Tango:通过设备网络协作进行蜂窝网络的性能和故障管理
  • 批准号:
    1409506
  • 财政年份:
    2014
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Continuing Grant
Travel Grants for Attending the 29th IEEE Symposium on Reliable Distributed Systems (SRDS)
参加第 29 届 IEEE 可靠分布式系统 (SRDS) 研讨会的旅费补助
  • 批准号:
    1047647
  • 财政年份:
    2010
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
CSR: Small: Monitoring for Error Detection in Today's High Throughput Applications
CSR:小:监控当今高吞吐量应用程序中的错误检测
  • 批准号:
    0916337
  • 财政年份:
    2009
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
NeTS-NOSS: Robust Sensor Network Architecture through Neighborhood Monitoring and Isolation
NeTS-NOSS:通过邻域监控和隔离实现稳健的传感器网络架构
  • 批准号:
    0626830
  • 财政年份:
    2006
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
Sensors: Smart RF Antennas for Reliable and Real-Time Sensor Networks
传感器:用于可靠、实时传感器网络的智能射频天线
  • 批准号:
    0330016
  • 财政年份:
    2003
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant

相似海外基金

CI-New: Collaborative Research: Developing an Open Networked Airborne Computing Platform
CI-New:协作研究:开发开放式网络机载计算平台
  • 批准号:
    1953048
  • 财政年份:
    2019
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
CRI: CI-New: Collaborative Research: Extensible, Software Enabled Unmanned Aerial Vehicles
CRI:CI-New:协作研究:可扩展、软件支持的无人机
  • 批准号:
    1823230
  • 财政年份:
    2018
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Continuing Grant
CI-New: Collaborative Research: An Open Platform for Internet Routing Experiments
CI-New:协作研究:互联网路由实验的开放平台
  • 批准号:
    1835252
  • 财政年份:
    2018
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
CRI: CI-New: Collaborative Research: NJR: A Normalized Java Resource
CRI:CI-New:协作研究:NJR:标准化 Java 资源
  • 批准号:
    1823227
  • 财政年份:
    2018
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
CRI: CI-New: Collaborative Research: NJR: A Normalized Java Resource
CRI:CI-New:协作研究:NJR:标准化 Java 资源
  • 批准号:
    1823360
  • 财政年份:
    2018
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
CRI: CI-NEW: Collaborative Research: Constructing a Community-Wide Software Architecture Infrastructure
CRI:CI-NEW:协作研究:构建社区范围的软件架构基础设施
  • 批准号:
    1823214
  • 财政年份:
    2018
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
CRI: CI-NEW: Collaborative Research: Constructing a Community-Wide Software Architecture Infrastructure
CRI:CI-NEW:协作研究:构建社区范围的软件架构基础设施
  • 批准号:
    1823262
  • 财政年份:
    2018
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
CRI: CI-NEW: Collaborative Research: Constructing a Community-Wide Software Architecture Infrastructure
CRI:CI-NEW:协作研究:构建社区范围的软件架构基础设施
  • 批准号:
    1823354
  • 财政年份:
    2018
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
CRI: CI-NEW: Collaborative Research: Constructing a Community-Wide Software Architecture Infrastructure
CRI:CI-NEW:协作研究:构建社区范围的软件架构基础设施
  • 批准号:
    1823246
  • 财政年份:
    2018
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
CI-New: Collaborative Research: An Infrastructure that Combines Eye Tracking into Integrated Development Environments to Study Software Development and Program Comprehension
CI-New:协作研究:将眼动追踪结合到集成开发环境中以研究软件开发和程序理解的基础设施
  • 批准号:
    1855753
  • 财政年份:
    2018
  • 资助金额:
    $ 76.33万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了