Toward Automatic Failure Diagnosis in the Cloud
迈向云端自动故障诊断
基本信息
- 批准号:435805-2013
- 负责人:
- 金额:$ 1.46万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2015
- 资助国家:加拿大
- 起止时间:2015-01-01 至 2016-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
As software systems have grown in size, complexity, and cost, it has become increasingly difficult to deliver bullet-proof software, resulting in many software failures in the production environment. This unfortunate fact is further exacerbated by the trend of cloud-computing - where it is even harder to make systems failure-free as they become increasingly more complex and distributed.
Once a failure occurs in these production systems (i.e., users cannot receive expected service), vendors need to trouble-shoot it as quickly as possible since every minute of the downtime is costly. For example, it is estimated that Amazon will lose 2 million dollars for every hour of its downtime. Such downtime will be even more catastrophic for those mission-critical software services. For example, the 2003 northeast blackout, caused by a software bug, resulted in over 10 million people in Ontario and 45 million in U.S. out of power for at least 7 hours. Later an investigation report attributes "failure to provide effective real-time diagnostic support" as one of the main reasons behind such great damage. Finally, a prolonged trouble-shooting process will frustrate the users and significantly erode the vendor's reputation.
The goal of this proposed research is to expedite the trouble-shooting of failures in cloud-based distributed systems. In particular, my research consists three progressive thrusts that together, will automate the diagnosis of such failures, and significantly reduce the downtime of the cloud-based distributed systems.
随着软件系统在规模、复杂性和成本上的增长,交付防弹软件变得越来越困难,导致生产环境中的许多软件故障。云计算的趋势进一步加剧了这一不幸的事实--在云计算中,随着系统变得越来越复杂和分布式,使系统无故障变得更加困难。
一旦这些生产系统中发生故障(即,用户无法获得预期的服务),供应商需要尽快排除故障,因为停机时间的每一分钟都是昂贵的。例如,据估计,亚马逊每停机一小时将损失200万美元。对于那些关键任务软件服务来说,这种停机时间将更具灾难性。例如,2003年东北部停电,由软件错误引起,导致安大略超过1000万人和美国4500万人断电至少7小时。后来的一份调查报告将“未能提供有效的实时诊断支持”列为造成如此巨大损害的主要原因之一。最后,长时间的故障排除过程将使用户感到沮丧,并严重损害供应商的声誉。
这项研究的目标是加快基于云的分布式系统故障的故障排除。特别是,我的研究包括三个渐进的推力,它们将自动诊断此类故障,并显着减少基于云的分布式系统的停机时间。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yuan, Ding其他文献
Understanding divergent substrate stereoselectivity in the isothiourea-catalysed conjugate addition of cyclic α-substituted β-ketoesters to α,β-unsaturated aryl esters.
- DOI:
10.1039/d3sc05470e - 发表时间:
2023-12-13 - 期刊:
- 影响因子:8.4
- 作者:
Yuan, Ding;Goodfellow, Alister S.;Kasten, Kevin;Duan, Zhuan;Kang, Tengfei;Cordes, David B.;Mckay, Aidan P.;Buhl, Michael;Boyce, Gregory R.;Smith, Andrew D. - 通讯作者:
Smith, Andrew D.
Relief Effects of Icariin on Inflammation-Induced Decrease of Tight Junctions in Intestinal Epithelial Cells.
- DOI:
10.3389/fphar.2022.903762 - 发表时间:
2022 - 期刊:
- 影响因子:5.6
- 作者:
Li, Yanli;Liu, Jie;Pongkorpsakol, Pawin;Xiong, Zhengguo;Li, Li;Jiang, Xuemei;Zhao, Haixia;Yuan, Ding;Zhang, Changcheng;Guo, Yuhui;Dun, Yaoyan - 通讯作者:
Dun, Yaoyan
Adaptive complementary filter using fuzzy logic and simultaneous perturbation stochastic approximation algorithm
使用模糊逻辑和同时扰动随机逼近算法的自适应互补滤波器
- DOI:
10.1016/j.measurement.2012.01.011 - 发表时间:
2012-06-01 - 期刊:
- 影响因子:5.6
- 作者:
Shen, Xiaowei;Yao, Minli;Yuan, Ding - 通讯作者:
Yuan, Ding
Successful surgical management of a ruptured popliteal artery aneurysm with acute common peroneal nerve neuropathy: A rare case
- DOI:
10.1177/1708538120950870 - 发表时间:
2020-08-24 - 期刊:
- 影响因子:1.1
- 作者:
Wang, Tiehao;Zhao, Jichun;Yuan, Ding - 通讯作者:
Yuan, Ding
Lithium ion battery separator with improved performance via side-by-side bicomponent electrospinning of PVDF-HFP/PI followed by 3D thermal crosslinking
通过 PVDF-HFP/PI 并排双组分静电纺丝随后进行 3D 热交联,提高了锂离子电池隔膜的性能
- DOI:
10.1016/j.jpowsour.2020.228123 - 发表时间:
2020-06-15 - 期刊:
- 影响因子:9.2
- 作者:
Cai, Ming;Yuan, Ding;Ning, Xin - 通讯作者:
Ning, Xin
Yuan, Ding的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yuan, Ding', 18)}}的其他基金
Fully Automated Software Logging
全自动软件记录
- 批准号:
RGPIN-2018-04932 - 财政年份:2022
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Fully Automated Software Logging
全自动软件记录
- 批准号:
RGPIN-2018-04932 - 财政年份:2021
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Efficient log data compression and analytics system
高效的日志数据压缩和分析系统
- 批准号:
570524-2021 - 财政年份:2021
- 资助金额:
$ 1.46万 - 项目类别:
Alliance Grants
Fully Automated Software Logging
全自动软件记录
- 批准号:
RGPIN-2018-04932 - 财政年份:2020
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Fully Automated Software Logging
全自动软件记录
- 批准号:
RGPIN-2018-04932 - 财政年份:2019
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Fully Automated Software Logging
全自动软件记录
- 批准号:
RGPIN-2018-04932 - 财政年份:2018
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
相似海外基金
Automatic Failure Localization and Diagnosis for Cloud Computing Applications
云计算应用的自动故障定位与诊断
- 批准号:
511196-2017 - 财政年份:2017
- 资助金额:
$ 1.46万 - 项目类别:
Engage Grants Program
Tail Risk Assessment for Serious Failure of RC Bridges with Automatic Design Drawing Restoration System
利用设计图自动修复系统对钢筋混凝土桥梁严重失效进行尾部风险评估
- 批准号:
17H04932 - 财政年份:2017
- 资助金额:
$ 1.46万 - 项目类别:
Grant-in-Aid for Young Scientists (A)
Toward Automatic Failure Diagnosis in the Cloud
迈向云端自动故障诊断
- 批准号:
435805-2013 - 财政年份:2017
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Toward Automatic Failure Diagnosis in the Cloud
迈向云端自动故障诊断
- 批准号:
435805-2013 - 财政年份:2016
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Machine learning modeling of automatic detection of failure modes in IMRT patient-specific QA
IMRT 患者特定 QA 中故障模式自动检测的机器学习模型
- 批准号:
16K19226 - 财政年份:2016
- 资助金额:
$ 1.46万 - 项目类别:
Grant-in-Aid for Young Scientists (B)
Elucidation of mechanism of self-excited Pressure Vibration using an Automatic Pressure-Reducing Valve and fatigue failure of pipeline and establishment of preventive measures
自动减压阀自激压力振动与管道疲劳失效机理的阐明及预防措施的制定
- 批准号:
15K07650 - 财政年份:2015
- 资助金额:
$ 1.46万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Toward Automatic Failure Diagnosis in the Cloud
迈向云端自动故障诊断
- 批准号:
435805-2013 - 财政年份:2014
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Toward Automatic Failure Diagnosis in the Cloud
迈向云端自动故障诊断
- 批准号:
435805-2013 - 财政年份:2013
- 资助金额:
$ 1.46万 - 项目类别:
Discovery Grants Program - Individual
Studies on the efficacy of automatic micro bubble test on the evaluation of respiratory failure in ARDS
自动微泡试验评估ARDS呼吸衰竭的疗效研究
- 批准号:
20592126 - 财政年份:2008
- 资助金额:
$ 1.46万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Study on Extraction Method of Failure Signal and Automatic Generation Method of Feature Parameters
故障信号提取方法及特征参数自动生成方法研究
- 批准号:
10650148 - 财政年份:1998
- 资助金额:
$ 1.46万 - 项目类别:
Grant-in-Aid for Scientific Research (C)