XPS: FULL: CCA: Production-Run Failure Recovery Based Approach to Reliable Parallel Software

XPS:完整:CCA:基于生产运行故障恢复的可靠并行软件方法

基本信息

  • 批准号:
    1439091
  • 负责人:
  • 金额:
    $ 75万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2014
  • 资助国家:
    美国
  • 起止时间:
    2014-08-01 至 2018-07-31
  • 项目状态:
    已结题

项目摘要

Title: XPS: FULL: CCA: Production-Run Failure Recovery Based Approach to Reliable Parallel SoftwareConcurrency bugs are a severe threat to system reliability in the multi-core era. Approaches to handling concurrency bugs and improving the reliability of production-run parallel software are sorely needed. This project aims to create a new parallel computing paradigm. The intellectual merits are that the project will pioneer treating run-time failure recovery as default for parallel programs, and reshaping every aspect of parallel-program development and maintenance. The project's broader significance and importance are that it will help lower the costs of software development, in-house testing, failure diagnosis, and bug repair, broadly benefiting society through better-performing parallel software.Specifically, the proposed framework will include five components: (1) a feather-weight run-time recovery framework that utilize natural program idempotence to obtain natural concurrency-bug failure recovery; (2) a new code-development system that guide developers to write software with improved recoverability; (3) a new in-house testing system, where the testing focus is shifted towards hard-to-recover code; (4) a new on-demand run-time monitoring system that leverages on-demand run-time monitoring for run-time recovery; (5) a new off-line failure diagnosis system that leverages the feedback from recovery for failure diagnosis and fixing. These five components will work together to significantly improve the reliability and lower the development cost of parallel software.
标题:XPS:Full:CCA:基于生产运行故障恢复的可靠并行软件方法并发错误在多核时代是对系统可靠性的严重威胁。迫切需要处理并发错误和提高生产运行的并行软件的可靠性的方法。该项目旨在创建一种新的并行计算范例。其智力优势在于,该项目将率先将运行时故障恢复作为并行程序的默认设置,并重塑并行程序开发和维护的方方面面。该项目更广泛的意义和重要性在于,它将有助于降低软件开发、内部测试、故障诊断和错误修复的成本,通过性能更好的并行软件广泛地造福社会。具体地说,建议的框架将包括五个组件:(1)羽毛重量的运行时恢复框架,利用自然的程序幂等来获得自然的并发错误故障恢复;(2)新的代码开发系统,引导开发人员编写具有更高可恢复性的软件;(3)新的内部测试系统,其中测试重点转移到难以恢复的代码上;(4)新的按需运行时监控系统,利用按需运行时监控进行运行时恢复;(5)新的离线故障诊断系统,利用恢复的反馈进行故障诊断和修复。这五个组件将协同工作,显著提高并行软件的可靠性,降低开发成本。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Shan Lu其他文献

The Research of Enterprise Informatization Upgrade Investment Resource Allocation
企业信息化升级投资资源配置研究
Design of a sector bowtie nano-rectenna for optical power and infrared detection
用于光功率和红外检测的扇形领结纳米整流天线的设计
  • DOI:
    10.1007/s11467-015-0508-7
  • 发表时间:
    2015-10
  • 期刊:
  • 影响因子:
    7.5
  • 作者:
    Kai Wang;Haifeng Hu;Shan Lu;Lingju Guo;Tao He
  • 通讯作者:
    Tao He
Microbacterium chengjingii sp. nov. and Microbacterium fandaimingii sp. nov., isolated from bat faeces of Hipposideros and Rousettus species.
城津微杆菌
Generalized construction of signature code for multiple-access adder channel
多路访问加法器通道签名代码的广义构造
Decoding for non-binary signature code
非二进制签名代码的解码

Shan Lu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Shan Lu', 18)}}的其他基金

CSR: Medium: Improving the Interface between Machine Learning and Software Systems
CSR:中:改进机器学习和软件系统之间的接口
  • 批准号:
    2313190
  • 财政年份:
    2023
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
NSF Student Travel Grant for 2020 ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)
NSF 学生旅费资助 2020 年 ACM 国际编程语言和操作系统架构支持会议 (ASPLOS)
  • 批准号:
    1936025
  • 财政年份:
    2020
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
CNS Core: Medium: Accurate Anytime Learning for Energy andTimeliness in Software Systems
CNS 核心:中:随时准确学习软件系统的能量和及时性
  • 批准号:
    1956180
  • 财政年份:
    2020
  • 资助金额:
    $ 75万
  • 项目类别:
    Continuing Grant
Student Travel Support for 2016 USENIX Annual Technical Conference
2016 年 USENIX 年度技术会议的学生旅行支持
  • 批准号:
    1632170
  • 财政年份:
    2016
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
CSR: Medium:Collaborative Research:Holistic, Cross-Site, Hybrid System Anomaly Debugging for Large Scale Hosting Infrastructures
CSR:中:协作研究:大规模托管基础设施的整体、跨站点、混合系统异常调试
  • 批准号:
    1514256
  • 财政年份:
    2015
  • 资助金额:
    $ 75万
  • 项目类别:
    Continuing Grant
BIGDATA: Collaborative Research: F: Holistic Optimization of Data-Driven Applications
BIGDATA:协作研究:F:数据驱动应用程序的整体优化
  • 批准号:
    1546543
  • 财政年份:
    2015
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
CAREER: Combating Performance Bugs in Software Systems
职业:对抗软件系统中的性能错误
  • 批准号:
    1514189
  • 财政年份:
    2014
  • 资助金额:
    $ 75万
  • 项目类别:
    Continuing Grant
CAREER: Combating Performance Bugs in Software Systems
职业:对抗软件系统中的性能错误
  • 批准号:
    1054616
  • 财政年份:
    2011
  • 资助金额:
    $ 75万
  • 项目类别:
    Continuing Grant
Fighting Concurrency Bugs through Effect-Oriented Approaches
通过面向效果的方法对抗并发错误
  • 批准号:
    1018180
  • 财政年份:
    2010
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant

相似国自然基金

钴基Full-Heusler合金的掺杂效应和薄膜噪声特性研究
  • 批准号:
    51871067
  • 批准年份:
    2018
  • 资助金额:
    60.0 万元
  • 项目类别:
    面上项目

相似海外基金

XPS: FULL: CCA: Collaborative Research: SPARTA: a Stream-based Processor And Run-Time Architecture
XPS:完整:CCA:协作研究:SPARTA:基于流的处理器和运行时架构
  • 批准号:
    1547036
  • 财政年份:
    2015
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
XPS: FULL: CCA: Cymric: A Flexible Processor-Near-Memory System Architecture
XPS:完整:CCA:Cymric:灵活的处理器近内存系统架构
  • 批准号:
    1533767
  • 财政年份:
    2015
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
XPS: FULL: CCA: Collaborative Research: Automatically Scalable Computation
XPS:完整:CCA:协作研究:自动可扩展计算
  • 批准号:
    1533663
  • 财政年份:
    2015
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
XPS: FULL: CCA: Collaborative Research: Automatically Scalable Computation
XPS:完整:CCA:协作研究:自动可扩展计算
  • 批准号:
    1533737
  • 财政年份:
    2015
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
XPS: FULL: CCA: NUMB: Exploiting Non-Uniform Memory Bandwidth for Computational Science
XPS:FULL:CCA:NUMB:利用非均匀内存带宽进行计算科学
  • 批准号:
    1533885
  • 财政年份:
    2015
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
XPS: Full: CCA: Enhancing Scalability and Energy Efficiency in Extreme-Scale Parallel Systems through Application-Aware Communication Reduction
XPS:完整:CCA:通过减少应用程序感知通信来增强超大规模并行系统的可扩展性和能源效率
  • 批准号:
    1438286
  • 财政年份:
    2014
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
XPS: FULL: CCA: Collaborative Research: CASH: Cost-aware Adaptation of Software and Hardware
XPS:完整:CCA:协作研究:CASH:软件和硬件的成本意识适应
  • 批准号:
    1439156
  • 财政年份:
    2014
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
XPS: FULL: CCA: Collaborative Research: SPARTA: a Stream-based Processor And Run-Time Architecture
XPS:完整:CCA:协作研究:SPARTA:基于流的处理器和运行时架构
  • 批准号:
    1439165
  • 财政年份:
    2014
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
XPS: FULL: CCA: Collaborative Research: Automatically Scalable Computation
XPS:完整:CCA:协作研究:自动可扩展计算
  • 批准号:
    1438983
  • 财政年份:
    2014
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
XPS: FULL: CCA: Collaborative Research: SPARTA: a Stream-based Processor And Run-Time Architecture
XPS:完整:CCA:协作研究:SPARTA:基于流的处理器和运行时架构
  • 批准号:
    1439097
  • 财政年份:
    2014
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了