DIADEM: debugging made dependable and measurable
DIADEM:调试变得可靠且可衡量
基本信息
- 批准号:EP/W012308/1
- 负责人:
- 金额:$ 41.39万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2022
- 资助国家:英国
- 起止时间:2022 至 无数据
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Software quality is increasingly critical to most of humanity. Bugs in software claim a huge annual toll, financially and even in human life. To eliminate bugs, developers depend crucially on their tools. Tools for interactive debugging are vital. They alone provide a high- (source-) level view of a running (binary) program, enabling programmers to 'see the bug' as it occurs in the program running in front of them. However, debugging infrastructure is notoriously unreliable, as it works only if various metadata is complete and correct. If not, the programmer sees a partial or incorrect view, which may be useless or actively misleading.These problems occur often in popular languages (e.g. C, C++, Rust, Go), owing to a tension between debuggability and optimisation. Debugging in these languages works by compiler-generated /metadata/ describing how binary (executable) code relates to source (human-written) code. Metadata generation is 'best-effort', and optimisation frequently introduces flaws -- but simply disabling optimisations is seldom an option. Programmers rely on optimisations to relieve them of much hand-tuning. Without them, code may run tens of times slower. Furthermore, some bugs appear only in optimised code, owing to undefined behaviour (underspecification) in the source language.This problem is extremely challenging. The heart of it is that writing compiler optimisations that preserve debuggability demands extra effort, on what are already intricate code transformations ('passes'). In practice corners are cut, leaving the output metadata approximate. To be acceptable to pass authors, improvements must reshape the effort/reward curve without increasing the task's baseline complexity. Unlike performance, debugging so far lacks quantitative benchmarks, so compiler authors have not prioritised or competed on debuggability.Existing techniques amount to interaction-based testing of debugging, often with features for narrowing down which passes introduced a flaw. This is haphazard, since exploring all metadata includes the already-hard problem of achieving full-coverage tests (to 'drive' the debugger over all program locations). We propose instead to analyse metadata as an artifact in its own right. This means instead of tests that interact with a single concrete execution through a debugger, we must devise a custom systematic, symbolic method for exploring the compiled code, evaluating the correctness of metadata in a mathematical manner. Unlike haphazard testing, this promises systematic measurement of lost coverage and correctness; the latter can (we hypothesise) be automated using recent advances in formal specification of source languages, namely /executable semantics/, as a replacement for the current manual practices. This idea of parallel source- and binary-level exploration also suggests a radical approach: post-hoc synthesis of metadata, relieving the compiler of generating it at all. The idea here builds on successful work on neighbouring problems (translation validation and decompilation).The project will proceed by practical methods, experimenting on a real production compiler (LLVM). It will build novel tools embeddable into existing compiler-testing workflows, both to diagnose compiler bugs and to quantify the improvement from fixing them. It will empirically explore abstractions and helpers used internally in compilers, to devise designs making them measurably more debug-preserving. Finally it will build a novel tool exploring the radical idea of synthesising high-quality metadata in post-hoc fashion, outside the compiler. It will develop metrics allowing quantitative comparison against traditional approaches. The beneficiaries are on many levels: compiler authors, software developers at large, and the general public who use or depend on the affected software.
软件质量对大多数人来说越来越重要。软件中的漏洞每年都会造成巨大的经济损失,甚至会影响到人类的生活。为了消除bug,开发人员非常依赖他们的工具。交互式调试工具至关重要。它们单独提供了一个正在运行的(二进制)程序的高级(源代码)视图,使程序员能够在程序运行时“看到错误”。然而,调试基础设施是众所周知的不可靠,因为它只有在各种元数据完整和正确的情况下才能工作。如果没有,程序员会看到一个不完整或不正确的视图,这可能是无用的或积极误导。这些问题经常发生在流行的语言(例如C,C++,Rust,Go)中,这是由于可调试性和优化之间的紧张关系。在这些语言中,编译器通过编译器生成的/元数据/描述二进制(可执行)代码如何与源代码(人类编写的)代码相关来工作。元数据的生成是“尽最大努力”的,优化经常会引入缺陷--但简单地禁用优化很少是一种选择。程序员依靠优化来减轻他们的手工调整。如果没有它们,代码可能会慢几十倍。此外,由于源语言中未定义的行为(未指定),一些错误只出现在优化的代码中。它的核心是,编写保持可调试性的编译器优化需要额外的努力,在已经复杂的代码转换(“通道”)上。在实践中,会进行切角,使输出元数据近似。为了让通过测试的作者接受,改进必须在不增加任务的基线复杂性的情况下重塑努力/回报曲线。与性能不同,调试到目前为止缺乏定量基准,因此编译器作者没有优先考虑或竞争可调试性。现有技术相当于基于交互的调试测试,通常具有缩小哪些通道引入缺陷的功能。这是偶然的,因为探索所有元数据包括实现全覆盖测试(在所有程序位置上“驱动”调试器)已经很难的问题。相反,我们建议将元数据作为一个工件进行分析。这意味着,我们必须设计一种自定义的系统化、符号化的方法来探索编译后的代码,以数学方式评估元数据的正确性,而不是通过调试器与单个具体执行交互的测试。不像偶然的测试,这承诺系统的测量丢失的覆盖率和正确性;后者可以(我们假设)自动使用最近的进展,在正式规范的源语言,即/可执行语义/,作为当前的手动操作的替代。这种并行源代码和二进制级别探索的想法也提出了一种激进的方法:事后综合元数据,从而使编译器根本不必生成元数据。这里的想法建立在对邻近问题(翻译验证和反编译)的成功工作的基础上。该项目将通过实用的方法进行,在真实的生产编译器(LLVM)上进行实验。它将构建可嵌入现有编译器测试工作流程的新工具,以诊断编译器错误并量化修复它们的改进。它将经验性地探索编译器内部使用的抽象和帮助程序,以设计出更好的调试保护设计。最后,它将构建一个新颖的工具,探索在编译器之外以事后方式合成高质量元数据的激进思想。它将制定衡量标准,以便与传统方法进行定量比较。受益者来自多个层面:编译器作者、广大软件开发人员以及使用或依赖受影响软件的公众。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Accurate Coverage Metrics for Compiler-Generated Debugging Information
编译器生成的调试信息的准确覆盖率指标
- DOI:10.1145/3640537.3641578
- 发表时间:2024
- 期刊:
- 影响因子:0
- 作者:Stinnett J
- 通讯作者:Stinnett J
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Stephen Kell其他文献
Rethinking software connectors
- DOI:
10.1145/1294917.1294918 - 发表时间:
2007-09 - 期刊:
- 影响因子:0
- 作者:
Stephen Kell - 通讯作者:
Stephen Kell
Black-box composition of mismatched software components
不匹配的软件组件的黑盒组合
- DOI:
- 发表时间:
2012 - 期刊:
- 影响因子:0
- 作者:
Stephen Kell - 通讯作者:
Stephen Kell
Convivial design heuristics for software systems
软件系统的欢乐设计启发法
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Stephen Kell - 通讯作者:
Stephen Kell
Reliable and fast DWARF-based stack unwinding
可靠且快速的基于 DWARF 的堆栈展开
- DOI:
10.1145/3360572 - 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
T. Bastian;Stephen Kell;Francesco Zappa Nardelli - 通讯作者:
Francesco Zappa Nardelli
A Survey of Practical Software Adaptation Techniques
- DOI:
10.3217/jucs-014-13-2110 - 发表时间:
2008 - 期刊:
- 影响因子:0
- 作者:
Stephen Kell - 通讯作者:
Stephen Kell
Stephen Kell的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
CAREER: FET: A Top-down Compilation Infrastructure for Optimization and Debugging in the Noisy Intermediate Scale Quantum (NISQ) era
职业:FET:用于噪声中级量子 (NISQ) 时代优化和调试的自上而下的编译基础设施
- 批准号:
2421059 - 财政年份:2024
- 资助金额:
$ 41.39万 - 项目类别:
Continuing Grant
CAREER: Advancing Neural Testing and Debugging of Software
职业:推进软件的神经测试和调试
- 批准号:
2238045 - 财政年份:2023
- 资助金额:
$ 41.39万 - 项目类别:
Continuing Grant
An Individual Investigator Development Plan to Improve Undergraduate Debugging Skills and Mindset
提高本科生调试技能和心态的个人研究者发展计划
- 批准号:
2321255 - 财政年份:2023
- 资助金额:
$ 41.39万 - 项目类别:
Standard Grant
Utilizing Artificial Intelligence to Improve the Testing and Debugging of Concurrent Software
利用人工智能改进并发软件的测试和调试
- 批准号:
RGPIN-2018-06588 - 财政年份:2022
- 资助金额:
$ 41.39万 - 项目类别:
Discovery Grants Program - Individual
Testing and Debugging Machine Learning-based Autonomous Systems
测试和调试基于机器学习的自治系统
- 批准号:
RGPIN-2020-04035 - 财政年份:2022
- 资助金额:
$ 41.39万 - 项目类别:
Discovery Grants Program - Individual
Inferring rich input structure for software debugging and defence
推断丰富的输入结构用于软件调试和防御
- 批准号:
RGPIN-2020-06394 - 财政年份:2022
- 资助金额:
$ 41.39万 - 项目类别:
Discovery Grants Program - Individual
Testing, Debugging and Repairing Machine Learning Software at the System Level
系统级测试、调试和修复机器学习软件
- 批准号:
RGPAS-2021-00034 - 财政年份:2022
- 资助金额:
$ 41.39万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Monitoring and Debugging of High Performance Distributed Heterogeneous Cloud Applications
高性能分布式异构云应用的监控和调试
- 批准号:
554158-2020 - 财政年份:2022
- 资助金额:
$ 41.39万 - 项目类别:
Alliance Grants
Testing, Debugging and Repairing Machine Learning Software at the System Level
系统级测试、调试和修复机器学习软件
- 批准号:
RGPIN-2021-02549 - 财政年份:2022
- 资助金额:
$ 41.39万 - 项目类别:
Discovery Grants Program - Individual
Reinventing the tuning and debugging tools for multi-thousand cores computer systems
重新发明数千核计算机系统的调优和调试工具
- 批准号:
RGPIN-2017-05634 - 财政年份:2022
- 资助金额:
$ 41.39万 - 项目类别:
Discovery Grants Program - Individual