Test FLARE (Test Flakiness Automated Reproduction and Explanation)
测试 FLARE(测试片状自动再现和解释)
基本信息
- 批准号:EP/X024539/1
- 负责人:
- 金额:$ 69.35万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2023
- 资助国家:英国
- 起止时间:2023 至 无数据
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
The cost of software failures is a huge burden to the worldwide economy that was estimated to be at least £1.3 trillion in 2017. Consequently, software testing, a vital defence against failures, contributes to a large proportion of software development effort and cost. Flaky tests are a particular strain on resources allocated to software development, because they intermittently pass and fail without changes to tests or project code, with often maddening, non-obvious causes. Flaky tests are tests that fundamentally do not always tell the truth: they can fail when code is working, and pass when it isn't. Because developers can no longer trust the results of their tests, they are unable to gain confidence that software is working correctly, potentially exposing end-users to the consequences of software failures. Flaky tests are a common occurrence in industry, significantly disrupting software development - even for companies with the greatest amount of resources to tackle them, such as Microsoft, Facebook, and Google.A test can produce different pass/fail (i.e., flaky) outcomes because of differing, unpredicted ways that the execution environment in which it runs interacts with its behaviour and/or the code that it tests. For instance, a machine may be experiencing a heavy concurrent task load, causing it to execute tests slowly, sometimes triggering timeouts in the code under test, and sometimes not. Or, network access is erratic on the testing infrastructure, meaning the availability of network resources may be compromised. Or, a program under test's logic is time and date dependent. These are just a few real examples of the different ways in which tests can be flaky. For some environmental conditions, the test passes, but in an alternative context, the same test fails.To remove flaky test behaviour, a developer has to modify test code or the code that it tests to control for aspects of its execution environment; i.e., the potential sources of its intermittent behaviour. But to accurately assess the differences in code execution behaviour and the places in the code that need to be changed, a developer must be able to reliably reproduce the differing pass/fail test outcomes. However, this not only involves recreating the environmental conditions that lead to the flaky behaviour, but also figuring out exactly what the environmental conditions were that caused the flakiness in the first place. Solving these issues and reproducing flaky tests manually can be extremely challenging for developers since the environmental conditions concerned (a) are intermittent; and (b) may be unrelated to anything the test is actually checking, and/or far-removed from the code being tested. Existing research techniques are insufficient for addressing these problems, and despite developer incentives for removing flakiness, Google, for instance, reports an astonishing one in seven tests as flaky.What the Test FLARE Project Will Do: The Test FLARE project will develop and empirically evaluate techniques capable of (1) automatically reproducing flaky behaviour that is due to the execution environment. It will also provide developers with (2) automated, human-readable explanations that help developers further understand the reasons for the flaky behaviour.
软件故障的成本对全球经济来说是一个巨大的负担,据估计,2017年至少有1.3万亿英镑。因此,软件测试,一个重要的防御失败,有助于软件开发的努力和成本的很大一部分。不稳定的测试对分配给软件开发的资源来说是一种特殊的压力,因为它们间歇性地通过和失败,而不需要对测试或项目代码进行更改,通常是令人抓狂的,不明显的原因。不完整的测试从根本上说并不总是说真话:当代码工作时,它们可能会失败,而当代码不工作时,它们可能会通过。由于开发人员不再信任他们的测试结果,他们无法获得软件正常工作的信心,这可能会使最终用户面临软件故障的后果。不稳定的测试在行业中很常见,严重破坏了软件开发-即使对于拥有最多资源来解决它们的公司,如Microsoft,Facebook和Google。因为它运行的执行环境与它的行为和/或它测试的代码交互的不同的、不可预测的方式而导致的结果。例如,一台机器可能正在经历沉重的并发任务负载,导致它缓慢地执行测试,有时会触发被测代码中的超时,有时不会。或者,测试基础设施上的网络访问不稳定,这意味着网络资源的可用性可能会受到影响。或者,被测程序的逻辑依赖于时间和日期。这些只是测试可以被验证的不同方式的几个真实的例子。对于某些环境条件,测试通过,但在另一个上下文中,相同的测试失败。为了删除重复测试行为,开发人员必须修改测试代码或其测试的代码以控制其执行环境的方面;即,其间歇性行为的潜在来源。但是,为了准确地评估代码执行行为的差异以及代码中需要更改的位置,开发人员必须能够可靠地重现不同的通过/失败测试结果。然而,这不仅涉及重新创建导致片状行为的环境条件,而且还涉及首先弄清楚导致片状行为的环境条件到底是什么。手动解决这些问题并重现测试对于开发人员来说是极具挑战性的,因为所涉及的环境条件(a)是间歇性的;以及(B)可能与测试实际检查的任何东西都无关,和/或远离被测试的代码。现有的研究技术不足以解决这些问题,尽管开发人员鼓励消除片状,例如,谷歌报告了令人惊讶的七分之一的测试作为测试。测试FLARE项目将做什么:测试FLARE项目将开发和经验评估技术,能够(1)自动再现由于执行环境导致的不稳定行为。它还将为开发人员提供(2)自动化的,人类可读的解释,帮助开发人员进一步了解恶意行为的原因。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Philip McMinn其他文献
Philip McMinn的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Philip McMinn', 18)}}的其他基金
RE-PRESENT: Automatic Repair of Presentation Failures in Web Applications
RE-PRESENT:自动修复 Web 应用程序中的演示失败
- 批准号:
EP/T015764/1 - 财政年份:2020
- 资助金额:
$ 69.35万 - 项目类别:
Research Grant
RE-COST: REducing the Cost of Oracles for Software Testing
RE-COST:降低软件测试的 Oracle 成本
- 批准号:
EP/I010386/1 - 财政年份:2011
- 资助金额:
$ 69.35万 - 项目类别:
Research Grant
Automated Discovery of Emergent Misbehaviour
自动发现紧急不当行为
- 批准号:
EP/G009600/1 - 财政年份:2009
- 资助金额:
$ 69.35万 - 项目类别:
Research Grant
相似国自然基金
长效GnRHa“flare-up”效应通过AMPK通路抑制子宫腺肌症患者卵泡发育的机制
- 批准号:81801418
- 批准年份:2018
- 资助金额:21.0 万元
- 项目类别:青年科学基金项目
相似海外基金
自己抗体介在性神経免疫疾患におけるimmune flare signatureの同定
自身抗体介导的神经免疫疾病中免疫耀斑特征的识别
- 批准号:
24K10660 - 财政年份:2024
- 资助金额:
$ 69.35万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Predicting disease flare and treatment response in inflammatory bowel disease
预测炎症性肠病的疾病发作和治疗反应
- 批准号:
MR/X023656/1 - 财政年份:2024
- 资助金额:
$ 69.35万 - 项目类别:
Fellowship
Simulations and multi-vantage-point observations of solar flare energetic electrons
太阳耀斑高能电子的模拟和多角度观测
- 批准号:
2888138 - 财政年份:2023
- 资助金额:
$ 69.35万 - 项目类别:
Studentship
SHINE: Understanding the Relationships of Photospheric Vector Magnetic Field Parameters in Solar Flare Occurrences using Graph-based Machine Learning Models
SHINE:使用基于图的机器学习模型了解太阳耀斑发生时光球矢量磁场参数的关系
- 批准号:
2301397 - 财政年份:2023
- 资助金额:
$ 69.35万 - 项目类别:
Standard Grant
Every Datapoint Counts: Atmosphere-aided Flare Studies in the Rubin era
每个数据点都很重要:鲁宾时代的大气辅助耀斑研究
- 批准号:
2308016 - 财政年份:2023
- 资助金额:
$ 69.35万 - 项目类别:
Standard Grant
SBIR Phase I: Artificial Intelligence (AI) chatbot providing flare-up support for patients with endometriosis
SBIR 第一阶段:人工智能 (AI) 聊天机器人为子宫内膜异位症患者提供紧急支持
- 批准号:
2304436 - 财政年份:2023
- 资助金额:
$ 69.35万 - 项目类别:
Standard Grant
A Deep (Learning) Dive into Solar Active Region Evolution and Flare Production
深入(学习)研究太阳活动区的演化和耀斑的产生
- 批准号:
2878047 - 财政年份:2023
- 资助金额:
$ 69.35万 - 项目类别:
Studentship
What causes low back pain to flare: Has a major opportunity to understand back pain been missed?
是什么导致腰痛发作:是否错过了了解背痛的重要机会?
- 批准号:
10709521 - 财政年份:2022
- 资助金额:
$ 69.35万 - 项目类别:
Collaborative Research: DKIST Critical Science: Study of Flare Producing Active Regions with Highest Resolution Observations and Data-based Magnetohydrodynamics (MHD) Modeling
合作研究:DKIST 关键科学:利用最高分辨率观测和基于数据的磁流体动力学 (MHD) 建模研究耀斑产生的活动区域
- 批准号:
2204385 - 财政年份:2022
- 资助金额:
$ 69.35万 - 项目类别:
Standard Grant
Study on the magnetic field evolution of solar flare kernels with high temporal cadence imaging system
高时间节奏成像系统研究太阳耀斑核磁场演化
- 批准号:
22K03687 - 财政年份:2022
- 资助金额:
$ 69.35万 - 项目类别:
Grant-in-Aid for Scientific Research (C)