权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Improving the Quality Assurance of Machine-Learning Software Applications

提高机器学习软件应用程序的质量保证

基本信息

批准号：
RGPIN-2019-06956
负责人：
Khomh, Foutse
金额：
$ 2.99万
依托单位：
École Polytechnique de Montréal
依托单位国家：
加拿大
项目类别：
Discovery Grants Program - Individual
财政年份：
2021
资助国家：
加拿大
起止时间：
2021-01-01 至 2022-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=739206
关键词：
Improving Quality Assurance Machine Learning

项目摘要

Machine learning (ML) is increasingly deployed in large-scale and critical systems thanks to recent breakthroughs in deep learning and reinforcement learning. We are now using software applications powered by ML in critical aspects of our daily lives; from finance, energy, to health and transportation. The economic benefits of Machine-Learning Software Applications (MLSA) and Artificial Intelligence (AI) in general is forecast to surpass USD 8.81 Billion by 2022. However, ensuring the quality assurance of MLSA is still very challenging as evidenced by the recent deadly incident caused by the $47-million Michigan Integrated Data Automated System (MiDAS), or the Uber's self-driving car that ran into a pedestrian even though the car's sensors detected her presence. The MLSA running the Uber's car reportedly considered the detection of the pedestrian as a "false positive". The main reason behind the difficulty to ensure quality in MLSA is the shift in the development paradigm induced by ML and AI. Traditionally, software systems are constructed deductively, by writing down the rules that govern the behavior of the system as program code. However, with ML, these rules are inferred from training data (i.e., the requirements are generated inductively). This paradigm shift in application development makes it difficult to reason about the behavior of software systems with ML components, resulting in systems that are intrinsically challenging to test and verify. A defect in a MLSA may come from its training data, program code, execution environment, or third-party frameworks (e.g., TensorFlow). Also, ML models must be retrained and evolved constantly to cope with changes in users' behaviors, model drift, or adversarial interactions for example, hence the necessity to architect them in a way that minimizes the cost of these frequent models changes on their overall maintenance and evolution. Current existing software development techniques must be revisited and adapted to this new reality. The goal of this research program is to develop techniques and tools to support quality assurance activities for MLSA systems, given that they do not have (complete) specifications or even source code corresponding to some of their critical behaviors (some MLSA rely on proprietary third-party libraries like Intel Math Kernel Library for many critical operations). Through this research program, my students and I will identify good and bad development practices that can impede the maintenance and the reliability of MLSA. I will also develop techniques and tools to help developers detect and correct errors in MLSA, both at design and implementation levels.

由于最近在深度学习和强化学习方面的突破，机器学习（ML）越来越多地部署在大规模和关键系统中。我们现在正在日常生活的关键方面使用由ML提供支持的软件应用程序;从金融，能源到健康和交通。机器学习软件应用（MLSA）和人工智能（AI）的经济效益预计到2022年将超过88.1亿美元。然而，确保MLSA的质量保证仍然非常具有挑战性，最近价值4700万美元的密歇根综合数据自动化系统（MiDAS）造成的致命事件，或者Uber的自动驾驶汽车撞上行人，尽管汽车的传感器检测到了她的存在。据报道，运行Uber汽车的MLSA认为行人的检测是“误报”。MLSA难以确保质量的主要原因是ML和AI引起的开发范式的转变。传统上，软件系统是通过演绎的方式构建的，通过将控制系统行为的规则写下来作为程序代码。然而，使用ML，这些规则是从训练数据中推断出来的（即，感应地产生需求）。应用程序开发中的这种范式转变使得很难推理具有ML组件的软件系统的行为，从而导致系统在测试和验证方面具有内在的挑战性。MLSA中的缺陷可能来自其训练数据、程序代码、执行环境或第三方框架（例如，TensorFlow）。此外，ML模型必须不断地重新训练和进化，以科普用户行为的变化，例如模型漂移或对抗性交互，因此有必要以一种最大限度地减少这些频繁模型变化对其整体维护和进化的成本的方式来构建它们。当前现有的软件开发技术必须重新审视并适应这一新的现实。该研究计划的目标是开发技术和工具，以支持MLSA系统的质量保证活动，因为它们没有（完整的）规范，甚至没有与其某些关键行为相对应的源代码（某些MLSA依赖专有的第三方库，如英特尔数学内核库来执行许多关键操作）。通过这个研究项目，我和我的学生将确定好的和坏的开发实践，可以阻碍MLSA的维护和可靠性。我还将开发技术和工具来帮助开发人员在设计和实现级别检测和纠正MLSA中的错误。