Mining, Intelligence and Automation in Tackling Machine-Learning Bugs

挖掘、智能和自动化解决机器学习缺陷

基本信息

  • 批准号:
    RGPIN-2021-03236
  • 负责人:
  • 金额:
    $ 2.11万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

Motivation and Objectives: According to Fortune Business Insights, Machine Learning (ML) software is projected to have a global market share of $117 billion by 2027. It has found applications in major areas including healthcare, transportation, and business analytics. However, the technology is yet to mature and can be unreliable. Bugs in ML software are highly complex, hard to solve due to their data-driven, non-deterministic nature and have potentially deadly consequences (e.g., the fatal crash of Uber's self-driving car). The standard procedure for correcting bugs is labour-intensive and inefficient, which takes up ~50% of a developer's time. The majority of this time is spent finding and understanding the faulty code before making actual code changes. To date, many approaches were designed to find and solve bugs in traditional software, but most are not accurate enough and are inadequate for application to ML software. My research program aims to design intelligent frameworks for understanding, finding, and reproducing bugs in ML software. Research Plan and Methodology: This proposal encompasses three complementary activities: (1) advancing the current understanding of ML bugs, (2) finding bugs in ML software, and (3) reproducing the identified bugs for a reliable diagnosis. First, we will construct a large dataset from ML applications found at GitHub to systematically study the characteristics and central challenges of ML bugs and analyze the effectiveness of traditional debugging solutions to these bugs. Second, we will design an intelligent framework that can (a) detect faulty components using intelligent Information Retrieval methods, (b) detect the faulty code within these components using their static properties and dynamic behaviours, and (c) complement these results with meaningful explanations (e.g., type of bug). Third, we will design an intelligent framework that can (a) help a developer understand how a bug might trigger, and (b) deliver appropriate test cases to reproduce the identified bugs in ML software using reinforcement learning and a technology sandbox. Novelty and Expected Significance: This research program has three novel aspects: (a) intelligent debugging supports for ML software, (b) extension of developer's cognitive abilities with machine intelligence, and (c) enrichment of tools' results with complementary information. It will advance the current state of research for cost-effective debugging and will also benefit parallel practices such as change management. My research will also produce tools that will be adopted by industry, such as through my collaborations with Mozilla Corporation and Canadian software companies. By supporting developers in solving ML bugs efficiently and by providing high-quality training to students in an area of acute need, this program will thus assist in the development of safe, reliable machine-learning software and significantly contribute to the Canadian economy.
动机和目标:根据《财富》商业洞察,到2027年,机器学习(ML)软件的全球市场份额预计将达到1170亿美元。它已经在包括医疗保健、交通和商业分析在内的主要领域得到了应用。然而,这项技术还不成熟,可能不可靠。机器学习软件中的漏洞非常复杂,由于其数据驱动的非确定性性质而难以解决,并且可能导致致命的后果(例如,优步自动驾驶汽车的致命车祸)。纠正错误的标准程序是劳动密集型和低效的,它占用了开发人员约50%的时间。在进行实际的代码更改之前,大部分时间都花在查找和理解错误代码上。迄今为止,许多方法都是为了发现和解决传统软件中的错误而设计的,但大多数方法都不够准确,不足以应用于ML软件。我的研究项目旨在设计智能框架,用于理解、发现和重现机器学习软件中的错误。研究计划和方法:该提案包括三个互补的活动:(1)推进当前对机器学习错误的理解,(2)发现机器学习软件中的错误,(3)重现已识别的错误以进行可靠的诊断。首先,我们将从GitHub上找到的ML应用程序构建一个大型数据集,系统地研究ML错误的特征和主要挑战,并分析传统调试解决方案对这些错误的有效性。其次,我们将设计一个智能框架,它可以(a)使用智能信息检索方法检测有缺陷的组件,(b)使用这些组件的静态属性和动态行为检测这些组件中的有缺陷代码,以及(c)用有意义的解释(例如,bug类型)补充这些结果。第三,我们将设计一个智能框架,可以(a)帮助开发人员了解bug可能如何触发,以及(b)提供适当的测试用例,以使用强化学习和技术沙箱重现ML软件中已识别的bug。新颖性和预期意义:本研究项目有三个新颖性方面:(a)支持机器学习软件的智能调试,(b)用机器智能扩展开发人员的认知能力,(c)用补充信息丰富工具的结果。它将推进成本效益调试的研究现状,也将有利于类似的实践,如变更管理。我的研究还将产生被工业采用的工具,例如通过我与Mozilla公司和加拿大软件公司的合作。通过支持开发人员有效地解决机器学习错误,并在迫切需要的领域为学生提供高质量的培训,该计划将有助于开发安全可靠的机器学习软件,并为加拿大经济做出重大贡献。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Rahman, MohammadMasudur其他文献

Rahman, MohammadMasudur的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Rahman, MohammadMasudur', 18)}}的其他基金

Mining, Intelligence and Automation in Tackling Machine-Learning Bugs
挖掘、智能和自动化解决机器学习缺陷
  • 批准号:
    DGECR-2021-00141
  • 财政年份:
    2021
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Launch Supplement
Mining, Intelligence and Automation in Tackling Machine-Learning Bugs
挖掘、智能和自动化解决机器学习缺陷
  • 批准号:
    RGPIN-2021-03236
  • 财政年份:
    2021
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Grants Program - Individual

相似海外基金

Artificial intelligence coupled to automation for accelerated medicine design
人工智能与自动化相结合,加速药物设计
  • 批准号:
    EP/Z533038/1
  • 财政年份:
    2024
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Research Grant
ADA: Remote Healthcare Monitoring powered by Network Intelligence and Automation
ADA:由网络智能和自动化提供支持的远程医疗保健监控
  • 批准号:
    10098971
  • 财政年份:
    2024
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Collaborative R&D
Cyber Graph-to-Text: AI automation for Threat Intelligence, made accessible to all
网络图文转换:威胁情报的人工智能自动化,可供所有人使用
  • 批准号:
    10052569
  • 财政年份:
    2023
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Investment Accelerator
SBIR Phase II: A Cryo-EM Automation and Intelligence Platform for Drug Discovery
SBIR 第二阶段:用于药物发现的冷冻电镜自动化和智能平台
  • 批准号:
    2135832
  • 财政年份:
    2022
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Cooperative Agreement
Automation and Optimization of High Dose-Rate Gynaecological Brachytherapy with Machine Learning and Artificial Intelligence
利用机器学习和人工智能实现高剂量率妇科近距离放射治疗的自动化和优化
  • 批准号:
    546757-2020
  • 财政年份:
    2022
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
SBIR Phase II: Advanced Artificial Intelligence for Robotic E-Commerce Pick-and-Pack Automation
SBIR 第二阶段:用于机器人电子商务分拣和包装自动化的先进人工智能
  • 批准号:
    2111915
  • 财政年份:
    2022
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Cooperative Agreement
Mining, Intelligence and Automation in Tackling Machine-Learning Bugs
挖掘、智能和自动化解决机器学习缺陷
  • 批准号:
    DGECR-2021-00141
  • 财政年份:
    2021
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Launch Supplement
Mining, Intelligence and Automation in Tackling Machine-Learning Bugs
挖掘、智能和自动化解决机器学习缺陷
  • 批准号:
    RGPIN-2021-03236
  • 财政年份:
    2021
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Discovery Grants Program - Individual
SBIR Phase I: Artificial Intelligence Enhanced Design Automation for General Engineering Systems
SBIR 第一阶段:通用工程系统的人工智能增强设计自动化
  • 批准号:
    2055030
  • 财政年份:
    2021
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Standard Grant
Automation of Neuro-Linguistic Programming methods using Artificial Intelligence to interpret unconscious thoughts and values for personal development
使用人工智能的神经语言编程方法的自动化来解释无意识的想法和价值观以促进个人发展
  • 批准号:
    10011174
  • 财政年份:
    2021
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Feasibility Studies
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了