权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

SHF: Small: Practical Analyses and Safe Transformations for Imperative Deep Learning Programs

SHF：小型：命令式深度学习程序的实用分析和安全转换

基本信息

批准号：
2200343
负责人：
Raffi Khatchadourian
金额：
$ 60万
依托单位：
CUNY Hunter College
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-06-01 至 2025-05-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2200343&HistoricalAwards=false
关键词：
SHF Small Practical Analyses Safe

项目摘要

Learning often occurs by pattern recognition. Software systems learn by using algorithms to recognize patterns and draw inferences from existing data and apply the inferences to previously unseen data. Deep Learning (DL) is a kind of machine-learning algorithm inspired by neural networks of human brains. A DL model learns decision logic from a large set of examples. A classic application is image processing, where a model may learn to recognize particular images through training with many sample images. While software systems that incorporate DL models involve large amounts of data, they still have to be efficient and responsive. This project is expected to increase DL system robustness, reliability, and scalability, positively impacting computer vision, autonomous driving, medicine, and extremism identification. Tools developed as a result of this project are also expected to democratize the Artificial Intelligence workforce, as they will assist data scientists and software engineers of varying proficiencies in writing quality DL code. Such tools can potentially contribute to a diverse, globally competitive STEM workforce and increase US economic competitiveness. This project will also promote software engineering concepts in machine learning by augmenting and creating several undergraduate and graduate courses. Dissemination will occur through publicly distributing datasets, papers, open-source software, and Open Educational Resources.DL frameworks increasingly make various tradeoffs to balance the often competing requirements of reliability, usability, and generality. Popular DL frameworks have historically embraced graph-based, deferred execution-style (low-level) Application Programming Interfaces (APIs). While efficient, (legacy) systems using such interfaces are cumbersome, error-prone, and difficult to debug, maintain, and port. Contrarily, (modern) eager execution-style DL APIs facilitate higher-level, imperative, and Object-Oriented (Python) programs that are easier to debug, less error-prone, and more extensible have consequently emerged at the expense of run-time performance. Though hybrid approaches aim to bridge the two paradigms, they necessitate a non-trivial amount of technical metadata and exhibit several limitations and known issues on the use of native program constructs. This project is expected to contribute practical analyses and safe transformations for modern imperative and Object-Oriented DL programs that markedly improve their reliability and scalability. First, various software engineering artifacts will be mined for bug fixes, (manual) refactorings (semantics-preserving source-to-source program transformations), and missed opportunities in efficiently executing imperative DL code. Then, novel analyses and refactorings for automatically (i) migrating legacy, deferred execution-style DL code to more robust imperative DL code and (ii) specifying how their otherwise eagerly-executed imperative DL code should be reliably and efficiently executed as graphs at run-time will be formulated. Finally, novel analyses for detecting performance bottlenecks and semantic errors associated with graph-based execution of imperative, otherwise eagerly-executed DL code will be designed. This contribution is significant because it fills the void of techniques, methodologies, and tools for effectively developing---and evolving long-lived---trustworthy and efficient DL systems that pervasively use imperative and Object-Oriented DL programming.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

学习通常通过模式识别来进行。软件系统通过使用算法来识别模式并从现有数据中得出推论，并将推论应用于以前看不见的数据。深度学习（Deep Learning，DL）是一种受人脑神经网络启发的机器学习算法。DL模型从大量的示例中学习决策逻辑。一个经典的应用是图像处理，其中模型可以通过使用许多样本图像进行训练来学习识别特定图像。虽然包含DL模型的软件系统涉及大量数据，但它们仍然必须高效且响应迅速。该项目预计将提高DL系统的鲁棒性、可靠性和可扩展性，对计算机视觉、自动驾驶、医学和极端主义识别产生积极影响。作为该项目的结果开发的工具也有望使人工智能劳动力民主化，因为它们将帮助不同专业的数据科学家和软件工程师编写高质量的DL代码。这些工具可能有助于形成多样化的、具有全球竞争力的STEM劳动力，并提高美国的经济竞争力。该项目还将通过增加和创建几门本科生和研究生课程来促进机器学习中的软件工程概念。通过公开发布数据集、论文、开源软件和开放教育资源来进行传播。深度学习框架越来越多地进行各种权衡，以平衡可靠性、可用性和通用性等经常相互竞争的需求。流行的DL框架历来都采用基于图的延迟执行风格（低级）应用程序编程接口（API）。虽然高效，但使用这种接口的（传统）系统是麻烦的、容易出错的，并且难以调试、维护和移植。首先，（现代）渴望执行风格的DL API促进了更高级别的，命令式的和面向对象的（Python）程序，这些程序更容易调试，更不容易出错，并且更具可扩展性，因此以牺牲运行时性能为代价出现。虽然混合方法的目的是桥接这两种范式，它们需要一个非平凡的技术元数据量，并表现出一些限制和已知的问题上使用本机程序构造。该项目有望为现代命令式和面向对象的DL程序提供实用分析和安全转换，从而显着提高其可靠性和可扩展性。首先，各种软件工程工件将被挖掘的错误修复，（手动）重构（语义保持源到源程序转换），并错过了有效执行命令式DL代码的机会。然后，新的分析和重构自动（i）迁移遗留的，延迟执行风格的DL代码到更强大的命令式DL代码和（ii）指定他们的，否则急于执行的命令式DL代码应该如何可靠和有效地执行作为图形在运行时将制定。最后，新的分析检测性能瓶颈和语义错误与基于图形的执行命令，否则急于执行DL代码将被设计。这一贡献意义重大，因为它填补了技术、方法和工具的空白，有效地开发-并发展长期-可信赖和有效的深度学习系统，这些系统普遍使用命令式和面向对象的深度学习编程。这一奖项反映了NSF的法定使命，并被认为值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估来支持。

项目成果

期刊论文数量（5）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study

DOI：
10.1145/3524842.3528455
发表时间：
2022-01
期刊：
2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR)
影响因子：
0
作者：
Tatiana Castro V'elez;Raffi Khatchadourian;M. Bagherzadeh;A. Raja
通讯作者：
Tatiana Castro V'elez;Raffi Khatchadourian;M. Bagherzadeh;A. Raja

How many mutex bugs can a simple analysis find in Go programs?

简单分析一下Go程序中可以发现多少互斥量bug？

DOI：
发表时间：
2022
期刊：
Annual Conference of the Japanese Society for Software Science and Technology
影响因子：
0
作者：
Fumi Takeuchi. Hidehiko Masuhara. Raffi Khatchadourian, Youyou Cong
通讯作者：
Fumi Takeuchi. Hidehiko Masuhara. Raffi Khatchadourian, Youyou Cong

A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree of Interest

通过 Git 历史记录和兴趣程度恢复功能日志级别的工具