权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Trustworthy Machine Learning from Untrusted Models

职业：从不可信模型中进行值得信赖的机器学习

基本信息

批准号：
2405136
负责人：
Ting Wang
金额：
$ 50.99万
依托单位：
SUNY at Stony Brook
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-11-01 至 2024-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2405136&HistoricalAwards=false
关键词：
CAREER Trustworthy Machine Learning Untrusted

项目摘要

Many of today's machine learning (ML)-based systems are not built from scratch, but are "composed" from an array of pre-trained, third-party models. Paralleling other forms of software reuse, reusing models can both speed up and simplify the development of ML-based systems. However, a lack of standardization, regulation, and verification of third-party ML models raises security concerns. In particular, ML models are subject to adversarial attacks in which third-party attackers or model providers themselves might embed hidden behaviors that are triggered by pre-specified inputs. This project aims at understanding the security threats incurred by reusing third-party models as building blocks of ML systems and developing tools to help developers mitigate such threats throughout the lifecycle of ML systems. Outcomes from the project will improve ML security in applications from self-driving cars to authentication in the short term while promoting more principled practices of building and operating ML systems in the long run.One major type of threat incurred by reusing third-party models is model reuse attacks, in which maliciously crafted models ("adversarial models") force host ML systems to malfunction on targeted inputs ("triggers") in a highly predictable manner. This project develops rigorous yet practical methods to proactively detect and remediate such backdoor vulnerabilities. First, it will empirically and analytically investigate the necessary conditions and invariant patterns of model reuse attacks. Second, leveraging these insights, it will develop a chain of mitigation tools that detect potential backdoors, pinpoint triggers, and provide mechanisms to fortify adversarial models against these attacks. Third, it will establish a unified theory of adversarial models and adversarial inputs to deepen more general understanding of adversarial ML. Finally, it will implement all the proposed techniques and system designs in the form of a prototype testbed, which provides a unique research facility for investigating a range of attack and defense techniques. New theories and techniques developed in this project will be integrated into undergraduate and graduate education and used to raise public awareness of the importance of ML security.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

当今许多基于机器学习（ML）的系统并不是从零开始构建的，而是由一系列预先训练好的第三方模型“组成”的。与其他形式的软件重用类似，重用模型可以加速并简化基于ml的系统的开发。然而，缺乏标准化、监管和第三方ML模型的验证引起了安全问题。特别是，ML模型容易受到对抗性攻击，其中第三方攻击者或模型提供者本身可能嵌入由预先指定的输入触发的隐藏行为。该项目旨在了解重用第三方模型作为机器学习系统的构建块所带来的安全威胁，并开发工具来帮助开发人员在机器学习系统的整个生命周期中减轻此类威胁。该项目的成果将在短期内提高从自动驾驶汽车到身份验证等应用中的ML安全性，同时从长远来看，将促进构建和操作ML系统的更有原则的实践。重用第三方模型引起的一种主要威胁是模型重用攻击，其中恶意制作的模型（“对抗性模型”）以高度可预测的方式迫使主机ML系统在目标输入（“触发器”）上发生故障。该项目开发了严格而实用的方法来主动检测和修复此类后门漏洞。首先，对模型重用攻击的必要条件和不变模式进行实证分析研究。其次，利用这些见解，它将开发一系列缓解工具，以检测潜在的后门，查明触发因素，并提供机制来加强对抗这些攻击的模型。第三，它将建立对抗性模型和对抗性输入的统一理论，以加深对对抗性机器学习的更一般的理解。最后，它将以原型测试平台的形式实现所有提出的技术和系统设计，这为调查一系列攻击和防御技术提供了独特的研究设施。本项目开发的新理论和技术将被整合到本科和研究生教育中，并用于提高公众对机器学习安全重要性的认识。该奖项反映了美国国家科学基金会的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。