权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: SLES: Verifying and Enforcing Safety Constraints in AI-based Sequential Generation

合作研究：SLES：验证和执行基于人工智能的顺序生成中的安全约束

基本信息

批准号：
2331966
负责人：
Cho-Jui Hsieh
金额：
$ 54万
依托单位：
University of California-Los Angeles
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-10-01 至 2026-09-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2331966&HistoricalAwards=false
关键词：
Collaborative Research SLES Verifying Enforcing

项目摘要

Artificial intelligence (AI) has achieved transformative impacts on various complex real-world challenges. Among its applications, sequential data are prevalent in many critical usages of AI when it directly engages with its users. Self-driving cars rely on AI to process sequences of sensor data from cameras and radars, and make a sequence of real-time decisions to ensure safe driving. Healthcare monitoring systems use AI to analyze sequences of patient health data, such as blood pressure, heart rate, and others, to detect anomalies and predict potential health issues. Chatbots utilize AI to understand natural language and generate safe, fair, and appropriate text responses as sequences of words and sentences. The sequential data produced by AI make its behavior hard to characterize because of the complex dependencies within the sequence, and a careless application of AI in these scenarios may lead to harmful consequences, such as a collision of an autonomous vehicle or the generation of biased or toxic texts. This project aims to study the safety of AI under scenarios with sequential data, provide assurance for its behavior in mission-critical environments, and ensure AI-based sequential generation can adhere to safety constraints and social norms. Ultimately, this research will help with reducing unexpected AI failures, preventing bias and discrimination in AI technologies, aligning AI systems with human values and societal norms, and building up public trust for AI-enabled applications.The technical contributions of this project consist of three thrusts. The first thrust develops a formal verification framework for assuring the safety of AI models for sequential generation tasks with rigorous mathematical guarantees. It includes a series of innovative verification algorithms for bound propagation and branch-and-bound for general non-linear functions involved in sequential generation models. These new verification methods will be integrated into the alpha-beta-CROWN neural network verifier, a well-known open-source toolbox developed by investigators. The second thrust involves training and inference algorithms that ensure sequential generation models comply with specified safety constraints, with a unique probabilistic framework that decomposes a safety constraint into action-level components and enforces them at each generation step. This approach can be integrated with model training to improve the safety performance of sequential generation models using posterior regularization techniques. Lastly, the third thrust aims to integrate the formal verification and constrained generation components above and apply them to three important real-world applications: safety of text generation, safety and stability of controlled systems, and robust AI-generated text detectors. This project will also result in tools to the broader AI community, including the alpha-beta-CROWN neural network verifier, and the shared data and benchmarks developed to evaluate the safety of sequential generation models.This project is supported by a partnership with the NSF and Open Philanthropy.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

人工智能（AI）对各种复杂的现实世界挑战产生了变革性的影响。在其应用程序中，当人工智能直接与用户互动时，顺序数据在人工智能的许多关键用途中普遍存在。自动驾驶汽车依靠人工智能处理来自摄像头和雷达的传感器数据序列，并做出一系列实时决策，以确保安全驾驶。医疗监控系统使用AI来分析患者健康数据序列，如血压、心率等，以检测异常并预测潜在的健康问题。聊天机器人利用人工智能来理解自然语言，并生成安全，公平和适当的文本响应作为单词和句子的序列。人工智能产生的序列数据使其行为难以描述，因为序列中存在复杂的依赖关系，在这些场景中不小心应用人工智能可能会导致有害的后果，例如自动驾驶汽车的碰撞或产生有偏见或有毒的文本。该项目旨在研究人工智能在序列数据场景下的安全性，为其在关键任务环境中的行为提供保证，并确保基于人工智能的序列生成能够遵守安全约束和社会规范。最终，这项研究将有助于减少意外的人工智能失败，防止人工智能技术中的偏见和歧视，使人工智能系统与人类价值观和社会规范保持一致，并建立公众对人工智能应用程序的信任。第一个推力开发了一个正式的验证框架，以确保具有严格数学保证的顺序生成任务的AI模型的安全性。它包括一系列创新的验证算法，用于顺序生成模型中涉及的一般非线性函数的边界传播和分支定界。这些新的验证方法将被集成到alpha-beta-CROWN神经网络验证器中，这是一个由研究人员开发的著名开源工具箱。第二个推力涉及训练和推理算法，确保顺序生成模型符合指定的安全约束，具有独特的概率框架，将安全约束分解为动作级组件并在每个生成步骤中执行它们。这种方法可以与模型训练相结合，使用后验正则化技术来提高序贯生成模型的安全性能。最后，第三个目标是整合上述形式化验证和约束生成组件，并将其应用于三个重要的现实应用：文本生成的安全性，受控系统的安全性和稳定性，以及强大的人工智能生成的文本检测器。该项目还将为更广泛的人工智能社区提供工具，包括alpha-beta-CROWN神经网络验证器，以及为评估连续发电模型的安全性而开发的共享数据和基准。该项目得到了与NSF和开放慈善事业的合作伙伴关系的支持。该奖项反映了NSF的法定使命，并被认为值得通过使用基金会的智力价值和更广泛的评估来支持。影响审查标准。