权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CRII: SaTC: Automatic Generation of API to Natural Language Data Type Mappings for Developer and End User Privacy Risk Mitigation

CRII：SaTC：自动生成 API 到自然语言数据类型映射，以减轻开发人员和最终用户的隐私风险

基本信息

批准号：
1948244
负责人：
Rocky Slavin
金额：
$ 17.5万
依托单位：
University of Texas at San Antonio
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2020
资助国家：
美国
起止时间：
2020-03-15 至 2024-02-29
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1948244&HistoricalAwards=false
关键词：
CRII SaTC Automatic Generation API

项目摘要

Since the advent of the smart phone, an increasing amount of the population has gained access to Internet-accessible software applications (apps). This, coupled with the various sensors available on mobile devices, make the general public highly susceptible to privacy risks as sensitive information (e.g., location, camera images, biometrics) may be leaked to the Internet. To help users make informed decisions about the potential privacy risks in using apps, regulators increasingly require app developers to include privacy policies communicating what information is collected or shared and how that information is used. However, even when such privacy policies are present, trust must be put in the app developers to adhere to the promises therein. Furthermore, developers are accountable for their adherence to their policies and must be confident that their privacy policies accurately represent their practices. This project aims to assist both developers and general app users in verifying the alignment of privacy policies and the apps they represent by producing an automated process for linking the semantics of language used in privacy policies with the code used to produce the apps themselves. Furthermore, the project will use this framework to generate tools for end users and developers to directly benefit from this work.The research project aims to produce an automated process for generating mappings between code-level APIs and natural language data types using machine learning. The resulting mappings will be utilized in developer and end user tools to identify and help mitigate potential privacy leakage during development and app usage. The current state of misalignment detection between privacy policies and app code requires the manual generation of mappings from code-level Application Program Interface (API) methods to privacy-oriented natural language data types. Even for small app categories, this process can require a human to review thousands of methods and hundreds of annotations resulting in potential for inaccuracies due to fatigue and incomplete domain knowledge. APIs also change as methods are introduced and deprecated resulting in outdated mappings. These problems make it difficult to apply the framework practically as the environment continually evolves. This project will address these challenges through two contributions. First, machine learning will be applied to the mapping generation process to produce an automated, scalable method for generating code-phrase mappings for APIs as needed. This will allow for misalignment detection for API levels, methods, and app categories beyond those build in previous contributions. This automated approach will make use of a state-of-the-art pre-trained language models to detect semantic similarity between API documentation and natural language data types used in privacy policies. Second, the resulting mappings from the automated model will be applied to practical developer and end user tools to enable informed decision for privacy risk mitigation. The PoliDroid tool suite will be developed including a developer-oriented integrated developer environment plugin which detects potential unintended privacy leaks based on a privacy policy and a real-time misalignment detection tool for end users.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

自智能手机出现以来，越来越多的人可以访问互联网可访问的软件应用程序（应用程序）。这与移动的设备上可用的各种传感器相结合，使得公众对作为敏感信息的隐私风险（例如，位置、摄像头图像、生物特征）可能会泄露到互联网上。为了帮助用户在使用应用程序时对潜在的隐私风险做出明智的决定，监管机构越来越多地要求应用程序开发人员包括隐私政策，说明收集或共享哪些信息以及如何使用这些信息。然而，即使存在这样的隐私政策，也必须信任应用程序开发人员遵守其中的承诺。此外，开发人员有责任遵守他们的政策，并且必须确信他们的隐私政策准确地代表了他们的做法。该项目旨在帮助开发人员和一般应用程序用户验证隐私政策及其所代表的应用程序的一致性，方法是生成一个自动化过程，将隐私政策中使用的语言语义与用于生成应用程序本身的代码联系起来。此外，该项目将使用该框架为最终用户和开发人员生成工具，以直接受益于这项工作。该研究项目旨在使用机器学习生成代码级API和自然语言数据类型之间的映射的自动化过程。由此产生的映射将在开发人员和最终用户工具中使用，以识别和帮助减轻开发和应用程序使用期间的潜在隐私泄露。隐私策略和应用代码之间的未对齐检测的当前状态需要手动生成从代码级应用程序接口（API）方法到面向隐私的自然语言数据类型的映射。即使对于小的应用程序类别，这个过程也可能需要一个人来审查数千种方法和数百种注释，这可能会导致由于疲劳和不完整的领域知识而导致的不准确。API也会随着方法的引入和弃用而改变，从而导致过时的映射。这些问题使得随着环境的不断演变，很难实际应用该框架。本项目将通过两项贡献应对这些挑战。首先，机器学习将应用于映射生成过程，以生成一种自动化、可扩展的方法，用于根据需要为API生成代码短语映射。这将允许对API级别、方法和应用类别进行未对齐检测，而不是以前的贡献。这种自动化方法将利用最先进的预训练语言模型来检测API文档与隐私策略中使用的自然语言数据类型之间的语义相似性。其次，自动化模型的映射结果将应用于实际的开发人员和最终用户工具，以实现隐私风险缓解的明智决策。PoliDroid工具套件的开发将包括一个面向开发人员的集成开发环境插件，该插件可根据隐私政策检测潜在的意外隐私泄露，并为最终用户提供实时未对准检测工具。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（4）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Analyzing privacy policies through syntax-driven semantic analysis of information types

DOI：
10.1016/j.infsof.2021.106608
发表时间：
2021-05-12
期刊：
INFORMATION AND SOFTWARE TECHNOLOGY
影响因子：
3.9
作者：
Hosseini, Mitra Bokaei;Breaux, Travis D.;Wang, Xiaoyin
通讯作者：
Wang, Xiaoyin

ConDySTA: Context-Aware Dynamic Supplement to Static Taint Analysis

DOI：
10.1109/sp40001.2021.00040
发表时间：
2021-05
期刊：
2021 IEEE Symposium on Security and Privacy (SP)
影响因子：
0
作者：
Xueling Zhang;Xiaoyin Wang;Rocky Slavin;Jianwei Niu
通讯作者：
Xueling Zhang;Xiaoyin Wang;Rocky Slavin;Jianwei Niu

DAISY: Dynamic-Analysis-Induced Source Discovery for Sensitive Data

DOI：
10.1145/3569936
发表时间：
2022-10
期刊：
ACM Transactions on Software Engineering and Methodology
影响因子：
4.4
作者：
Xueling Zhang;John Heaps;Rocky Slavin;Jianwei Niu;T. Breaux;Xiaoyin Wang
通讯作者：
Xueling Zhang;John Heaps;Rocky Slavin;Jianwei Niu;T. Breaux;Xiaoyin Wang

Ambiguity and Generality in Natural Language Privacy Policies

DOI：
10.1109/re51729.2021.00014
发表时间：
2021-09
期刊：
2021 IEEE 29th International Requirements Engineering Conference (RE)
影响因子：
0
作者：
M. Hosseini;John Heaps;Rocky Slavin;Jianwei Niu;T. Breaux
通讯作者：
M. Hosseini;John Heaps;Rocky Slavin;Jianwei Niu;T. Breaux

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Rocky Slavin其他文献

Rethinking Security Requirements in RE Research Technical Report

重新思考 RE Research 技术报告中的安全要求

DOI：
发表时间：
2014
期刊：
影响因子：
0
作者：
Hanan Hibshi;Rocky Slavin;Jianwei Niu;T. Breaux
通讯作者：
T. Breaux

PVDetector: A Detector of Privacy-Policy Violations for Android Apps

PVDetector：Android 应用程序隐私政策违规检测器

DOI：
10.1145/2897073.2897720
发表时间：
2016
期刊：
2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft)
影响因子：
0
作者：
Rocky Slavin;Xiaoyin Wang;M. Hosseini;James Hester;R. Krishnan;Jaspreet Bhatia;T. Breaux;Jianwei Niu
通讯作者：
Jianwei Niu