CRII: SaTC: Automatic Generation of API to Natural Language Data Type Mappings for Developer and End User Privacy Risk Mitigation

CRII:SaTC:自动生成 API 到自然语言数据类型映射,以减轻开发人员和最终用户的隐私风险

基本信息

  • 批准号:
    1948244
  • 负责人:
  • 金额:
    $ 17.5万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-03-15 至 2024-02-29
  • 项目状态:
    已结题

项目摘要

Since the advent of the smart phone, an increasing amount of the population has gained access to Internet-accessible software applications (apps). This, coupled with the various sensors available on mobile devices, make the general public highly susceptible to privacy risks as sensitive information (e.g., location, camera images, biometrics) may be leaked to the Internet. To help users make informed decisions about the potential privacy risks in using apps, regulators increasingly require app developers to include privacy policies communicating what information is collected or shared and how that information is used. However, even when such privacy policies are present, trust must be put in the app developers to adhere to the promises therein. Furthermore, developers are accountable for their adherence to their policies and must be confident that their privacy policies accurately represent their practices. This project aims to assist both developers and general app users in verifying the alignment of privacy policies and the apps they represent by producing an automated process for linking the semantics of language used in privacy policies with the code used to produce the apps themselves. Furthermore, the project will use this framework to generate tools for end users and developers to directly benefit from this work.The research project aims to produce an automated process for generating mappings between code-level APIs and natural language data types using machine learning. The resulting mappings will be utilized in developer and end user tools to identify and help mitigate potential privacy leakage during development and app usage. The current state of misalignment detection between privacy policies and app code requires the manual generation of mappings from code-level Application Program Interface (API) methods to privacy-oriented natural language data types. Even for small app categories, this process can require a human to review thousands of methods and hundreds of annotations resulting in potential for inaccuracies due to fatigue and incomplete domain knowledge. APIs also change as methods are introduced and deprecated resulting in outdated mappings. These problems make it difficult to apply the framework practically as the environment continually evolves. This project will address these challenges through two contributions. First, machine learning will be applied to the mapping generation process to produce an automated, scalable method for generating code-phrase mappings for APIs as needed. This will allow for misalignment detection for API levels, methods, and app categories beyond those build in previous contributions. This automated approach will make use of a state-of-the-art pre-trained language models to detect semantic similarity between API documentation and natural language data types used in privacy policies. Second, the resulting mappings from the automated model will be applied to practical developer and end user tools to enable informed decision for privacy risk mitigation. The PoliDroid tool suite will be developed including a developer-oriented integrated developer environment plugin which detects potential unintended privacy leaks based on a privacy policy and a real-time misalignment detection tool for end users.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
自智能手机出现以来,越来越多的人可以访问互联网可访问的软件应用程序(应用程序)。这与移动的设备上可用的各种传感器相结合,使得公众对作为敏感信息的隐私风险(例如,位置、摄像头图像、生物特征)可能会泄露到互联网上。为了帮助用户在使用应用程序时对潜在的隐私风险做出明智的决定,监管机构越来越多地要求应用程序开发人员包括隐私政策,说明收集或共享哪些信息以及如何使用这些信息。然而,即使存在这样的隐私政策,也必须信任应用程序开发人员遵守其中的承诺。此外,开发人员有责任遵守他们的政策,并且必须确信他们的隐私政策准确地代表了他们的做法。该项目旨在帮助开发人员和一般应用程序用户验证隐私政策及其所代表的应用程序的一致性,方法是生成一个自动化过程,将隐私政策中使用的语言语义与用于生成应用程序本身的代码联系起来。此外,该项目将使用该框架为最终用户和开发人员生成工具,以直接受益于这项工作。该研究项目旨在使用机器学习生成代码级API和自然语言数据类型之间的映射的自动化过程。由此产生的映射将在开发人员和最终用户工具中使用,以识别和帮助减轻开发和应用程序使用期间的潜在隐私泄露。隐私策略和应用代码之间的未对齐检测的当前状态需要手动生成从代码级应用程序接口(API)方法到面向隐私的自然语言数据类型的映射。即使对于小的应用程序类别,这个过程也可能需要一个人来审查数千种方法和数百种注释,这可能会导致由于疲劳和不完整的领域知识而导致的不准确。API也会随着方法的引入和弃用而改变,从而导致过时的映射。这些问题使得随着环境的不断演变,很难实际应用该框架。本项目将通过两项贡献应对这些挑战。首先,机器学习将应用于映射生成过程,以生成一种自动化、可扩展的方法,用于根据需要为API生成代码短语映射。这将允许对API级别、方法和应用类别进行未对齐检测,而不是以前的贡献。这种自动化方法将利用最先进的预训练语言模型来检测API文档与隐私策略中使用的自然语言数据类型之间的语义相似性。其次,自动化模型的映射结果将应用于实际的开发人员和最终用户工具,以实现隐私风险缓解的明智决策。PoliDroid工具套件的开发将包括一个面向开发人员的集成开发环境插件,该插件可根据隐私政策检测潜在的意外隐私泄露,并为最终用户提供实时未对准检测工具。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Analyzing privacy policies through syntax-driven semantic analysis of information types
  • DOI:
    10.1016/j.infsof.2021.106608
  • 发表时间:
    2021-05-12
  • 期刊:
  • 影响因子:
    3.9
  • 作者:
    Hosseini, Mitra Bokaei;Breaux, Travis D.;Wang, Xiaoyin
  • 通讯作者:
    Wang, Xiaoyin
ConDySTA: Context-Aware Dynamic Supplement to Static Taint Analysis
  • DOI:
    10.1109/sp40001.2021.00040
  • 发表时间:
    2021-05
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xueling Zhang;Xiaoyin Wang;Rocky Slavin;Jianwei Niu
  • 通讯作者:
    Xueling Zhang;Xiaoyin Wang;Rocky Slavin;Jianwei Niu
DAISY: Dynamic-Analysis-Induced Source Discovery for Sensitive Data
  • DOI:
    10.1145/3569936
  • 发表时间:
    2022-10
  • 期刊:
  • 影响因子:
    4.4
  • 作者:
    Xueling Zhang;John Heaps;Rocky Slavin;Jianwei Niu;T. Breaux;Xiaoyin Wang
  • 通讯作者:
    Xueling Zhang;John Heaps;Rocky Slavin;Jianwei Niu;T. Breaux;Xiaoyin Wang
Ambiguity and Generality in Natural Language Privacy Policies
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Rocky Slavin其他文献

Rethinking Security Requirements in RE Research Technical Report
重新思考 RE Research 技术报告中的安全要求
  • DOI:
  • 发表时间:
    2014
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hanan Hibshi;Rocky Slavin;Jianwei Niu;T. Breaux
  • 通讯作者:
    T. Breaux
PVDetector: A Detector of Privacy-Policy Violations for Android Apps
PVDetector:Android 应用程序隐私政策违规检测器
Protocol-agnostic IoT Device Classification on Encrypted Traffic Using Link-Level Flows
使用链路级流对加密流量进行与协议无关的 IoT 设备分类
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Gabriel A. Morales;Adam Bienek;Patrick Jenkins;Rocky Slavin
  • 通讯作者:
    Rocky Slavin

Rocky Slavin的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

SaTC: CORE: Small: Automatic Exploits Detection and Mitigation for Industrial Control System Protocols
SaTC:核心:小型:工业控制系统协议的自动漏洞检测和缓解
  • 批准号:
    2345563
  • 财政年份:
    2023
  • 资助金额:
    $ 17.5万
  • 项目类别:
    Standard Grant
SaTC: CORE: Small: Automatic Identification of Privilege-guard Variables for Data-only Attacks and Defenses
SaTC:核心:小型:自动识别纯数据攻击和防御的权限保护变量
  • 批准号:
    2247652
  • 财政年份:
    2023
  • 资助金额:
    $ 17.5万
  • 项目类别:
    Continuing Grant
SaTC: CORE: Small: Automatic Detection and Repair of Side Channel Vulnerabilities in Software Code
SaTC:CORE:小型:自动检测和修复软件代码中的侧信道漏洞
  • 批准号:
    2245344
  • 财政年份:
    2023
  • 资助金额:
    $ 17.5万
  • 项目类别:
    Continuing Grant
SaTC: CORE: Small: Sound Automatic Exploit Generation
SaTC:核心:小:声音自动漏洞利用生成
  • 批准号:
    2234257
  • 财政年份:
    2023
  • 资助金额:
    $ 17.5万
  • 项目类别:
    Continuing Grant
SaTC: CORE: Small: Automatic Exploits Detection and Mitigation for Industrial Control System Protocols
SaTC:核心:小型:工业控制系统协议的自动漏洞检测和缓解
  • 批准号:
    2051621
  • 财政年份:
    2021
  • 资助金额:
    $ 17.5万
  • 项目类别:
    Standard Grant
SaTC: CORE: Small: Automatic Software Patching against Microarchitectual Attacks
SaTC:核心:小型:针对微架构攻击的自动软件修补
  • 批准号:
    1956032
  • 财政年份:
    2020
  • 资助金额:
    $ 17.5万
  • 项目类别:
    Standard Grant
SaTC: CORE: Medium: Collaborative: Understanding and Discovering Illicit Online Business Through Automatic Analysis of Online Text Traces
SaTC:核心:媒介:协作:通过自动分析在线文本痕迹理解和发现非法在线业务
  • 批准号:
    1850725
  • 财政年份:
    2018
  • 资助金额:
    $ 17.5万
  • 项目类别:
    Continuing Grant
SaTC: CORE: Medium: Collaborative: Understanding and Discovering Illicit Online Business Through Automatic Analysis of Online Text Traces
SaTC:核心:媒介:协作:通过自动分析在线文本痕迹理解和发现非法在线业务
  • 批准号:
    1801432
  • 财政年份:
    2018
  • 资助金额:
    $ 17.5万
  • 项目类别:
    Continuing Grant
SaTC: CORE: Medium: Collaborative: Understanding and Discovering Illicit Online Business Through Automatic Analysis of Online Text Traces
SaTC:核心:媒介:协作:通过自动分析在线文本痕迹理解和发现非法在线业务
  • 批准号:
    1801365
  • 财政年份:
    2018
  • 资助金额:
    $ 17.5万
  • 项目类别:
    Continuing Grant
SaTC: CORE: Medium: Collaborative: Understanding and Discovering Illicit Online Business Through Automatic Analysis of Online Text Traces
SaTC:核心:媒介:协作:通过自动分析在线文本痕迹理解和发现非法在线业务
  • 批准号:
    1801652
  • 财政年份:
    2018
  • 资助金额:
    $ 17.5万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了