Collaborative Research: SaTC: CORE: Small: Differentially Private Data Synthesis: Practical Algorithms and Statistical Foundations

协作研究:SaTC:核心:小型:差分隐私数据合成:实用算法和统计基础

基本信息

  • 批准号:
    2247794
  • 负责人:
  • 金额:
    $ 30万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2023
  • 资助国家:
    美国
  • 起止时间:
    2023-07-15 至 2026-06-30
  • 项目状态:
    未结题

项目摘要

Data collected by organizations and agencies are a key resource in today’s information age and fuel a significant part of today's economy. However, the disclosure of those data poses serious threats to individual privacy. One important approach to using data while protecting privacy is differential private data synthesis (DPDS). That is, given as input a private dataset, one uses a differentially private algorithm to generate synthetic datasets that are “similar” to the input dataset. While DPDS has received much attention in recent years, our understanding on this topic remains limited. This project takes a multi-disciplinary approach to advance our scientific understanding as well as improve practice techniques for DPDS. More specifically, this project’s novelties are as follows. First, it systematically explores the design space in marginal-based DPDS algorithms that have been proven to be effective in NIST competitions on DPDS, while also taking insights from data synthesis techniques developed in similar fields (often not satisfying DP). Second, it develops statistical theories that both are motivated by the empirical performances of DPDS algorithms, and guide the empirical research of these algorithms. The project’s broader significance and importance are as follows. We are in the information economy. Data of all kinds, such as online interaction, medical sensor data, genomic data, and location data are being collected. Practical techniques that enable use of these data while protecting individual privacy are crucially needed and will greatly enhance the value of such data. Users will gain from increased control of their private information, and society as a whole will benefit from deriving maximal benefit from aggregated data. PIs plan to jointly develop and teach a graduate-level course on synthetic data based on the existing research in this area as well as research results from this project, and involve undergraduate students in research. This project has two thrusts. The first thrust aims to develop new marginal-based DPDS algorithms that improve upon the state-of-art in empirical evaluations. The tasks include: perform an in-depth study of the “marginal-to-dataset” problem (how to synthesize a dataset when given a set of marginals); develop and evaluate new approaches for handling numerical attributes; and develop adaptive and automated techniques for selecting marginals so that dataset synthesized with them captures as much useful information from the input dataset as possible. The second thrust complements the empirical research in the first thrust, and aims to develop statistical theory for high dimensional marginal-based data synthesis algorithms, and also a general learning theory framework to evaluate the utility of synthetic data in downstream tasks. The two thrusts are highly complementary and support each other. The experimental study in Thrust 1 will provide insights and directions for theoretical studies in Thrust 2, which will help explain the experimental findings as well as guide additional experimental studies.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
组织和机构收集的数据是当今信息时代的关键资源,也是当今经济的重要组成部分。然而,这些数据的披露对个人隐私构成了严重威胁。在保护隐私的同时使用数据的一个重要方法是差分私有数据合成(DPDS)。也就是说,给定一个私有数据集作为输入,使用差分私有算法生成与输入数据集“相似”的合成数据集。虽然近年来DPDS受到了广泛的关注,但我们对这一主题的了解仍然有限。该项目采用多学科的方法来提高我们对DPDS的科学认识,并改进实践技术。更具体地说,这个项目的新奇之处如下。首先,它系统地探索了基于边缘的DPDS算法的设计空间,这些算法已被证明在NIST的DPDS竞赛中是有效的,同时也从类似领域开发的数据合成技术中获得了见解(通常不满足DP)。其次,发展了由DPDS算法的经验性能驱动的统计理论,并指导了这些算法的实证研究。该项目更广泛的意义和重要性如下。我们处在信息经济时代。各种各样的数据,如在线交互、医疗传感器数据、基因组数据和位置数据正在被收集。在保护个人隐私的同时能够使用这些数据的实用技术是至关重要的,并且将大大提高这些数据的价值。用户将从加强对其私人信息的控制中获益,整个社会将从汇总数据中获得最大利益。pi计划根据该领域的现有研究以及本项目的研究成果,共同开发和教授研究生水平的综合数据课程,并让本科生参与研究。这个项目有两个重点。第一个重点是开发新的基于边际的DPDS算法,以改进经验评估的最新水平。任务包括:对“边际到数据集”问题(如何在给定一组边际时合成数据集)进行深入研究;开发和评估处理数字属性的新方法;并开发自适应和自动选择边缘的技术,以便与它们合成的数据集从输入数据集中捕获尽可能多的有用信息。第二部分是对第一部分实证研究的补充,旨在为高维边缘数据合成算法发展统计理论,并为评估合成数据在下游任务中的效用提供一个通用的学习理论框架。这两个重点是高度互补和相互支持的。推力1的实验研究将为推力2的理论研究提供见解和方向,有助于解释实验结果,并指导后续的实验研究。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ninghui Li其他文献

PURE: A Framework for Analyzing Proximity-based Contact Tracing Protocols
PURE:用于分析基于接近度的接触追踪协议的框架
  • DOI:
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    16.6
  • 作者:
    F. Cicala;Weicheng Wang;Tianhao Wang;Ninghui Li;E. Bertino;F. Liang;Yang Yang
  • 通讯作者:
    Yang Yang
Fisher Information as a Utility Metric for Frequency Estimation under Local Differential Privacy
Fisher信息作为本地差分隐私下频率估计的效用度量
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Milan Lopuhaä;B. Škorić;Ninghui Li
  • 通讯作者:
    Ninghui Li
A formal semantics for P3P
P3P 的形式化语义
  • DOI:
  • 发表时间:
    2004
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ting Yu;Ninghui Li;A. Antón
  • 通讯作者:
    A. Antón
Anonymizing Network Traces with Temporal Pseudonym Consistency
通过时间假名一致性对网络跟踪进行匿名化
Sensornet
传感器网
  • DOI:
  • 发表时间:
    2009
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Rodney Topor;Kenneth Salem;Amarnath Gupta;K. Goda;John F. Gehrke;N. Palmer;Mohamed Sharaf;Alexandros Labrinidis;J. Roddick;Ariel Fuxman;Renée J. Miller;Wang;Anastasios Kementsietsidis;Philippe Bonnet;D. Shasha;Ronald Peikert;Bertram Ludäscher;S. Bowers;T. McPhillips;Harald Naumann;K. Voruganti;J. Domingo;Ben Carterette;Panagiotis G. Ipeirotis;Marcelo Arenas;Y. Manolopoulos;Y. Theodoridis;V. Tsotras;B. Carminati;Jan Jurjens;Eduardo B. Fernandez;Murat Kantarcıoǧlu;Jaideep Vaidya;Indrakshi Ray;Athena Vakali;Cristina Sirangelo;E. Pitoura;Himanshu Gupta;Surajit Chaudhuri;G. Weikum;Ulf Leser;David W. Embley;Fausto Giunchiglia;P. Shvaiko;Mikalai Yatskevich;Edward Y. Chang;Christine Parent;S. Spaccapietra;E. Zimányi;G. Anadiotis;S. Kotoulas;Ronny Siebes;Grigoris Antoniou;D. Plexousakis;J. Bailey;François Bry;Tim Furche;Sebastian Schaffert;David Martin;Gregory D. Speegle;Krithi Ramamritham;P. Chrysanthis;Kai;Stéphane Bressan;S. Abiteboul;D. Suciu;G. Dobbie;Tok Wang Ling;Sugato Basu;Ramesh Govindan;Michael H. Böhlen;C. S. Jensen;Jianyong Wang;K. Vidyasankar;A. Chan;Serge Mankovski;S. Elnikety;P. Valduriez;Yannis Velegrakis;Mario A. Nascimento;Michael Huggett;Andrew U. Frank;Yanchun Zhang;Guandong Xu;R. Snodgrass;Alan Fekete;Marcus Herzog;Konstantinos Morfonios;Y. Ioannidis;E. Wohlstadter;M. Matera;F. Schwagereit;Steffen Staab;Keir Fraser;Jingren Zhou;M. Mokbel;Walid G. Aref;Mirella M. Moro;Markus Schneider;Panos Kalnis;Gabriel Ghinita;Michael F. Goodchild;Shashi Shekhar;James Kang;Vijayaprasath Gandhi;Nikos Mamoulis;Betsy George;Michel Scholl;Agnès Voisard;Ralf Hartmut Güting;Yufei Tao;Dimitris Papadias;Peter Revesz;G. Kollios;E. Frentzos;Apostolos N. Papadopoulos;Bernhard Thalheim;Jovan Pehcevski;Benjamin Piwowarski;S. Theodoridis;Konstantinos Koutroumbas;George Karabatis;Don Chamberlin;Philip A. Bernstein;Michael H. Böhlen;J. Gamper;Ping Li;Kazimierz Subieta;S. Harizopoulos;Ethan Zhang;Yi Zhang;Theodore Johnson;Hans;S. Fienberg;Jiashun Jin;Radu Sion;C. Paice;Nikos Hardavellas;Ippokratis Pandis;Edie M. Rasmussen;Hiroshi Yoshida;G. Graefe;Bernd Reiner;Karl Hahn;K. Wada;T. Risch;Jiawei Han;Bolin Ding;Lukasz Golab;Michael Stonebraker;Bibudh Lahiri;Srikanta Tirthapura;Erik Vee;Yanif Ahmad;U. Çetintemel;Mitch Cherniack;S. Zdonik;Mariano P. Consens;M. Lalmas;R. Baeza;D. Hiemstra;Peer Krögerand;Arthur Zimek;Nick Craswell;Carson Kai;Maxime Crochemore;Thierry Lecroq;Arie Shoshani;Jimmy Lin;Hwanjo Yu;David B. Lomet;H. Hinterberger;Ninghui Li;Phillip B. Gibbons;Mouna Kacimi;Thomas Neumann
  • 通讯作者:
    Thomas Neumann

Ninghui Li的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ninghui Li', 18)}}的其他基金

Collaborative Proposal: SaTC: Frontiers: Center for Distributed Confidential Computing (CDCC)
协作提案:SaTC:前沿:分布式机密计算中心 (CDCC)
  • 批准号:
    2207204
  • 财政年份:
    2022
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
SaTC: CORE: Medium: Collaborative: User-Centered Deployment of Differential Privacy
SaTC:核心:媒介:协作:以用户为中心的差异隐私部署
  • 批准号:
    1931443
  • 财政年份:
    2020
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
RAPID: Collaborative: PPSRC: Privacy-Preserving Self-Reporting for COVID-19
RAPID:协作:PPSRC:COVID-19 隐私保护自我报告
  • 批准号:
    2034235
  • 财政年份:
    2020
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
SaTC: CORE: Improving Password Ecosystem: A Holistic Approach
SaTC:核心:改进密码生态系统:整体方法
  • 批准号:
    1704587
  • 财政年份:
    2017
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
EAGER: Bridging The Gap between Theory and Practice in Data Privacy
EAGER:弥合数据隐私理论与实践之间的差距
  • 批准号:
    1640374
  • 财政年份:
    2016
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
TWC SBE: Medium: Collaborative: User-Centric Risk Communication and Control on Mobile Devices
TWC SBE:媒介:协作:移动设备上以用户为中心的风险沟通和控制
  • 批准号:
    1314688
  • 财政年份:
    2013
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
TC: Small: Provably Private Microdata Publishing
TC:小型:可证明的私人微数据出版
  • 批准号:
    1116991
  • 财政年份:
    2011
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
CCS Workshops Organization Supplement
CCS 研讨会组织补充
  • 批准号:
    1054001
  • 财政年份:
    2010
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
TC:Medium: Collaborative Research: Towards Formal, Risk Aware Authorization
TC:中:协作研究:迈向正式的、具有风险意识的授权
  • 批准号:
    0963715
  • 财政年份:
    2010
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
TC:Medium:Collaborative Research:Techniques to Retrofit Legacy Code
TC:中:协作研究:改造遗留代码的技术
  • 批准号:
    0905442
  • 财政年份:
    2009
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant

相似国自然基金

Research on Quantum Field Theory without a Lagrangian Description
  • 批准号:
    24ZR1403900
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Cell Research
  • 批准号:
    31224802
  • 批准年份:
    2012
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research
  • 批准号:
    31024804
  • 批准年份:
    2010
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research (细胞研究)
  • 批准号:
    30824808
  • 批准年份:
    2008
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
  • 批准号:
    10774081
  • 批准年份:
    2007
  • 资助金额:
    45.0 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: SaTC: CORE: Medium: Differentially Private SQL with flexible privacy modeling, machine-checked system design, and accuracy optimization
协作研究:SaTC:核心:中:具有灵活隐私建模、机器检查系统设计和准确性优化的差异化私有 SQL
  • 批准号:
    2317232
  • 财政年份:
    2024
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Collaborative Research: SaTC: CORE: Medium: Using Intelligent Conversational Agents to Empower Adolescents to be Resilient Against Cybergrooming
合作研究:SaTC:核心:中:使用智能会话代理使青少年能够抵御网络诱骗
  • 批准号:
    2330940
  • 财政年份:
    2024
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Collaborative Research: NSF-BSF: SaTC: CORE: Small: Detecting malware with machine learning models efficiently and reliably
协作研究:NSF-BSF:SaTC:核心:小型:利用机器学习模型高效可靠地检测恶意软件
  • 批准号:
    2338301
  • 财政年份:
    2024
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Collaborative Research: SaTC: CORE: Medium: Differentially Private SQL with flexible privacy modeling, machine-checked system design, and accuracy optimization
协作研究:SaTC:核心:中:具有灵活隐私建模、机器检查系统设计和准确性优化的差异化私有 SQL
  • 批准号:
    2317233
  • 财政年份:
    2024
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Collaborative Research: NSF-BSF: SaTC: CORE: Small: Detecting malware with machine learning models efficiently and reliably
协作研究:NSF-BSF:SaTC:核心:小型:利用机器学习模型高效可靠地检测恶意软件
  • 批准号:
    2338302
  • 财政年份:
    2024
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Collaborative Research: SaTC: CORE: Medium: Using Intelligent Conversational Agents to Empower Adolescents to be Resilient Against Cybergrooming
合作研究:SaTC:核心:中:使用智能会话代理使青少年能够抵御网络诱骗
  • 批准号:
    2330941
  • 财政年份:
    2024
  • 资助金额:
    $ 30万
  • 项目类别:
    Continuing Grant
Collaborative Research: SaTC: CORE: Small: Towards Secure and Trustworthy Tree Models
协作研究:SaTC:核心:小型:迈向安全可信的树模型
  • 批准号:
    2413046
  • 财政年份:
    2024
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Collaborative Research: SaTC: EDU: Adversarial Malware Analysis - An Artificial Intelligence Driven Hands-On Curriculum for Next Generation Cyber Security Workforce
协作研究:SaTC:EDU:对抗性恶意软件分析 - 下一代网络安全劳动力的人工智能驱动实践课程
  • 批准号:
    2230609
  • 财政年份:
    2023
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Collaborative Research: SaTC: EDU: RoCCeM: Bringing Robotics, Cybersecurity and Computer Science to the Middled School Classroom
合作研究:SaTC:EDU:RoCCeM:将机器人、网络安全和计算机科学带入中学课堂
  • 批准号:
    2312057
  • 财政年份:
    2023
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
Collaborative Research: SaTC: CORE: Medium: Understanding the Impact of Privacy Interventions on the Online Publishing Ecosystem
协作研究:SaTC:核心:媒介:了解隐私干预对在线出版生态系统的影响
  • 批准号:
    2237329
  • 财政年份:
    2023
  • 资助金额:
    $ 30万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了