Collaborative Research: Record Linkage and Privacy-Preserving Methods for Big Data
协作研究:大数据的记录链接和隐私保护方法
基本信息
- 批准号:1534412
- 负责人:
- 金额:$ 26.56万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2015
- 资助国家:美国
- 起止时间:2015-09-15 至 2018-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This research project will develop sound statistical and machine learning techniques for preserving privacy with linked data. Social entities and their patterns of behavior is a crucial topic in the social sciences. Research in this area has been invigorated by the growth of the modern information infrastructure, ease of data collection and storage, and the development of novel computational data analyses techniques. However, in many application areas relevant and sensitive information is commonly located across multiple databases. Data analysis is inherently impossible without merging databases, but at the cost of increasing the risk of a privacy violation. This research will address the problem of how to perform valid statistical inference in the presence of multiple data sources, data sharing, and privacy in the age of "big data." The investigators' new modeling construct for inference and uncertainty quantification will contribute to both statistics and the many disciplines for which statistics is a principal tool. The methods will have a wide range of applications in the social, economic, and behavioral sciences, including medicine, genetics, official statistics, and human rights violations. The investigators will collaborate with post-doctoral researcher and with graduate and undergraduate students. The statistical methods will be encapsulated in open-source software packages, allowing off-the-shelf use by practitioners while facilitating more detailed control and extensions.This interdisciplinary research project will improve upon methods in record linkage and privacy using state-of-the-art techniques from statistics and machine learning. Record linkage is the process of merging possible noisy databases with the goal of removing duplicate entries. Privacy-preserving record linkage (PPRL) tries to identify records that refer to the same entities from multiple databases without compromising the privacy of the entities represented by these records. The research will focus on three aims: (1) development of new Bayesian methods for PPRL, where the error can be propagated exactly across the entire linkage process and into statistical inference, including new privacy measures to capture a tradeoff between utility and risk of any individual risk in a linked database; (2) development of new robust methods for realizing synthetic data releases post-linkage with differential privacy guarantees and its relaxations to address additional layers of privacy and support broader data sharing; and (3) exploration of "big data" methods such as variational inference to address scalability and latent cluster exchangeability issues existing within linkage and privacy, such that the new methods can scale to multiple and large databases. The new methods will be scalable and assess uncertainty throughout the entire linkage and privacy process and can be evaluated using Bayesian disclosure risk and Bayesian differential privacy. The project is supported by the Methodology, Measurement, and Statistics Program and a consortium of federal statistical agencies as part of a joint activity to support research on survey and statistical methodology.
该研究项目将开发良好的统计和机器学习技术,以保护关联数据的隐私。 社会实体及其行为模式是社会科学中的一个重要课题。 现代信息基础设施的发展、数据收集和存储的便利以及新的计算数据分析技术的发展,使这一领域的研究得到了加强。 然而,在许多应用领域中,相关和敏感的信息通常位于多个数据库中。 如果不合并数据库,数据分析本质上是不可能的,但代价是增加了侵犯隐私的风险。 本研究将探讨在“大数据时代”,如何在多个数据源、数据共享与隐私的情况下,进行有效的统计推论。“研究人员用于推理和不确定性量化的新建模结构将有助于统计学和统计学作为主要工具的许多学科。 这些方法将在社会、经济和行为科学中有广泛的应用,包括医学、遗传学、官方统计和侵犯人权。 研究人员将与博士后研究人员以及研究生和本科生合作。 统计方法将被封装在开放源代码软件包中,允许从业人员使用现成的,同时促进更详细的控制和扩展。这个跨学科的研究项目将使用统计和机器学习的最先进技术改进记录链接和隐私方法。 记录链接是合并可能有噪声的数据库的过程,目的是删除重复条目。 隐私保护记录链接(PPRL)试图从多个数据库中识别引用相同实体的记录,而不会损害这些记录所表示的实体的隐私。 研究将集中在三个目标:(1)开发新的贝叶斯方法PPRL,其中错误可以在整个链接过程中准确传播并进入统计推断,包括新的隐私措施,以捕获链接数据库中任何个体风险的效用和风险之间的权衡;(2)制定新的强有力的方法,以实现综合数据的发布,与差别隐私保障及其放宽相联系,以解决额外的隐私层,并支持更广泛的数据共享;以及(3)探索“大数据”方法,例如变分推理,以解决链接和隐私中存在的可扩展性和潜在的集群交换问题,使得新方法可以扩展到多个和大型数据库。 新方法将是可扩展的,并在整个链接和隐私过程中评估不确定性,并可以使用贝叶斯披露风险和贝叶斯差分隐私进行评估。 该项目得到了方法、测量和统计方案以及联邦统计机构联合会的支持,作为支持调查和统计方法研究的联合活动的一部分。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Rebecca Steorts其他文献
Rebecca Steorts的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Rebecca Steorts', 18)}}的其他基金
CAREER: Scalable Record Linkage through the Microclustering Property
职业:通过微集群属性实现可扩展的记录链接
- 批准号:
1652431 - 财政年份:2017
- 资助金额:
$ 26.56万 - 项目类别:
Continuing Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: Examining Pyrotechnology and Ecosystem Change in the Archaeological Record
合作研究:检查考古记录中的火工技术和生态系统变化
- 批准号:
2413996 - 财政年份:2023
- 资助金额:
$ 26.56万 - 项目类别:
Standard Grant
Collaborative Research: RUI: An undergraduate cohort thermochronology research and mentorship experience investigating the thermo-tectonic record of the northern Klamath Mountains
合作研究:RUI:本科生群体热年代学研究和指导经验,调查克拉马斯山脉北部的热构造记录
- 批准号:
2242862 - 财政年份:2023
- 资助金额:
$ 26.56万 - 项目类别:
Standard Grant
Collaborative Research: RUI: An undergraduate cohort thermochronology research and mentorship experience investigating the thermo-tectonic record of the northern Klamath Mountains
合作研究:RUI:本科生群体热年代学研究和指导经验,调查克拉马斯山脉北部的热构造记录
- 批准号:
2242861 - 财政年份:2023
- 资助金额:
$ 26.56万 - 项目类别:
Standard Grant
Collaborative Research: A 50,000-year continuous record of the Indian Summer Monsoon from Loktak Lake, NE India
合作研究:印度东北部洛克塔克湖 50,000 年连续记录的印度夏季季风
- 批准号:
2303253 - 财政年份:2023
- 资助金额:
$ 26.56万 - 项目类别:
Standard Grant
Collaborative Research: A 50,000-year continuous record of the Indian Summer Monsoon from Loktak Lake, NE India
合作研究:印度东北部洛克塔克湖 50,000 年连续记录的印度夏季季风
- 批准号:
2303255 - 财政年份:2023
- 资助金额:
$ 26.56万 - 项目类别:
Standard Grant
Collaborative Research: A 50,000-year continuous record of the Indian Summer Monsoon from Loktak Lake, NE India
合作研究:印度东北部洛克塔克湖 50,000 年连续记录的印度夏季季风
- 批准号:
2303254 - 财政年份:2023
- 资助金额:
$ 26.56万 - 项目类别:
Standard Grant
Collaborative Research: Reconstructing the missing record of late Proterozoic tectonism along the western margin of Laurentia using deep-time thermochronology
合作研究:利用深时热年代学重建劳伦大陆西缘晚元古代构造运动的缺失记录
- 批准号:
2140481 - 财政年份:2022
- 资助金额:
$ 26.56万 - 项目类别:
Standard Grant
Collaborative Research: Do subduction‐complex metamorphic rocks record the thermal evolution of a subduction zone or periods of anomalous tectonic activity? Baja California
合作研究:俯冲复杂变质岩是否记录了俯冲带的热演化或异常构造活动的时期?
- 批准号:
2127229 - 财政年份:2022
- 资助金额:
$ 26.56万 - 项目类别:
Standard Grant
Collaborative Research: Sustaining The Utqiagvik Aerosol Record of Decades (STUARD)
合作研究:维持乌特恰格维克气溶胶数十年记录 (STUARD)
- 批准号:
2127737 - 财政年份:2022
- 资助金额:
$ 26.56万 - 项目类别:
Continuing Grant
Collaborative Research: Sustaining The Utqiaġvik Aerosol Record of Decades (STUARD)
合作研究:维持乌特恰维克气溶胶数十年来的记录 (STUARD)
- 批准号:
2127733 - 财政年份:2022
- 资助金额:
$ 26.56万 - 项目类别:
Continuing Grant














{{item.name}}会员




