Advancing Record Linkage Research: Optimal Linkage Decisions and Propagating Linkage Uncertainty

推进记录关联研究:最优关联决策和传播关联不确定性

基本信息

  • 批准号:
    1852841
  • 负责人:
  • 金额:
    $ 15万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-09-01 至 2022-08-31
  • 项目状态:
    已结题

项目摘要

This project will advance research on record linkage. It is increasingly common to find complementary information on individuals scattered across multiple data sources. To take full advantage of these data sources, researchers need to be able to link information on the same individuals. In many applications, however, there are no unique identifiers of the individuals in the datafiles. This makes it difficult to recognize which records correspond to the same individuals. Statistical methodology will be developed for creating merged datafiles and for improving analyses of the linked data. These data linkages will allow richer data analyses and potentially substitute for or facilitate new data collection efforts. Researchers across disciplines will benefit from being able to use statistically rigorous procedures to merge datasets and to carry out analyses with linked data. The ability to create and analyze richer datasets will facilitate understanding of policy options in important areas such as education and health, thus furthering societal interests. A graduate student will be trained as part of this project, and the techniques will be made available as part of free software packages along with tutorials.This project will use the output of probabilistic record linkage procedures to develop rigorous statistical methodology for creating merged datafiles. Coherent approaches for propagating linkage uncertainty into subsequent analyses will be explored. To create merged datasets, the investigator will derive an estimator of the true linkage of the datafiles. A loss function will be developed through which researchers will be able to give different weights to different types of linkage errors. The linkage estimator will be derived by minimizing the expected value of the researcher-defined loss function. The point estimators also will include the option of "abstaining" from linking records for which the correct links are highly uncertain. To perform statistical analyses with merged data, the investigator will explore procedures in which researchers carry out the statistical analysis they are interested in for each of several plausible linkages of the data and then combine the output from these analyses. The procedures will be validated theoretically, via simulation studies, and using real data analyses.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目将推进记录链接的研究。寻找分散在多个数据源中的个人的补充信息变得越来越普遍。为了充分利用这些数据源,研究人员需要能够链接同一个人的信息。然而,在许多应用中,数据文件中没有个人的唯一标识符。这使得很难识别哪些记录对应于同一个人。将开发统计方法来创建合并数据文件并改进链接数据的分析。 这些数据链接将允许更丰富的数据分析,并有可能替代或促进新的数据收集工作。跨学科的研究人员将受益于能够使用严格的统计程序来合并数据集并使用链接数据进行分析。创建和分析更丰富的数据集的能力将有助于理解教育和健康等重要领域的政策选择,从而促进社会利益。作为该项目的一部分,将培训一名研究生,并且这些技术将作为免费软件包的一部分以及教程提供。该项目将使用概率记录链接程序的输出来开发用于创建合并数据文件的严格统计方法。将探索将连锁不确定性传播到后续分析中的连贯方法。为了创建合并的数据集,研究人员将得出数据文件真实链接的估计量。将开发一个损失函数,通过该函数,研究人员将能够为不同类型的链接错误赋予不同的权重。链接估计器将通过最小化研究人员定义的损失函数的期望值来导出。点估计器还将包括“放弃”链接记录的选项,这些记录的正确链接高度不确定。为了使用合并数据进行统计分析,研究人员将探索研究人员对数据的几个看似合理的联系中的每一个进行他们感兴趣的统计分析的程序,然后合并这些分析的输出。该程序将通过模拟研究和真实数据分析从理论上进行验证。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Discussion of ‘A Unified Framework for De-Duplication and Population Size Estimation’ by Tancredi, Steorts, and Liseo.
Tancredi、Steorts 和 Liseo 对“重复数据删除和总体规模估计的统一框架”的讨论。
  • DOI:
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    4.4
  • 作者:
    Sadinle, Mauricio
  • 通讯作者:
    Sadinle, Mauricio
The Central Role of the Identifying Assumption in Population Size Estimation
识别假设在人口规模估计中的核心作用
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    1.9
  • 作者:
    Aleshin-Guendel, Serge;Sadinle, Mauricio;Wakefield, Jon
  • 通讯作者:
    Wakefield, Jon
Multifile Partitioning for Record Linkage and Duplicate Detection
Discussion of ‘Multiple-Systems Analysis for the Quantification of Modern Slavery: Classical and Bayesian Approaches’ by Bernard Silverman
伯纳德·西尔弗曼 (Bernard Silverman) 讨论的“现代奴隶制量化的多系统分析:经典方法和贝叶斯方法”
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Mauricio Sadinle Garcia-Ruiz其他文献

Mauricio Sadinle Garcia-Ruiz的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

Occupational exposure to ionizing radiation and the impacts on cancer incidence, and mortality: a record linkage cohort study of nearly one million workers in the Canadian National Dose Registry
电离辐射的职业暴露及其对癌症发病率和死亡率的影响:一项针对加拿大国家剂量登记处近百万工人的创纪录的连锁队列研究
  • 批准号:
    480070
  • 财政年份:
    2023
  • 资助金额:
    $ 15万
  • 项目类别:
    Operating Grants
Multifile probabilistic record linkage for drug overdose surveillance and public health action
用于药物过量监测和公共卫生行动的多文件概率记录链接
  • 批准号:
    10039949
  • 财政年份:
    2020
  • 资助金额:
    $ 15万
  • 项目类别:
Multifile probabilistic record linkage for drug overdose surveillance and public health action
用于药物过量监测和公共卫生行动的多文件概率记录链接
  • 批准号:
    10200740
  • 财政年份:
    2020
  • 资助金额:
    $ 15万
  • 项目类别:
Developing realistic attack models for privacy preserving record linkage and algorithms to prevent such attacks
开发用于隐私保护记录链接的真实攻击模型和防止此类攻击的算法
  • 批准号:
    407023611
  • 财政年份:
    2019
  • 资助金额:
    $ 15万
  • 项目类别:
    Research Grants
Parent-Offspring Record Linkage for Population-Based Chronic Disease Research: Planning for an International Consortium Meeting
基于人群的慢性病研究的亲子记录联系:国际联盟会议的规划
  • 批准号:
    412144
  • 财政年份:
    2019
  • 资助金额:
    $ 15万
  • 项目类别:
    Miscellaneous Programs
Record Linkage Across Heterogeneous Data Sources
记录异构数据源之间的链接
  • 批准号:
    RGPIN-2014-05304
  • 财政年份:
    2019
  • 资助金额:
    $ 15万
  • 项目类别:
    Discovery Grants Program - Individual
Familial Histories of Comorbid Conditions for Predicting Fracture Risk using Population-Based Record Linkage
使用基于人群的记录关联预测骨折风险的共病家族史
  • 批准号:
    383533
  • 财政年份:
    2018
  • 资助金额:
    $ 15万
  • 项目类别:
    Studentship Programs
Leveraging record linkage for single-indication medications to boost recruitment in Psychiatric and Pharmaco- Genetics
利用单适应症药物的记录关联来促进精神病学和药物遗传学领域的招募
  • 批准号:
    nhmrc : GNT1138514
  • 财政年份:
    2018
  • 资助金额:
    $ 15万
  • 项目类别:
    Project Grants
Record Linkage Across Heterogeneous Data Sources
记录异构数据源之间的链接
  • 批准号:
    RGPIN-2014-05304
  • 财政年份:
    2018
  • 资助金额:
    $ 15万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了