Improving Probabilistic Record Linkage and Subsequent Inference
改进概率记录链接和后续推理
基本信息
- 批准号:1824555
- 负责人:
- 金额:$ 20.46万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-10-01 至 2020-09-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This research project will develop methods for linking records across databases in the absence of unique identifiers such as Social Security numbers and for making inference using the linked data files. Record linkage is a perennial and challenging problem across the social sciences, with important applications in areas such as demography, economics, public health, and official statistics. Plummeting costs of new forms of data collection and storage and the proliferation of "big data" have increased the need for merging such databases as researchers and statistical agencies struggle to integrate carefully curated datasets with messy and incomplete data from historical, administrative, and commercial sources. The methods developed in this project will facilitate the successful integration of different data sources, thus generating new resources for future research. These combined data sources may also provide some alternatives to expensive survey data collection in an era of declining response rates. Freely available software will be developed and stored in a public repository.The increasing desire to deploy probabilistic record linkage has spurred significant research into various components of the process, such as how to compare records, how to reduce the number of record comparisons to keep the problem computationally feasible, how to quantify the weight of evidence for or against a link between records, and how to ultimately generate a merged database. Often these components are studied in isolation from each other and from the ultimate goal of making inferences using the merged files. This research project will take a more holistic view of the record linkage process in order to advance the state of the art. The project has two primary goals. The first goal is to develop new models for record linkage that incorporate the impact of preprocessing methods that reduce the total number of record pairs to be evaluated. While widely deployed and well motivated, these methods have effects on subsequent modeling that are not well understood. The second goal is to enhance understanding of uncertainty and error throughout the process and to develop imputation methods for propagating error due to uncertain record links and other missing data, such as item nonresponse in a survey. These methods will be designed with an eye toward large applications that require new computational approaches. The project is supported by the Methodology, Measurement, and Statistics Program and a consortium of federal statistical agencies as part of a joint activity to support research on survey and statistical methodology.
该研究项目将开发在没有社会安全号码等唯一标识符的情况下将数据库中的记录链接起来的方法,并使用链接的数据文件进行推断。 记录关联是社会科学中一个长期存在且具有挑战性的问题,在人口学、经济学、公共卫生和官方统计等领域有着重要的应用。 新形式的数据收集和存储成本的大幅下降以及“大数据”的激增增加了合并此类数据库的需求,因为研究人员和统计机构难以将精心策划的数据集与来自历史,行政和商业来源的混乱和不完整的数据整合在一起。 该项目开发的方法将促进不同数据源的成功整合,从而为未来的研究产生新的资源。 在答复率不断下降的时代,这些合并的数据来源还可以为昂贵的调查数据收集提供一些替代办法。 将开发可免费获得的软件并将其存储在公共存储库中。部署概率记录链接的日益增长的愿望促使人们对该过程的各个组成部分进行了大量研究,例如如何比较记录,如何减少记录比较的次数以保持问题在计算上可行,如何量化支持或反对记录之间链接的证据的权重,以及如何最终生成合并数据库。 通常,这些组件是彼此隔离地研究的,并且与使用合并文件进行推理的最终目标隔离。 本研究项目将采取一个更全面的观点记录链接过程,以推进国家的艺术。该项目有两个主要目标。第一个目标是开发新的记录链接模型,将预处理方法的影响,减少记录对的总数进行评估。 虽然广泛部署和良好的动机,这些方法对后续建模的影响还没有得到很好的理解。 第二个目标是加强对整个过程中的不确定性和误差的理解,并制定插补方法,用于传播由于不确定的记录链接和其他缺失数据(如调查中的项目无应答)而导致的误差。这些方法的设计将着眼于需要新计算方法的大型应用。 该项目得到了方法、测量和统计方案以及联邦统计机构联合会的支持,作为支持调查和统计方法研究的联合活动的一部分。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jared Murray其他文献
Cervical Spine Alignment in Helmeted Skiers and Snowboarders With Suspected Head and Neck Injuries: Comparison of Lateral C-spine Radiographs Before and After Helmet Removal and Implications for Ski Patrol Transport
- DOI:
10.1016/j.wem.2017.03.009 - 发表时间:
2017-09-01 - 期刊:
- 影响因子:1.2
- 作者:
Jared Murray;David A. Rust - 通讯作者:
David A. Rust
Jared Murray的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jared Murray', 18)}}的其他基金
CAREER: Bayesian Tree Models for Next-Generation Studies in the Behavioral and Social Sciences
职业:行为和社会科学下一代研究的贝叶斯树模型
- 批准号:
2046896 - 财政年份:2021
- 资助金额:
$ 20.46万 - 项目类别:
Continuing Grant
Improving Probabilistic Record Linkage and Subsequent Inference
改进概率记录链接和后续推理
- 批准号:
1631970 - 财政年份:2016
- 资助金额:
$ 20.46万 - 项目类别:
Standard Grant
相似海外基金
New approaches to training deep probabilistic models
训练深度概率模型的新方法
- 批准号:
2613115 - 财政年份:2025
- 资助金额:
$ 20.46万 - 项目类别:
Studentship
Probabilistic Inference Based Utility Evaluation and Path Generation for Active Autonomous Exploration of USVs in Unknown Confined Marine Environments
基于概率推理的效用评估和路径生成,用于未知受限海洋环境中 USV 主动自主探索
- 批准号:
EP/Y000862/1 - 财政年份:2024
- 资助金额:
$ 20.46万 - 项目类别:
Research Grant
ProbAI: A Hub for the Mathematical and Computational Foundations of Probabilistic AI
ProbAI:概率人工智能的数学和计算基础中心
- 批准号:
EP/Y028783/1 - 财政年份:2024
- 资助金额:
$ 20.46万 - 项目类别:
Research Grant
Towards the next generation probabilistic flood forecasting system for the UK
英国下一代概率洪水预报系统
- 批准号:
2907694 - 财政年份:2024
- 资助金额:
$ 20.46万 - 项目类别:
Studentship
Understanding conscious and unconscious learning of probabilistic information
理解概率信息的有意识和无意识学习
- 批准号:
24K16877 - 财政年份:2024
- 资助金额:
$ 20.46万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Probabilistic arrival time prediction algorithm using a-priori knowledge and machine learning to enable sustainable air traffic management
使用先验知识和机器学习的概率到达时间预测算法,以实现可持续的空中交通管理
- 批准号:
24K07723 - 财政年份:2024
- 资助金额:
$ 20.46万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Collaborative Research: SHF: Medium: Verifying Deep Neural Networks with Spintronic Probabilistic Computers
合作研究:SHF:中:使用自旋电子概率计算机验证深度神经网络
- 批准号:
2311295 - 财政年份:2023
- 资助金额:
$ 20.46万 - 项目类别:
Continuing Grant
CAREER: Set-Systems: Probabilistic, Geometric and Extremal Perspectives
职业:集合系统:概率、几何和极值观点
- 批准号:
2237138 - 财政年份:2023
- 资助金额:
$ 20.46万 - 项目类别:
Continuing Grant
RAPID/Collaborative Research: Advancing Probabilistic Fault Displacement Hazard Assessments by Collecting Perishable Data from the 2023 Turkiye Earthquake Sequence
RAPID/合作研究:通过收集 2023 年土耳其地震序列的易腐烂数据推进概率断层位移危险评估
- 批准号:
2330152 - 财政年份:2023
- 资助金额:
$ 20.46万 - 项目类别:
Standard Grant
Probabilistic models of zeta-functions and applications to number theory
Zeta 函数的概率模型及其在数论中的应用
- 批准号:
22KJ2747 - 财政年份:2023
- 资助金额:
$ 20.46万 - 项目类别:
Grant-in-Aid for JSPS Fellows