Privacy-preserving methods and tools for handling missing data in distributed health data networks
用于处理分布式健康数据网络中丢失数据的隐私保护方法和工具
基本信息
- 批准号:9364071
- 负责人:
- 金额:$ 59.85万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-09-08 至 2021-06-30
- 项目状态:已结题
- 来源:
- 关键词:AddressAlgorithmsAreaClientClinical DataClinical ResearchCollaborationsCommunicationComputer softwareDataDatabasesDisclosureDockingElectronic Health RecordEnvironmentGoalsHealthHealthcare SystemsHospitalsIndividualInformaticsInstitutionInstitutional PolicyInsuranceLearningLegalMethodologyMethodsObesityPatientsPatternPrivacyPrivatizationProbabilityProviderResearchResearch PersonnelSecureSecuritySiteSystemWeightbiomedical informaticscostdata sharingdistributed dataelectronic datahealth care servicehealth dataimprovedinterestopen sourcepatient privacyprecision medicineprofiles in patientspublic trustrepositoryscale upsimulationstatisticstooluser-friendlyweb interface
项目摘要
PROJECT SUMMARY
Distributed health data networks (DHDNs) that leverage electronic health records (EHRs) (e.g., eMerge,
pSCANNER, PEDSnet) have drawn substantial interests in recent years, as they a) eliminate the need to
create, maintain, and secure access to central data repositories, b) minimize the need to disclose protected
health information outside the data-owning entity, and c) mitigate many security, proprietary, legal, and privacy
concerns. Missing data are ubiquitous and present analytical challenges in DHDNs. However, very limited
research has been conducted to address missing data in such settings. When applying to a distributed
environment, the current state-of-the-art approaches for handling missing data require pooling raw data into a
central repository before analysis and hence require individual-level data sharing, which may not be feasible
for a number of reasons, including institutional policies prohibiting such sharing, high regulatory hurdles, public
privacy concerns, and costs/overhead of moving massive amounts of data. A large body of research has
demonstrated that given some background information about an individual such as data from EHRs, an
adversary can learn (from “de-identified” data) sensitive information about the individual and improper
disclosure of individual-level data may have serious implications. The proposed research will address the
challenges associated with handling missing data in distributed analysis and fill a crucial methodology gap. We
propose the following specific aims: 1) develop privacy-preserving distributed methods for handling missing
data in horizontally partitioned data; 2) develop privacy preserving distributed methods for handling missing
data in vertically partitioned data; 3) develop a user-friendly toolkit to allow researchers to handle missing data
for distributed analysis in health data networks; and 4) evaluate and validate the methods and tool kit using the
UCSD obesity patient data prepared for pSCANNER, and data from PEDSnet in addition to simulated data.
The proposed approaches will enable using data across multiple sites and will not require pooling patient-level
data into a central repository. They can be scaled up to handle massive amounts of data in DHDNs, because
the decomposed computation can be parallelized to all participating parties. The results of our study will
significantly advance the state-of-the-art in missing data methodology for DHDNs. The privacy-preserving
software toolkit will enable researchers to use more complete data in their research by leveraging information
from multiple sites without compromising patient privacy, and help lower regulatory and other hurdles for
collaboration across multiple institutions and build the public trust. As such, it will encourage more institutions
and healthcare systems to become part of a clinical data research network and more patients to participate in
clinical studies, which will improve the validity, robustness and generalizability of research findings and offer
substantial benefits in areas including, but not limited to, precision medicine and informatics practice.
项目摘要
利用电子健康记录(EHR)的分布式健康数据网络(DHDN)(例如,eMerge,
pSCANNER,PEDSnet)近年来引起了极大的兴趣,因为它们a)消除了
创建、维护和保护对中央数据存储库的访问,B)最大限度地减少披露受保护的
数据拥有实体之外的健康信息,以及c)减轻许多安全性、专有性、法律的和隐私
性问题缺失数据是普遍存在的,并在DHDN中提出了分析挑战。然而,非常有限
为解决这种情况下的数据缺失问题进行了研究。当应用于分布式
环境中,当前处理缺失数据的最先进方法需要将原始数据汇集到
因此需要个人层面的数据共享,这可能不可行
由于许多原因,包括禁止这种共享的机构政策,高监管障碍,公共
隐私问题以及移动大量数据的成本/开销。大量的研究表明,
证明了给定一些关于个人的背景信息,例如来自EHR的数据,
对手可以(从“去识别”数据)了解有关个人的敏感信息,
披露个人数据可能会产生严重影响。拟议的研究将解决
解决了分布式分析中处理缺失数据的难题,填补了关键的方法学空白。我们
提出了以下具体目标:1)开发隐私保护的分布式方法来处理丢失
水平分区数据中的数据; 2)开发隐私保护分布式方法来处理丢失数据
垂直分区数据中的数据; 3)开发一个用户友好的工具包,使研究人员能够处理丢失的数据
用于卫生数据网络中的分布式分析;以及4)使用
为pSCANNER准备的UCSD肥胖患者数据,以及来自PEDSnet的数据和模拟数据。
所提出的方法将能够使用多个研究中心的数据,并且不需要合并患者水平
数据到中央存储库。它们可以扩展以处理DHDN中的大量数据,因为
分解的计算可以被并行化到所有参与方。我们的研究结果将
显着推进DHDN缺失数据方法的最新技术水平。隐私保护
软件工具包将使研究人员能够利用信息,在研究中使用更完整的数据
从多个网站,而不损害病人的隐私,并有助于降低监管和其他障碍,
跨机构合作,建立公众信任。因此,它将鼓励更多的机构
和医疗保健系统成为临床数据研究网络的一部分,
临床研究,这将提高有效性,鲁棒性和研究结果的普遍性,并提供
在包括但不限于精准医学和信息学实践等领域的重大利益。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Qi Long其他文献
Qi Long的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Qi Long', 18)}}的其他基金
Statistical Modeling of Alzheimer's Disease Progression Integrating Brain Imaging and -Omics Data
整合脑成像和组学数据的阿尔茨海默病进展统计模型
- 批准号:
10457208 - 财政年份:2021
- 资助金额:
$ 59.85万 - 项目类别:
Statistical Modeling of Alzheimer's Disease Progression Integrating Brain Imaging and -Omics Data
整合脑成像和组学数据的阿尔茨海默病进展统计模型
- 批准号:
10579286 - 财政年份:2021
- 资助金额:
$ 59.85万 - 项目类别:
Statistical Modeling of Alzheimer's Disease Progression Integrating Brain Imaging and -Omics Data
整合脑成像和组学数据的阿尔茨海默病进展统计模型
- 批准号:
10359718 - 财政年份:2021
- 资助金额:
$ 59.85万 - 项目类别:
A comparative analysis of human and canine iNKT cells for ACT
人和犬 iNKT 细胞 ACT 的比较分析
- 批准号:
10287095 - 财政年份:2017
- 资助金额:
$ 59.85万 - 项目类别:
Coordinating Center for Canine Immunotherapy Trials and Correlative Studies
犬免疫治疗试验及相关研究协调中心
- 批准号:
10255532 - 财政年份:2017
- 资助金额:
$ 59.85万 - 项目类别:
Coordinating Center for Canine Immunotherapy Trials and Correlative Studies
犬免疫治疗试验及相关研究协调中心
- 批准号:
10260668 - 财政年份:2017
- 资助金额:
$ 59.85万 - 项目类别:
Coordinating Center for Canine Immunotherapy Trials and Correlative Studies
犬免疫治疗试验及相关研究协调中心
- 批准号:
10247892 - 财政年份:2017
- 资助金额:
$ 59.85万 - 项目类别:
Statistical Methods for Causal Inference in Observational Studies
观察研究中因果推断的统计方法
- 批准号:
8870561 - 财政年份:2015
- 资助金额:
$ 59.85万 - 项目类别:
Evaluating Prediction Models for Cancer Endpoints Subject to Dependent Censoring
评估受相关审查影响的癌症终点预测模型
- 批准号:
8443616 - 财政年份:2013
- 资助金额:
$ 59.85万 - 项目类别:
相似海外基金
Approximate algorithms and architectures for area efficient system design
区域高效系统设计的近似算法和架构
- 批准号:
LP170100311 - 财政年份:2018
- 资助金额:
$ 59.85万 - 项目类别:
Linkage Projects
AMPS: Rank Minimization Algorithms for Wide-Area Phasor Measurement Data Processing
AMPS:用于广域相量测量数据处理的秩最小化算法
- 批准号:
1736326 - 财政年份:2017
- 资助金额:
$ 59.85万 - 项目类别:
Standard Grant
Low Power, Area Efficient, High Speed Algorithms and Architectures for Computer Arithmetic, Pattern Recognition and Cryptosystems
用于计算机算术、模式识别和密码系统的低功耗、面积高效、高速算法和架构
- 批准号:
1686-2013 - 财政年份:2017
- 资助金额:
$ 59.85万 - 项目类别:
Discovery Grants Program - Individual
Rigorous simulation of speckle fields caused by large area rough surfaces using fast algorithms based on higher order boundary element methods
使用基于高阶边界元方法的快速算法对大面积粗糙表面引起的散斑场进行严格模拟
- 批准号:
375876714 - 财政年份:2017
- 资助金额:
$ 59.85万 - 项目类别:
Research Grants
Low Power, Area Efficient, High Speed Algorithms and Architectures for Computer Arithmetic, Pattern Recognition and Cryptosystems
用于计算机算术、模式识别和密码系统的低功耗、面积高效、高速算法和架构
- 批准号:
1686-2013 - 财政年份:2016
- 资助金额:
$ 59.85万 - 项目类别:
Discovery Grants Program - Individual
Low Power, Area Efficient, High Speed Algorithms and Architectures for Computer Arithmetic, Pattern Recognition and Cryptosystems
用于计算机算术、模式识别和密码系统的低功耗、面积高效、高速算法和架构
- 批准号:
1686-2013 - 财政年份:2015
- 资助金额:
$ 59.85万 - 项目类别:
Discovery Grants Program - Individual
Low Power, Area Efficient, High Speed Algorithms and Architectures for Computer Arithmetic, Pattern Recognition and Cryptosystems
用于计算机算术、模式识别和密码系统的低功耗、面积高效、高速算法和架构
- 批准号:
1686-2013 - 财政年份:2014
- 资助金额:
$ 59.85万 - 项目类别:
Discovery Grants Program - Individual
AREA: Optimizing gene expression with mRNA free energy modeling and algorithms
区域:利用 mRNA 自由能建模和算法优化基因表达
- 批准号:
8689532 - 财政年份:2014
- 资助金额:
$ 59.85万 - 项目类别:
CPS: Synergy: Collaborative Research: Distributed Asynchronous Algorithms and Software Systems for Wide-Area Monitoring of Power Systems
CPS:协同:协作研究:用于电力系统广域监控的分布式异步算法和软件系统
- 批准号:
1329780 - 财政年份:2013
- 资助金额:
$ 59.85万 - 项目类别:
Standard Grant
CPS: Synergy: Collaborative Research: Distributed Asynchronous Algorithms and Software Systems for Wide-Area Mentoring of Power Systems
CPS:协同:协作研究:用于电力系统广域指导的分布式异步算法和软件系统
- 批准号:
1329745 - 财政年份:2013
- 资助金额:
$ 59.85万 - 项目类别:
Standard Grant