A Risk Management Framework for Identifiability in Genomics Research

基因组学研究中可识别性的风险管理框架

基本信息

  • 批准号:
    9754854
  • 负责人:
  • 金额:
    $ 24.06万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2012
  • 资助国家:
    美国
  • 起止时间:
    2012-09-21 至 2021-07-31
  • 项目状态:
    已结题

项目摘要

The past decade has witnessed numerous demonstrations that genomic data can be traced back to the corresponding named individuals. These attacks exploit various collections, including the NIH Database of Genotypes and Phenotypes (dbGaP), the 1000 Genomes Project, and the Beacon Project of the Global Alliance for Genomics and Health, and are often reported in the popular media. At the same time, research conducted in the first phase of this grant (from 2012-2016) showed that such re-identification attacks often represent worst- case, non-generalizable scenarios. Specifically, it was shown that these attacks often focus on the possibility of attack - and not its probability given the wide range of factors often at play in practice. By focusing on the possible, such investigations can lead policy makers to believe that de-identification is a useless activity. However, our research showed that de-identification is only one part of a larger strategy of deterrents that can be used to manage risk. By intelligently combining de-identification with other technical risk mitigation approaches (e.g., controlled access) and societal constructs (e.g., data use agreements and penalties), genomic data sharing solutions can be developed with appropriate levels of risk and utility for scientists and society. While our research laid the foundation for managing identification risk in genomic data sharing, significant questions remain regarding its translation into practical guidance. In particular, risk management models must be specialized to the type of data that is shared, the types of penalties (or punishments) available, and the costs of adopting and administering deterrence mechanisms. Thus, in the second phase of this research project, we propose to augment risk-based re-identification management frameworks to model and assess the deterrence approaches invoked by existing repositories, such as dbGaP (which holds a collection of smaller historical datasets from completed studies), as well as emerging initiatives, such as the Precision Medicine Initiative. This project will pursue three specific aims, designed to work in harmony, but at the same time sufficiently independent that if one fails, the research will still yield fruitful risk management guidance for genomic databases: 1) Develop game theoretic models to assess re-identification attacks at different levels of detail in genomic data sharing (e.g., aggregate summaries of the proportion of variants in case vs. control groups in association studies); 2) Characterize and measure the costs associated with common re-identification deterrence approaches for genomic data (e.g., physical investigatory reviews and virtual audits of IT system use); and 3) Optimize the parameterization of a deterrence policy (e.g., the amount of damages for violation of a data use agreement or the amount of time to withhold data from an attacker/investigator) given the expected value of genomic data. We will evaluate these approaches with a large repository of de-identified genomic and electronic medical records in use at a large academic medical center, datasets hosted at two federal repositories, and a web system that presents summary statistics from a cohort of 9000 participants.
在过去的十年里,已经有许多证据表明,基因组数据可以追溯到 对应的命名个体。这些攻击利用各种集合,包括NIH数据库, 基因型和表型(dbGaP),1000个基因组计划和全球联盟的灯塔计划 基因组学和健康,并经常在大众媒体报道。与此同时,进行的研究 在第一阶段(2012-2016年)的研究表明,这种重新识别攻击通常是最严重的- 案例,不可推广的场景。具体而言,研究表明,这些攻击往往侧重于以下可能性: 攻击-而不是它的概率考虑到广泛的因素往往在实践中发挥作用。通过关注 这种调查有可能导致决策者认为,取消身份是一种无用的活动。 然而,我们的研究表明,去身份化只是更大的威慑战略的一部分, 用于管理风险。通过智能地将去识别与其他技术风险缓解相结合, 方法(例如,受控访问)和社会结构(例如,数据使用协议和处罚),基因组 可以开发出对科学家和社会具有适当风险和效用的数据共享解决方案。而 我们的研究为管理基因组数据共享中的识别风险奠定了基础, 将其转化为实践指导。特别是,风险管理模型必须 专门针对共享的数据类型,可用的惩罚类型(或惩罚)以及 采用和管理威慑机制。因此,在本研究项目的第二阶段,我们 建议加强基于风险的重新识别管理框架,以模拟和评估威慑 由现有存储库调用的方法,如dbGaP(它包含一个较小的历史数据集), 已完成研究的数据集),以及新兴的倡议,如精准医学倡议。这 该项目将追求三个具体目标,旨在协调工作,但同时又足够独立 如果失败,该研究仍将为基因组数据库提供富有成效的风险管理指导:1)开发 在基因组数据共享中评估不同细节级别的重新识别攻击的博弈论模型 (e.g.,关联研究中病例组与对照组中变异比例的汇总); 2) 描述和衡量与共同的重新识别威慑方法相关的成本, 基因组数据(例如,对信息技术系统使用情况进行物理解释性审查和虚拟审计);以及3)优化 威慑策略的参数化(例如,违反数据使用协议的损害赔偿金额,或 向攻击者/调查者隐瞒数据的时间量)。 我们将评估这些方法与一个大的知识库去识别基因组和电子医疗 在一个大型学术医疗中心使用的记录,在两个联邦存储库托管的数据集,以及一个Web系统 它提供了9000名参与者的汇总统计数据。

项目成果

期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Robust Transparency Against Model Inversion Attacks.
Integrating linear optimization with structural modeling to increase HIV neutralization breadth.
将线性优化与结构建模相结合,以增加 HIV 中和广度。
  • DOI:
    10.1371/journal.pcbi.1005999
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    4.3
  • 作者:
    Sevy,AlexanderM;Panda,Swetasudha;CroweJr,JamesE;Meiler,Jens;Vorobeychik,Yevgeniy
  • 通讯作者:
    Vorobeychik,Yevgeniy
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Bradley A. Malin其他文献

Dataset Representativeness and Downstream Task Fairness
数据集代表性和下游任务公平性
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Victor A. Borza;Andrew Estornell;Chien;Bradley A. Malin;Yevgeniy Vorobeychik
  • 通讯作者:
    Yevgeniy Vorobeychik
APPLICATIONS OF HOMOMORPHIC ENCRYPTION
同态加密的应用
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    David Archer;Lily Chen;Jung Hee Cheon;Ran Gilad;Roger A. Hallman;Zhicong Huang;Xiaoqian Jiang;R. Kumaresan;Bradley A. Malin;Heidi Sofia;Yongsoo Song;Shuang Wang
  • 通讯作者:
    Shuang Wang
Protecting Genomic Sequence Anonymity with Generalization Lattices
Optimizing word embeddings for small datasets: a case study on patient portal messages from breast cancer patients
  • DOI:
    10.1038/s41598-024-66319-z
  • 发表时间:
    2024-07-12
  • 期刊:
  • 影响因子:
    3.900
  • 作者:
    Qingyuan Song;Congning Ni;Jeremy L. Warner;Qingxia Chen;Lijun Song;S. Trent Rosenbloom;Bradley A. Malin;Zhijun Yin
  • 通讯作者:
    Zhijun Yin
Computational strategic recruitment for representation and coverage studied in the All of Us Research Program
在“我们所有人”研究计划中研究的代表和覆盖范围的计算战略招聘
  • DOI:
    10.1038/s41746-025-01804-x
  • 发表时间:
    2025-07-03
  • 期刊:
  • 影响因子:
    15.100
  • 作者:
    Victor A. Borza;Qingxia Chen;Ellen W. Clayton;Murat Kantarcioglu;Lina Sulieman;Yevgeniy Vorobeychik;Bradley A. Malin
  • 通讯作者:
    Bradley A. Malin

Bradley A. Malin的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Bradley A. Malin', 18)}}的其他基金

Ethics Core (FABRIC)
道德核心 (FABRIC)
  • 批准号:
    10662376
  • 财政年份:
    2023
  • 资助金额:
    $ 24.06万
  • 项目类别:
Ethics Core (FABRIC)
道德核心 (FABRIC)
  • 批准号:
    10473062
  • 财政年份:
    2022
  • 资助金额:
    $ 24.06万
  • 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
  • 批准号:
    8695427
  • 财政年份:
    2012
  • 资助金额:
    $ 24.06万
  • 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
  • 批准号:
    9301793
  • 财政年份:
    2012
  • 资助金额:
    $ 24.06万
  • 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
  • 批准号:
    9193769
  • 财政年份:
    2012
  • 资助金额:
    $ 24.06万
  • 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
  • 批准号:
    8548389
  • 财政年份:
    2012
  • 资助金额:
    $ 24.06万
  • 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
  • 批准号:
    9360125
  • 财政年份:
    2012
  • 资助金额:
    $ 24.06万
  • 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
  • 批准号:
    8341447
  • 财政年份:
    2012
  • 资助金额:
    $ 24.06万
  • 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
  • 批准号:
    8915734
  • 财政年份:
    2012
  • 资助金额:
    $ 24.06万
  • 项目类别:
Automated Detection of Anomalous Accesses to Electronic Health Records
自动检测电子健康记录的异常访问
  • 批准号:
    8882547
  • 财政年份:
    2009
  • 资助金额:
    $ 24.06万
  • 项目类别:

相似海外基金

Rational design of rapidly translatable, highly antigenic and novel recombinant immunogens to address deficiencies of current snakebite treatments
合理设计可快速翻译、高抗原性和新型重组免疫原,以解决当前蛇咬伤治疗的缺陷
  • 批准号:
    MR/S03398X/2
  • 财政年份:
    2024
  • 资助金额:
    $ 24.06万
  • 项目类别:
    Fellowship
Re-thinking drug nanocrystals as highly loaded vectors to address key unmet therapeutic challenges
重新思考药物纳米晶体作为高负载载体以解决关键的未满足的治疗挑战
  • 批准号:
    EP/Y001486/1
  • 财政年份:
    2024
  • 资助金额:
    $ 24.06万
  • 项目类别:
    Research Grant
CAREER: FEAST (Food Ecosystems And circularity for Sustainable Transformation) framework to address Hidden Hunger
职业:FEAST(食品生态系统和可持续转型循环)框架解决隐性饥饿
  • 批准号:
    2338423
  • 财政年份:
    2024
  • 资助金额:
    $ 24.06万
  • 项目类别:
    Continuing Grant
Metrology to address ion suppression in multimodal mass spectrometry imaging with application in oncology
计量学解决多模态质谱成像中的离子抑制问题及其在肿瘤学中的应用
  • 批准号:
    MR/X03657X/1
  • 财政年份:
    2024
  • 资助金额:
    $ 24.06万
  • 项目类别:
    Fellowship
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
  • 批准号:
    2348066
  • 财政年份:
    2024
  • 资助金额:
    $ 24.06万
  • 项目类别:
    Standard Grant
The Abundance Project: Enhancing Cultural & Green Inclusion in Social Prescribing in Southwest London to Address Ethnic Inequalities in Mental Health
丰富项目:增强文化
  • 批准号:
    AH/Z505481/1
  • 财政年份:
    2024
  • 资助金额:
    $ 24.06万
  • 项目类别:
    Research Grant
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10107647
  • 财政年份:
    2024
  • 资助金额:
    $ 24.06万
  • 项目类别:
    EU-Funded
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
  • 批准号:
    2341402
  • 财政年份:
    2024
  • 资助金额:
    $ 24.06万
  • 项目类别:
    Standard Grant
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
  • 批准号:
    10106221
  • 财政年份:
    2024
  • 资助金额:
    $ 24.06万
  • 项目类别:
    EU-Funded
Recite: Building Research by Communities to Address Inequities through Expression
背诵:社区开展研究,通过表达解决不平等问题
  • 批准号:
    AH/Z505341/1
  • 财政年份:
    2024
  • 资助金额:
    $ 24.06万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了