A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
基本信息
- 批准号:9754854
- 负责人:
- 金额:$ 24.06万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2012
- 资助国家:美国
- 起止时间:2012-09-21 至 2021-07-31
- 项目状态:已结题
- 来源:
- 关键词:Academic Medical CentersAddressAdoptedAgreementBackBehaviorCase StudyCollectionComputerized Medical RecordContractsControl GroupsCost MeasuresDataData AggregationData CollectionData SecurityData SetDatabasesDetectionFoundationsFundingGenomeGenomic SegmentGenomic approachGenomicsGrantHealthIndividualInformation SystemsIntelligenceInternetInvestigationKnowledgeLeadLengthLifeLinkMeasuresMedical centerModelingMotivationNamesPaperParticipantPeer ReviewPhasePlayPoliciesPolicy MakerPrecision Medicine InitiativePrivacyProbabilityProcessProgress ReportsPublicationsPublished DatabasePublishingPunishmentRecordsReportingResearchResearch PersonnelResearch Project GrantsResourcesRiskRisk AssessmentRisk ManagementScientistServicesSocietiesSystemTheoretical modelTimeTranslationsUnited States National Institutes of HealthVariantWorkbasecohortcostdata managementdata sharingdatabase of Genotypes and Phenotypesdesigngenomic datahuman subjectphenomephenotypic dataprecision medicinerepositoryrisk mitigationsocialsocioeconomicsstatisticstheoriestoolvirtual
项目摘要
The past decade has witnessed numerous demonstrations that genomic data can be traced back to the
corresponding named individuals. These attacks exploit various collections, including the NIH Database of
Genotypes and Phenotypes (dbGaP), the 1000 Genomes Project, and the Beacon Project of the Global Alliance
for Genomics and Health, and are often reported in the popular media. At the same time, research conducted
in the first phase of this grant (from 2012-2016) showed that such re-identification attacks often represent worst-
case, non-generalizable scenarios. Specifically, it was shown that these attacks often focus on the possibility of
attack - and not its probability given the wide range of factors often at play in practice. By focusing on the
possible, such investigations can lead policy makers to believe that de-identification is a useless activity.
However, our research showed that de-identification is only one part of a larger strategy of deterrents that can
be used to manage risk. By intelligently combining de-identification with other technical risk mitigation
approaches (e.g., controlled access) and societal constructs (e.g., data use agreements and penalties), genomic
data sharing solutions can be developed with appropriate levels of risk and utility for scientists and society. While
our research laid the foundation for managing identification risk in genomic data sharing, significant questions
remain regarding its translation into practical guidance. In particular, risk management models must be
specialized to the type of data that is shared, the types of penalties (or punishments) available, and the costs of
adopting and administering deterrence mechanisms. Thus, in the second phase of this research project, we
propose to augment risk-based re-identification management frameworks to model and assess the deterrence
approaches invoked by existing repositories, such as dbGaP (which holds a collection of smaller historical
datasets from completed studies), as well as emerging initiatives, such as the Precision Medicine Initiative. This
project will pursue three specific aims, designed to work in harmony, but at the same time sufficiently independent
that if one fails, the research will still yield fruitful risk management guidance for genomic databases: 1) Develop
game theoretic models to assess re-identification attacks at different levels of detail in genomic data sharing
(e.g., aggregate summaries of the proportion of variants in case vs. control groups in association studies); 2)
Characterize and measure the costs associated with common re-identification deterrence approaches for
genomic data (e.g., physical investigatory reviews and virtual audits of IT system use); and 3) Optimize the
parameterization of a deterrence policy (e.g., the amount of damages for violation of a data use agreement or
the amount of time to withhold data from an attacker/investigator) given the expected value of genomic data.
We will evaluate these approaches with a large repository of de-identified genomic and electronic medical
records in use at a large academic medical center, datasets hosted at two federal repositories, and a web system
that presents summary statistics from a cohort of 9000 participants.
在过去的十年里,见证了无数的例证,基因组数据可以追溯到
相应的具名个人。这些攻击利用了各种收集,包括美国国立卫生研究院数据库
基因类型和表型(DBGaP)、1000基因组计划和全球联盟的灯塔计划
对于基因组学和健康,并经常在流行媒体上报道。与此同时,进行了研究
在这笔赠款的第一阶段(从2012-2016年)显示,这种重新识别攻击通常是最糟糕的-
情况下,不可泛化的情况。具体地说,据显示,这些攻击往往侧重于
进攻--而不是它的概率,因为在实践中经常起作用的因素很多。通过专注于
有可能,这样的调查可能会让政策制定者认为,去身份识别是一项无用的活动。
然而,我们的研究表明,消除身份识别只是一个更大的威慑战略的一部分,可以
被用来管理风险。通过智能地将去身份识别与其他技术风险缓解相结合
方法(例如,受控访问)和社会结构(例如,数据使用协议和惩罚)、基因组
数据共享解决方案的开发可以为科学家和社会带来适当水平的风险和效用。而当
我们的研究为管理基因组数据共享中的识别风险奠定了基础,重要问题
继续将其转化为实际指导意见。特别是,风险管理模型必须
专门针对共享的数据类型、可用的惩罚类型以及
采用和管理威慑机制。因此,在这个研究项目的第二阶段,我们
建议加强基于风险的重新识别管理框架,以模拟和评估威慑力
由现有存储库调用的方法,如DBGaP(保存较小历史记录的集合
来自已完成研究的数据集),以及新出现的倡议,如“精确医学倡议”。这
该项目将追求三个具体目标,旨在和谐地工作,但同时又足够独立
如果失败,这项研究仍将为基因组数据库提供卓有成效的风险管理指导:1)开发
评估基因组数据共享中不同细节层次的重新识别攻击的博弈论模型
(例如,在关联研究中病例组与对照组的变异比例的汇总);2)
描述和衡量与以下项目的常见重新识别威慑方法相关的成本
基因组数据(例如,IT系统使用的实物调查审查和虚拟审计);以及3)优化
威慑政策的参数化(例如,违反数据使用协议的损害赔偿额或
向攻击者/调查者隐瞒数据的时间量)给定基因组数据的预期价值。
我们将使用大量未识别的基因组和电子医学数据库来评估这些方法
一个大型学术医疗中心正在使用的记录,两个联邦储存库托管的数据集,以及一个网络系统
这提供了9000名参与者的汇总统计数据。
项目成果
期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Robust Transparency Against Model Inversion Attacks.
- DOI:10.1109/tdsc.2020.3019508
- 发表时间:2021-09
- 期刊:
- 影响因子:7.3
- 作者:Alufaisan Y;Kantarcioglu M;Zhou Y
- 通讯作者:Zhou Y
Integrating linear optimization with structural modeling to increase HIV neutralization breadth.
将线性优化与结构建模相结合,以增加 HIV 中和广度。
- DOI:10.1371/journal.pcbi.1005999
- 发表时间:2018
- 期刊:
- 影响因子:4.3
- 作者:Sevy,AlexanderM;Panda,Swetasudha;CroweJr,JamesE;Meiler,Jens;Vorobeychik,Yevgeniy
- 通讯作者:Vorobeychik,Yevgeniy
{{
                item.title }}
{{ item.translation_title }}
- DOI:{{ item.doi }} 
- 发表时间:{{ item.publish_year }} 
- 期刊:
- 影响因子:{{ item.factor }}
- 作者:{{ item.authors }} 
- 通讯作者:{{ item.author }} 
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:{{ item.author }} 
数据更新时间:{{ patent.updateTime }}
Bradley A. Malin其他文献
Dataset Representativeness and Downstream Task Fairness
数据集代表性和下游任务公平性
- DOI:
- 发表时间:2024 
- 期刊:
- 影响因子:0
- 作者:Victor A. Borza;Andrew Estornell;Chien;Bradley A. Malin;Yevgeniy Vorobeychik 
- 通讯作者:Yevgeniy Vorobeychik 
APPLICATIONS OF HOMOMORPHIC ENCRYPTION
同态加密的应用
- DOI:
- 发表时间:2017 
- 期刊:
- 影响因子:0
- 作者:David Archer;Lily Chen;Jung Hee Cheon;Ran Gilad;Roger A. Hallman;Zhicong Huang;Xiaoqian Jiang;R. Kumaresan;Bradley A. Malin;Heidi Sofia;Yongsoo Song;Shuang Wang 
- 通讯作者:Shuang Wang 
Protecting Genomic Sequence Anonymity with Generalization Lattices
- DOI:10.1055/s-0038-1634025 
- 发表时间:2005 
- 期刊:
- 影响因子:1.7
- 作者:Bradley A. Malin 
- 通讯作者:Bradley A. Malin 
Optimizing word embeddings for small datasets: a case study on patient portal messages from breast cancer patients
- DOI:10.1038/s41598-024-66319-z 
- 发表时间:2024-07-12 
- 期刊:
- 影响因子:3.900
- 作者:Qingyuan Song;Congning Ni;Jeremy L. Warner;Qingxia Chen;Lijun Song;S. Trent Rosenbloom;Bradley A. Malin;Zhijun Yin 
- 通讯作者:Zhijun Yin 
Computational strategic recruitment for representation and coverage studied in the All of Us Research Program
在“我们所有人”研究计划中研究的代表和覆盖范围的计算战略招聘
- DOI:10.1038/s41746-025-01804-x 
- 发表时间:2025-07-03 
- 期刊:
- 影响因子:15.100
- 作者:Victor A. Borza;Qingxia Chen;Ellen W. Clayton;Murat Kantarcioglu;Lina Sulieman;Yevgeniy Vorobeychik;Bradley A. Malin 
- 通讯作者:Bradley A. Malin 
Bradley A. Malin的其他文献
{{
              item.title }}
{{ item.translation_title }}
- DOI:{{ item.doi }} 
- 发表时间:{{ item.publish_year }} 
- 期刊:
- 影响因子:{{ item.factor }}
- 作者:{{ item.authors }} 
- 通讯作者:{{ item.author }} 
{{ truncateString('Bradley A. Malin', 18)}}的其他基金
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
- 批准号:8695427 
- 财政年份:2012
- 资助金额:$ 24.06万 
- 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
- 批准号:9301793 
- 财政年份:2012
- 资助金额:$ 24.06万 
- 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
- 批准号:9193769 
- 财政年份:2012
- 资助金额:$ 24.06万 
- 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
- 批准号:9360125 
- 财政年份:2012
- 资助金额:$ 24.06万 
- 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
- 批准号:8548389 
- 财政年份:2012
- 资助金额:$ 24.06万 
- 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
- 批准号:8915734 
- 财政年份:2012
- 资助金额:$ 24.06万 
- 项目类别:
A Risk Management Framework for Identifiability in Genomics Research
基因组学研究中可识别性的风险管理框架
- 批准号:8341447 
- 财政年份:2012
- 资助金额:$ 24.06万 
- 项目类别:
Automated Detection of Anomalous Accesses to Electronic Health Records
自动检测电子健康记录的异常访问
- 批准号:8882547 
- 财政年份:2009
- 资助金额:$ 24.06万 
- 项目类别:
相似海外基金
Rational design of rapidly translatable, highly antigenic and novel recombinant immunogens to address deficiencies of current snakebite treatments
合理设计可快速翻译、高抗原性和新型重组免疫原,以解决当前蛇咬伤治疗的缺陷
- 批准号:MR/S03398X/2 
- 财政年份:2024
- 资助金额:$ 24.06万 
- 项目类别:Fellowship 
Re-thinking drug nanocrystals as highly loaded vectors to address key unmet therapeutic challenges
重新思考药物纳米晶体作为高负载载体以解决关键的未满足的治疗挑战
- 批准号:EP/Y001486/1 
- 财政年份:2024
- 资助金额:$ 24.06万 
- 项目类别:Research Grant 
CAREER: FEAST (Food Ecosystems And circularity for Sustainable Transformation) framework to address Hidden Hunger
职业:FEAST(食品生态系统和可持续转型循环)框架解决隐性饥饿
- 批准号:2338423 
- 财政年份:2024
- 资助金额:$ 24.06万 
- 项目类别:Continuing Grant 
Metrology to address ion suppression in multimodal mass spectrometry imaging with application in oncology
计量学解决多模态质谱成像中的离子抑制问题及其在肿瘤学中的应用
- 批准号:MR/X03657X/1 
- 财政年份:2024
- 资助金额:$ 24.06万 
- 项目类别:Fellowship 
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
- 批准号:2348066 
- 财政年份:2024
- 资助金额:$ 24.06万 
- 项目类别:Standard Grant 
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
- 批准号:2341402 
- 财政年份:2024
- 资助金额:$ 24.06万 
- 项目类别:Standard Grant 
The Abundance Project: Enhancing Cultural & Green Inclusion in Social Prescribing in Southwest London to Address Ethnic Inequalities in Mental Health
丰富项目:增强文化
- 批准号:AH/Z505481/1 
- 财政年份:2024
- 资助金额:$ 24.06万 
- 项目类别:Research Grant 
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
- 批准号:10107647 
- 财政年份:2024
- 资助金额:$ 24.06万 
- 项目类别:EU-Funded 
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
- 批准号:10106221 
- 财政年份:2024
- 资助金额:$ 24.06万 
- 项目类别:EU-Funded 
Recite: Building Research by Communities to Address Inequities through Expression
背诵:社区开展研究,通过表达解决不平等问题
- 批准号:AH/Z505341/1 
- 财政年份:2024
- 资助金额:$ 24.06万 
- 项目类别:Research Grant 

 刷新
              刷新
            
















 {{item.name}}会员
              {{item.name}}会员
            



