A novel platform for synthetic generation and statistical obfuscation of tabular clinical data, simulated images, and machine-generated text

用于表格临床数据、模拟图像和机器生成文本的合成生成和统计混淆的新颖平台

基本信息

  • 批准号:
    10696488
  • 负责人:
  • 金额:
    $ 32.46万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2023
  • 资助国家:
    美国
  • 起止时间:
    2023-09-15 至 2024-09-14
  • 项目状态:
    已结题

项目摘要

PROJECT SUMMARY Data is a critical and highly valuable commodity, driving meaningful change in our society, especially when it pertains to patient care and biomedical research. Currently, institutions pay inordinate sums to increase, regain, and complement their data panels. As an extra burden, data legislation and privacy protection regulations introduce barriers to forming effective partnerships between business, clinical, research and educational organizations. As a result, approximately 80% of medical data today can’t be readily shared because they contain personal, protected or sensitive information and remains unstructured and untapped after they are created. There is a growing and urgent unmet need for technology solutions that balance research and commercial organizations interests by supporting flexible general-purpose analytics while guaranteeing privacy protection. There are no effective mechanisms to enable data sharing without either risking inappropriate release of sensitive information or potential degradation of the information content. The currently available few protocols and algorithms for modeling, processing, interrogating, and ultimately sharing large sensitive data (e.g., thousands and millions of records with thousands of heterogeneous features) all share significant limitations and their practical use still lags behind research progress. Two major unmet needs in the data sharing industry are i) the inability to return de-identified clones of the raw data, and ii) lack of scalability requirements of production deployments. GrayRain, LLC is an early-stage Software-as-a-Service company developing a novel platform for statistical obfuscation and de- identification of sensitive structured (numerical, categorical tabular data) and unstructured information (e.g., clinical text, doctors/nurses notes and clinical images, such as MRI, PET). The core of GrayRain’s technology is the novel patented statistical obfuscation algorithm, DataSifter. The technology proposed in this STTR Phase I application will significantly increase the number of secure data transactions in the healthcare sector and beyond, enabling data sharing with fully controllable risk of identification of any sensitive information, including, but not limited to PHI (personal health information), demographic information, or socioeconomic status. GrayRain’s technology is able to produce de-identified clones of raw tabular data, addressing a major limitations encounter across existing data anonymization protocols. As far as scalability, the main goal of this STTR Phase I is to establish feasibility of GrayRain to accurately and efficiently (re: scalability) de- identify and share large-scale complex EHR data repositories with a controlled risk of disclosing protected or personal health information .
项目摘要 数据是一种至关重要且极具价值的商品,推动着我们社会的重大变革, 特别是当它涉及到病人护理和生物医学研究时。目前,机构支付 过度的总和增加,恢复,并补充他们的数据面板.作为额外的负担, 数据立法和隐私保护条例为形成有效的 商业、临床、研究和教育组织之间的伙伴关系。因此,在本发明中, 今天,大约80%的医疗数据无法轻易共享,因为它们包含 个人的、受保护的或敏感的信息,并且在它们 被创造出来。对技术解决方案的需求日益增长,迫切需要平衡 研究和商业组织的利益,通过支持灵活的通用 分析,同时保证隐私保护。 没有有效的机制, 数据共享,而不会有不适当地发布敏感信息或潜在风险 信息内容的退化。目前可用的几个协议和算法, 建模、处理、询问并最终共享大型敏感数据(例如,数千 以及具有数千种异构特征的数百万条记录)都共享重要的 其局限性和实际应用仍然落后于研究进展。两大未满足的需求 数据共享行业是i)无法返回原始数据的去识别克隆,以及ii) 缺乏生产部署的可扩展性要求。GrayRain有限责任公司是一个早期的 软件即服务公司开发了一个新的平台,用于统计混淆和去 识别敏感的结构化(数字、分类表格数据)和非结构化数据 信息(例如,临床文本、医生/护士笔记和临床图像,例如MRI、PET)。的 GrayRain技术的核心是新颖的专利统计混淆算法DataSifter。的 在STTR第一阶段应用中提出的技术将大大增加 安全的数据交易在医疗保健部门和超越,使数据共享与充分 识别任何敏感信息的可控风险,包括但不限于PHI (个人健康信息)、人口统计信息或社会经济地位。GrayRain的 技术能够产生原始表格数据的去识别克隆,解决了一个主要的局限性 在现有的数据匿名化协议中遇到的问题。就可扩展性而言, STTR第一阶段是建立GrayRain的可行性,以准确有效地(关于可扩展性)去 识别和共享大规模复杂的EHR数据存储库,并控制泄露风险 受保护或个人健康信息。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ronak Shetty其他文献

Ronak Shetty的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
  • 批准号:
    EP/Y029089/1
  • 财政年份:
    2024
  • 资助金额:
    $ 32.46万
  • 项目类别:
    Research Grant
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 32.46万
  • 项目类别:
    Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
  • 批准号:
    2338816
  • 财政年份:
    2024
  • 资助金额:
    $ 32.46万
  • 项目类别:
    Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
  • 批准号:
    2338846
  • 财政年份:
    2024
  • 资助金额:
    $ 32.46万
  • 项目类别:
    Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
  • 批准号:
    2348261
  • 财政年份:
    2024
  • 资助金额:
    $ 32.46万
  • 项目类别:
    Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
  • 批准号:
    2348346
  • 财政年份:
    2024
  • 资助金额:
    $ 32.46万
  • 项目类别:
    Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
  • 批准号:
    2348457
  • 财政年份:
    2024
  • 资助金额:
    $ 32.46万
  • 项目类别:
    Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
  • 批准号:
    2404989
  • 财政年份:
    2024
  • 资助金额:
    $ 32.46万
  • 项目类别:
    Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 32.46万
  • 项目类别:
    Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
  • 批准号:
    2339669
  • 财政年份:
    2024
  • 资助金额:
    $ 32.46万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了