Modeling the Incompleteness and Biases of Health Data

对健康数据的不完整性和偏差进行建模

基本信息

  • 批准号:
    10581658
  • 负责人:
  • 金额:
    $ 30.75万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-06-01 至 2025-03-31
  • 项目状态:
    未结题

项目摘要

Modeling the Incompleteness and Biases of Health Data Researchers are increasingly working to “mine” health data to derive new medical knowledge. Unlike experimental data that are collected per a research protocol, the primary role of clinical data is to help clinicians care for patients, so the procedures for its collection are not often systematic. Thus, missing and/or biased data can hinder medical knowledge discovery and data mining efforts. Existing efforts for missing health data imputation often focus on only cross-sectional correlation (e.g., correlation across subjects or across variables) but neglect autocorrelation (e.g., correlation across time points). Moreover, they often focus on modeling incompleteness but neglect the biases in health data. Modeling both the incompleteness and bias may contribute to better understanding of health data and better support clinical decision making. We propose a novel framework of Bias-Aware Missing data Imputation with Cross-sectional correlation and Autocorrelation (BAMICA), and leverage clinical notes to better inform the methods that will otherwise rely on structured health data only. In addition to evaluating its imputation accuracy, we will apply the proposed framework to assist in downstream tasks such as predictive modeling for multiple outcomes across a diverse range of clinical and cohort study datasets. Aim 1 introduces the MICA framework to jointly consider cross-sectional correlation and auto-correlation. In Aim 2, we will augment MICA to be bias-aware (hence BAMICA) to account for biases stemmed from multiple roots such as healthcare process and use them as features in imputing missing health data. This augmentation is achieved by a novel recurrent neural network architecture that keeps track of both evolution of health data variables and bias factors. In Aim 3, we will supplement unstructured clinical notes to structured health data for modeling incompleteness and biases using a novel architecture of graph neural network on top of memory network. We will apply graph neural networks to process clinical notes in order to learn proper representations as input to the memory networks for imputation and downstream predictive modeling tasks. Depending on the clinical problem and data availability, not all modules may be needed. Thus our proposed BAMICA framework is designed to be flexible and consists of selectable modules to meet some or all of the above needs. In summary, our proposal bridges a key knowledge gap in jointly modeling incompleteness and biases in health data and utilizes unstructured clinical notes to supplement and augment such modeling in order to better support predictive modeling and clinical decision making. We will demonstrate generalizability by experimenting on four large clinical and cohort study datasets, and by scaling up to the eMERGE network spanning 11 institutions nationwide. We will disseminate the open-source framework. The principled and flexible framework generated by this project will bring significant methodological advancement and have a direct impact on enhancing discovery from health data.
健康数据的不完整性和偏差建模 研究人员越来越多地致力于“挖掘”健康数据,以获得新的医学知识。不像 根据研究方案收集的实验数据,临床数据的主要作用是帮助 临床医生照顾病人,因此其收集程序往往不是系统的。因此,缺失和/或 有偏见的数据会阻碍医学知识发现和数据挖掘工作。为缺失的健康所做的现有努力 数据插补通常仅集中于横截面相关性(例如,跨学科或跨学科的相关性 变量)但忽略自相关(例如,跨时间点的相关性)。此外,他们经常关注 模型不完整,但忽略了健康数据中的偏差。 对不完整性和偏差进行建模可能有助于更好地理解健康数据, 支持临床决策。我们提出了一个新的偏差感知缺失数据填补框架, 横截面相关性和自相关性(BAMICA),并利用临床记录更好地告知 这些方法将仅依赖于结构化健康数据。除了评估其归因之外, 准确性,我们将应用所提出的框架,以协助下游任务,如预测建模, 多种临床和队列研究数据集的多种结局。 目标1引入云母框架,共同考虑横截面相关性和自相关性。在 目标2,我们将增强云母的偏差意识(因此是BAMICA),以解释源于多个 根,如医疗保健过程,并使用它们作为功能,插补缺失的健康数据。这种增强 是通过一种新的递归神经网络架构实现的,该架构可以跟踪健康数据的演变, 变量和偏倚因素。在目标3中,我们将为结构化健康数据补充非结构化临床记录, 基于存储器的图神经网络模型 网络我们将应用图神经网络来处理临床笔记,以便学习适当的表示 作为记忆网络的输入,用于估算和下游预测建模任务。取决于 临床问题和数据可用性,并非所有模块都需要。因此,我们提出的BAMICA框架 设计灵活,由可选择的模块组成,以满足部分或全部上述需求。 总之,我们的建议弥合了联合建模不完整性和偏见的关键知识差距, 健康数据,并利用非结构化的临床笔记来补充和增强这种建模,以便更好地 支持预测建模和临床决策。我们将通过以下方式来证明可推广性: 在四个大型临床和队列研究数据集上进行实验,并通过扩展到eMERGE网络 全国11个机构。我们将传播开源框架。原则性和 该项目产生的灵活框架将带来重大的方法进步, 对加强从健康数据中发现的直接影响。

项目成果

期刊论文数量(22)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Advances in Machine Learning Approaches to Heart Failure with Preserved Ejection Fraction.
  • DOI:
    10.1016/j.hfc.2021.12.002
  • 发表时间:
    2022-04
  • 期刊:
  • 影响因子:
    3.4
  • 作者:
    Ahmad FS;Luo Y;Wehbe RM;Thomas JD;Shah SJ
  • 通讯作者:
    Shah SJ
Distinct clinical phenotypes in paediatric cancer patients with sepsis are associated with different outcomes-an international multicentre retrospective study.
  • DOI:
    10.1016/j.eclinm.2023.102252
  • 发表时间:
    2023-11
  • 期刊:
  • 影响因子:
    15.1
  • 作者:
    Wosten-van Asperen, Roelie M.;la Roi-Teeuw, Hannah M.;van Amstel, Rombout B. E.;Bos, Lieuwe D. J.;Tissing, Wim J. E.;Jordan, Iolanda;Dohna-Schwake, Christian;Bottari, Gabriella;Pappachan, John;Crazzolara, Roman;Comoretto, Rosanna I.;Mizia-Malarz, Agniezka;Moscatelli, Andrea;Sanchez-Martin, Maria;Willems, Jef;Rogerson, Colin M.;Bennett, Tellen D.;Luo, Yuan;Atreya, Mihir R.;Faustino, E. Vincent S.;Geva, Alon;Weiss, Scott L.;Schlapbach, Luregn J.;Sanchez-Pinto, L. Nelson
  • 通讯作者:
    Sanchez-Pinto, L. Nelson
Integrative analysis of functional genomic screening and clinical data identifies a protective role for spironolactone in severe COVID-19.
  • DOI:
    10.1016/j.crmeth.2023.100503
  • 发表时间:
    2023-07-24
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Hyperchloremia in critically ill patients: association with outcomes and prediction using electronic health record data.
危重患者的高氯血症:与结果的关联以及使用电子健康记录数据的预测。
Predicting mortality in critically ill patients with diabetes using machine learning and clinical notes.
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

YUAN LUO其他文献

YUAN LUO的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('YUAN LUO', 18)}}的其他基金

Modeling the Incompleteness and Biases of Health Data
对健康数据的不完整性和偏差进行建模
  • 批准号:
    10381541
  • 财政年份:
    2020
  • 资助金额:
    $ 30.75万
  • 项目类别:
National Infrastructure for Standardized and Portable EHR Phenotyping Algorithms
标准化和便携式 EHR 表型算法的国家基础设施
  • 批准号:
    10021669
  • 财政年份:
    2017
  • 资助金额:
    $ 30.75万
  • 项目类别:
In vivo Studies of Ginkgo biloba Neuroprotection
银杏神经保护的体内研究
  • 批准号:
    7455616
  • 财政年份:
    2004
  • 资助金额:
    $ 30.75万
  • 项目类别:
In vivo Studies of Ginkgo biloba Neuroprotection
银杏神经保护的体内研究
  • 批准号:
    7188740
  • 财政年份:
    2004
  • 资助金额:
    $ 30.75万
  • 项目类别:
In vivo Studies of Ginkgo biloba Neuroprotection
银杏神经保护的体内研究
  • 批准号:
    7070002
  • 财政年份:
    2004
  • 资助金额:
    $ 30.75万
  • 项目类别:
In vivo Studies of Ginkgo biloba Neuroprotection
银杏神经保护的体内研究
  • 批准号:
    7283658
  • 财政年份:
    2004
  • 资助金额:
    $ 30.75万
  • 项目类别:
In vivo Studies of Ginkgo biloba Neuroprotection
银杏神经保护的体内研究
  • 批准号:
    6947778
  • 财政年份:
    2004
  • 资助金额:
    $ 30.75万
  • 项目类别:
In vivo Studies of Ginkgo biloba Neuroprotection
银杏神经保护的体内研究
  • 批准号:
    7694239
  • 财政年份:
    2004
  • 资助金额:
    $ 30.75万
  • 项目类别:
In vivo Studies of Ginkgo biloba Neuroprotection
银杏神经保护的体内研究
  • 批准号:
    6827981
  • 财政年份:
    2004
  • 资助金额:
    $ 30.75万
  • 项目类别:
SIGNALING MECHANISMS IN DOPAMINE RECEPTOR SYNERGISM
多巴胺受体协同作用中的信号机制
  • 批准号:
    7235701
  • 财政年份:
    2003
  • 资助金额:
    $ 30.75万
  • 项目类别:

相似海外基金

CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 30.75万
  • 项目类别:
    Continuing Grant
Collaborative Research: SHF: Small: Artificial Intelligence of Things (AIoT): Theory, Architecture, and Algorithms
合作研究:SHF:小型:物联网人工智能 (AIoT):理论、架构和算法
  • 批准号:
    2221742
  • 财政年份:
    2022
  • 资助金额:
    $ 30.75万
  • 项目类别:
    Standard Grant
Collaborative Research: SHF: Small: Artificial Intelligence of Things (AIoT): Theory, Architecture, and Algorithms
合作研究:SHF:小型:物联网人工智能 (AIoT):理论、架构和算法
  • 批准号:
    2221741
  • 财政年份:
    2022
  • 资助金额:
    $ 30.75万
  • 项目类别:
    Standard Grant
Algorithms and Architecture for Super Terabit Flexible Multicarrier Coherent Optical Transmission
超太比特灵活多载波相干光传输的算法和架构
  • 批准号:
    533529-2018
  • 财政年份:
    2020
  • 资助金额:
    $ 30.75万
  • 项目类别:
    Collaborative Research and Development Grants
OAC Core: Small: Architecture and Network-aware Partitioning Algorithms for Scalable PDE Solvers
OAC 核心:小型:可扩展 PDE 求解器的架构和网络感知分区算法
  • 批准号:
    2008772
  • 财政年份:
    2020
  • 资助金额:
    $ 30.75万
  • 项目类别:
    Standard Grant
Algorithms and Architecture for Super Terabit Flexible Multicarrier Coherent Optical Transmission
超太比特灵活多载波相干光传输的算法和架构
  • 批准号:
    533529-2018
  • 财政年份:
    2019
  • 资助金额:
    $ 30.75万
  • 项目类别:
    Collaborative Research and Development Grants
Visualization of FPGA CAD Algorithms and Target Architecture
FPGA CAD 算法和目标架构的可视化
  • 批准号:
    541812-2019
  • 财政年份:
    2019
  • 资助金额:
    $ 30.75万
  • 项目类别:
    University Undergraduate Student Research Awards
Collaborative Research: ABI Innovation: Algorithms for recovering root architecture from 3D imaging
合作研究:ABI 创新:从 3D 成像恢复根结构的算法
  • 批准号:
    1759836
  • 财政年份:
    2018
  • 资助金额:
    $ 30.75万
  • 项目类别:
    Standard Grant
Collaborative Research: ABI Innovation: Algorithms for recovering root architecture from 3D imaging
合作研究:ABI 创新:从 3D 成像恢复根结构的算法
  • 批准号:
    1759796
  • 财政年份:
    2018
  • 资助金额:
    $ 30.75万
  • 项目类别:
    Standard Grant
Collaborative Research: ABI Innovation: Algorithms for recovering root architecture from 3D imaging
合作研究:ABI 创新:从 3D 成像恢复根结构的算法
  • 批准号:
    1759807
  • 财政年份:
    2018
  • 资助金额:
    $ 30.75万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了