Causal graphical methods for high-dimensional heterogeneous biomedical data

高维异构生物医学数据的因果图方法

基本信息

  • 批准号:
    10388447
  • 负责人:
  • 金额:
    $ 4.68万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-03-21 至 2025-03-20
  • 项目状态:
    未结题

项目摘要

In the past decade, there has been an explosion of data collected from biological and biomedical systems, both in terms of type and volume. Mining these high-dimensional, heterogeneous, and often dynamic datasets to make biologically or medically important inferences or develop predictive models requires new sophisticated data analytics methods. New machine learning methods have begun filling this gap, but most of these methods generate “black box” models that lack clear interpretability. Additionally, these methods are associative, and are thus incapable of teasing out the complex cause-effect relationships among features in the dataset. Directed causal graphical models (DCGMs) are a powerful tool for filling this gap. DCGMs, learned from observational datasets, can represent causal relationships between variables. This allows DCGMs to generate hypotheses of mechanisms and construct parsimonious, causally informed predictive models. However, biomedical datasets often have features that make it difficult to construct causal graphical models over the full dataset. Examples include: data type heterogeneity, high dimensionality, multicollinearity, cyclicity, and nonstationarity. To address these problems, I propose to develop methods for learning causal graphs in datasets containing (1) a heterogeneous mixture of continuous, categorical, and censored variables, (2) high dimensionality and multicollinearity, and (3) cyclicity and nonstationarity. In Aim 1, I will develop a new causal discovery algorithm that accommodates continuous, categorical and censored variables (e.g., survival). In Aim 2, I will test and compare various methods for matrix decomposition and dimensionality reduction in their ability to learn a meaningful low-dimensional latent feature space to be used in graph learning methods. In Aim 3, I will develop a new method for causal discovery in dynamic, possibly cyclic, gene regulatory networks at single cell resolution. In all cases, testing and validation will be performed on synthetic and real-life publicly available datasets. These methodological improvements constitute important steps forward in the field of causal discovery and they can be utilized together or independently to provide a flexible and powerful platform for analysis of a wide range of biomedical datasets. Once made available, they will enable researchers to make inferences about causal mechanisms, generate hypotheses, and build robust, parsimonious predictive models.
在过去的十年里,从生物和生物医学系统收集的数据激增, 在类型和数量上。挖掘这些高维、异构且经常是动态的数据集, 做出生物学或医学上重要的推论或开发预测模型需要新的复杂的 数据分析方法。新的机器学习方法已经开始填补这一空白,但其中大多数方法 生成缺乏清晰可解释性的“黑箱”模型。此外,这些方法是关联的, 因此无法梳理出数据集中特征之间的复杂因果关系。引导 因果图模型(DCGM)是填补这一空白的有力工具。DCGMs,从观察中学习 数据集,可以表示变量之间的因果关系。这允许DCGM生成以下假设: 机制,并构建简约的,因果关系知情的预测模型。然而,生物医学数据集 通常具有使得难以在整个数据集上构建因果图模型的特征。示例 包括:数据类型异质性、高维性、多重共线性、循环性和非平稳性。解决 这些问题,我建议开发方法,学习因果图的数据集包含(1) 连续、分类和删失变量的异质混合,(2)高维和 多重共线性;(3)周期性和非平稳性。在目标1中,我将开发一种新的因果发现算法 适应连续的、分类的和删失的变量(例如,生存)。在目标2中,我将测试和 比较各种矩阵分解和降维方法的学习能力 有意义的低维潜在特征空间,用于图学习方法。在目标3中,我将开发 一种新的方法,因果关系的发现在动态的,可能是循环的,基因调控网络在单细胞分辨率。 在所有情况下,测试和验证将在合成和现实生活中公开可用的数据集上进行。这些 方法上的改进构成了因果发现领域的重要步骤,它们可以 可以一起使用或独立使用,为分析各种 生物医学数据集一旦可用,它们将使研究人员能够推断因果关系, 机制,生成假设,并建立稳健、简约的预测模型。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Tyler Lovelace其他文献

Tyler Lovelace的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Tyler Lovelace', 18)}}的其他基金

Causal graphical methods for high-dimensional heterogeneous biomedical data
高维异构生物医学数据的因果图方法
  • 批准号:
    10625257
  • 财政年份:
    2022
  • 资助金额:
    $ 4.68万
  • 项目类别:

相似海外基金

DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
  • 批准号:
    EP/Y029089/1
  • 财政年份:
    2024
  • 资助金额:
    $ 4.68万
  • 项目类别:
    Research Grant
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 4.68万
  • 项目类别:
    Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
  • 批准号:
    2338816
  • 财政年份:
    2024
  • 资助金额:
    $ 4.68万
  • 项目类别:
    Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
  • 批准号:
    2338846
  • 财政年份:
    2024
  • 资助金额:
    $ 4.68万
  • 项目类别:
    Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
  • 批准号:
    2348261
  • 财政年份:
    2024
  • 资助金额:
    $ 4.68万
  • 项目类别:
    Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
  • 批准号:
    2348346
  • 财政年份:
    2024
  • 资助金额:
    $ 4.68万
  • 项目类别:
    Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
  • 批准号:
    2348457
  • 财政年份:
    2024
  • 资助金额:
    $ 4.68万
  • 项目类别:
    Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
  • 批准号:
    2404989
  • 财政年份:
    2024
  • 资助金额:
    $ 4.68万
  • 项目类别:
    Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 4.68万
  • 项目类别:
    Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
  • 批准号:
    2339669
  • 财政年份:
    2024
  • 资助金额:
    $ 4.68万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了