Causal Structure Learning from Sparse High Dimensional Data

从稀疏高维数据中学习因果结构

基本信息

  • 批准号:
    RGPIN-2021-02856
  • 负责人:
  • 金额:
    $ 1.31万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

This research program studies causal learning in complex natural systems. The foundational challenge lies in inferring links in a high dimensional network based on limited observed data. For example, which genes can best prevent boar taint (an off-taste in pork meat)? Which regions of the brain are associated with cognitive function? Alternatively, what are the plant and pollinator traits driving plant and pollinator interactions? Can we predict interactions? Global concerns such as climate change, disease control, food security, and environmental management are replete with open problems. A general solution does not exist. Our approach is to develop powerful learning algorithms that can help researchers unlock these relationships. Animal breeding programs often require learning the structure of large genetic networks and identifying candidates for genetic selection. Consider boar taint, which is caused by high levels of two compounds. Candidate targets would be genes that reduce levels of one compound without adversely affecting fertility or production traits. The doubly sparse regression incorporating graphical structure of predictors (DSRIG) model leverages the (undirected) graph structure over predictors (gene expression levels) to improve prediction for a quantitative trait (boar taint). However, DSRIG is computationally intensive and does not distinguish between potential predictors that influence the response versus variables influenced by the response. Conservation management programs require learning the drivers of link formation, such as plant and pollinator species traits relevant to pollination. Anticipating which plant and pollinator species are most vulnerable to extinction or identifying species important in structuring the community can inform resource management and allocation efforts. Regularized grouped Dirichlet-multinomial (DM) regression is a consumer-resource model that models plant-pollinator interactions as a function of plant and pollinator traits. Unfortunately, survey data often underrepresent or exclude rare interactions. There is no established method to incorporate environmental covariates in the model or compare community structures across networks. The main objectives of the proposed program are to extend 1. the DSRIG framework to the causal (directed) graph setting, and 2. the grouped DM framework to compare networks over space or time. Short term goals include improving optimization of DSRIG; exploiting the directed structure of the predictor graph for a univariate response; extending the DM regression framework for zero-inflation; and comparing two networks over a (e.g., soil) gradient. Long term goals include extending DSRIG to the directed multivariate response setting and modelling bipartite networks more broadly over space and time. This research would benefit Canadian genetic selection and animal health monitoring programs as well as inform conservation and resource management practices.
这个研究项目研究复杂自然系统中的因果学习。最基本的挑战在于基于有限的观测数据推断高维网络中的链接。例如,哪些基因可以最好地防止野猪污染(猪肉中的一种异味)?大脑的哪些区域与认知功能有关?或者,什么是植物和传粉者的特征驱动植物和传粉者的相互作用?我们能预测相互作用吗?气候变化、疾病控制、粮食安全和环境管理等全球性问题充满了悬而未决的问题。不存在一般的解决方案。我们的方法是开发强大的学习算法,帮助研究人员解锁这些关系。动物育种项目通常需要学习大型遗传网络的结构,并确定遗传选择的候选者。考虑一下野猪的污染,它是由两种化合物的高水平引起的。候选目标将是在不对生育或生产性状产生不利影响的情况下降低一种化合物水平的基因。双重稀疏回归结合预测因子的图形结构(DSRIG)模型利用预测因子(基因表达水平)上的(无向)图结构来改进对数量性状(野猪污染)的预测。然而,DSRIG是计算密集型的,并且不区分影响响应的潜在预测者和受响应影响的变量。保护管理计划需要学习链接形成的驱动因素,如与授粉相关的植物和传粉者物种特征。预测哪些植物和传粉者物种最容易灭绝,或确定对构建群落很重要的物种,可以为资源管理和分配工作提供信息。正则化分组狄利克雷多项式(DM)回归是一个消费者-资源模型,它将植物-传粉者之间的相互作用模拟为植物和传粉者性状的函数。不幸的是,调查数据往往低估或排除了罕见的相互作用。目前还没有既定的方法来将环境协变量纳入模型中,或者比较网络中的社区结构。该方案的主要目标是将1.DSRIG框架扩展到因果(有向)图设置,以及2.分组DM框架以比较空间或时间上的网络。短期目标包括改进DSRIG的优化;利用预测图的有向结构来获得单变量响应;扩展DM回归框架以实现零通胀;以及比较(例如,土壤)坡度上的两个网络。长期目标包括将DSRIG扩展到定向多变量响应环境,并在空间和时间上更广泛地建模二部网络。这项研究将有助于加拿大的遗传选择和动物健康监测计划,并为保护和资源管理实践提供信息。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ali, Rebecca其他文献

Ali, Rebecca的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ali, Rebecca', 18)}}的其他基金

Causal Structure Learning from Sparse High Dimensional Data
从稀疏高维数据中学习因果结构
  • 批准号:
    RGPIN-2021-02856
  • 财政年份:
    2021
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Discovery Grants Program - Individual

相似海外基金

Postdoctoral Fellowship: OPP-PRF: Leveraging Community Structure Data and Machine Learning Techniques to Improve Microbial Functional Diversity in an Arctic Ocean Ecosystem Model
博士后奖学金:OPP-PRF:利用群落结构数据和机器学习技术改善北冰洋生态系统模型中的微生物功能多样性
  • 批准号:
    2317681
  • 财政年份:
    2024
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Standard Grant
Structure-Focused Multi-task Learning Approach for structural pattern recognition and analysis
用于结构模式识别和分析的以结构为中心的多任务学习方法
  • 批准号:
    24K20789
  • 财政年份:
    2024
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
CAREER: Structure Exploiting Multi-Agent Reinforcement Learning for Large Scale Networked Systems: Locality and Beyond
职业:为大规模网络系统利用多智能体强化学习的结构:局部性及其他
  • 批准号:
    2339112
  • 财政年份:
    2024
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Continuing Grant
CIF: Small: Efficient and Secure Federated Structure Learning from Bad Data
CIF:小型:高效、安全的联邦结构从不良数据中学习
  • 批准号:
    2341359
  • 财政年份:
    2024
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Standard Grant
Learning novel structure across time and sleep
跨越时间和睡眠学习新颖的结构
  • 批准号:
    10657210
  • 财政年份:
    2023
  • 资助金额:
    $ 1.31万
  • 项目类别:
CAREER: Structure Learning and Forecasting of Large-Scale Time Series
职业:大规模时间序列的结构学习和预测
  • 批准号:
    2239102
  • 财政年份:
    2023
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Continuing Grant
A macromolecular structure building toolkit for machine learning and cloud applications
用于机器学习和云应用的大分子结构构建工具包
  • 批准号:
    BB/X006492/1
  • 财政年份:
    2023
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Research Grant
Accurate, reliable, and interpretable machine learning for assessment of neonatal and pediatric brain micro-structure
准确、可靠且可解释的机器学习,用于评估新生儿和儿童大脑微结构
  • 批准号:
    10566299
  • 财政年份:
    2023
  • 资助金额:
    $ 1.31万
  • 项目类别:
Structure Analysis of Science and Mathematics Problems and Application for Individually Optimal Learning
科学和数学问题的结构分析及其在个体最优学习中的应用
  • 批准号:
    23K02748
  • 财政年份:
    2023
  • 资助金额:
    $ 1.31万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了