A reliable and scalable approach to causal inference for large-scale multivariate data

一种可靠且可扩展的大规模多元数据因果推理方法

基本信息

  • 批准号:
    1407028
  • 负责人:
  • 金额:
    $ 12万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2014
  • 资助国家:
    美国
  • 起止时间:
    2014-08-15 至 2017-07-31
  • 项目状态:
    已结题

项目摘要

With masses of large-scale data being generated, a key challenge facing many scientists is to infer relationships amongst variables of interest. In particular, inferring causal or functional relationships amongst genes, proteins, and other biological elements is of fundamental interest to scientists. This project will develop methods for inferring causal or functional relations between genetic, proteomic, and transcriptomic features both for the ENCODE human genome project and data for mice with different susceptibility to obesity and diabetes. For both types of data, this project will develop frameworks that comprise: (1) domain knowledge that informs the choice of model and algorithm; (2) fast, parallelizeable algorithms with provable run-time guarantees; and (3) statistical consistency guarantees for the algorithms developed under assumptions that are likely to be satisfied in practice.Directed graphical models or Bayesian networks provide a useful framework for representing causal or functional relationships. A number of algorithms have been developed for inferring directed or Bayesian networks from data. However prior approaches are either unreliable as they require assumptions that are rarely satisfied in practice, or do not scale to larger datasets. The proposed project will address this issue by developing algorithms for inferring directed networks with both statistical consistency guarantees and run-time guarantees. The new algorithms will involve exploiting connections between techniques in numerical linear algebra for developing fast solvers of linear systems and concepts in graph theory. Algorithms will be coded in R and will exploit parallel processing. Evaluation will involve both small-scale and large-scale synthetic graphical models with known network structure, real datasets involving yeast data where some of the directions are known, and new biochemistry data in which most of the directions are unknown. Theoretical guarantees on run-time and statistical consistency will be provided using a combination of tools from graph theory, numerical linear algebra, and concentration of measure the PI has used and developed in prior work.
随着大量大规模数据的产生,许多科学家面临的一个关键挑战是推断感兴趣的变量之间的关系。特别是,推断基因、蛋白质和其他生物元素之间的因果关系或功能关系是科学家最感兴趣的。该项目将为ENCODE人类基因组计划和具有不同肥胖和糖尿病易感性的小鼠数据开发推断遗传、蛋白质组学和转录组学特征之间因果关系或功能关系的方法。对于这两种类型的数据,该项目将开发框架,包括:(1)告知模型和算法选择的领域知识;(2)具有可证明的运行时保证的快速、可并行算法;(3)在实践中可能满足的假设下开发的算法的统计一致性保证。有向图形模型或贝叶斯网络为表示因果关系或功能关系提供了一个有用的框架。已经开发了许多算法来从数据中推断有向网络或贝叶斯网络。然而,先前的方法要么不可靠,因为它们需要的假设在实践中很少得到满足,要么不能扩展到更大的数据集。拟议的项目将通过开发算法来推断具有统计一致性保证和运行时保证的有向网络来解决这个问题。新的算法将涉及利用数值线性代数技术之间的联系,以开发线性系统的快速求解器和图论中的概念。算法将用R编码,并将利用并行处理。评估将涉及具有已知网络结构的小规模和大规模合成图形模型,涉及酵母数据的真实数据集,其中一些方向是已知的,以及大多数方向未知的新生物化学数据。运行时和统计一致性的理论保证将使用图论、数值线性代数和PI在先前工作中使用和开发的测量浓度的工具组合来提供。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Garvesh Raskutti其他文献

Network estimation via poisson autoregressive models
通过泊松自回归模型进行网络估计
The Frugal Inference of Causal Relations
因果关系的节俭推理
Minimax Optimal Convex Methods for Poisson Inverse Problems Under $ell_{q}$ -Ball Sparsity
$ell_{q}$ -球稀疏性下泊松反问题的极小极大最优凸方法
Testing for high-dimensional network parameters in auto-regressive models
自回归模型中高维网络参数的测试
  • DOI:
    10.1214/19-ejs1646
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    1.1
  • 作者:
    Lili Zheng;Garvesh Raskutti
  • 通讯作者:
    Garvesh Raskutti
Statistical and Algorithmic Perspectives on Randomized Sketching for Ordinary Least-Squares
普通最小二乘随机草图的统计和算法视角

Garvesh Raskutti的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Garvesh Raskutti', 18)}}的其他基金

Estimation, inference and testing for large-scale directed network models
大规模有向网络模型的估计、推理和测试
  • 批准号:
    1811767
  • 财政年份:
    2018
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant

相似国自然基金

Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    合作创新研究团队

相似海外基金

SHF: Small: QED - A New Approach to Scalable Verification of Hardware Memory Consistency
SHF:小型:QED - 硬件内存一致性可扩展验证的新方法
  • 批准号:
    2332891
  • 财政年份:
    2024
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
CAREER: A Novel Electrically-assisted Multimaterial Printing Approach for Scalable Additive Manufacturing of Bioinspired Heterogeneous Materials Architectures
职业:一种新型电辅助多材料打印方法,用于仿生异质材料架构的可扩展增材制造
  • 批准号:
    2338752
  • 财政年份:
    2024
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Automated, Scalable, and Machine Learning-Driven Approach for Generating and Optimizing Scientific Application Codes
用于生成和优化科学应用代码的自动化、可扩展且机器学习驱动的方法
  • 批准号:
    23K24856
  • 财政年份:
    2024
  • 资助金额:
    $ 12万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
In vivo Perturb-map: scalable genetic screens with single-cell and spatial resolution in intact tissues
体内扰动图:在完整组织中具有单细胞和空间分辨率的可扩展遗传筛选
  • 批准号:
    10578616
  • 财政年份:
    2023
  • 资助金额:
    $ 12万
  • 项目类别:
A Uniquely Scalable Approach to Sequence Tens of Millions of Single Cells Without Compromising Performance
一种独特的可扩展方法,可在不影响性能的情况下对数千万个单细胞进行测序
  • 批准号:
    10700398
  • 财政年份:
    2023
  • 资助金额:
    $ 12万
  • 项目类别:
GARDE: Scalable Clinical Decision Support for Individualized Cancer Risk Management
GARDE:个性化癌症风险管理的可扩展临床决策支持
  • 批准号:
    10741231
  • 财政年份:
    2023
  • 资助金额:
    $ 12万
  • 项目类别:
Development of a facile, robust, scalable, and versatile chemoenzymatic glycan-remodeling approach for site-specific antibody conjugation
开发一种简便、稳健、可扩展且多功能的化学酶聚糖重塑方法,用于位点特异性抗体缀合
  • 批准号:
    10615237
  • 财政年份:
    2022
  • 资助金额:
    $ 12万
  • 项目类别:
SBIR Phase I: A highly-scalable, rapid, in-season approach to tune a nitrogen model for accurate prediction of a corn crop’s remaining nitrogen need
SBIR 第一阶段:一种高度可扩展、快速的季节性方法,用于调整氮模型,以准确预测玉米作物的剩余氮需求
  • 批准号:
    2127096
  • 财政年份:
    2022
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Automated, Scalable, and Machine Learning-Driven Approach for Generating and Optimizing Scientific Application Codes
用于生成和优化科学应用代码的自动化、可扩展且机器学习驱动的方法
  • 批准号:
    22H03600
  • 财政年份:
    2022
  • 资助金额:
    $ 12万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Development of a facile, robust, scalable, and versatile chemoenzymatic glycan-remodeling approach for site-specific antibody conjugation
开发一种简便、稳健、可扩展且多功能的化学酶聚糖重塑方法,用于位点特异性抗体缀合
  • 批准号:
    10484443
  • 财政年份:
    2022
  • 资助金额:
    $ 12万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了