Improving Causal Inference Methods in Statistics for Analyzing Big Data

改进统计学中用于分析大数据的因果推理方法

基本信息

  • 批准号:
    RGPIN-2018-05044
  • 负责人:
  • 金额:
    $ 1.53万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2020
  • 资助国家:
    加拿大
  • 起止时间:
    2020-01-01 至 2021-12-31
  • 项目状态:
    已结题

项目摘要

The increasing availability, declining cost of computational machineries and wider application of smart and cloud-based technologies have led to a growing trend of collecting large-scale information for business, utilitarian and scientific purposes. These databases generally contain a considerable number of variables, cover substantially large populations with long follow-up, and better reflect real-world' daily practices compared to those derived from carefully controlled randomized experiments. However, these datasets are not primarily collected for research purposes, and in the absence of randomization, confounding poses a critical challenge in exploring the cause-and-effect relationship between the outcome and the intervention. There is a vast literature on confounding adjustment in the statistical and causal inference literature that guides us to select appropriate variables to adjust and control, e.g., controlling for confounders and risk factors, but not adjusting for instruments and noise variables. Due to the complexity and large size of these databases with thousands of variables, it is not tenable for a domain expert to (i) hand-pick the important confounders or identify which variables are instruments, (ii) reasonably correctly guess the functional form of the covariates in the intervention model (in the propensity score context) or the outcome model, (iii) adequately assess the covariate balance for so many variables. To address these challenges, there are four specific research objectives in this proposal. 1. To develop confounder selection approaches in a high dimensional setting incorporating the principles established in the causal inference literature. 2. To study the robustness of various data-adaptive methods in the context of model misspecification in a high dimensional setting. 3. To propose appropriate metrics for assessing the covariate balance' in the context of propensity scores estimated from high-dimensional covariates. 4. To investigate the above issues when longitudinal data are available. These methods will be evaluated through theoretical developments, real-life applications, and via realistic simulations. I am positioned in a unique interdisciplinary research environment, as an Assistant Professor in the UBC Faculty of Medicine, a biostatistician at St. Paul's hospital, and an alumnus from the Statistics department, UBC, with close research ties with UBC and McGill. In this big-data era, there are huge demands for students with training in statistical modeling who can take causal structures into consideration while analyzing a large data set. Training of highly qualified personnel within an interdisciplinary environment is an essential component of this research. Trainees will receive training and access to high-quality research datasets and methodological and applied research questions that will have a real-life impact.
计算机器的可用性不断增加,成本不断下降,智能和云技术的应用越来越广泛,导致为商业、实用和科学目的收集大规模信息的趋势日益增长。这些数据库通常包含相当多的变量,覆盖了大量的长期随访人群,与精心控制的随机实验相比,更好地反映了真实世界的日常实践。然而,这些数据集主要不是为了研究目的而收集的,在没有随机化的情况下,混淆对探索结果和干预之间的因果关系构成了关键挑战。在统计和因果推理文献中有大量关于混杂调整的文献,指导我们选择适当的变量进行调整和控制,例如,控制混杂因素和风险因素,但不调整仪器和噪声变量。由于这些数据库的复杂性和庞大的规模,成千上万的变量,这是站不住脚的领域专家(i)手工挑选重要的混杂因素或确定哪些变量是工具,(ii)合理正确地猜测干预模型(在倾向评分的背景下)或结果模型中的协变量的函数形式,(iii)充分评估协变量平衡这么多的变量。 为了应对这些挑战,本提案中有四个具体的研究目标。1.结合因果推理文献中建立的原则,在高维环境中开发混杂因素选择方法。2.研究在高维环境中模型误设定的背景下各种数据自适应方法的鲁棒性。3.在从高维协变量估计的倾向评分的背景下,提出用于评估协变量平衡的适当指标。4.在纵向数据可用的情况下,调查上述问题。这些方法将通过理论发展,实际应用和现实模拟进行评估。 我被定位在一个独特的跨学科研究环境中,作为UBC医学院的助理教授,圣保罗医院的生物统计学家,以及UBC统计系的校友,与UBC和麦吉尔大学有着密切的研究联系。在这个大数据时代,对受过统计建模培训的学生有巨大的需求,他们可以在分析大型数据集时考虑因果结构。在跨学科环境中培养高素质人才是这项研究的重要组成部分。学员将接受培训,并获得高质量的研究数据集和方法和应用研究问题,将有现实生活中的影响。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Karim, Mohammad其他文献

Interlaboratory performance and quantitative PCR data acceptance metrics for NIST SRM® 2917.
  • DOI:
    10.1016/j.watres.2022.119162
  • 发表时间:
    2022-10-15
  • 期刊:
  • 影响因子:
    12.8
  • 作者:
    Sivaganesan, Mano;Willis, Jessica R.;Karim, Mohammad;Babatola, Akin;Catoe, David;Boehm, Alexandria B.;Wilder, Maxwell;Green, Hyatt;Lobos, Aldo;Harwood, Valerie J.;Hertel, Stephanie;Klepikow, Regina;Howard, Mondraya F.;Laksanalamai, Pongpan;Roundtree, Alexis;Mattioli, Mia;Eytcheson, Stephanie;Molina, Marirosa;Lane, Molly;Rediske, Richard;Ronan, Amanda;D'Souza, Nishita;Rose, Joan B.;Shrestha, Abhilasha;Hoar, Catherine;Silverman, Andrea I.;Faulkner, Wyatt;Wickman, Kathleen;Kralj, Jason G.;Servetas, Stephanie L.;Hunter, Monique E.;Jackson, Scott A.;Shanks, Orin C.
  • 通讯作者:
    Shanks, Orin C.
Discovery of Tumor-Targeted 6-Methyl Substituted Pemetrexed and Related Antifolates with Selective Loss of RFC Transport.
  • DOI:
    10.1021/acsmedchemlett.3c00326
  • 发表时间:
    2023-12-14
  • 期刊:
  • 影响因子:
    4.2
  • 作者:
    Kaku, Krishna;Ravindra, Manasa P.;Tong, Nian;Choudhary, Shruti;Li, Xinxin;Yu, Jianming;Karim, Mohammad;Brzezinski, Madelyn;O'Connor, Carrie;Hou, Zhanjun;Matherly, Larry H.;Gangjee, Aleem
  • 通讯作者:
    Gangjee, Aleem
Zinc trafficking 1. Probing the roles of proteome, metallothionein, and glutathione
  • DOI:
    10.1093/mtomcs/mfab055
  • 发表时间:
    2021-09-02
  • 期刊:
  • 影响因子:
    3.4
  • 作者:
    Mahim, Afsana;Karim, Mohammad;Petering, David H.
  • 通讯作者:
    Petering, David H.

Karim, Mohammad的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Karim, Mohammad', 18)}}的其他基金

Improving Causal Inference Methods in Statistics for Analyzing Big Data
改进统计学中用于分析大数据的因果推理方法
  • 批准号:
    RGPIN-2018-05044
  • 财政年份:
    2022
  • 资助金额:
    $ 1.53万
  • 项目类别:
    Discovery Grants Program - Individual
Improving Causal Inference Methods in Statistics for Analyzing Big Data
改进统计学中用于分析大数据的因果推理方法
  • 批准号:
    RGPIN-2018-05044
  • 财政年份:
    2021
  • 资助金额:
    $ 1.53万
  • 项目类别:
    Discovery Grants Program - Individual
Improving Causal Inference Methods in Statistics for Analyzing Big Data
改进统计学中用于分析大数据的因果推理方法
  • 批准号:
    RGPIN-2018-05044
  • 财政年份:
    2019
  • 资助金额:
    $ 1.53万
  • 项目类别:
    Discovery Grants Program - Individual
Improving Causal Inference Methods in Statistics for Analyzing Big Data
改进统计学中用于分析大数据的因果推理方法
  • 批准号:
    RGPIN-2018-05044
  • 财政年份:
    2018
  • 资助金额:
    $ 1.53万
  • 项目类别:
    Discovery Grants Program - Individual
Improving Causal Inference Methods in Statistics for Analyzing Big Data
改进统计学中用于分析大数据的因果推理方法
  • 批准号:
    DGECR-2018-00235
  • 财政年份:
    2018
  • 资助金额:
    $ 1.53万
  • 项目类别:
    Discovery Launch Supplement

相似海外基金

"Improving Understanding Of Weight Stigma With Causal Inference Methods And General Population Survey Data".
“利用因果推理方法和一般人口调查数据提高对体重耻辱的理解”。
  • 批准号:
    ES/X000486/1
  • 财政年份:
    2023
  • 资助金额:
    $ 1.53万
  • 项目类别:
    Research Grant
Improving Causal Inference Methods in Statistics for Analyzing Big Data
改进统计学中用于分析大数据的因果推理方法
  • 批准号:
    RGPIN-2018-05044
  • 财政年份:
    2022
  • 资助金额:
    $ 1.53万
  • 项目类别:
    Discovery Grants Program - Individual
Improving Causal Inference Methods in Statistics for Analyzing Big Data
改进统计学中用于分析大数据的因果推理方法
  • 批准号:
    RGPIN-2018-05044
  • 财政年份:
    2021
  • 资助金额:
    $ 1.53万
  • 项目类别:
    Discovery Grants Program - Individual
Improving representativeness in non-probability surveys and causal inference with regularized regression and post-stratification
通过正则化回归和后分层提高非概率调查和因果推断的代表性
  • 批准号:
    10219956
  • 财政年份:
    2020
  • 资助金额:
    $ 1.53万
  • 项目类别:
Improving representativeness in non-probability surveys and causal inference with regularized regression and post-stratification
通过正则化回归和后分层提高非概率调查和因果推断的代表性
  • 批准号:
    10400107
  • 财政年份:
    2020
  • 资助金额:
    $ 1.53万
  • 项目类别:
Improving Causal Inference Methods in Statistics for Analyzing Big Data
改进统计学中用于分析大数据的因果推理方法
  • 批准号:
    RGPIN-2018-05044
  • 财政年份:
    2019
  • 资助金额:
    $ 1.53万
  • 项目类别:
    Discovery Grants Program - Individual
III: SMALL: Moving Beyond Knowledge to Action: Evaluating and Improving the Utility of Causal Inference
III:小:超越知识到行动:评估和提高因果推理的实用性
  • 批准号:
    1907951
  • 财政年份:
    2019
  • 资助金额:
    $ 1.53万
  • 项目类别:
    Continuing Grant
Improving Causal Inference Methods in Statistics for Analyzing Big Data
改进统计学中用于分析大数据的因果推理方法
  • 批准号:
    RGPIN-2018-05044
  • 财政年份:
    2018
  • 资助金额:
    $ 1.53万
  • 项目类别:
    Discovery Grants Program - Individual
Improving Causal Inference Methods in Statistics for Analyzing Big Data
改进统计学中用于分析大数据的因果推理方法
  • 批准号:
    DGECR-2018-00235
  • 财政年份:
    2018
  • 资助金额:
    $ 1.53万
  • 项目类别:
    Discovery Launch Supplement
Improving Causal Inference Tools for Addiction Researchers
改进成瘾研究人员的因果推理工具
  • 批准号:
    9769684
  • 财政年份:
    2018
  • 资助金额:
    $ 1.53万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了