Computational and Inferential Tools for Machine Learning Methods in Biostatistical Research

生物统计研究中机器学习方法的计算和推理工具

基本信息

  • 批准号:
    RGPIN-2017-06586
  • 负责人:
  • 金额:
    $ 1.02万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2019
  • 资助国家:
    加拿大
  • 起止时间:
    2019-01-01 至 2020-12-31
  • 项目状态:
    已结题

项目摘要

Modern machine learning methods, such as boosting, support vector machines, or neural networks, have made great impact on statistical research and application mostly in terms of improved predictive and prognostic accuracy. Their enhanced abilities to model complex interactions and non-linear effects could also be utilized to explain the underlying physical or physiological phenomena and to generate specific scientific hypothesis for further study. In non-strictly predictive applications, use of many modern methods, however, is hampered by their black-box nature and by the lack of inferential tools that would allow to obtain statistical confidence measures on inferred relationships. The simplest statistical inference which is universal in classical models pertains to statements on individual covariates. For example, is covariate "Gender" an important factor in a model of disease progression? In classical models this is answered by calculating statistical inference quantities (p-values, confidence intervals) on a parameter (or small set of parameters) that are connected with "Gender" in a model. In contrast, machine learning methods utilize a non-parametric approach where covariates influence on the outcome is not controlled by a small set of parameters. Hence the classical approach is not applicable and an importance of any particular covariate in the model of the outcome is not easily tested. While many model-specific or approximate measures have been proposed, in particular Variable Importance Metric in a Random Forest model, there is no universal, statistically coherent approach present in literature. We propose to develop, validate, apply and disseminate - in the form of freely available software packages - a set of tools for classical inference that will allow researchers to test the importance and influence of covariates of interest in the non-parametric machine learning models of the outcome.
现代机器学习方法,如boosting,支持向量机或神经网络,对统计研究和应用产生了巨大影响,主要是在提高预测和预后准确性方面。它们模拟复杂相互作用和非线性效应的增强能力也可用于解释潜在的物理或生理现象,并为进一步研究产生特定的科学假设。然而,在非严格预测应用中,许多现代方法的使用受到其黑盒性质和缺乏推理工具的阻碍,这些工具可以获得推断关系的统计置信度。最简单的统计推断,这是普遍的,在经典的模型属于声明的个别协变量。例如,协变量“性别”是否是疾病进展模型中的重要因素?在经典模型中,这是通过计算模型中与“性别”相关的参数(或一小组参数)的统计推断量(p值,置信区间)来回答的。相比之下,机器学习方法利用非参数方法,其中协变量对结果的影响不受一小组参数的控制。因此,经典方法不适用,并且结果模型中任何特定协变量的重要性不容易检验。虽然已经提出了许多模型特定的或近似的措施,特别是随机森林模型中的变量重要性度量,但文献中没有通用的,统计上一致的方法。我们建议开发,验证,应用和传播-以免费软件包的形式-一套经典推理工具,使研究人员能够测试结果的非参数机器学习模型中感兴趣的协变量的重要性和影响。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Kustra, Rafal其他文献

5-hmC in the brain is abundant in synaptic genes and shows differences at the exon-intron boundary.
  • DOI:
    10.1038/nsmb.2372
  • 发表时间:
    2012-10
  • 期刊:
  • 影响因子:
    16.8
  • 作者:
    Khare, Tarang;Pai, Shraddha;Koncevicius, Karolis;Pal, Mrinal;Kriukiene, Edita;Liutkeviciute, Zita;Irimia, Manuel;Jia, Peixin;Ptak, Carolyn;Xia, Menghang;Tice, Raymond;Tochigi, Mamoru;Morera, Solange;Nazarians, Anaies;Belsham, Denise;Wong, Albert H. C.;Blencowe, Benjamin J.;Wang, Sun Chong;Kapranov, Philipp;Kustra, Rafal;Labrie, Viviane;Klimasauskas, Saulius;Petronis, Arturas
  • 通讯作者:
    Petronis, Arturas
Predictors of all-cause mortality among patients hospitalized with influenza, respiratory syncytial virus, or SARS-CoV-2.
  • DOI:
    10.1111/irv.13004
  • 发表时间:
    2022-11
  • 期刊:
  • 影响因子:
    4.4
  • 作者:
    Hamilton, Mackenzie A.;Liu, Ying;Calzavara, Andrew;Sundaram, Maria E.;Djebli, Mohamed;Darvin, Dariya;Baral, Stefan;Kustra, Rafal;Kwong, Jeffrey C.;Mishra, Sharmistha
  • 通讯作者:
    Mishra, Sharmistha
Data-Fusion in Clustering Microarray Data: Balancing Discovery and Interpretability
CFH and ARMS2 genetic risk determines progression to neovascular age-related macular degeneration after antioxidant and zinc supplementation
A factor analysis model for functional genomics
  • DOI:
    10.1186/1471-2105-7-216
  • 发表时间:
    2006-04-21
  • 期刊:
  • 影响因子:
    3
  • 作者:
    Kustra, Rafal;Shioda, Romy;Zhu, Mu
  • 通讯作者:
    Zhu, Mu

Kustra, Rafal的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Kustra, Rafal', 18)}}的其他基金

Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2010
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2009
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2008
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2007
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Statistics for high-dimensional data with applications in analysis of high-throughput genomics experiments
高维数据统计及其在高通量基因组学实验分析中的应用
  • 批准号:
    240006-2006
  • 财政年份:
    2006
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
High-performance computing resource for statistical genomics
用于统计基因组学的高性能计算资源
  • 批准号:
    330595-2006
  • 财政年份:
    2005
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Research Tools and Instruments - Category 1 (<$150,000)
Biostatistical analysis of medical signals and images
医学信号和图像的生物统计分析
  • 批准号:
    240006-2001
  • 财政年份:
    2005
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Biostatistical analysis of medical signals and images
医学信号和图像的生物统计分析
  • 批准号:
    240006-2001
  • 财政年份:
    2004
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Biostatistical analysis of medical signals and images
医学信号和图像的生物统计分析
  • 批准号:
    240006-2001
  • 财政年份:
    2003
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Biostatistical analysis of medical signals and images
医学信号和图像的生物统计分析
  • 批准号:
    240006-2001
  • 财政年份:
    2002
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual

相似海外基金

Evaluating probabilistic inferential models of learnt sound representations in auditory cortex
评估听觉皮层中学习的声音表征的概率推理模型
  • 批准号:
    BB/X013391/1
  • 财政年份:
    2023
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Research Grant
A semantic/pragmatic analyses of interaction between predicates of personal taste and inferential expressions
个人品味谓词与推理表达之间相互作用的语义/语用分析
  • 批准号:
    23K12181
  • 财政年份:
    2023
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
EHSCAN-Exploring Early Holocene Saharan Cultural Adaptation and social Networks through socio-ecological inferential modelling.
EHSCAN-通过社会生态推理模型探索全新世早期撒哈拉文化适应和社交网络。
  • 批准号:
    EP/Y028430/1
  • 财政年份:
    2023
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Fellowship
Emergence of ostensive inferential communication
明示推理交流的出现
  • 批准号:
    23K02901
  • 财政年份:
    2023
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Computational and Inferential Tools for Machine Learning Methods in Biostatistical Research
生物统计研究中机器学习方法的计算和推理工具
  • 批准号:
    RGPIN-2017-06586
  • 财政年份:
    2021
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Investigation of semantic universals in the inferential domain
推理领域语义共性的研究
  • 批准号:
    21K12991
  • 财政年份:
    2021
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Repro Sampling Method: A Transformative Artificial-Sample-Based Inferential Framework with Applications to Discrete Parameter, High-Dimensional Data, and Rare Events Inferences
再现采样方法:一种基于人工样本的变革性推理框架,应用于离散参数、高维数据和稀有事件推理
  • 批准号:
    2015373
  • 财政年份:
    2020
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Standard Grant
Computational and Inferential Tools for Machine Learning Methods in Biostatistical Research
生物统计研究中机器学习方法的计算和推理工具
  • 批准号:
    RGPIN-2017-06586
  • 财政年份:
    2020
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Does Inferential Confusion Moderate the Relationship Between Fear-of-self and OCD Symptoms?
推理混乱是否会调节自我恐惧与强迫症症状之间的关系?
  • 批准号:
    449807
  • 财政年份:
    2020
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Studentship Programs
Inferential apparatuses producing predictions: perspectives from literature, mathematics, the history of art, the history of philosophy of science, and cognitive science
产生预测的推理装置:文学、数学、艺术史、科学哲学史和认知科学的视角
  • 批准号:
    19H01201
  • 财政年份:
    2019
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了