Semiparametric Statistical (Machine) Learning

半参数统计(机器)学习

基本信息

  • 批准号:
    RGPIN-2018-04868
  • 负责人:
  • 金额:
    $ 1.68万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2018
  • 资助国家:
    加拿大
  • 起止时间:
    2018-01-01 至 2019-12-31
  • 项目状态:
    已结题

项目摘要

Background***Statistical learning algorithms (SLAs) are processes that are used worldwide as modern methods for predicting results from complex relationships. SLAs learn from a set of data consisting of a response (the target for prediction) and one or more explanatory variables (the inputs). The learning consists of minimizing the loss—a distance between the predictions and the observed responses across the entire data set. SLAs have features called tuning parameters that users can vary to get a better-fitting result. However, the process of finding the best values for tuning parameters is cumbersome, relying on a computationally intensive process called cross-validation where the entire learning process is repeated numerous times. Furthermore, different SLAs can provide somewhat different results on a given data set, and it is unclear in advance which one should be used. ******Work to be done***The proposed research will generate ways to simplify the process of tuning SLAs. We will also provide new ways to combine SLAs into one ensemble predictor that makes use of the different strengths of each individual SLA. To do this, we will develop a new method for mathematically deriving an information criterion (IC) on each SLA. ICs measure the fit of a statistical model to a data set, balancing between making the loss smaller and making a model too complex. Because SLAs are not based on statistical models, we will mathematically approximate the SLA (or features of the SLA) with a statistical model, from which the IC will be developed. The model complexity component is difficult to estimate exactly, but our group has had success doing this on related problems, resulting in new statistical analysis methods with better properties than existing ones.******There are hundreds of different SLAs that could be targets for this work. We will perform preliminary tests of many of these candidates to select the ones that have the best combinations of prediction quality and structurally amenable features upon which ICs can be created. These ICs can then be computed on different versions of a single SLA or on multiple SLAs to help select the best-fitting ones. The ICs can also be for model averaging, where different SLA predictions are averaged into a single prediction using weighting determined by their ICs. ******Outcomes and Benefits ***The results of this research will be new statistical methods that lead to faster and better prediction algorithms. These algorithms will be developed into software that is freely accessible to millions of users nationwide and worldwide. Users who need fast, accurate answers to important questions—for example, forecasting consumer trends, comparing potential patient responses to different therapies, and anticipating impacts of public policies—will have better tools for performing these tasks.
统计学习算法(sla)是世界范围内使用的一种现代方法,用于预测复杂关系的结果。sla从一组由响应(预测的目标)和一个或多个解释变量(输入)组成的数据中学习。学习包括最小化损失——整个数据集的预测和观察到的响应之间的距离。sla具有称为调优参数的特性,用户可以改变这些特性以获得更好的拟合结果。然而,寻找优化参数的最佳值的过程是繁琐的,依赖于称为交叉验证的计算密集型过程,其中整个学习过程要重复多次。此外,不同的sla可以在给定的数据集上提供不同的结果,并且事先不清楚应该使用哪一个。******有待完成的工作***提出的研究将产生简化sla调优过程的方法。我们还将提供将SLA组合成一个集成预测器的新方法,该预测器利用每个单独SLA的不同优势。为此,我们将开发一种新的方法,用于从数学上推导每个SLA的信息标准(IC)。ic衡量统计模型与数据集的拟合程度,在减小损失和使模型过于复杂之间取得平衡。因为SLA不是基于统计模型,所以我们将用统计模型在数学上近似SLA(或SLA的特征),IC将从统计模型中开发出来。模型复杂性成分很难准确估计,但我们小组在相关问题上已经成功地做到了这一点,从而产生了比现有统计分析方法性能更好的新统计分析方法。******有数百种不同的sla可以作为这项工作的目标。我们将对其中的许多候选对象进行初步测试,以选择具有预测质量和结构可适应特征的最佳组合的候选对象,从而可以在其上创建ic。然后可以在单个SLA或多个SLA的不同版本上计算这些ic,以帮助选择最合适的版本。ic还可以用于模型平均,其中使用由其ic确定的权重将不同的SLA预测平均为单个预测。******结果和益处***本研究的结果将是新的统计方法,从而导致更快更好的预测算法。这些算法将被开发成软件,供全国和全世界数百万用户免费使用。需要快速准确回答重要问题的用户——例如,预测消费者趋势,比较不同疗法的潜在患者反应,以及预测公共政策的影响——将有更好的工具来执行这些任务。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Loughin, Thomas其他文献

Isolation and morphological and metabolic characterization of common endophytes in annually burned tallgrass prairie
  • DOI:
    10.3852/09-212
  • 发表时间:
    2010-07-01
  • 期刊:
  • 影响因子:
    2.8
  • 作者:
    Mandyam, Keerthi;Loughin, Thomas;Jumpponen, Ari
  • 通讯作者:
    Jumpponen, Ari

Loughin, Thomas的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Loughin, Thomas', 18)}}的其他基金

Semiparametric Statistical (Machine) Learning
半参数统计(机器)学习
  • 批准号:
    RGPIN-2018-04868
  • 财政年份:
    2022
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Semiparametric Statistical (Machine) Learning
半参数统计(机器)学习
  • 批准号:
    RGPIN-2018-04868
  • 财政年份:
    2021
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Semiparametric Statistical (Machine) Learning
半参数统计(机器)学习
  • 批准号:
    RGPIN-2018-04868
  • 财政年份:
    2020
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Semiparametric Statistical (Machine) Learning
半参数统计(机器)学习
  • 批准号:
    RGPIN-2018-04868
  • 财政年份:
    2019
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Heteroscedastic trees, ensembles, and other joint models for means and variances
均值和方差的异方差树、集成和其他联合模型
  • 批准号:
    342205-2013
  • 财政年份:
    2017
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Heteroscedastic trees, ensembles, and other joint models for means and variances
均值和方差的异方差树、集成和其他联合模型
  • 批准号:
    342205-2013
  • 财政年份:
    2015
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Heteroscedastic trees, ensembles, and other joint models for means and variances
均值和方差的异方差树、集成和其他联合模型
  • 批准号:
    342205-2013
  • 财政年份:
    2014
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Heteroscedastic trees, ensembles, and other joint models for means and variances
均值和方差的异方差树、集成和其他联合模型
  • 批准号:
    342205-2013
  • 财政年份:
    2013
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Identifying dispersion effects in unreplicated multilevel factorial experiments
识别未重复的多级因子实验中的色散效应
  • 批准号:
    342205-2007
  • 财政年份:
    2011
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual
Identifying dispersion effects in unreplicated multilevel factorial experiments
识别未重复的多级因子实验中的色散效应
  • 批准号:
    342205-2007
  • 财政年份:
    2010
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Grants Program - Individual

相似海外基金

EAGER: SSMCDAT2023: Revealing Local Symmetry Breaking in Intermetallics: Combining Statistical Mechanics and Machine Learning in PDF Analysis
EAGER:SSMCDAT2023:揭示金属间化合物中的局部对称性破缺:在 PDF 分析中结合统计力学和机器学习
  • 批准号:
    2334261
  • 财政年份:
    2023
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Standard Grant
REU Site: University of North Carolina at Greensboro - Complex Data Analysis using Statistical and Machine Learning Tools
REU 站点:北卡罗来纳大学格林斯伯勒分校 - 使用统计和机器学习工具进行复杂数据分析
  • 批准号:
    2244160
  • 财政年份:
    2023
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Standard Grant
Comparison of Machine Learning and Conventional Statistical Modeling for Predicting Readmission Following Acute Heart Failure Hospitalization
机器学习与传统统计模型预测急性心力衰竭住院后再入院的比较
  • 批准号:
    495410
  • 财政年份:
    2023
  • 资助金额:
    $ 1.68万
  • 项目类别:
Modern Statistics and Statistical Machine Learning
现代统计学和统计机器学习
  • 批准号:
    2886365
  • 财政年份:
    2023
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Studentship
Modern Statistics and Statistical Machine Learning
现代统计学和统计机器学习
  • 批准号:
    2886852
  • 财政年份:
    2023
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Studentship
Next-Generation Algorithms in Statistical Genetics Based on Modern Machine Learning
基于现代机器学习的下一代统计遗传学算法
  • 批准号:
    10714930
  • 财政年份:
    2023
  • 资助金额:
    $ 1.68万
  • 项目类别:
Unravel machine learning blackboxes -- A general, effective and performance-guaranteed statistical framework for complex and irregular inference problems in data science
揭开机器学习黑匣子——针对数据科学中复杂和不规则推理问题的通用、有效和性能有保证的统计框架
  • 批准号:
    2311064
  • 财政年份:
    2023
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Standard Grant
Modern Statistics and Statistical Machine Learning
现代统计学和统计机器学习
  • 批准号:
    2886723
  • 财政年份:
    2023
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Studentship
A Novel Approach to Semi-Supervised Statistical Machine Learning
半监督统计机器学习的新方法
  • 批准号:
    DP230101671
  • 财政年份:
    2023
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Discovery Projects
Modern Statistics and Statistical Machine Learning
现代统计学和统计机器学习
  • 批准号:
    2886777
  • 财政年份:
    2023
  • 资助金额:
    $ 1.68万
  • 项目类别:
    Studentship
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了