Approximations of computationally intensive statistical learning algorithms: theory and methods

计算密集型统计学习算法的近似:理论和方法

基本信息

  • 批准号:
    RGPIN-2019-06487
  • 负责人:
  • 金额:
    $ 1.36万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2019
  • 资助国家:
    加拿大
  • 起止时间:
    2019-01-01 至 2020-12-31
  • 项目状态:
    已结题

项目摘要

Practitioners of quantitative sciences (statisticians, engineers, physicists, etc.) often face intractable quantities, such as calculating an integral which cannot be solved analytically or optimizing a function which is not known explicitly. In both cases, the quantity of interest is referred to as intractable since its exact (mathematical) value is out of reach. Such a situation is usually solved using stochastic numerical algorithms: those are iterative methods using sequences of random numbers implemented on a computer which return a numerical value to the user, approximating the exact solution to their intractable problem. Based on this numerical output, the practitioner can classify data, assess a model, interpret an experiment, etc. The algorithm convergence must thus be well understood and it should return a value which is probabilistically accurate. Since the 1950s, a lot of research in statistics and machine learning has been devoted to designing algorithms that have a solid theoretical foundation. Such algorithms include Markov chain Monte Carlo methods, Expected Maximization algorithm, the gradient algorithm, etc. Those are referred to as standard algorithms as they are known and used by most applied scientists: in regular situations, the algorithm, seen as a black-box, converges and returns a trustworthy solution to the problem as long as it iterates a sufficient amount of time.******Paradoxically, the increasing computational capacity of today's computers challenges the efficiency of standard algorithms. We outline two situations where they scale poorly to the dimension of the problem.***- big data: improved storage capacity and better data-acquisition devices mean that algorithms can be supplied with more data (and more accurate ones). Standard algorithms become computationally slow and in fact unusable in practice.***- high-dimensionality: novel computer architectures (parallel computing, GPU, etc.) allow scientists to attempt solving more complex problems such as integrating functions of several hundred variables. Standard algorithms become statistically slow, i.e. they need much more iterations for achieving a given accuracy than for lower dimensional problems.******The main research line of this proposal deals with the approximation of some standard algorithms. Expected outputs include statistical methods that are computationally and statistically more efficient than standard algorithms while still retaining, in some capacity, their black-box aspect. Designing an approximation framework that guarantees that most theoretical properties of the standard algorithm are preserved in the noisy version is essential. Promising results have already been obtained for some algorithms and have been successfully applied to social network analysis and computer vision. Current research aims at making those approximate methods more generic and unifying the theoretical frameworks for analyzing and designing new approximate algorithms.**
定量科学的从业者(统计学家、工程师、物理学家等)经常面临棘手的数量,例如计算一个不能解析解的积分或优化一个不明确已知的函数。在这两种情况下,感兴趣的数量都被称为难以处理的,因为它的精确(数学)值是无法达到的。这种情况通常使用随机数值算法来解决:这是一种迭代方法,使用在计算机上实现的随机数序列,返回数值给用户,近似于他们棘手问题的精确解。基于这个数值输出,从业者可以对数据进行分类,评估模型,解释实验等。因此,必须很好地理解算法的收敛性,并且它应该返回一个概率准确的值。自20世纪50年代以来,统计学和机器学习领域的大量研究都致力于设计具有坚实理论基础的算法。这类算法包括马尔可夫链蒙特卡罗方法、期望最大化算法、梯度算法等。这些被称为标准算法,因为它们被大多数应用科学家所熟知和使用:在常规情况下,算法被视为一个黑盒,只要迭代足够的时间,它就会收敛并返回一个值得信赖的问题解决方案。******矛盾的是,当今计算机不断增长的计算能力挑战了标准算法的效率。我们列出了两种情况,在这两种情况下,它们的规模与问题的规模不相符。***-大数据:改进的存储容量和更好的数据采集设备意味着算法可以提供更多的数据(和更准确的数据)。标准算法在计算上变得很慢,实际上在实践中无法使用。***-高维:新颖的计算机架构(并行计算,GPU等)允许科学家尝试解决更复杂的问题,如数百个变量的函数积分。标准算法在统计上变得缓慢,即与低维问题相比,它们需要更多的迭代才能达到给定的精度。******本提案的主要研究路线涉及一些标准算法的近似。预期的输出包括在计算和统计上比标准算法更有效的统计方法,同时在某种程度上仍然保留其黑箱方面。设计一个近似框架,以保证标准算法的大多数理论性质在有噪声的版本中保留是必不可少的。一些算法已经取得了可喜的结果,并已成功地应用于社会网络分析和计算机视觉。目前的研究目标是使这些近似方法更加通用,并统一分析和设计新的近似算法的理论框架

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Maire, Florian其他文献

Identification of Ion Series Using Ion Mobility Mass Spectrometry: The Example of Alkyl-Benzothiophene and Alkyl-Dibenzothiophene Ions in Diesel Fuels
  • DOI:
    10.1021/ac400731d
  • 发表时间:
    2013-06-04
  • 期刊:
  • 影响因子:
    7.4
  • 作者:
    Maire, Florian;Neeson, Kieran;Giusti, Pierre
  • 通讯作者:
    Giusti, Pierre
Efficient Bayesian inference for exponential random graph models by correcting the pseudo-posterior distribution
  • DOI:
    10.1016/j.socnet.2017.03.013
  • 发表时间:
    2017-07-01
  • 期刊:
  • 影响因子:
    3.1
  • 作者:
    Bouranis, Lampros;Friel, Nial;Maire, Florian
  • 通讯作者:
    Maire, Florian
Traveling Wave Ion Mobility Mass Spectrometry Study of Low Generation Polyamidoamine Dendrimers
Atmospheric Solid Analysis Probe-Ion Mobility Mass Spectrometry of Polypropylene
  • DOI:
    10.1021/ac302109q
  • 发表时间:
    2012-11-06
  • 期刊:
  • 影响因子:
    7.4
  • 作者:
    Barrere, Caroline;Maire, Florian;Giusti, Pierre
  • 通讯作者:
    Giusti, Pierre
A Mutasynthesis Approach with a Penicillium chrysogenum ΔroqA Strain Yields New Roquefortine D Analogues
  • DOI:
    10.1002/cbic.201402686
  • 发表时间:
    2015-04-13
  • 期刊:
  • 影响因子:
    3.2
  • 作者:
    Ouchaou, Kahina;Maire, Florian;Overkleeft, Herman S.
  • 通讯作者:
    Overkleeft, Herman S.

Maire, Florian的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Maire, Florian', 18)}}的其他基金

Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
  • 批准号:
    RGPIN-2019-06487
  • 财政年份:
    2022
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Grants Program - Individual
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
  • 批准号:
    RGPIN-2019-06487
  • 财政年份:
    2021
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Grants Program - Individual
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
  • 批准号:
    RGPIN-2019-06487
  • 财政年份:
    2020
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Grants Program - Individual
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
  • 批准号:
    DGECR-2019-00269
  • 财政年份:
    2019
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Launch Supplement

相似海外基金

Computationally Intensive Methods for Large Spatio-Temporal Data Sets
大型时空数据集的计算密集型方法
  • 批准号:
    RGPIN-2018-04604
  • 财政年份:
    2022
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Grants Program - Individual
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
  • 批准号:
    RGPIN-2019-06487
  • 财政年份:
    2022
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Grants Program - Individual
Computationally Intensive Methods for Large Spatio-Temporal Data Sets
大型时空数据集的计算密集型方法
  • 批准号:
    RGPIN-2018-04604
  • 财政年份:
    2021
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Grants Program - Individual
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
  • 批准号:
    RGPIN-2019-06487
  • 财政年份:
    2021
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Grants Program - Individual
Computationally Intensive Methods for Large Spatio-Temporal Data Sets
大型时空数据集的计算密集型方法
  • 批准号:
    RGPIN-2018-04604
  • 财政年份:
    2020
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Grants Program - Individual
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
  • 批准号:
    RGPIN-2019-06487
  • 财政年份:
    2020
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Grants Program - Individual
Computationally Intensive Methods for Large Spatio-Temporal Data Sets
大型时空数据集的计算密集型方法
  • 批准号:
    RGPIN-2018-04604
  • 财政年份:
    2019
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Grants Program - Individual
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
  • 批准号:
    DGECR-2019-00269
  • 财政年份:
    2019
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Launch Supplement
Computationally Intensive Methods for Large Spatio-Temporal Data Sets
大型时空数据集的计算密集型方法
  • 批准号:
    RGPIN-2018-04604
  • 财政年份:
    2018
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Discovery Grants Program - Individual
Closed-Loop Data Science for Complex, Computationally- and Data-Intensive Analytics
用于复杂、计算和数据密集型分析的闭环数据科学
  • 批准号:
    EP/R018634/1
  • 财政年份:
    2018
  • 资助金额:
    $ 1.36万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了