权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Principled approaches to deep learning: generalization under distribution shift and predictive uncertainty

深度学习的原则方法：分布变化和预测不确定性下的泛化

基本信息

批准号：
RGPIN-2022-03609
负责人：
Oberman, Adam
金额：
$ 1.97万
依托单位：
McGill University
依托单位国家：
加拿大
项目类别：
Discovery Grants Program - Individual
财政年份：
2022
资助国家：
加拿大
起止时间：
2022-01-01 至 2023-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=759084
关键词：
Principled approaches deep learning generalization

项目摘要

Deep Learning (DL) addresses problems which were previously not possible using traditional Machine Learning (ML) methods. However, unlike ML, which has a solid theoretical foundation, DL has so far relied engineering practices. In particular, while very successful in computer applications, DL has so far been limited in its applications to real world problems, for example autonomous vehicles. In order to continue to broaden the applications, deep learning requires a theoretical foundation, in the form of error estimates, which provide control over the accuracy of models on new inputs. Thirty years ago, Machine Learning (ML) was in a situation similar to the one currently faced by DL: the methods were far ahead of the theory. In a short time, ML theory was able to solve theoretical problems, allowing ML models to be safely deployed in a wide range of applications. The proposed program will do for DL what was done thirty years ago for ML. Error estimation can be addressed in two settings. The first is generalization error bounds, which apply before the inputs are seen by the model. The second is predictive uncertainty, which applies after the inputs are seen by the model (but before a decision is made). For example, suppose we have a decision problem where we can only act if the probability of error is less than 1%, but our model error is 5%, on average. Predictive uncertainty can tell us, on a case by case basis, on which inputs the model accuracy is high enough to act. For a second example, we train computer vision models using databases of images, but we want to deploy them on real world images, which are statistically different from the training set. We propose to extend data transformation techniques, and methods for measuring dataset differences, in order to better estimate the model error on real world images. This proposal will address the problem of applying deep learning in a broader setting by building a mathematical theory for DL out-of-distribution (OOD) generalization. It will 1.Formulate suitable definitions and assumptions which allow us to state the deep learning out-of- distribution generalization problem in mathematical terms. 2.Prove a theorem estimating (with high probability) the OOD generalization gap, in terms of relevant problem inputs. 3.Determine relevant problem inputs through mathematical modelling. Determining the relevant hypotheses requires applying the scientific method, which, in this case corresponds to computer experiments designed to probe the generalization behaviour of deep neural networks. Stating the definitions and assumptions precisely involves mathematical modelling. Proving the theorem described above involves mathematical analysis.

深度学习（DL）解决了以前使用传统机器学习（ML）方法无法解决的问题。然而，与拥有坚实理论基础的ML不同，DL迄今为止依赖于工程实践。特别地，虽然在计算机应用中非常成功，但DL迄今为止在其应用中仅限于真实的世界问题，例如自主车辆。为了继续扩大应用，深度学习需要以误差估计的形式提供理论基础，以控制新输入的模型准确性。30年前，机器学习（ML）的情况与DL目前面临的情况类似：方法远远领先于理论。在很短的时间内，ML理论能够解决理论问题，使ML模型能够安全地部署在广泛的应用中。拟议的计划将为DL做三十年前为ML做的事情。误差估计可以在两种设置中解决。第一个是泛化误差界，它在模型看到输入之前应用。第二个是预测不确定性，它适用于模型看到输入之后（但在做出决策之前）。例如，假设我们有一个决策问题，我们只能在错误概率小于1%的情况下采取行动，但我们的模型误差平均为5%。预测的不确定性可以告诉我们，根据具体情况，模型的准确性足以采取行动。对于第二个例子，我们使用图像数据库训练计算机视觉模型，但我们希望将它们部署在真实的世界图像上，这些图像在统计上与训练集不同。我们建议扩展数据转换技术和测量数据集差异的方法，以便更好地估计真实的世界图像上的模型误差。该提案将通过构建DL分布外（OOD）泛化的数学理论来解决在更广泛的环境中应用深度学习的问题。它将1.制定合适的定义和假设，使我们能够用数学术语来描述深度学习的分布外泛化问题。2.证明一个定理估计（高概率）OOD泛化差距，在相关的问题输入。3.通过数学建模确定相关问题输入。确定相关假设需要应用科学方法，在这种情况下，这相当于旨在探测深度神经网络泛化行为的计算机实验。精确地陈述定义和假设涉及数学建模。证明上述定理涉及数学分析。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Oberman, Adam其他文献

Deep relaxation: partial differential equations for optimizing deep neural networks

DOI：
10.1007/s40687-018-0148-y
发表时间：
2018-06-28
期刊：
RESEARCH IN THE MATHEMATICAL SCIENCES
影响因子：
1.2
作者：
Chaudhari, Pratik;Oberman, Adam;Carlier, Guillaume
通讯作者：
Carlier, Guillaume

ANISOTROPIC TOTAL VARIATION REGULARIZED L1 APPROXIMATION AND DENOISING/DEBLURRING OF 2D BAR CODES

DOI：
10.3934/ipi.2011.5.591
发表时间：
2011-08-01
期刊：
INVERSE PROBLEMS AND IMAGING
影响因子：
1.3
作者：
Choksi, Rustum;van Gennip, Yves;Oberman, Adam
通讯作者：
Oberman, Adam

NUMERICAL METHODS FOR MATCHING FOR TEAMS AND WASSERSTEIN BARYCENTERS

DOI：
10.1051/m2an/2015033
发表时间：
2015-11-01
期刊：
ESAIM-MATHEMATICAL MODELLING AND NUMERICAL ANALYSIS-MODELISATION MATHEMATIQUE ET ANALYSE NUMERIQUE
影响因子：
0
作者：
Carlier, Guillaume;Oberman, Adam;Oudet, Edouard
通讯作者：
Oudet, Edouard