权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Learning Better Representations With Kernels

使用内核学习更好的表示

基本信息

批准号：
RGPIN-2021-02974
负责人：
Sutherland, Danica
金额：
$ 2.11万
依托单位：
University of British Columbia
依托单位国家：
加拿大
项目类别：
Discovery Grants Program - Individual
财政年份：
2022
资助国家：
加拿大
起止时间：
2022-01-01 至 2023-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=749925
关键词：
Learning Better Representations Kernels

项目摘要

Many of the most impressive gains in machine learning in the past decade - helping computers "understand" images, text, and more - have been based on the idea of representation learning. Rather than processing an image as a set of red, green, and blue values for a grid of pixels, we learn a set of numbers describing the image in some abstract way, so that two similar images - which may have very different pixel values - have similar representations. Most work on representation learning, however, assumes that we will use the representation only in a very simple (linear) way. But in some settings, especially when instead of trying to find a representation for one static problem we want to find a representation that works for many related problems, this assumption can make our job much harder. A machine learning approach known as kernel methods provides richer ways to work with a representation. We can, then, potentially use simpler, easier-to-find representations than if we force ourselves to use them linearly - especially in settings where we want to find representations good for more than one thing. This scheme has seen good success in several areas already, including in generative models (training a computer to output, e.g., images of fake people) and in density estimation (figuring out the "shape" of a dataset, to know which kinds of points we might expect to see in the future). We believe that it can be applied more widely, to improve our ability to find representations in a variety of problems. For instance, a method known as "invariant risk minimization" tries to find predictors which don't rely on random correlations that happen to be present in the training data, but may not hold when we go to apply the model on slightly different data. This method currently works with only linear predictions based on a given representation, and that assumption can cause it to behave very poorly even on some extremely simple datasets. We believe that a version of the method that incorporates kernel models will be more robust and reliable in applications. One area in particular that has already benefited from this approach is called two-sample testing: telling whether two different datasets are fundamentally different from one another, not just different due to random chance. For example, this is used to tell whether the control group and treatment group are different in a medical trial. In practice, though, it's often the case that the two datasets are different - but only in ways we don't really care about. The methods to help understand how two datasets differ, rather than just whether they're different, are much less developed. We believe that we can use kernels to find representations that will, in the end, help people actually understand the differences between datasets.

在过去十年中，机器学习领域的许多最令人印象深刻的成果——帮助计算机“理解”图像、文本等——都是基于表征学习的思想。我们不是将图像处理为像素网格的一组红、绿、蓝值，而是学习一组以某种抽象方式描述图像的数字，这样两个相似的图像——可能具有非常不同的像素值——就具有相似的表示。然而，大多数关于表征学习的工作都假设我们只会以一种非常简单（线性）的方式使用表征。但在某些情况下，特别是当我们不是试图为一个静态问题找到一个表示，而是想找到一个适用于许多相关问题的表示时，这种假设会使我们的工作变得更加困难。一种被称为核方法的机器学习方法提供了更丰富的方法来处理表示。因此，我们可以使用更简单、更容易找到的表示，而不是强迫自己线性地使用它们——尤其是在我们想要找到适用于多个事物的表示的情况下。这个方案已经在几个领域取得了很好的成功，包括生成模型（训练计算机输出，例如，假人的图像）和密度估计（计算出数据集的“形状”，以知道我们未来可能会看到哪种类型的点）。我们相信它可以得到更广泛的应用，以提高我们在各种问题中寻找表征的能力。例如，一种被称为“不变风险最小化”的方法试图找到不依赖于碰巧出现在训练数据中的随机相关性的预测器，但当我们将模型应用于稍微不同的数据时，它可能不成立。这种方法目前只适用于基于给定表示的线性预测，这种假设甚至会导致它在一些极其简单的数据集上表现得非常糟糕。我们相信，在应用程序中，集成内核模型的方法版本将更加健壮和可靠。有一个领域已经从这种方法中受益，那就是所谓的双样本测试：判断两个不同的数据集是否存在本质上的差异，而不仅仅是由于随机机会而产生的差异。例如，在医学试验中，这用于判断对照组和治疗组是否不同。然而，在实践中，这两个数据集经常是不同的——只是在我们并不真正关心的方面。帮助理解两个数据集如何不同的方法，而不仅仅是它们是否不同的方法，还远远不够发达。我们相信，我们可以使用核来找到最终将帮助人们真正理解数据集之间差异的表示。