权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Learning Better Representations With Kernels

使用内核学习更好的表示

基本信息

批准号：
RGPIN-2021-02974
负责人：
Sutherland, Dougal
金额：
$ 2.11万
依托单位：
University of British Columbia
依托单位国家：
加拿大
项目类别：
Discovery Grants Program - Individual
财政年份：
2021
资助国家：
加拿大
起止时间：
2021-01-01 至 2022-12-31
项目状态：
已结题

来源：
https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=742314
关键词：
Learning Better Representations Kernels

项目摘要

Many of the most impressive gains in machine learning in the past decade - helping computers "understand" images, text, and more - have been based on the idea of representation learning. Rather than processing an image as a set of red, green, and blue values for a grid of pixels, we learn a set of numbers describing the image in some abstract way, so that two similar images - which may have very different pixel values - have similar representations. Most work on representation learning, however, assumes that we will use the representation only in a very simple (linear) way. But in some settings, especially when instead of trying to find a representation for one static problem we want to find a representation that works for many related problems, this assumption can make our job much harder. A machine learning approach known as kernel methods provides richer ways to work with a representation. We can, then, potentially use simpler, easier-to-find representations than if we force ourselves to use them linearly - especially in settings where we want to find representations good for more than one thing. This scheme has seen good success in several areas already, including in generative models (training a computer to output, e.g., images of fake people) and in density estimation (figuring out the "shape" of a dataset, to know which kinds of points we might expect to see in the future). We believe that it can be applied more widely, to improve our ability to find representations in a variety of problems. For instance, a method known as "invariant risk minimization" tries to find predictors which don't rely on random correlations that happen to be present in the training data, but may not hold when we go to apply the model on slightly different data. This method currently works with only linear predictions based on a given representation, and that assumption can cause it to behave very poorly even on some extremely simple datasets. We believe that a version of the method that incorporates kernel models will be more robust and reliable in applications. One area in particular that has already benefited from this approach is called two-sample testing: telling whether two different datasets are fundamentally different from one another, not just different due to random chance. For example, this is used to tell whether the control group and treatment group are different in a medical trial. In practice, though, it's often the case that the two datasets are different - but only in ways we don't really care about. The methods to help understand how two datasets differ, rather than just whether they're different, are much less developed. We believe that we can use kernels to find representations that will, in the end, help people actually understand the differences between datasets.

在过去十年中，机器学习的许多最令人印象深刻的成就-帮助计算机“理解”图像，文本等等-都是基于表征学习的思想。我们不是将图像处理为像素网格的一组红色、绿色和蓝色值，而是学习一组以某种抽象方式描述图像的数字，以便两个相似的图像-可能具有非常不同的像素值-具有相似的表示。然而，大多数关于表征学习的工作都假设我们只以非常简单的（线性）方式使用表征。但在某些情况下，特别是当我们不是试图为一个静态问题找到一个表示时，我们希望找到一个适用于许多相关问题的表示，这个假设会使我们的工作变得更加困难。一种被称为内核方法的机器学习方法提供了更丰富的方法来处理表示。因此，我们可以使用更简单、更容易找到的表征，而不是强迫自己线性地使用它们--特别是在我们想要找到对多个事物都有好处的表征的情况下。该方案已经在几个领域取得了良好的成功，包括生成模型（训练计算机输出，例如，虚假人物的图像）和密度估计（弄清楚数据集的“形状”，以知道我们可能期望在未来看到哪些类型的点）。我们相信，它可以应用得更广泛，以提高我们在各种问题中找到表征的能力。例如，一种被称为“不变风险最小化”的方法试图找到不依赖于训练数据中碰巧存在的随机相关性的预测器，但当我们将模型应用于略有不同的数据时，可能无法保持。这种方法目前只适用于基于给定表示的线性预测，即使在一些非常简单的数据集上，这种假设也会导致它的表现非常差。我们相信，一个版本的方法，结合内核模型将更加强大和可靠的应用程序。特别是已经从这种方法中受益的一个领域被称为双样本测试：判断两个不同的数据集是否从根本上彼此不同，而不仅仅是由于随机机会而不同。例如，这是用来告诉是否控制组和治疗组是不同的医学试验。然而，在实践中，通常情况下，两个数据集是不同的-但只是在我们并不真正关心的方面。帮助理解两个数据集如何不同的方法，而不仅仅是它们是否不同，开发得更少。我们相信我们可以使用内核来找到表示，最终帮助人们真正理解数据集之间的差异。