Learning Better Representations With Kernels
使用内核学习更好的表示
基本信息
- 批准号:RGPIN-2021-02974
- 负责人:
- 金额:$ 2.11万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2021
- 资助国家:加拿大
- 起止时间:2021-01-01 至 2022-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Many of the most impressive gains in machine learning in the past decade - helping computers "understand" images, text, and more - have been based on the idea of representation learning. Rather than processing an image as a set of red, green, and blue values for a grid of pixels, we learn a set of numbers describing the image in some abstract way, so that two similar images - which may have very different pixel values - have similar representations. Most work on representation learning, however, assumes that we will use the representation only in a very simple (linear) way. But in some settings, especially when instead of trying to find a representation for one static problem we want to find a representation that works for many related problems, this assumption can make our job much harder. A machine learning approach known as kernel methods provides richer ways to work with a representation. We can, then, potentially use simpler, easier-to-find representations than if we force ourselves to use them linearly - especially in settings where we want to find representations good for more than one thing. This scheme has seen good success in several areas already, including in generative models (training a computer to output, e.g., images of fake people) and in density estimation (figuring out the "shape" of a dataset, to know which kinds of points we might expect to see in the future). We believe that it can be applied more widely, to improve our ability to find representations in a variety of problems. For instance, a method known as "invariant risk minimization" tries to find predictors which don't rely on random correlations that happen to be present in the training data, but may not hold when we go to apply the model on slightly different data. This method currently works with only linear predictions based on a given representation, and that assumption can cause it to behave very poorly even on some extremely simple datasets. We believe that a version of the method that incorporates kernel models will be more robust and reliable in applications. One area in particular that has already benefited from this approach is called two-sample testing: telling whether two different datasets are fundamentally different from one another, not just different due to random chance. For example, this is used to tell whether the control group and treatment group are different in a medical trial. In practice, though, it's often the case that the two datasets are different - but only in ways we don't really care about. The methods to help understand how two datasets differ, rather than just whether they're different, are much less developed. We believe that we can use kernels to find representations that will, in the end, help people actually understand the differences between datasets.
在过去十年中,机器学习的许多最令人印象深刻的成就-帮助计算机“理解”图像,文本等等-都是基于表征学习的思想。我们不是将图像处理为像素网格的一组红色、绿色和蓝色值,而是学习一组以某种抽象方式描述图像的数字,以便两个相似的图像-可能具有非常不同的像素值-具有相似的表示。然而,大多数关于表征学习的工作都假设我们只以非常简单的(线性)方式使用表征。但在某些情况下,特别是当我们不是试图为一个静态问题找到一个表示时,我们希望找到一个适用于许多相关问题的表示,这个假设会使我们的工作变得更加困难。一种被称为内核方法的机器学习方法提供了更丰富的方法来处理表示。因此,我们可以使用更简单、更容易找到的表征,而不是强迫自己线性地使用它们--特别是在我们想要找到对多个事物都有好处的表征的情况下。该方案已经在几个领域取得了良好的成功,包括生成模型(训练计算机输出,例如,虚假人物的图像)和密度估计(弄清楚数据集的“形状”,以知道我们可能期望在未来看到哪些类型的点)。我们相信,它可以应用得更广泛,以提高我们在各种问题中找到表征的能力。例如,一种被称为“不变风险最小化”的方法试图找到不依赖于训练数据中碰巧存在的随机相关性的预测器,但当我们将模型应用于略有不同的数据时,可能无法保持。这种方法目前只适用于基于给定表示的线性预测,即使在一些非常简单的数据集上,这种假设也会导致它的表现非常差。我们相信,一个版本的方法,结合内核模型将更加强大和可靠的应用程序。特别是已经从这种方法中受益的一个领域被称为双样本测试:判断两个不同的数据集是否从根本上彼此不同,而不仅仅是由于随机机会而不同。例如,这是用来告诉是否控制组和治疗组是不同的医学试验。然而,在实践中,通常情况下,两个数据集是不同的-但只是在我们并不真正关心的方面。帮助理解两个数据集如何不同的方法,而不仅仅是它们是否不同,开发得更少。我们相信我们可以使用内核来找到表示,最终帮助人们真正理解数据集之间的差异。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Sutherland, Dougal其他文献
Sutherland, Dougal的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Sutherland, Dougal', 18)}}的其他基金
Learning Better Representations With Kernels
使用内核学习更好的表示
- 批准号:
DGECR-2021-00195 - 财政年份:2021
- 资助金额:
$ 2.11万 - 项目类别:
Discovery Launch Supplement
相似海外基金
How can we make use of one or more computationally powerful virtual robots, to create a hive mind network to better coordinate multi-robot teams?
我们如何利用一个或多个计算能力强大的虚拟机器人来创建蜂巢思维网络,以更好地协调多机器人团队?
- 批准号:
2594635 - 财政年份:2025
- 资助金额:
$ 2.11万 - 项目类别:
Studentship
MFB: Better Homologous Folding using Computational Linguistics and Deep Learning
MFB:使用计算语言学和深度学习更好的同源折叠
- 批准号:
2330737 - 财政年份:2024
- 资助金额:
$ 2.11万 - 项目类别:
Standard Grant
Creating Better Opportunities in the South West Through a Growth-Mindset-of-Opportunity Intervention
通过机会增长心态干预在西南地区创造更好的机会
- 批准号:
ES/Z502480/1 - 财政年份:2024
- 资助金额:
$ 2.11万 - 项目类别:
Research Grant
Decision 360: Open Finance for better lending decisions
决策 360:开放金融以做出更好的贷款决策
- 批准号:
10099934 - 财政年份:2024
- 资助金额:
$ 2.11万 - 项目类别:
Collaborative R&D
Presymptom: development of a novel machine-learning-derived diagnostic test to rule out infection to enable enhanced clinical care and better targeted anti-microbial use
症状前:开发一种新型的机器学习诊断测试来排除感染,从而加强临床护理和更有针对性的抗菌药物使用
- 批准号:
10089281 - 财政年份:2024
- 资助金额:
$ 2.11万 - 项目类别:
Investment Accelerator
Healthy Jozi: A Staged Approach to Better Workplace Food Choices and Chronic Disease Screening and Linkage to Care
健康 Jozi:更好的工作场所食物选择和慢性病筛查以及与护理联系的分阶段方法
- 批准号:
MR/Z000467/1 - 财政年份:2024
- 资助金额:
$ 2.11万 - 项目类别:
Research Grant
Designing synthetic matrices for enhanced organoid development: A step towards better disease understanding
设计合成基质以增强类器官发育:更好地了解疾病的一步
- 批准号:
MR/Y033760/1 - 财政年份:2024
- 资助金额:
$ 2.11万 - 项目类别:
Research Grant
CAREER: Understanding Photo-thermoelectric Phenomena in Bulk and Nanomaterials for Better Optical Sensing
职业:了解块状和纳米材料中的光热电现象以实现更好的光学传感
- 批准号:
2340728 - 财政年份:2024
- 资助金额:
$ 2.11万 - 项目类别:
Continuing Grant
Doctoral Dissertation Research: Thinking ahead to do better now: Legacy-focused cognition and its link to environmental sustainability
博士论文研究:提前思考现在做得更好:以遗产为中心的认知及其与环境可持续性的联系
- 批准号:
2343645 - 财政年份:2024
- 资助金额:
$ 2.11万 - 项目类别:
Standard Grant