Learning Better Representations With Kernels

使用内核学习更好的表示

基本信息

  • 批准号:
    RGPIN-2021-02974
  • 负责人:
  • 金额:
    $ 2.11万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

Many of the most impressive gains in machine learning in the past decade - helping computers "understand" images, text, and more - have been based on the idea of representation learning. Rather than processing an image as a set of red, green, and blue values for a grid of pixels, we learn a set of numbers describing the image in some abstract way, so that two similar images - which may have very different pixel values - have similar representations. Most work on representation learning, however, assumes that we will use the representation only in a very simple (linear) way. But in some settings, especially when instead of trying to find a representation for one static problem we want to find a representation that works for many related problems, this assumption can make our job much harder. A machine learning approach known as kernel methods provides richer ways to work with a representation. We can, then, potentially use simpler, easier-to-find representations than if we force ourselves to use them linearly - especially in settings where we want to find representations good for more than one thing. This scheme has seen good success in several areas already, including in generative models (training a computer to output, e.g., images of fake people) and in density estimation (figuring out the "shape" of a dataset, to know which kinds of points we might expect to see in the future). We believe that it can be applied more widely, to improve our ability to find representations in a variety of problems. For instance, a method known as "invariant risk minimization" tries to find predictors which don't rely on random correlations that happen to be present in the training data, but may not hold when we go to apply the model on slightly different data. This method currently works with only linear predictions based on a given representation, and that assumption can cause it to behave very poorly even on some extremely simple datasets. We believe that a version of the method that incorporates kernel models will be more robust and reliable in applications. One area in particular that has already benefited from this approach is called two-sample testing: telling whether two different datasets are fundamentally different from one another, not just different due to random chance. For example, this is used to tell whether the control group and treatment group are different in a medical trial. In practice, though, it's often the case that the two datasets are different - but only in ways we don't really care about. The methods to help understand how two datasets differ, rather than just whether they're different, are much less developed. We believe that we can use kernels to find representations that will, in the end, help people actually understand the differences between datasets.
在过去十年中,机器学习领域的许多最令人印象深刻的成果——帮助计算机“理解”图像、文本等——都是基于表征学习的思想。我们不是将图像处理为像素网格的一组红、绿、蓝值,而是学习一组以某种抽象方式描述图像的数字,这样两个相似的图像——可能具有非常不同的像素值——就具有相似的表示。然而,大多数关于表征学习的工作都假设我们只会以一种非常简单(线性)的方式使用表征。但在某些情况下,特别是当我们不是试图为一个静态问题找到一个表示,而是想找到一个适用于许多相关问题的表示时,这种假设会使我们的工作变得更加困难。一种被称为核方法的机器学习方法提供了更丰富的方法来处理表示。因此,我们可以使用更简单、更容易找到的表示,而不是强迫自己线性地使用它们——尤其是在我们想要找到适用于多个事物的表示的情况下。这个方案已经在几个领域取得了很好的成功,包括生成模型(训练计算机输出,例如,假人的图像)和密度估计(计算出数据集的“形状”,以知道我们未来可能会看到哪种类型的点)。我们相信它可以得到更广泛的应用,以提高我们在各种问题中寻找表征的能力。例如,一种被称为“不变风险最小化”的方法试图找到不依赖于碰巧出现在训练数据中的随机相关性的预测器,但当我们将模型应用于稍微不同的数据时,它可能不成立。这种方法目前只适用于基于给定表示的线性预测,这种假设甚至会导致它在一些极其简单的数据集上表现得非常糟糕。我们相信,在应用程序中,集成内核模型的方法版本将更加健壮和可靠。有一个领域已经从这种方法中受益,那就是所谓的双样本测试:判断两个不同的数据集是否存在本质上的差异,而不仅仅是由于随机机会而产生的差异。例如,在医学试验中,这用于判断对照组和治疗组是否不同。然而,在实践中,这两个数据集经常是不同的——只是在我们并不真正关心的方面。帮助理解两个数据集如何不同的方法,而不仅仅是它们是否不同的方法,还远远不够发达。我们相信,我们可以使用核来找到最终将帮助人们真正理解数据集之间差异的表示。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Sutherland, Danica其他文献

Sutherland, Danica的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似海外基金

How can we make use of one or more computationally powerful virtual robots, to create a hive mind network to better coordinate multi-robot teams?
我们如何利用一个或多个计算能力强大的虚拟机器人来创建蜂巢思维网络,以更好地协调多机器人团队?
  • 批准号:
    2594635
  • 财政年份:
    2025
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Studentship
MFB: Better Homologous Folding using Computational Linguistics and Deep Learning
MFB:使用计算语言学和深度学习更好的同源折叠
  • 批准号:
    2330737
  • 财政年份:
    2024
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Standard Grant
Creating Better Opportunities in the South West Through a Growth-Mindset-of-Opportunity Intervention
通过机会增长心态干预在西南地区创造更好的机会
  • 批准号:
    ES/Z502480/1
  • 财政年份:
    2024
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Research Grant
Decision 360: Open Finance for better lending decisions
决策 360:开放金融以做出更好的贷款决策
  • 批准号:
    10099934
  • 财政年份:
    2024
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Collaborative R&D
Presymptom: development of a novel machine-learning-derived diagnostic test to rule out infection to enable enhanced clinical care and better targeted anti-microbial use
症状前:开发一种新型的机器学习诊断测试来排除感染,从而加强临床护理和更有针对性的抗菌药物使用
  • 批准号:
    10089281
  • 财政年份:
    2024
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Investment Accelerator
Healthy Jozi: A Staged Approach to Better Workplace Food Choices and Chronic Disease Screening and Linkage to Care
健康 Jozi:更好的工作场所食物选择和慢性病筛查以及与护理联系的分阶段方法
  • 批准号:
    MR/Z000467/1
  • 财政年份:
    2024
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Research Grant
Designing synthetic matrices for enhanced organoid development: A step towards better disease understanding
设计合成基质以增强类器官发育:更好地了解疾病的一步
  • 批准号:
    MR/Y033760/1
  • 财政年份:
    2024
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Research Grant
CAREER: Understanding Photo-thermoelectric Phenomena in Bulk and Nanomaterials for Better Optical Sensing
职业:了解块状和纳米材料中的光热电现象以实现更好的光学传感
  • 批准号:
    2340728
  • 财政年份:
    2024
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Continuing Grant
Doctoral Dissertation Research: Thinking ahead to do better now: Legacy-focused cognition and its link to environmental sustainability
博士论文研究:提前思考现在做得更好:以遗产为中心的认知及其与环境可持续性的联系
  • 批准号:
    2343645
  • 财政年份:
    2024
  • 资助金额:
    $ 2.11万
  • 项目类别:
    Standard Grant
New Users for a Better ICOS
新用户打造更好的 ICOS
  • 批准号:
    10101542
  • 财政年份:
    2024
  • 资助金额:
    $ 2.11万
  • 项目类别:
    EU-Funded
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了