权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Medium: Foundations of Self-Supervised Learning Through the Lens of Probabilistic Generative Models

RI：媒介：通过概率生成模型的视角进行自我监督学习的基础

基本信息

批准号：
2211907
负责人：
Pradeep Ravikumar
金额：
$ 112.79万
依托单位：
Carnegie-Mellon University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-10-01 至 2026-09-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2211907&HistoricalAwards=false
关键词：
RI Medium Foundations Self Supervised

项目摘要

Supervised learning of modern machine learning models requires very large high-quality labeled datasets. Labeling data requires very expensive human annotations, which is often too expensive for under-resourced end-users of machine learning. Unsupervised learning of machine learning models from unlabeled data has the promise to vastly increase the accessibility and inclusivity of modern machine learning. An emerging paradigm for such unsupervised learning is self-supervised learning (SSL), wherein a machine learning model is trained on tasks for which labels can be automatically generated. This approach is at the core of high-performing language and image machine learning models like BERT and DALL-E. However, despite its promise on many benchmarks across diverse domains, a lot of current methodology for developing SSL methods is opaque and heuristic, and evaluation relies on ad-hoc choices of performance metrics. The goal of this project is to build scientific and mathematical foundations of SSL, and consequently also improve its practice. In some of the earliest work in this area, SSL was used to speed up tasks involving the learning of probabilistic models. Progressively, via a series of approximations for scalability, the outputs of SSL could no longer be rigorously tied to probabilistic model parameters, and the goal shifted to learning features that are "useful" for downstream tasks, that is representation learning. "Useful" however can often be mathematically difficult to pin down, so it is frequently not clear (even empirically, much less theoretically) what these methods learn about the data. At present, designing a well-performing SSL method entails trying many combinations of tasks and model architectures, until a particular one gives good results on the downstream tasks. This has two downsides: (i) it requires a substantial amount of trial-and-error; (ii) on a scientific level, it doesn't yield any understanding of what makes a particular task/architecture suitable, and what the features learned capture about the data distribution. This project will repair the severed tie between probabilistic models and feature learning via self-supervised models by analyzing the aspects of a deep generative model that can be recovered via self-supervised learning. Moreover, through this lens, we propose to understand the relative advantages---both statistical and algorithmic---of self-supervised learning methods over other methods for learning probabilistic models.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

现代机器学习模型的监督学习需要非常大的高质量标记数据集。标记数据需要非常昂贵的人工注释，这对于资源不足的机器学习最终用户来说往往过于昂贵。从未标记数据中对机器学习模型进行无监督学习，有望大大提高现代机器学习的可访问性和包容性。这种无监督学习的新兴范例是自监督学习（SSL），其中机器学习模型在可以自动生成标签的任务上进行训练。这种方法是BERT和DALL-E等高性能语言和图像机器学习模型的核心。然而，尽管它在不同领域的许多基准测试中有希望，但目前开发SSL方法的许多方法都是不透明和启发式的，并且评估依赖于性能指标的临时选择。该项目的目标是建立SSL的科学和数学基础，从而改善其实践。在该领域的一些早期工作中，SSL被用于加速涉及概率模型学习的任务。逐步地，通过一系列可扩展性的近似，SSL的输出不再严格地与概率模型参数联系在一起，目标转移到学习对下游任务“有用”的特征，即表示学习。然而，“有用”通常在数学上很难确定，因此这些方法对数据的了解通常不清楚（即使是经验上的，更不用说理论上的）。目前，设计一个性能良好的SSL方法需要尝试许多任务和模型架构的组合，直到某个特定的组合在下游任务上给出良好的结果。这有两个缺点：（i）它需要大量的试错;（ii）在科学层面上，它不会产生任何关于什么使特定任务/架构合适的理解，以及所学习的特征捕获了关于数据分布的什么。该项目将通过分析可以通过自监督学习恢复的深度生成模型的各个方面，修复概率模型和通过自监督模型进行的特征学习之间的联系。此外，通过这个透镜，我们建议了解的相对优势-统计和算法-的自我监督学习方法比其他方法学习概率models.This奖项反映了NSF的法定使命，并已被认为是值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。

项目成果

期刊论文数量（7）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Identifiability of deep generative models without auxiliary information

DOI：
发表时间：
2022-06
期刊：
影响因子：
0
作者：
Bohdan Kivva;Goutham Rajendran;Pradeep Ravikumar;Bryon Aragam
通讯作者：
Bohdan Kivva;Goutham Rajendran;Pradeep Ravikumar;Bryon Aragam

Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments

DOI：
发表时间：
2021-06
期刊：
ArXiv
影响因子：
0
作者：
Yining Chen;Elan Rosenfeld;Mark Sellke;Tengyu Ma;Andrej Risteski
通讯作者：
Yining Chen;Elan Rosenfeld;Mark Sellke;Tengyu Ma;Andrej Risteski

Masked Prediction: A Parameter Identifiability View

DOI：
发表时间：
2022
期刊：
影响因子：
0
作者：
Bingbin Liu;Daniel J. Hsu;Pradeep Ravikumar;Andrej Risteski
通讯作者：
Bingbin Liu;Daniel J. Hsu;Pradeep Ravikumar;Andrej Risteski

Continual learning: a feature extraction formalization, an efﬁcient algorithm, and barriers

DOI：
发表时间：
期刊：
影响因子：
0
作者：
Binghui Peng;Andrej Risteski
通讯作者：
Binghui Peng;Andrej Risteski

Concept Gradient: Concept-based Interpretation Without Linear Assumption

DOI：
10.48550/arxiv.2208.14966
发表时间：
2022-08
期刊：
ArXiv
影响因子：
0
作者：
Andrew Bai;Chih-Kuan Yeh;Pradeep Ravikumar;Neil Y. C. Lin;Cho-Jui Hsieh
通讯作者：
Andrew Bai;Chih-Kuan Yeh;Pradeep Ravikumar;Neil Y. C. Lin;Cho-Jui Hsieh

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Pradeep Ravikumar其他文献

Ordinal Graphical Models: A Tale of Two Approaches

序数图形模型：两种方法的故事

DOI：
10.5555/3305890.3306018
发表时间：
2017
期刊：
ArXiv
影响因子：
0
作者：
A. Suggala;Eunho Yang;Pradeep Ravikumar
通讯作者：
Pradeep Ravikumar

XMRF: an R package to fit Markov Networks to high-throughput genetics data

XMRF：一个 R 包，用于使马尔可夫网络适应高通量遗传学数据

DOI：
发表时间：
2015
期刊：
bioRxiv
影响因子：
0
作者：
Ying;Genevera I. Allen;Yulia Baker;Eunho Yang;Pradeep Ravikumar;Zhandong Liu
通讯作者：
Zhandong Liu

Deep Density Destructors

深度密度破坏函数

DOI：
发表时间：
2018
期刊：
International Conference on Machine Learning
影响因子：
0
作者：
David I. Inouye;Pradeep Ravikumar
通讯作者：
Pradeep Ravikumar

Nonparametric sparse hierarchical models describe V1 fMRI responses to natural images

非参数稀疏分层模型描述 V1 fMRI 对自然图像的响应

DOI：
发表时间：
2008
期刊：
Neural Information Processing Systems
影响因子：
0
作者：
Pradeep Ravikumar;Vincent Q. Vu;Bin Yu;Thomas Naselaris;Kendrick Norris Kay;J. Gallant
通讯作者：
J. Gallant