权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Variational Inference for Resource-Efficient Learning

职业：资源高效学习的变分推理

基本信息

批准号：
2047418
负责人：
Stephan Mandt
金额：
$ 44.65万
依托单位：
University of California-Irvine
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-09-01 至 2026-08-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2047418&HistoricalAwards=false
关键词：
CAREER Variational Inference Resource Efficient

项目摘要

The power of Deep Learning (DL) comes with enormous energy and storage costs due to massive data needs and parameter-rich models. For example, recent models for natural language generation contain more than a hundred billion parameters and require huge amounts of training data. Training such a model can entail nearly five times the lifetime carbon dioxide emissions of the average American car. This project develops a holistic approach to resource-efficient DL based on a common set of methodologies: DL models and algorithms are viewed through the lens of information theory, making it possible to formally quantify and minimize the required resources. Outcomes of this project include new methods for the compression of both models (neural networks) and data (images and video), as well as new training algorithms for DL that reduce data requirements and improve runtime efficiency. These research activities will also inform summer teaching activities for under-represented students, lead to new open-source software for resource-efficient machine learning, as well as workshops and symposia on neural compression and statistical machine learning. In more detail, the project approaches resource-efficient machine learning from the perspective of variational inference (VI) and contains three thrusts that focus on different inefficiencies: (A) bandwidth inefficiency: a model's inefficient representation of data or parameters, (B) data inefficiency: a model's extensive need for training data, and (C) runtime inefficiency: a learning or inference algorithm's inability to produce desired answers within a given computational time budget. Thrust A draws on the connection between VI and rate-distortion theory to derive new neural compression algorithms with improved compression performance. Thrust B designs informative priors for effective learning with limited data in non-stationary environments. Finally, thrust C develops highly scalable training algorithms for Bayesian neural networks that hybridize Markov Chain Monte Carlo and VI, trading-off precision for convergence speed. The project contains applications from the domains of image and video compression as well as climate science.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

深度学习（DL）的强大功能伴随着巨大的能量和存储成本，因为它需要大量的数据和丰富的参数模型。例如，最近的自然语言生成模型包含超过1000亿个参数，需要大量的训练数据。训练这样一个模型可能需要将近五倍于美国汽车平均寿命的二氧化碳排放量。该项目开发了一个整体的方法，以资源有效的DL的基础上，一套通用的方法：DL模型和算法被视为通过信息论的透镜，使之有可能正式量化和最小化所需的资源。该项目的成果包括压缩模型（神经网络）和数据（图像和视频）的新方法，以及用于DL的新训练算法，这些算法降低了数据需求并提高了运行时效率。这些研究活动还将为代表性不足的学生的夏季教学活动提供信息，为资源节约型机器学习提供新的开源软件，以及关于神经压缩和统计机器学习的研讨会和研讨会。更详细地说，该项目从变分推理（VI）的角度来研究资源高效的机器学习，并包含三个重点，重点是不同的效率低下：（A）带宽效率低下：模型对数据或参数的低效表示，（B）数据效率低下：模型对训练数据的广泛需求，以及（C）运行时效率低下：学习或推理算法在给定的计算时间预算内无法产生所需的答案。推力A利用VI和率失真理论之间的联系来推导具有改进的压缩性能的新的神经压缩算法。推力B设计了信息先验，用于在非平稳环境中利用有限数据进行有效学习。最后，thrust C为贝叶斯神经网络开发了高度可扩展的训练算法，该算法混合了Markov Chain Monte Carlo和VI，权衡了收敛速度的精度。该项目包含图像和视频压缩以及气候科学领域的应用。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（22）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Supervised Compression for Resource- constrained Edge Computing Systems

资源受限边缘计算系统的监督压缩

DOI：
10.1109/wacv51458.2022.00100
发表时间：
2022
期刊：
IEEE Winter Conference on Applications of Computer Vision (IEEE WACV
影响因子：
0
作者：
Matsubara, Y;Yang, R.;Mandt, S;Levorato, M.
通讯作者：
Levorato, M.

Neural Transformation Learning for Deep Anomaly Detection Beyond Images

DOI：
发表时间：
2021-03
期刊：
ArXiv
影响因子：
0
作者：
Chen Qiu;Timo Pfrommer;M. Kloft;S. Mandt;Maja R. Rudolph
通讯作者：
Chen Qiu;Timo Pfrommer;M. Kloft;S. Mandt;Maja R. Rudolph

Towards Empirical Sandwich Bounds on the Rate-Distortion Function

DOI：
发表时间：
2021-11
期刊：
ArXiv
影响因子：
0
作者：
Yibo Yang;S. Mandt
通讯作者：
Yibo Yang;S. Mandt

Probabilistic Querying of Continuous-Time Event Sequences

连续时间事件序列的概率查询

DOI：
发表时间：
2023
期刊：
Artificial Intelligence and Statistics
影响因子：
0
作者：
Boyd, Alex;Chang, Yuxin;Mandt, Stephan;Smyth, Padhraic
通讯作者：
Smyth, Padhraic

Structured Stochastic Gradient MCMC

DOI：
发表时间：
2021-07
期刊：
ArXiv
影响因子：
0
作者：
Antonios Alexos;Alex Boyd;S. Mandt
通讯作者：
Antonios Alexos;Alex Boyd;S. Mandt

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Stephan Mandt其他文献

Understanding and Visualizing Droplet Distributions in Simulations of Shallow Clouds

理解和可视化浅层云模拟中的液滴分布

DOI：
发表时间：
2023
期刊：
arXiv.org
影响因子：
0
作者：
Justus C. Will;A. M. Jenney;Kara D. Lamb;Michael S. Pritchard;Colleen Kaul;Po;Kyle Pressel;Jacob Shpund;M. Lier;Stephan Mandt
通讯作者：
Stephan Mandt

Preserving Identity with Variational Score for General-purpose 3D Editing

通过通用 3D 编辑的变分保留同一性

DOI：
发表时间：
2024
期刊：
影响因子：
0
作者：
Duong H. Le;Tuan Pham;Aniruddha Kembhavi;Stephan Mandt;Wei;Jiasen Lu
通讯作者：
Jiasen Lu

HANNA: hard-constraint neural network for consistent activity coefficient prediction

汉纳：用于一致活度系数预测的硬约束神经网络

DOI：
10.1039/d4sc05115g
发表时间：
2024-11-05
期刊：
Chemical Science
影响因子：
7.400
作者：
Thomas Specht;Mayank Nagda;Sophie Fellenz;Stephan Mandt;Hans Hasse;Fabian Jirasek
通讯作者：
Fabian Jirasek

Advancing thermodynamic group-contribution methods by machine learning: UNIFAC 2.0

通过机器学习推进热力学基团贡献法：UNIFAC 2.0

DOI：
10.1016/j.cej.2024.158667
发表时间：
2025-01-15
期刊：
Chemical Engineering Journal
影响因子：
13.200
作者：
Nicolas Hayer;Thorsten Wendel;Stephan Mandt;Hans Hasse;Fabian Jirasek
通讯作者：
Fabian Jirasek

JANET: Joint Adaptive predictioN-region Estimation for Time-series

DOI：
10.1007/s10994-025-06812-2
发表时间：
2025-06-23
期刊：
MACHINE LEARNING
影响因子：
2.900
作者：
Eshant English;Eliot Wong-Toi;Matteo Fontana;Stephan Mandt;Padhraic Smyth;Christoph Lippert
通讯作者：
Christoph Lippert