Hyper-fast hyper-parameter tuning for the next generation of machine learning
下一代机器学习的超快速超参数调整
基本信息
- 批准号:RGPIN-2022-03669
- 负责人:
- 金额:$ 4.01万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Machine learning (ML) uses large amounts data to set the parameters of a model, and is having success in a growing number of applications from speech recognition to computer vision to language translation. But ML has a problem with "hyper-parameters", the variables affecting the learning algorithm and the structure of the model. We typically still need to spend enormous amounts of time tuning hyper-parameters in order to obtain good performance. It was recently highlighted that the cost of tuning current ML language models has a comparable carbon output to five cars throughout their lifetime. This situation will only get worse as the next generation of models will have far more hyper-parameters. We will not be able to solve many important problems until we address this issue. The long-term goal of this research program is to develop algorithms that can train ML models on enormous datasets in a short amount of time. During the last 6 years we have focused on developing algorithms that converge faster, but we now need to turn to the pressing issue of dealing with the hyper-parameters. The short-term goal of this project is to focus on addressing the issues associated with the two typical sources of hyper-parameters: (A) The algorithm used to do the learning typically has hyper-parameters, such as the learning rate. (B) The model that is being used typically has hyper-parameters, such as the depth of a deep learning model. We have already made progress on (A). In 2019 we gave the first method that automatically tunes one of the most important hyper-parameters, the learning rate, during training. This method is guaranteed to perform at least as well as the best fixed learning rate for modern "over-parameterized" models. We plan to develop algorithms that are insensitive to other learning hyper-parameters and that do not require the over-parameterized assumption. My lab is uniquely positioned to address problem (B). Current strategies for tuning the parameters of network architectures tend to use discrete parameterizations of the hyper-parameters. In work 10 years ago I showed a variety of ways to use continuous parameterizations to yield high-quality approximate solutions to learning problems that involve searching over graph and hyper-graph structures. These continuous relaxations gave enormous speedups over previous approaches based on discrete parameterizations, and we will develop new methods like these to address tuning model hyper-parameters. There is a huge potential impact for the proposed research, with potential applications ranging from medicine to scientific discovery to self-driving cars. In ML we want to build algorithms and models that work across many applications (we want to "build a better hammer" that can be used for many tasks). Thus, breakthroughs on the algorithms underlying ML models immediately impact many applications that use ML (or will use it in the future).
机器学习使用大量的数据来设置模型的参数,并在从语音识别到计算机视觉到语言翻译的越来越多的应用中取得了成功。但ML存在“超参数”的问题,即影响学习算法和模型结构的变量。为了获得良好的性能,我们通常仍然需要花费大量时间来调优超参数。最近有人强调,调整当前ML语言模型的成本相当于五辆汽车在整个生命周期内的碳排放量。这种情况只会变得更糟,因为下一代模型将拥有更多的超参数。在我们解决这个问题之前,我们将无法解决许多重要问题。该研究计划的长期目标是开发能够在短时间内在海量数据集上训练ML模型的算法。在过去的6年里,我们一直专注于开发收敛更快的算法,但现在我们需要转向处理超参数的紧迫问题。本项目的短期目标是重点解决与两个典型的超参数来源相关的问题:(A)用于进行学习的算法通常具有超参数,如学习率。(B)正在使用的模型通常具有超参数,例如深度学习模型的深度。我们已经在(A)方面取得了进展。2019年,我们给出了第一种方法,在培训期间自动调整最重要的超参数之一-学习率。对于现代的“过参数”模型,该方法保证了至少和最佳固定学习速率一样的性能。我们计划开发对其他学习超参数不敏感的算法,并且不需要过度参数化的假设。我的实验室在解决问题(B)方面具有独特的优势。当前用于调整网络体系结构的参数的策略倾向于使用超参数的离散参数化。在10年前的工作中,我展示了使用连续参数化为学习问题生成高质量近似解决方案的各种方法,这些问题涉及在图和超图结构上进行搜索。这些连续的松弛使以前基于离散参数的方法有了很大的加速,我们将开发这样的新方法来解决模型超参数的调整问题。这项拟议的研究具有巨大的潜在影响,潜在的应用范围从医学到科学发现再到自动驾驶汽车。在ML中,我们想要构建跨许多应用程序工作的算法和模型(我们想要“构建一个更好的锤子”,可以用于许多任务)。因此,ML模型底层算法的突破立即影响到许多使用ML(或将在未来使用它)的应用程序。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Schmidt, Mark其他文献
Experimental quantification of the effect of Mg on calcite-aqueous fluid oxygen isotope fractionation
- DOI:
10.1016/j.chemgeo.2012.03.027 - 发表时间:
2012-06-05 - 期刊:
- 影响因子:3.9
- 作者:
Mavromatis, Vasileios;Schmidt, Mark;Oelkers, Eric H. - 通讯作者:
Oelkers, Eric H.
Convex Optimization for Big Data
- DOI:
10.1109/msp.2014.2329397 - 发表时间:
2014-09-01 - 期刊:
- 影响因子:14.9
- 作者:
Cevher, Volkan;Becker, Stephen;Schmidt, Mark - 通讯作者:
Schmidt, Mark
A Portable and Autonomous Mass Spectrometric System for On-Site Environmental Gas Analysis
- DOI:
10.1021/acs.est.6b03669 - 发表时间:
2016-12-20 - 期刊:
- 影响因子:11.4
- 作者:
Brennwald, Matthias S.;Schmidt, Mark;Kipfer, Rolf - 通讯作者:
Kipfer, Rolf
Dimensions in major depressive disorder and their relevance for treatment outcome.
- DOI:
10.1016/j.jad.2013.10.020 - 发表时间:
2014-02 - 期刊:
- 影响因子:6.6
- 作者:
Vrieze, Elske;Demyttenaere, Koen;Bruffaerts, Ronny;Hermans, Dirk;Pizzagalli, Diego A.;Sienaert, Pascal;Hompes, Titia;de Boer, Peter;Schmidt, Mark;Claes, Stephan - 通讯作者:
Claes, Stephan
Fe-Si-oxyhydroxide deposits at a slow-spreading centre with thickened oceanic crust: The Lilliput hydrothermal field (9°33′S, Mid-Atlantic Ridge)
- DOI:
10.1016/j.chemgeo.2010.09.012 - 发表时间:
2010-11-15 - 期刊:
- 影响因子:3.9
- 作者:
Dekov, Vesselin M.;Petersen, Sven;Schmidt, Mark - 通讯作者:
Schmidt, Mark
Schmidt, Mark的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Schmidt, Mark', 18)}}的其他基金
Large-Scale Machine Learning
大规模机器学习
- 批准号:
CRC-2019-00358 - 财政年份:2022
- 资助金额:
$ 4.01万 - 项目类别:
Canada Research Chairs
Tractable Big Data and Big Models in Machine Learning
机器学习中易于处理的大数据和大模型
- 批准号:
RGPIN-2015-06068 - 财政年份:2021
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Large-Scale Machine Learning
大规模机器学习
- 批准号:
CRC-2019-00358 - 财政年份:2021
- 资助金额:
$ 4.01万 - 项目类别:
Canada Research Chairs
Tractable Big Data and Big Models in Machine Learning
机器学习中易于处理的大数据和大模型
- 批准号:
RGPIN-2015-06068 - 财政年份:2020
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Large-Scale Machine Learning
大规模机器学习
- 批准号:
CRC-2019-00358 - 财政年份:2020
- 资助金额:
$ 4.01万 - 项目类别:
Canada Research Chairs
Tractable Big Data and Big Models in Machine Learning
机器学习中易于处理的大数据和大模型
- 批准号:
RGPIN-2015-06068 - 财政年份:2019
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Tractable Big Data and Big Models in Machine Learning
机器学习中易于处理的大数据和大模型
- 批准号:
RGPIN-2015-06068 - 财政年份:2018
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
基于FAST搜寻及观测的脉冲星多波段辐射机制研究
- 批准号:12403046
- 批准年份:2024
- 资助金额:0 万元
- 项目类别:青年科学基金项目
FAST连续观测数据处理的pipeline开发
- 批准号:
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
基于神经网络的FAST馈源融合测量算法研究
- 批准号:12363010
- 批准年份:2023
- 资助金额:31 万元
- 项目类别:地区科学基金项目
使用FAST开展河外中性氢吸收线普查
- 批准号:12373011
- 批准年份:2023
- 资助金额:52.00 万元
- 项目类别:面上项目
基于FAST的射电脉冲星搜索和候选识别的深度学习方法研究
- 批准号:12373107
- 批准年份:2023
- 资助金额:54 万元
- 项目类别:面上项目
基于FAST观测的重复快速射电暴的统计和演化研究
- 批准号:12303042
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
利用FAST漂移扫描多科学目标同时巡天宽带谱线数据研究星系中性氢质量函数
- 批准号:12373012
- 批准年份:2023
- 资助金额:52.00 万元
- 项目类别:面上项目
基于FAST望远镜及超级计算的脉冲星深度搜寻和研究
- 批准号:12373109
- 批准年份:2023
- 资助金额:55.00 万元
- 项目类别:面上项目
基于FAST高灵敏度和高谱分辨中性氢数据的暗星系的系统搜寻与研究
- 批准号:12373001
- 批准年份:2023
- 资助金额:52.00 万元
- 项目类别:面上项目
基于FAST的纳赫兹引力波研究
- 批准号:LY23A030001
- 批准年份:2023
- 资助金额:0.0 万元
- 项目类别:省市级项目
相似海外基金
Mem-Fast Membranes as Enablers for Future Biorefineries: from Fabrication to Advanced Separation Technologies
Mem-Fast 膜作为未来生物精炼的推动者:从制造到先进的分离技术
- 批准号:
EP/Y032004/1 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Research Grant
Model order reduction for fast phase-field fracture simulations
快速相场断裂模拟的模型降阶
- 批准号:
EP/Y002474/1 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Research Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
- 批准号:
2338816 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Continuing Grant
CRII: RI: Deep neural network pruning for fast and reliable visual detection in self-driving vehicles
CRII:RI:深度神经网络修剪,用于自动驾驶车辆中快速可靠的视觉检测
- 批准号:
2412285 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Standard Grant
Accelerated discovery of ultra-fast ionic conductors with machine learning
通过机器学习加速超快离子导体的发现
- 批准号:
24K08582 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
SBIR Phase I: An Interplanetary Smallsat for Fast Connectivity, Navigation, and Positioning
SBIR 第一阶段:用于快速连接、导航和定位的行星际小型卫星
- 批准号:
2322390 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Standard Grant
FAST CAR-T: Faster, Adaptive and Scalable Technologies For CAR-T Manufacture
FAST CAR-T:更快、自适应和可扩展的 CAR-T 制造技术
- 批准号:
EP/Z532770/1 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Research Grant
Luminescent Organometallic Complexes with Fast Radiative Rates
具有快速辐射速率的发光有机金属配合物
- 批准号:
2348784 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Continuing Grant
CAREER: Understanding Radiation Belt Electron Fast, Deep Injections in the Inner Magnetosphere
职业:了解辐射带电子在内磁层的快速、深层注入
- 批准号:
2338125 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Continuing Grant
CAREER: Fast coherent and incoherent control of atomic ions in scalable platforms
职业:在可扩展平台中对原子离子进行快速相干和非相干控制
- 批准号:
2338897 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Continuing Grant