权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Small: Taming Massive Pre-trained Models under Label Scarcity via an Optimization Lens

RI：小型：通过优化镜头在标签稀缺的情况下驯服大量预训练模型

基本信息

批准号：
2226152
负责人：
Tuo Zhao
金额：
$ 53.99万
依托单位：
Georgia Tech Research Corporation
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-09-01 至 2025-08-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2226152&HistoricalAwards=false
关键词：
RI Small Taming Massive Pre

项目摘要

Deep transfer learning (DTL) has made significant progress in many real-world applications such as image and speech recognition. Training deep learning models in these applications often requires large amounts of labeled data, (e.g., images with annotated objects). Labelling these data by human labor, however, can be very expensive and time-consuming, which significantly limits the broader adoption of deep learning. Such an issue is more pronounced in certain domains (e.g. biomedical domain), where labeled data are scarce. To address the concern of label scarcity, researchers have resorted to deep transfer learning, where a massive deep learning model is first pre-trained only using unlabeled data and then adapted to the downstream task of our interests with only limited labelled data. Due to the gap between the enormous sizes of the pre-trained models and the limited labeled data, however, such a deep transfer learning approach is prone to overfitting and fail to generalize well on the unseen data, especially when there are noisy labels. Moreover, the enormous model sizes make practical deployment very difficult when there are constraints on storage/memory usage, inference latency and energy consumption, especially on edge devices. This project aims to develop an efficient computational framework to improve the generalization of deep transfer learning and reduce the model sizes by leveraging cutting-edge optimization and machine learning techniques.Specifically, this project aims to develop: (I) new adversarial regularization methods, which can regularize the complexity of deep learning models and prevent overfitting of the training data, (II) new self-training methods robust to noisy labels in the training data, and (III) new optimization methods, which can improve the training of compact deep learning models in deep transfer learning. Moreover, we will develop new generalization and approximation theories for understanding the benefits of our proposed methods in transfer learning. The proposed research will also deliver open-source software in the form of easy-to-use libraries, which facilitate researchers and practitioners to apply DTL in related fields.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

深度迁移学习(DTL)在图像和语音识别等实际应用中取得了显著的进展。在这些应用中训练深度学习模型通常需要大量的标记数据(例如，带有注释对象的图像)。然而，人工为这些数据贴上标签可能非常昂贵和耗时，这大大限制了深度学习的更广泛采用。这一问题在某些领域(例如生物医学领域)更为明显，因为这些领域的标签数据非常稀缺。为了解决标签稀缺的问题，研究人员求助于深度迁移学习，即大规模的深度学习模型首先只使用未标记的数据进行预训练，然后在只有有限的标记数据的情况下适应我们感兴趣的下游任务。然而，由于庞大的预训练模型和有限的标签数据之间的差距，这种深度迁移学习方法容易过度拟合，并且不能很好地对未知数据进行泛化，特别是在存在噪声标签的情况下。此外，当存储/内存使用、推理延迟和能耗受到限制时，巨大的模型大小使实际部署变得非常困难，尤其是在边缘设备上。该项目旨在开发一种高效的计算框架，以提高深度迁移学习的泛化能力，并利用前沿优化和机器学习技术来减少模型规模。具体而言，本项目旨在开发：(I)新的对抗性正则化方法，它可以使深度学习模型的复杂性正则化，并防止训练数据过拟合；(Ii)新的自我训练方法，它对训练数据中的噪声标签具有鲁棒性；(Iii)新的优化方法，它可以改进深度迁移学习中紧凑深度学习模型的训练。此外，我们将发展新的推广和逼近理论，以了解我们提出的方法在迁移学习中的好处。这项拟议的研究还将以易于使用的库的形式提供开源软件，这有助于研究人员和从业者在相关领域应用DTL。这一奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（4）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation

DOI：
10.48550/arxiv.2306.11222
发表时间：
2023-06
期刊：
ArXiv
影响因子：
0
作者：
Yixiao Li;Yifan Yu;Qingru Zhang;Chen Liang;Pengcheng He;Weizhu Chen;Tuo Zhao
通讯作者：
Yixiao Li;Yifan Yu;Qingru Zhang;Chen Liang;Pengcheng He;Weizhu Chen;Tuo Zhao

Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms

DOI：
10.48550/arxiv.2310.10810
发表时间：
2023-10
期刊：
ArXiv
影响因子：
0
作者：
Alexander W. Bukharin;Yan Li;Yue Yu;Qingru Zhang;Zhehui Chen;Simiao Zuo;Chao Zhang;Songan Zhang;Tuo Zhao
通讯作者：
Alexander W. Bukharin;Yan Li;Yue Yu;Qingru Zhang;Zhehui Chen;Simiao Zuo;Chao Zhang;Songan Zhang;Tuo Zhao

Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

DOI：
10.48550/arxiv.2303.10512
发表时间：
2023
期刊：
ArXiv
影响因子：
0
作者：
Qingru Zhang;Minshuo Chen;Alexander Bukharin;Pengcheng He;Yu Cheng;Weizhu Chen;Tuo Zhao
通讯作者：
Qingru Zhang;Minshuo Chen;Alexander Bukharin;Pengcheng He;Yu Cheng;Weizhu Chen;Tuo Zhao

Machine Learning Force Fields with Data Cost Aware Training

DOI：
10.48550/arxiv.2306.03109
发表时间：
2023-06
期刊：
影响因子：
0
作者：
Alexander W. Bukharin;Tianyi Liu;Sheng Wang;Simiao Zuo;Weihao Gao;Wen Yan;Tuo Zhao
通讯作者：
Alexander W. Bukharin;Tianyi Liu;Sheng Wang;Simiao Zuo;Weihao Gao;Wen Yan;Tuo Zhao

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Tuo Zhao其他文献

Learning explainable task-relevant state representation for model-free deep reinforcement learning

DOI：
10.1016/j.neunet.2024.106741
发表时间：
2024-12-01
期刊：
Research article
影响因子：
作者：
Tingting Zhao;Guixi Li;Tuo Zhao;Yarui Chen;Ning Xie;Gang Niu;Masashi Sugiyama
通讯作者：
Masashi Sugiyama

Effects of different mixed ratio of maize straw and cabbage wastes on silage quality

DOI：
DOI: 10.1166/jbmb.2015.1494
发表时间：
2015
期刊：
Journal of Biobased Materials and Bioenergy, 2015,9(1):88-94
影响因子：
作者：
Haiwei Ren;Na Xu;Jinping Li;Zhizhong Li;Tuo Zhao;Fangxia Pei;Xingquan Yao;Yongming Sun
通讯作者：
Yongming Sun

Ensemble Acoustic Modeling for CD-DNN-HMM Using Random Forests of Phonetic Decision Trees

使用语音决策树随机森林的 CD-DNN-HMM 集成声学建模

DOI：
10.1007/s11265-015-1001-9
发表时间：
2015
期刊：
Journal of Signal Processing Systems
影响因子：
0
作者：
Tuo Zhao;Yunxin Zhao;Xin Chen
通讯作者：
Xin Chen

TDOA Estimation of Speech Source in Noisy Reverberant Environments

噪声混响环境中语音源的 TDOA 估计

DOI：
10.1109/slt54892.2023.10023256
发表时间：
2023
期刊：
2022 IEEE Spoken Language Technology Workshop (SLT)
影响因子：
0
作者：
Suliang Bu;Tuo Zhao;Yunxin Zhao
通讯作者：
Yunxin Zhao

Methodology for optimal parametrization of the Polymer Membrane Fuel Cell based on Elman Neural Network method and Quantum Water Strider Algorithm

DOI：
10.1016/j.egyr.2021.04.058
发表时间：
2021-11-01
期刊：
Research article
影响因子：
作者：
Kai Sun;Tuo Zhao;Zhaolin Li;Lu Wang;Ruichen Wang;Xijie Chen;Qing Yang;Ehsan Ramezani
通讯作者：
Ehsan Ramezani