Reducing Training Data in Deep Learning
减少深度学习中的训练数据
基本信息
- 批准号:RGPIN-2019-06222
- 负责人:
- 金额:$ 4.01万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2019
- 资助国家:加拿大
- 起止时间:2019-01-01 至 2020-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Deep neural networks have been highly successful in supervised learning for a variety of AI related applications. They include automatic speech recognition, image classification and face recognition in computer vision, and natural language translation. However, their success relies on huge volumes of labeled training data, which is time-consuming and expensive to obtain. While data is abundant in today's digital world of the Web, mobile devices, and the Internet of Things, unsupervised learning (which does not need labels) has yet to live up to its promises. ******In this research program we plan to study machine learning requiring human capabilities. These learning problems often have many unlabeled data, very few labeled data, and abstract concepts and knowledge are learned and accumulated across many tasks. Human learning is also active and interactive. Progress on these problems would lead to new theory and algorithms that not only significantly reduce the amount of labeled data needed in supervised learning, but also advance our understanding of machine learning in solving difficult real-world problems.******We propose a novel deep learning framework in which autoencoders and classifiers are coupled and optimized simultaneously to make maximal usage of both unlabeled and labeled data. The autoencoder networks are trained from a large set of unlabeled data, but only need to recall enough details for the purpose of classifying a small number of labeled examples. The proposed research nicely unifies and integrates supervised and unsupervised learning, feature learning, learning representations, lifelong learning, and few-shot learning. ******The research proposal consists of two long-term objectives, and five short-term objectives, each with clear and feasible methodologies. These will provide ample opportunities for training PhD and MSc students. In total, the proposal will train 4 PhD students and 6 MSc students, as well as one Postdoc, in the next 5 years of the proposed research. ******As deep learning in AI is an extremely popular area that attracts both academia and industry, I expect that the HQP trained in this research will be in high demand, and will be making an impact in their future research career in academia and industry. ******We expect to make significant contributions not only to the academic research of machine learning and deep learning, but also to various real-world applications. We expect that less than 10% of the training data (or the labeling cost) would be needed to train the deep neural networks without affecting much the predictive accuracy or the computational cost. The savings would be very significant in any real-world application of deep learning.**
深度神经网络在各种人工智能相关应用的监督学习方面非常成功。 它们包括自动语音识别,计算机视觉中的图像分类和人脸识别以及自然语言翻译。 然而,他们的成功依赖于大量的标记训练数据,这是耗时和昂贵的获得。 虽然在当今的网络、移动的设备和物联网的数字世界中数据丰富,但无监督学习(不需要标签)尚未实现其承诺。 * 在这个研究项目中,我们计划研究需要人类能力的机器学习。这些学习问题通常有许多未标记的数据,很少有标记的数据,抽象概念和知识是在许多任务中学习和积累的。 人类的学习也是积极的和互动的。 这些问题的进展将导致新的理论和算法,不仅可以显着减少监督学习所需的标记数据量,而且还可以促进我们对机器学习在解决困难的现实问题方面的理解。我们提出了一种新的深度学习框架,其中自动编码器和分类器同时耦合和优化,以最大限度地利用未标记和标记数据。 自动编码器网络是从大量未标记的数据中训练出来的,但只需要回忆足够的细节即可对少量已标记的示例进行分类。 这项研究很好地统一和集成了监督和无监督学习、特征学习、学习表示、终身学习和少次学习。** 研究计划包括两个长期目标和五个短期目标,每个目标都有明确可行的方法。 这将为培养博士和硕士学生提供充足的机会。 总的来说,该提案将在未来5年的拟议研究中培养4名博士生和6名硕士生,以及一名博士后。****** 由于人工智能中的深度学习是一个非常受欢迎的领域,吸引了学术界和工业界,我预计在这项研究中训练的HQP将受到很高的需求,并将对他们未来在学术界和工业界的研究生涯产生影响。** 我们希望不仅对机器学习和深度学习的学术研究做出重大贡献,而且对各种现实世界的应用做出重大贡献。 我们预计,训练深度神经网络所需的训练数据(或标记成本)不到10%,而不会对预测精度或计算成本产生太大影响。 在深度学习的任何实际应用中,节省的成本都是非常显著的。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ling, Charles其他文献
Ling, Charles的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ling, Charles', 18)}}的其他基金
Reducing Training Data in Deep Learning
减少深度学习中的训练数据
- 批准号:
RGPIN-2019-06222 - 财政年份:2022
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Reducing Training Data in Deep Learning
减少深度学习中的训练数据
- 批准号:
RGPIN-2019-06222 - 财政年份:2021
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Reducing Training Data in Deep Learning
减少深度学习中的训练数据
- 批准号:
RGPAS-2019-00084 - 财政年份:2020
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Reducing Training Data in Deep Learning
减少深度学习中的训练数据
- 批准号:
RGPIN-2019-06222 - 财政年份:2020
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Reducing Training Data in Deep Learning
减少深度学习中的训练数据
- 批准号:
RGPAS-2019-00084 - 财政年份:2019
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Accelerator Supplements
Improving Information Retrieval with Machine Learning
通过机器学习改进信息检索
- 批准号:
46392-2012 - 财政年份:2018
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Improving Information Retrieval with Machine Learning
通过机器学习改进信息检索
- 批准号:
46392-2012 - 财政年份:2017
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Improving Highstreet's Assets Model with Advanced Machine Learning
利用先进的机器学习改进 Highstreet 的资产模型
- 批准号:
501559-2016 - 财政年份:2016
- 资助金额:
$ 4.01万 - 项目类别:
Engage Plus Grants Program
Improving Information Retrieval with Machine Learning
通过机器学习改进信息检索
- 批准号:
46392-2012 - 财政年份:2015
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Grants Program - Individual
Improving customer service with predictive models for IBM London Software Lab
利用 IBM 伦敦软件实验室的预测模型改善客户服务
- 批准号:
491890-2015 - 财政年份:2015
- 资助金额:
$ 4.01万 - 项目类别:
Engage Grants Program
相似海外基金
CAREER: Mitigating the Lack of Labeled Training Data in Machine Learning Based on Multi-level Optimization
职业:基于多级优化缓解机器学习中标记训练数据的缺乏
- 批准号:
2339216 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Continuing Grant
ZooCELL: Tracing the evolution of sensory cell types in animal diversity: multidisciplinary training in 3D cellular reconstruction, multimodal data ..
ZooCELL:追踪动物多样性中感觉细胞类型的进化:3D 细胞重建、多模态数据方面的多学科培训..
- 批准号:
EP/Y037049/1 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Research Grant
Tracing the evolution of sensory cell types in animal diversity: multidisciplinary training in 3D cellular reconstruction, multimodal data analysis
追踪动物多样性中感觉细胞类型的进化:3D 细胞重建、多模式数据分析的多学科培训
- 批准号:
EP/Y037081/1 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Research Grant
Measurement and analysis of radiotherapy small field dosimetry data to support the development of a simulation training product for clinical Radiotherapy Physicists.
放射治疗小场剂量测定数据的测量和分析,以支持临床放射治疗物理学家模拟培训产品的开发。
- 批准号:
10089179 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Collaborative R&D
Generative Visual Pre-training on Unlabelled Big Data
未标记大数据的生成视觉预训练
- 批准号:
DP240101848 - 财政年份:2024
- 资助金额:
$ 4.01万 - 项目类别:
Discovery Projects
Collaborative Research: Implementation: Medium: Secure, Resilient Cyber-Physical Energy System Workforce Pathways via Data-Centric, Hardware-in-the-Loop Training
协作研究:实施:中:通过以数据为中心的硬件在环培训实现安全、有弹性的网络物理能源系统劳动力路径
- 批准号:
2320972 - 财政年份:2023
- 资助金额:
$ 4.01万 - 项目类别:
Standard Grant
Collaborative Research: Implementation: Medium: Secure, Resilient Cyber-Physical Energy System Workforce Pathways via Data-Centric, Hardware-in-the-Loop Training
协作研究:实施:中:通过以数据为中心的硬件在环培训实现安全、有弹性的网络物理能源系统劳动力路径
- 批准号:
2320975 - 财政年份:2023
- 资助金额:
$ 4.01万 - 项目类别:
Standard Grant
MCA Pilot PUI: Data Intensive Research Training (DIRT) in forecasting soil respiration at core terrestrial NEON sites
MCA 试点 PUI:预测陆地 NEON 核心站点土壤呼吸的数据密集型研究培训 (DIRT)
- 批准号:
2321958 - 财政年份:2023
- 资助金额:
$ 4.01万 - 项目类别:
Standard Grant
NRT-HDR: Integrative Training in Data Science-Enabled Sensing of the Environment for Climate Adaptation (DataSENSE)
NRT-HDR:数据科学支持的气候适应环境感知综合培训 (DataSENSE)
- 批准号:
2244403 - 财政年份:2023
- 资助金额:
$ 4.01万 - 项目类别:
Continuing Grant
Network Connector: DEDICATE: Data Science Equity-Driven Inquiry to Create Accessible Project-based Training for Social Impact Education
网络连接器:DEDICATE:数据科学公平驱动的探究,为社会影响力教育创建可访问的基于项目的培训
- 批准号:
2304100 - 财政年份:2023
- 资助金额:
$ 4.01万 - 项目类别:
Continuing Grant