Expressive data augmentation in deep learning
深度学习中的富有表现力的数据增强
基本信息
- 批准号:RGPIN-2022-04651
- 负责人:
- 金额:$ 1.82万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Deep learning is a subfield of machine learning, a type of artificial intelligence, whose goal is to automatically learn how to solve problems using data. For example, a typical task in deep learning is called "image classification", and consists of learning how to categorize images into different categories (e.g. "cat", "dog", "human") when given a dataset of images labeled with their corresponding category. For a human, this task is easy, but since computers can only "see" an image as a bunch of ones and zeros, it is hard to encode how to solve the problem into an algorithm, a set of mechanical instructions that a computer can follow. In recent years, deep learning has enabled new applications where designing such algorithms is difficult, making advances on tasks involving images, audio, and language, among a variety of others. One key limitation of deep learning is that it typically requires a large amount of data in order to work well. In image classification, for example, several thousand to several million labeled images are required for reasonable performance, a prohibitive cost for most applications. To help compensate, it is common to artificially generate new data from existing data, a process known as "data augmentation". A basic example of this is to randomly make slight alterations to an image's brightness while maintaining its label - an image of a cat is still an image of a cat, even if the brightness is changed by a small amount. Data augmentation has the effect of expanding the effective size of the dataset used to learn algorithms without requiring the costly collection of new data. Despite its large utility, a number of challenges exist when using data augmentation, which my research intends to solve. When applying it to a new problem, for example, one needs to define its basic operations (e.g. the random brightness change itself) and decide on their precise strengths, which may be costly. My research will define augmentation operations automatically by learning operations that vary the precise appearance of images while preserving their desired labels. Then, to tune the strength of each operation, my research will investigate the learned behavior of algorithms trained without data augmentation; if algorithm output varies greatly with respect to a particular operation, then it is likely that using strong amounts of it as data augmentation will improve an algorithm's robustness to it. If successful, my research will allow for the automatic creation of expressive data augmentation policies, substantially reducing the amount of data required to unlock new applications of deep learning throughout both research and industry in Canada. One particularly exciting application of this is in medicine, since the data available for most medical tasks is limited. Ideally, the development of both improved and novel diagnostics may be possible, advancing Canadian medical research and eventually the health of the Canadian public as a whole.
深度学习是机器学习的一个子领域,机器学习是一种人工智能,其目标是自动学习如何使用数据解决问题。例如,深度学习中的一个典型任务被称为“图像分类”,它包括学习如何在给定一个标有相应类别的图像数据集时将图像分类为不同类别(例如“猫”,“狗”,“人”)。对于人类来说,这项任务很容易,但由于计算机只能“看到”一堆1和0的图像,因此很难将如何解决问题编码为算法,即计算机可以遵循的一组机械指令。近年来,深度学习使设计此类算法很困难的新应用成为可能,在涉及图像、音频和语言等各种任务方面取得了进展。深度学习的一个关键限制是,它通常需要大量的数据才能正常工作。例如,在图像分类中,需要几千到几百万个标记的图像来获得合理的性能,这对于大多数应用来说是一个过高的成本。为了帮助补偿,通常会从现有数据中人工生成新数据,这一过程称为“数据增强”。一个基本的例子是随机地对图像的亮度进行轻微的改变,同时保持其标签-猫的图像仍然是猫的图像,即使亮度改变了一点点。数据扩充的作用是扩大用于学习算法的数据集的有效大小,而不需要昂贵的新数据收集。尽管其巨大的效用,但在使用数据增强时存在许多挑战,我的研究旨在解决这些挑战。例如,当将其应用于新问题时,需要定义其基本操作(例如随机亮度变化本身)并决定其精确强度,这可能是昂贵的。我的研究将通过学习操作来自动定义增强操作,这些操作可以改变图像的精确外观,同时保留所需的标签。然后,为了调整每个操作的强度,我的研究将调查在没有数据增强的情况下训练的算法的学习行为;如果算法输出相对于特定操作变化很大,那么很可能使用大量的数据作为数据增强将提高算法的鲁棒性。如果成功,我的研究将允许自动创建表达性数据增强策略,大大减少了在加拿大的研究和工业中解锁深度学习新应用所需的数据量。一个特别令人兴奋的应用是在医学上,因为大多数医疗任务可用的数据是有限的。理想情况下,改进和新型诊断的发展是可能的,推动加拿大的医学研究,并最终促进加拿大公众的健康。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Summers, Cecilia其他文献
Summers, Cecilia的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Summers, Cecilia', 18)}}的其他基金
Expressive data augmentation in deep learning
深度学习中的富有表现力的数据增强
- 批准号:
DGECR-2022-00408 - 财政年份:2022
- 资助金额:
$ 1.82万 - 项目类别:
Discovery Launch Supplement
相似国自然基金
Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:合作创新研究团队
Data-driven Recommendation System Construction of an Online Medical Platform Based on the Fusion of Information
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国青年学者研究基金项目
Development of a Linear Stochastic Model for Wind Field Reconstruction from Limited Measurement Data
- 批准号:
- 批准年份:2020
- 资助金额:40 万元
- 项目类别:
半参数空间自回归面板模型的有效估计与应用研究
- 批准号:71961011
- 批准年份:2019
- 资助金额:16.0 万元
- 项目类别:地区科学基金项目
基于高频信息下高维波动率矩阵估计及应用
- 批准号:71901118
- 批准年份:2019
- 资助金额:18.0 万元
- 项目类别:青年科学基金项目
高频数据波动率统计推断、预测与应用
- 批准号:71971118
- 批准年份:2019
- 资助金额:50.0 万元
- 项目类别:面上项目
基于个体分析的投影式非线性非负张量分解在高维非结构化数据模式分析中的研究
- 批准号:61502059
- 批准年份:2015
- 资助金额:19.0 万元
- 项目类别:青年科学基金项目
基于Linked Open Data的Web服务语义互操作关键技术
- 批准号:61373035
- 批准年份:2013
- 资助金额:77.0 万元
- 项目类别:面上项目
体数据表达与绘制的新方法研究
- 批准号:61170206
- 批准年份:2011
- 资助金额:55.0 万元
- 项目类别:面上项目
一类新Regime-Switching模型及其在金融建模中的应用研究
- 批准号:11061041
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:地区科学基金项目
相似海外基金
All for data, data for all: Improving accessibility of healthcare data through a co-designed augmentation of an existing online rehabilitation application.
一切为了数据,数据为所有人:通过共同设计的现有在线康复应用程序的增强功能,提高医疗保健数据的可访问性。
- 批准号:
10054277 - 财政年份:2023
- 资助金额:
$ 1.82万 - 项目类别:
Grant for R&D
Prime editing for Crumbs homologue 1 (CRB1) Inherited Retinal Dystrophies
Crumbs 同源物 1 (CRB1) 遗传性视网膜营养不良的 Prime 编辑
- 批准号:
10636325 - 财政年份:2023
- 资助金额:
$ 1.82万 - 项目类别:
Genome Editing Therapy for Usher Syndrome Type 3
针对 3 型亚瑟综合症的基因组编辑疗法
- 批准号:
10759804 - 财政年份:2023
- 资助金额:
$ 1.82万 - 项目类别:
Identifying mechanistic pathways underlying RPE pathogenesis in models of pattern dystrophy
识别模式营养不良模型中 RPE 发病机制的机制途径
- 批准号:
10636678 - 财政年份:2023
- 资助金额:
$ 1.82万 - 项目类别:
Precision genome editing in vivo to treat retinal diseases
体内精准基因组编辑治疗视网膜疾病
- 批准号:
10565189 - 财政年份:2023
- 资助金额:
$ 1.82万 - 项目类别:
Novel Implementation of Microporous Annealed Particle HydroGel for Next-generation Posterior Pharyngeal Wall Augmentation
用于下一代咽后壁增强的微孔退火颗粒水凝胶的新实现
- 批准号:
10727361 - 财政年份:2023
- 资助金额:
$ 1.82万 - 项目类别:
Biomechanical Treatment of CTS Via Carpal Arch Space Augmentation: A Pilot Clinical Trial
通过腕弓间隙增大治疗 CTS 的生物力学治疗:初步临床试验
- 批准号:
10725257 - 财政年份:2023
- 资助金额:
$ 1.82万 - 项目类别:
Using data augmentation, active learning, and visual analytics for learning with limited examples on mobility data sets
使用数据增强、主动学习和可视化分析,通过移动数据集的有限示例进行学习
- 批准号:
DGECR-2022-00386 - 财政年份:2022
- 资助金额:
$ 1.82万 - 项目类别:
Discovery Launch Supplement
Bayesian Data Augmentation for Recurrent Events in Electronic Medical Records of Patients with Cancer
癌症患者电子病历中重复事件的贝叶斯数据增强
- 批准号:
10436083 - 财政年份:2022
- 资助金额:
$ 1.82万 - 项目类别:
Bayesian Data Augmentation for Recurrent Events in Electronic Medical Records of Patients with Cancer
癌症患者电子病历中重复事件的贝叶斯数据增强
- 批准号:
10579304 - 财政年份:2022
- 资助金额:
$ 1.82万 - 项目类别:














{{item.name}}会员




