Big Data Processing and Analytics - Mining Noisy Visual Data and Learning Transferrable Predictive Models
大数据处理和分析 - 挖掘嘈杂的视觉数据和学习可迁移的预测模型
基本信息
- 批准号:RGPIN-2014-04402
- 负责人:
- 金额:$ 2.84万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2017
- 资助国家:加拿大
- 起止时间:2017-01-01 至 2018-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The term ‘Big Data’ has recently emerged to characterize a wide variety of techniques and problems that involve the capture, management, processing, analysis and use of large quantities of data. The increase in our ability to capture, store and process data has reached a point where the impact of Big Data is now seen on the front page of national newspapers. The applicant has extensive experience across a variety of prototypical Big Data problems, including a previous discovery grant on ‘large scale data mining’. Research will focus on developing broadly applicable techniques for visual data processing and analysis through looking at a variety of problems involving visual data with the potential for high impact. Key areas will consist of medical image analysis, object and activity recognition focusing on applications to video indexing, next generation computer animation, intelligent transportation and robotics.Research will focus upon the following common needs, problems, challenges and research questions identified through the applicant’s first hand experience with previous Big Data research projects, namely: 1) The need for acceleration techniques capable of processing large data sets including pre-processing, feature extraction, data modeling, optimization and analysis techniques. 2) The need for principled theory and implementations of techniques that optimize over relevant measures of performance while also accounting for different pre-processing, model complexity, resource constraints and the amount of labelled vs. unlabeled data. 3) The challenges associated with how to most effectively exploit potentially enormous quantities of unlabeled data, as well as weakly or partially and/or noisily labelled data. 4) The problems associated with data collection and reducing human labelling effort. 5) The open questions of how to more effectively transfer learned models or representations obtained using data in one domain to another domain. To achieve these goals we will perform experiments using standard evaluation data sets as well as further develop, curate and create a number of our own data sets covering the themes of faces, emotions, human activities, scene types, objects and medical imagery. In particular, to obtain large quantities of weakly or noisily labelled data and perform experiments on methods for learning with noisy labels we will further develop a large dataset of video that has been annotated by Descriptive Video Services for the blind. We will also collect our own 3D and 4D object and activity recognition data sets centred on the themes of intelligent transportation, robotics and computer animation.Recent research combining large data sets with highly accelerated optimization of deep neural network learning techniques has yielded impressive results on a wide variety of competitive problems. Here we will explore and compare the ways in which other novel deep architectures and other techniques can benefit from the combination of big data and algorithm acceleration. Accelerated algorithms will then be used to develop techniques for the complete optimization of pipelines including both pre-processing steps and hyper-parameters.We also wish to enhance the transferability of models and representation when applied to test data that has been collected in related but different settings. We hypothesize that complete pipeline optimization may lead to more transferrable methods if validation sets representative of the underlying types of domain transfer are used. For visual recognition, we also hypothesize that techniques explicitly accounting for the 4D nature of our world may yield improved transferability. The project will result in the training of highly qualified personnel in the high demand area of Big Data.
术语“大数据”最近出现,以表征涉及捕获,管理,处理,分析和使用大量数据的各种技术和问题。我们捕获、存储和处理数据的能力已经提高到了一个地步,大数据的影响现在出现在全国性报纸的头版。申请人在各种典型的大数据问题上拥有丰富的经验,包括之前在“大规模数据挖掘”方面的发现资助。研究将侧重于开发广泛适用的视觉数据处理和分析技术,通过研究涉及视觉数据的各种问题,具有潜在的高影响力。主要研究领域包括医学图像分析、物体和活动识别,重点关注视频索引、下一代计算机动画、智能交通和机器人技术的应用。研究将集中在以下共同需求、问题、挑战和研究问题,这些问题是通过申请人在以前的大数据研究项目中的第一手经验确定的,即:1)需要能够处理大型数据集的加速技术,包括预处理、特征提取、数据建模、优化和分析技术。2)需要有原则的理论和技术实现,优化相关的性能指标,同时考虑不同的预处理,模型复杂性,资源约束以及标记与未标记数据的数量。3)与如何最有效地利用潜在的大量未标记数据以及弱标记或部分标记和/或噪声标记数据相关的挑战。4)与数据收集和减少人类标签工作相关的问题。5)如何更有效地将使用一个领域中的数据获得的学习模型或表示转移到另一个领域的开放问题。为了实现这些目标,我们将使用标准评估数据集进行实验,并进一步开发,策划和创建一些我们自己的数据集,涵盖面部,情感,人类活动,场景类型,对象和医学图像等主题。特别是,为了获得大量弱标记或噪声标记的数据,并对使用噪声标签进行学习的方法进行实验,我们将进一步开发一个大型视频数据集,该数据集已由盲人描述性视频服务进行注释。我们还将收集我们自己的3D和4D对象和活动识别数据集,这些数据集集中在智能交通、机器人和计算机动画等主题上。最近的研究将大数据集与深度神经网络学习技术的高度加速优化相结合,在各种竞争问题上取得了令人印象深刻的结果。在这里,我们将探索和比较其他新的深度架构和其他技术如何从大数据和算法加速的结合中受益。加速算法将用于开发完整优化管道的技术,包括预处理步骤和超参数。我们还希望在应用于相关但不同设置中收集的测试数据时,增强模型和表示的可移植性。我们假设,完整的管道优化可能会导致更多的可转移的方法,如果验证集的基础类型的域转移的代表使用。对于视觉识别,我们还假设,明确说明我们世界的4D性质的技术可能会提高可转移性。该项目将在大数据的高需求领域培训高素质的人员。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Pal, Christopher其他文献
Robust Motion In-betweening
- DOI:
10.1145/3386569.3392480 - 发表时间:
2020-07-01 - 期刊:
- 影响因子:6.2
- 作者:
Harvey, Felix G.;Yurick, Mike;Pal, Christopher - 通讯作者:
Pal, Christopher
A New Smooth Approximation to the Zero One Loss with a Probabilistic Interpretation
- DOI:
10.1145/3365672 - 发表时间:
2020-02-01 - 期刊:
- 影响因子:3.6
- 作者:
Hasan, Md Kamrul;Pal, Christopher - 通讯作者:
Pal, Christopher
The Liver Tumor Segmentation Benchmark (LiTS).
- DOI:
10.1016/j.media.2022.102680 - 发表时间:
2023-02 - 期刊:
- 影响因子:10.9
- 作者:
Bilic, Patrick;Christ, Patrick;Li, Hongwei Bran;Vorontsov, Eugene;Ben-Cohen, Avi;Kaissis, Georgios;Szeskin, Adi;Jacobs, Colin;Mamani, Gabriel Efrain Humpire;Chartrand, Gabriel;Lohoefer, Fabian;Holch, Julian Walter;Sommer, Wieland;Hofmann, Felix;Hostettler, Alexandre;Lev-Cohain, Naama;Drozdzal, Michal;Amitai, Michal Marianne;Vivanti, Refael;Sosna, Jacob;Ezhov, Ivan;Sekuboyina, Anjany;Navarro, Fernando;Kofler, Florian;Paetzold, Johannes C.;Shit, Suprosanna;Hu, Xiaobin;Lipkova, Jana;Rempfler, Markus;Piraud, Marie;Kirschke, Jan;Wiestler, Benedikt;Zhang, Zhiheng;Huelsemeyer, Christian;Beetz, Marcel;Ettlinger, Florian;Antonelli, Michela;Bae, Woong;Bellver, Miriam;Bi, Lei;Chen, Hao;Chlebus, Grzegorz;Dam, Erik B.;Dou, Qi;Fu, Chi-Wing;Georgescu, Bogdan;Giro-I-Nieto, Xavier;Gruen, Felix;Han, Xu;Heng, Pheng-Ann;Hesser, Jurgen;Moltz, Jan Hendrik;Igel, Christian;Isensee, Fabian;Jaeger, Paul;Jia, Fucang;Kaluva, Krishna Chaitanya;Khened, Mahendra;Kim, Ildoo;Kim, Jae-Hun;Kim, Sungwoong;Kohl, Simon;Konopczynski, Tomasz;Kori, Avinash;Krishnamurthi, Ganapathy;Li, Fan;Li, Hongchao;Li, Junbo;Li, Xiaomeng;Lowengrub, John;Ma, Jun;Maier-Hein, Klaus;Maninis, Kevis-Kokitsi;Meine, Hans;Merhof, Dorit;Pai, Akshay;Perslev, Mathias;Petersen, Jens;Pont-Tuset, Jordi;Qi, Jin;Qi, Xiaojuan;Rippel, Oliver;Roth, Karsten;Sarasua, Ignacio;Schenk, Andrea;Shen, Zengming;Torres, Jordi;Wachinger, Christian;Wang, Chunliang;Weninger, Leon;Wu, Jianrong;Xu, Daguang;Yang, Xiaoping;Yu, Simon Chun-Ho;Yuan, Yading;Yue, Miao;Zhang, Liping;Cardoso, Jorge;Bakas, Spyridon;Braren, Rickmer;Heinemann, Volker;Pal, Christopher;Tang, An;Kadoury, Samuel;Soler, Luc;van Ginneken, Bram;Greenspan, Hayit;Joskowicz, Leo;Menze, Bjoern - 通讯作者:
Menze, Bjoern
3D segmentation of abdominal CT imagery with graphical models, conditional random fields and learning
- DOI:
10.1007/s00138-013-0497-x - 发表时间:
2014-02-01 - 期刊:
- 影响因子:3.3
- 作者:
Bhole, Chetan;Pal, Christopher;Wismueller, Axel - 通讯作者:
Wismueller, Axel
Pal, Christopher的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Pal, Christopher', 18)}}的其他基金
From Perception and Learning to Understanding and Action
从感知和学习到理解和行动
- 批准号:
RGPIN-2020-06837 - 财政年份:2022
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Grants Program - Individual
From Perception and Learning to Understanding and Action
从感知和学习到理解和行动
- 批准号:
RGPIN-2020-06837 - 财政年份:2021
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Grants Program - Individual
NSERC industrial research chair (IRC) on deep AI for multimedia and assistive technology
NSERC 多媒体和辅助技术深度人工智能工业研究主席 (IRC)
- 批准号:
523846-2017 - 财政年份:2020
- 资助金额:
$ 2.84万 - 项目类别:
Industrial Research Chairs
From Perception and Learning to Understanding and Action
从感知和学习到理解和行动
- 批准号:
RGPIN-2020-06837 - 财政年份:2020
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Grants Program - Individual
Big Data Processing and Analytics - Mining Noisy Visual Data and Learning Transferrable Predictive Models
大数据处理和分析 - 挖掘嘈杂的视觉数据和学习可迁移的预测模型
- 批准号:
RGPIN-2014-04402 - 财政年份:2019
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Grants Program - Individual
NSERC industrial research chair (IRC) on deep AI for multimedia and assistive technology
NSERC 多媒体和辅助技术深度人工智能工业研究主席 (IRC)
- 批准号:
523847-2017 - 财政年份:2018
- 资助金额:
$ 2.84万 - 项目类别:
Industrial Research Chairs
Big Data Processing and Analytics - Mining Noisy Visual Data and Learning Transferrable Predictive Models
大数据处理和分析 - 挖掘嘈杂的视觉数据和学习可迁移的预测模型
- 批准号:
RGPIN-2014-04402 - 财政年份:2018
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Grants Program - Individual
Big Data Processing and Analytics - Mining Noisy Visual Data and Learning Transferrable Predictive Models
大数据处理和分析 - 挖掘嘈杂的视觉数据和学习可迁移的预测模型
- 批准号:
RGPIN-2014-04402 - 财政年份:2016
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Grants Program - Individual
Big Data Processing and Analytics - Mining Noisy Visual Data and Learning Transferrable Predictive Models
大数据处理和分析 - 挖掘嘈杂的视觉数据和学习可迁移的预测模型
- 批准号:
RGPIN-2014-04402 - 财政年份:2015
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Grants Program - Individual
Big Data Processing and Analytics - Mining Noisy Visual Data and Learning Transferrable Predictive Models
大数据处理和分析 - 挖掘嘈杂的视觉数据和学习可迁移的预测模型
- 批准号:
RGPIN-2014-04402 - 财政年份:2014
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:合作创新研究团队
Data-driven Recommendation System Construction of an Online Medical Platform Based on the Fusion of Information
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国青年学者研究基金项目
Development of a Linear Stochastic Model for Wind Field Reconstruction from Limited Measurement Data
- 批准号:
- 批准年份:2020
- 资助金额:40 万元
- 项目类别:
基于Linked Open Data的Web服务语义互操作关键技术
- 批准号:61373035
- 批准年份:2013
- 资助金额:77.0 万元
- 项目类别:面上项目
Molecular Interaction Reconstruction of Rheumatoid Arthritis Therapies Using Clinical Data
- 批准号:31070748
- 批准年份:2010
- 资助金额:34.0 万元
- 项目类别:面上项目
高维数据的函数型数据(functional data)分析方法
- 批准号:11001084
- 批准年份:2010
- 资助金额:16.0 万元
- 项目类别:青年科学基金项目
染色体复制负调控因子datA在细胞周期中的作用
- 批准号:31060015
- 批准年份:2010
- 资助金额:25.0 万元
- 项目类别:地区科学基金项目
Computational Methods for Analyzing Toponome Data
- 批准号:60601030
- 批准年份:2006
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Construction of big data analysis platform for fish behavior in the sea by image processing, change detection, and machine learning techniques
利用图像处理、变化检测、机器学习技术构建海洋鱼类行为大数据分析平台
- 批准号:
23K14005 - 财政年份:2023
- 资助金额:
$ 2.84万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Complex Big Data Processing Framework for Pervasice Traceability
用于普及可追溯性的复杂大数据处理框架
- 批准号:
23H03399 - 财政年份:2023
- 资助金额:
$ 2.84万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Advanced Security and Privacy Techniques for Secure Big Data Query, Sharing and Processing
用于安全大数据查询、共享和处理的先进安全和隐私技术
- 批准号:
RGPIN-2022-03244 - 财政年份:2022
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Grants Program - Individual
Distributed Systems Support for Processing Big Data from Sensor Networks
分布式系统支持处理来自传感器网络的大数据
- 批准号:
RGPIN-2019-06776 - 财政年份:2022
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Grants Program - Individual
Mining Social Media Big Data for Toxicovigilance: Studying Substance Use via Natural Language Processing and Machine Learning Methods
挖掘社交媒体大数据进行毒物警戒:通过自然语言处理和机器学习方法研究药物使用
- 批准号:
10588855 - 财政年份:2022
- 资助金额:
$ 2.84万 - 项目类别:
Quality-Assured End-to-End Big Data Approximation Processing
有质量保证的端到端大数据近似处理
- 批准号:
22H03594 - 财政年份:2022
- 资助金额:
$ 2.84万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Distributed Systems Support for Processing Big Data from Sensor Networks
分布式系统支持处理来自传感器网络的大数据
- 批准号:
RGPIN-2019-06776 - 财政年份:2021
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Grants Program - Individual
Robust Geometry Processing for Big Dirty Data
大脏数据的鲁棒几何处理
- 批准号:
RGPIN-2017-05235 - 财政年份:2021
- 资助金额:
$ 2.84万 - 项目类别:
Discovery Grants Program - Individual
Big Data Processing with Compressed Secure Computation
通过压缩安全计算进行大数据处理
- 批准号:
21H05052 - 财政年份:2021
- 资助金额:
$ 2.84万 - 项目类别:
Grant-in-Aid for Scientific Research (S)
Just-in-Time Compilation of Big Data Analytics for Graphics Processing Units
图形处理单元大数据分析的即时编译
- 批准号:
534143-2019 - 财政年份:2021
- 资助金额:
$ 2.84万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral