面向视频目标识别的图像集合分类方法研究-猫眼课题宝

权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

课题基金

基金详情

面向视频目标识别的图像集合分类方法研究

结题报告

批准号：

61379083

项目类别：

面上项目

资助金额：

76.0 万元

负责人：

王瑞平

依托单位：

中国科学院计算技术研究所

学科分类：

F0210.计算机图像视频处理与多媒体技术

结题年份：

2017

批准年份：

2013

项目状态：

已结题

项目参与者：

阚美娜、黄智武、李绍欣、李岩、张杰、刘昕、王雯、刘梦怡、严灿祥

关键词：

视频目标识别图像集合分类流形学习度量学习集合统计量

国基评审专家1V1指导中标率高出同行96.8%

中文摘要

图像集合分类主要研究大规模视频序列中涵盖图像复杂表观变化的集合建模表示与分类学习，是当前计算机视觉领域的一个具有重要理论价值和广阔应用前景的研究课题。本项目将以流形几何和统计学习理论为基础，分析图像集合复杂数据模式的统计分布规律，研究集合数据的紧致特征表示与建模理论，建立集合模型上的距离度量与分类学习方法框架。具体研究内容包括：针对集合数据的复杂非线性变化，研究类别相关的多流形协同表示学习理论，提出局部概率模型框架下的多流形判别分类方法；分析集合数据分布的统计特性，建立样本统计量表示的集合模型，提出融合多阶统计量特征的集合距离度量学习方法；研究集合分布结构的鲁棒统计优化理论，构建流形与统计量的联合学习模型，解决数据噪声与采样偏差等挑战问题，实现高精度高信度的图像集合分类。本项目预期取得理论创新与技术突破，促进视频识别的广泛应用。

英文摘要

Image set classification studies the modeling and classification of a set of images, which usually come from large-scale video sequences and cover complex appearance variations. The problem has been a hot topic in the computer vision research community with important theoretical value and broad application prospects. This project will take manifold geometry and statistical learning theory as its foundation, analyze the statistical distribution principle of the complex set data patterns, study the compact feature representation and modeling of image set, and establish a framework of distance metric and classification learning methods for set model. Specifically, this project will address the following research issues. First, to model the complex nonlinear variations of set data, we will study the multi-manifold collaborative learning theory, and propose a multi-manifold discriminant classification approach based on local probabilistic models for manifold representation. Second, we will conduct analysis on the statistical properties of set data distribution, build a sample statistics-based set model, and derive a set distance metric learning method by integrating multi-order statistics features. Third, we will study the robust statistics and optimization theory of set data structure, construct a joint learning model by combining appearance manifold and statistics modeling. With the proposed joint model, we are able to effectively solve the challenging problems of data noise and sampling deviation, and finally achieve appealing image set classification with both high precision and high reliability. This project is devoted to practical applications such as video security surveillance, video retrieval and clustering, web albums management, etc. The project is expected to achieve both theoretical innovation and technological breakthroughs, and finally promote the wide application of video-based object recognition.

本项目面向视频安全监控、影视视频检索等实用场景需求，围绕视频目标识别中的图像集合分类这一关键核心问题开展深入研究，在图像集合统计流形建模、集合黎曼度量学习、视频哈希学习等方面取得了重要进展，主要工作如下：.（1）面向视频分类任务，提出了多阶统计量融合的图像集合结构化建模方法，采用混合高斯模型刻画集合数据的复杂非线性变化，理论定义并证明了一系列高斯黎曼核函数，构建了高斯分布黎曼流形上的判别学习框架；.（2）面向跨图像视频识别任务，探索了视频数据的多种统计流形建模，提出了以核空间嵌入为桥接的跨欧氏-黎曼异质度量学习方法，解决了图像向量空间与视频矩阵流形之间的跨模态匹配难题，实现了异质空间多态数据的统一表达与分类；.（3）面向视频检索任务，提出了联合帧间与帧内统计信息的视频流形表示方法，设计了高层类别与中层属性耦合的多功能哈希学习架构，克服了传统方法逐帧编码的效率瓶颈，实现了跨类别与属性的多层次多粒度检索；.（4）针对真实场景视频复杂表观变化的挑战，提出了判别统计量导向的集合深度特征表示学习方法，建立了图像特征表示与集合度量学习的深度网络联合优化框架，获得了与分类任务高度耦合的判别性集合表示，实现了端到端的高精度图像集合分类。.围绕上述工作，项目执行期间共发表/录用领域主流国际期刊和会议论文34篇（含CCF-A类论文17篇），其中国际期刊论文12篇（包括IEEE Trans. on PAMI论文1篇，IEEE Trans. on Image Processing论文5篇，IEEE Trans. on CVST论文1篇），会议论文22篇（包括IEEE CVPR论文8篇，ICCV论文2篇，ICML论文1篇）。这些工作较为系统地形成了“非线性度量学习”研究范式，丰富了经典度量学习理论框架，引起了国际同行广泛关注和跟进。已发表论文Google Scholar引用749次，单篇最高104次。项目成果申请国家发明专利2项。以上述工作为算法核心，分别获得了ACM ICMI2014 EmotiW视频表情识别竞赛冠军、IEEE FG2015 PaSC视频人脸识别竞赛冠军。项目成果有效支持了课题组相关产业化项目的技术研发，促进了视频识别的广泛应用。

期刊论文列表

专著列表

科研奖励列表

会议论文列表

专利列表

A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database

DOI：10.1109/tip.2015.2493448

发表时间：2015-10

期刊：

IEEE Transactions on Image Processing

影响因子：10.6

作者：

Zhiwu Huang;S. Shan;Ruiping Wang;Haihong Zhang;S. Lao;Alifu Kuerban;Xilin Chen

通讯作者：Zhiwu Huang;S. Shan;Ruiping Wang;Haihong Zhang;S. Lao;Alifu Kuerban;Xilin Chen

Video modeling and learning on Riemannian manifold for emotion recognition in the wild

DOI：10.1007/s12193-015-0204-5

发表时间：2015-11

期刊：

Journal on Multimodal User Interfaces

影响因子：2.9

作者：

Mengyi Liu;Ruiping Wang;Shaoxin Li;Zhiwu Huang;S. Shan;Xilin Chen

通讯作者：Mengyi Liu;Ruiping Wang;Shaoxin Li;Zhiwu Huang;S. Shan;Xilin Chen

Geometry-Aware Similarity Learning on SPD Manifolds for Visual Recognition

用于视觉识别的 SPD 流形上的几何感知相似性学习

DOI：10.1109/tcsvt.2017.2729660

发表时间：2018-10-01

期刊：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

影响因子：8.4

作者：

Huang, Zhiwu;Wang, Ruiping;Chen, Xilin

通讯作者：Chen, Xilin

Face recognition on large-scale video in the wild with hybrid Euclidean-and-Riemannian metric learning

DOI：10.1016/j.patcog.2015.03.011

发表时间：2015-10

期刊：

Pattern Recognit.

影响因子：--

作者：

Zhiwu Huang;Ruiping Wang;S. Shan;Xilin Chen

通讯作者：Zhiwu Huang;Ruiping Wang;S. Shan;Xilin Chen

Deep and Structured Robust Information Theoretic Learning for Image Analysis

用于图像分析的深度结构化鲁棒信息理论学习

DOI：10.1109/tip.2016.2588330

发表时间：2016-09-01

期刊：

IEEE TRANSACTIONS ON IMAGE PROCESSING

影响因子：10.6

作者：

Deng, Yue;Bao, Feng;Dai, Qionghai

通讯作者：Dai, Qionghai

知识引导的自然场景跨模态图文生成方法研究

批准号：
U21B2025
项目类别：
联合基金项目
资助金额：
255万元
批准年份：
2021
负责人：
王瑞平
依托单位：
中国科学院计算技术研究所

视觉数据非线性建模与度量学习

批准号：
61922080
项目类别：
优秀青年科学基金项目
资助金额：
130万元
批准年份：
2019
负责人：
王瑞平
依托单位：
中国科学院计算技术研究所

开放场景中大规模物体识别方法研究

批准号：
61772500
项目类别：
面上项目
资助金额：
62.0万元
批准年份：
2017
负责人：
王瑞平
依托单位：
中国科学院计算技术研究所

国内基金

海外基金

会员权益说明：