基于多模态融合机制的视频语义表征方法研究-猫眼课题宝

权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

课题基金

基金详情

基于多模态融合机制的视频语义表征方法研究

结题报告

批准号：

61702313

项目类别：

青年科学基金项目

资助金额：

25.0 万元

负责人：

侯素娟

依托单位：

山东师范大学

学科分类：

F0210.计算机图像视频处理与多媒体技术

结题年份：

2020

批准年份：

2017

项目状态：

已结题

项目参与者：

秦茂玲、梁成、赵艳娜、李卓然、姜岩芸、赵彦会

关键词：

特征提取多模态融合视频语义视频表征

国基评审专家1V1指导中标率高出同行96.8%

中文摘要

随着互联网应用及存储技术的发展，视频数据呈爆炸式增长。视频具有信息量大、冗余度高、抽象程度底和非结构化等特性：一方面，视频中存在图像、音频和文本等多种媒介数据，且相互之间呈现时序关联性；另一方面，视频从内容上又具有很强的逻辑性。这些特性给视频数据的智能分析研究带来巨大挑战。本项目以多媒体分析技术为基础，以视频为研究对象，综合运用深度学习、图像处理、模式识别和主题模型等技术，拟从以下三个方面展开研究：(1)研究视频视觉特征的自动学习和提取机制，构建自适应特性的深度模型，从视觉层次建立对视频的描述；（2）研究视频数据中多模态信息的有效融合机制，以期在多元异构数据中寻找某些不变关系；（3）研究结合领域知识的视频语义表征，根据视频类别特点和不同的应用场景，在多模态融合基础上构建一套相应的语义表征模型。本项目的完成不仅能够丰富主题模型的应用领域，还将对各领域视频的智能分析研究提供新的视角和理论突破。

英文摘要

With the development of Internet applications and storage technology, video data has increased explosively. The video data has many distinct features, such as information diversity, high redundancy, low degree of abstraction and unstructured characteristic. On the one hand, several types of media data including image, audio as well as text are contained in videos, and there are temporal correlations among them; on the other hand, video data has a strong logic in terms of content. These features bring great challenges to intelligent analysis of video data. Based on the multimedia analysis technology, this project comprehensively applies a series of technologies from several popular subjects, involving deep learning, image processing, pattern recognition and theme model, to study video data from the following three aspects: (1) construct an adaptive deep model by analyzing video visual feature, and create a video representation from the visual view. (2) study the effective fusion mechanism based on multimodal information within video data, and explore some invariant relationship from the multivariate heterogeneous data. (3) build a video semantic representation model with domain knowledge, which can adapt to different application scenarios for different types of video data. The accomplishment of this project will not only enrich the application of the theme model, but also provide new insights into the intelligent analysis of video data.

依托本项目，课题组主要围绕以广告视频为代表的视频表征和Logo目标检测进行研究。一方面，在进行视频表征构建过程中，不仅考虑了视频中的视觉、音频等特征，还融合了视频的领域特性。更进一步，构建了一种包含高层语义特性的视频表征算子。另一方面，课题组对以广告视频为代表的短视频进行进一步研究，具体来说，对广告视频中的Logo品牌信息进行了挖掘。目前阶段，实现了大规模Logo数据集的构建和图像中Logo目标检测。.课题组在研究过程中发表了多篇学术论文，其中SCI论文6篇，CCF A类1篇，申请发明专利3项目，其中授权1项。协助培养硕士研究生5名。

期刊论文列表

专著列表

科研奖励列表

会议论文列表

专利列表

Solving Jigsaw Puzzles via Nonconvex Quadratic Programming With the Projected Power Method

DOI：10.1109/tmm.2020.3009501

发表时间：2021

期刊：

IEEE Transactions on Multimedia

影响因子：7.3

作者：

Fang Yan;Yuanjie Zheng;Jinyu Cong;Liu Liu-Liu;D. Tao;Sujuan Hou

通讯作者：Fang Yan;Yuanjie Zheng;Jinyu Cong;Liu Liu-Liu;D. Tao;Sujuan Hou

Classifying advertising video by topicalizing high-level semantic concepts

通过主题化高级语义概念对广告视频进行分类

DOI：10.1007/s11042-018-5801-3

发表时间：2018-10-01

期刊：

MULTIMEDIA TOOLS AND APPLICATIONS

影响因子：3.6

作者：

Hou, Sujuan;Zhou, Shangbo;Zheng, Yuanjie

通讯作者：Zheng, Yuanjie

Kirsch Direction Template Despeckling Algorithm of High-Resolution SAR Images-Based on Structural Information Detection

基于结构信息检测的高分辨率SAR图像Kirsch方向模板去斑算法

DOI：10.1109/lgrs.2020.2966369

发表时间：2021-01

期刊：

IEEE Geoscience and Remote Sensing Letters

影响因子：4.8

作者：

Sujuan Hou;Zengguo Sun;Liu Yang;Yunjing Song

通讯作者：Yunjing Song

A Generative Model for OCT Retinal Layer Segmentation by Group wise Curve Alignment

通过分组曲线对齐进行 OCT 视网膜层分割的生成模型

DOI：10.1109/access.2018.2825397

发表时间：2018

期刊：

IEEE Access

影响因子：3.9

作者：

Duan Wenjun;Zheng Yuanjie;Ding Yanhui;Hou Sujuan;Tang Yufang;Xu Yan;Qin Maoling;Wu Jianfeng;Shen Dinggang;Bi Hongsheng

通讯作者：Bi Hongsheng

Mamda: Inferring Microrna-Disease Associations with Manifold Alignment

MAMDA：通过流形比对推断 microRNA 疾病关联

DOI：10.1016/j.compbiomed.2019.05.014

发表时间：2019

期刊：

Computers in biology and medicine

影响因子：7.7

作者：

Yan Fang;Zheng Yuanjie;Jia Weikuan;Hou Sujuan;Xiao Rui

通讯作者：Xiao Rui

面向开放环境的Logo识别与检测关键技术研究

批准号：
62372278
项目类别：
面上项目
资助金额：
50万元
批准年份：
2023
负责人：
侯素娟
依托单位：
山东师范大学

面向互联网短视频数据的Logo检测技术研究

批准号：
--
项目类别：
面上项目
资助金额：
57万元
批准年份：
2020
负责人：
侯素娟
依托单位：
山东师范大学

国内基金

海外基金

会员权益说明：