结合知识的神经网络在多模态数据学习中的应用-猫眼课题宝

权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

结合知识的神经网络在多模态数据学习中的应用

结题报告

批准号：

61972361

项目类别：

面上项目

资助金额：

59.0 万元

负责人：

张剑

依托单位：

浙江外国语学院

学科分类：

计算机图像视频处理与多媒体技术

结题年份：

2023

批准年份：

2019

项目状态：

已结题

项目参与者：

张剑

关键词：

特征提取网络模块化知识学习语义融合自适应强化学习

国基评审专家1V1指导中标率高出同行96.8%

中文摘要

从多模态数据中学习语义知识是当前人工智能领域中的前沿课题。本项目针对多模态数据特征提取与语义融合问题研究一种结合人类知识的可解释神经网络，将包括人类常识在内的层次化知识融入多模态数据特征提取的过程，使神经网络能有效学习概念和概念之间的相互关联，从而赋予特征表达很强的高层语义信息，使得与神经网络的中层表达进行语义交互变得可解释和可操控。在此基础上，研究跨模态知识迁移、任务驱动的多模态语义融合以及网络系统的模块化，实现网络的低代价灵活搭建。这项研究的意义在于提供了一种不同于端到端模式的新的网络构造和训练方法，有望大大增强网络知识提炼的精准性并缩减网络规模、降低训练和知识共享的代价，是神经网络研究的一种新思路。我们在多模态疾病诊断和多模态动作识别与检索问题中检验提出的方法，研究过程中提出的概念卷积、非线性注意力机制、跨模态知识蒸馏、自适应强化学习等新技术也会对深度学习的发展起到推动作用。

英文摘要

Learning semantic knowledge from multimodal data is one of the frontier topics in artificial intelligence. Aiming at multimodal feature extraction and semantic fusion, this project proposes a knowledge integrated interpretable neural network model that fuses the hierarchical knowledge, including human commonsense, into the procedure of feature extraction, such that the neural network can learn concepts as well as the interconnections among them and encode this strong semantic information into the feature representations. This model makes the semantic interaction with the network’s middle-level representations interpretable and maneuverable. Based on this model, we study the cross-modality knowledge transfer, task-driven multimodal semantic fusion and network modularization to realize the low cost and flexible construction of neural networks. The significance of this research is that it provides a new approach (rather than end-to-end pattern) to network construction and training, and this approach can improve the effect of the feature extraction and reduce the scale of networks and the cost of the training and knowledge sharing. The proposed model represents a new thought of the research about neural networks. We validate the model with the multimodal disease diagnosis and multimodal action recognition and retrieval problems. The new techniques proposed during the research, e.g., concept convolution, nonlinear attention mechanism, cross-modal knowledge distillation and adaptive reinforcement learning, will be real progresses in the development of deep learning.

期刊论文列表

专著列表

科研奖励列表

会议论文列表

专利列表

Multi-features guided robust visual tracking

多功能引导的鲁棒视觉跟踪

DOI：10.1007/s11042-020-08791-z

发表时间：2020-05

期刊：

Multimedia Tools and Applications

影响因子：3.6

作者：

Liang Yun;Zhang Jian;Wang Mei-hua;Lin Chen;Xiao Jun

通讯作者：Xiao Jun

SPRNet: Single-Pixel Reconstruction for One-Stage Instance Segmentation

SPRNet：单阶段实例分割的单像素重建

DOI：10.1109/tcyb.2020.2969046

发表时间：2019-04

期刊：

IEEE Transactions on Cybernetics

影响因子：11.8

作者：

Yu Jun;Yao Jinghan;Zhang Jian;Yu Zhou;Tao Dacheng

通讯作者：Tao Dacheng

Vector of Locally and Adaptively Aggregated Descriptors for Image Feature Representation

DOI：10.1016/j.patcog.2021.107952

发表时间：2021-03

期刊：

Pattern Recognit.

影响因子：--

作者：

Jian Zhang;Yunyin Cao;Qun Wu

通讯作者：Jian Zhang;Yunyin Cao;Qun Wu

Position constrained network for 3D human pose estimation

DOI：10.1007/s00530-021-00880-9

发表时间：2022-02

期刊：

Multimedia Systems

影响因子：3.9

作者：

Xie Dong;Jun Yu;Jian Zhang

通讯作者：Xie Dong;Jun Yu;Jian Zhang

Graph and dynamics interpretation in robotic reinforcement learning task

DOI：10.1016/j.ins.2022.08.041

发表时间：2022-08

期刊：

Inf. Sci.

影响因子：--

作者：

Zonggui Yao;Jun Yu;Jian Zhang;W. He

通讯作者：Zonggui Yao;Jun Yu;Jian Zhang;W. He

融合时空约束与先验知识的表演驱动的人脸动画生成

批准号：
61303143
项目类别：
青年科学基金项目
资助金额：
26.0万元
批准年份：
2013
负责人：
张剑
依托单位：
浙江外国语学院

国内基金

海外基金

会员权益说明：