权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Learning-Based Wavelet Video Coding Using Deep Adaptive Lifting

使用深度自适应提升的基于学习的小波视频编码

基本信息

批准号：
461649014
负责人：
Professor Dr.-Ing. André Kaup
金额：
--
依托单位：
Lehrstuhl für Multimediakommunikation und Signalverarbeitung
依托单位国家：
德国
项目类别：
Research Grants
财政年份：
资助国家：
德国
起止时间：
项目状态：
未结题

来源：
https://gepris.dfg.de/gepris/projekt/461649014?language=en
关键词：
Learning Based Wavelet Video Coding

项目摘要

Learning-based methods resulting from artificial intelligence have been used successfully in various fields of image and video processing. In the field of lossy image compression significant progress regarding the rate-distortion performance compared to classic image coders has been achieved as well. This ratio describes the maximum achievable compression for a certain reproduction fidelity. Moreover, classic image and video coders are based on the concept of variable rates. This allows for providing various bit rates in dependence of the desired reconstruction quality. An exemplary use case can be described by supplying networks with varying channel capacities with the same codec. Current end-to-end trained learning-based methods are characterized by their good signal adaptivity, resulting in an improved compression performance compared to classic approaches. However, a crucial disadvantage is given by the lack of understanding regarding the manner of functioning of neural networks, which is caused by the fact that deep learning architectures are usually not designed systematically but manually in a trial-and-error fashion. Moreover, the training of neural networks requires large computational complexity, since variable rates are usually obtained by training multiple models separately. Therefore, in this research proposal a novel variable rate learning-based video coder shall be developed using motion compensated wavelet lifting. Besides rate adaptivity, spatial and temporal scalability is achieved, resulting in a fully scalable bit stream. This method is based on the so-called lifting structure offering the advantage of applying any non-linear operation without harming the reconstruction property of the transform. This also enables the possibility of implementing neural networks within the lifting structure and, thereby, increases the efficiency of the wavelet lifting. Learned wavelet coefficients are expected to achieve a better signal adaptivity and data compaction. In contrast to end-to-end trained methods, a further advantage of the proposed approach is given by the better understanding regarding the manner of functioning of neural networks due to the well-known architecture of the lifting structure. Thereby, changes in the network architectures can directly be tracked and interpreted. Moreover, the lifting structure is characterized by providing a fully in-place calculation, which does not need any auxiliary memory. Applying such a deep adaptive lifting structure to video compression has not been considered so far and describes a promising new concept for learning-based video compression, combining variable rates and high interpretability in one model.

人工智能产生的基于学习的方法已经成功地应用于图像和视频处理的各个领域。在有损图像压缩领域，与经典图像编码器相比，在率失真性能方面也取得了重大进展。这个比率描述了在一定的再现保真度下所能达到的最大压缩。此外，经典的图像和视频编码器是基于可变速率的概念。这允许提供不同的比特率依赖于所需的重建质量。可以通过使用相同的编解码器提供具有不同信道容量的网络来描述示例性用例。当前基于端到端训练学习的方法具有良好的信号自适应性，与经典方法相比，压缩性能有所提高。然而，一个关键的缺点是缺乏对神经网络功能方式的理解，这是由于深度学习架构通常不是系统地设计的，而是以试错的方式手动设计的。此外，神经网络的训练需要很大的计算复杂度，因为变量率通常是通过单独训练多个模型来获得的。因此，本研究建议采用运动补偿小波提升技术，开发一种基于可变速率学习的视频编码器。除了速率自适应外，还实现了空间和时间的可扩展性，从而实现了完全可扩展的比特流。该方法基于所谓的提升结构，其优点是适用于任何非线性操作而不损害变换的重建特性。这也使得在提升结构中实现神经网络成为可能，从而提高了小波提升的效率。学习到的小波系数有望达到更好的信号自适应和数据压缩。与端到端训练方法相比，由于众所周知的升降结构结构，所提出的方法的进一步优势在于对神经网络功能方式的更好理解。因此，可以直接跟踪和解释网络体系结构中的变化。此外，提升结构的特点是提供完全就地计算，不需要任何辅助存储器。将这种深度自适应提升结构应用到视频压缩中迄今为止还没有被考虑过，它描述了一个有前途的基于学习的视频压缩新概念，在一个模型中结合了可变速率和高可解释性。