Conditional Coding for Learned Image and Video Compression

用于学习图像和视频压缩的条件编码

基本信息

项目摘要

This joint research project between the Institut für Informationsverarbeitung (TNT) of the Leibniz Universität Hannover (LUH) and the Department of Computer Science of the National Chiao Tung University (NYCU) in Taiwan addresses end-to-end learned video compression from the perspective of conditional coding with an meta learning-based regularization and tailoring scheme.The arrival of deep learning spurs a new wave of developments in end-to-end learned compression. Recent years witnessed the success of learned image compression, with the state-of-the-art showing better MS-SSIM results than (and comparable PSNR results to) VVC Intra. By comparison, the development of end-to-end learned video compression is still in its early stage. Most learned video codecs follow the traditional, hybrid-based coding architecture, namely temporal prediction followed by transform-based residual coding. A recent publication indicates that although the state-of-the-art learned video codecs show better results than x265, they can hardly compete with the HEVC Test Model (HM) under more realistic test conditions.Recently, a new school of thought, known as inter-frame conditional coding, emerged, taking end-to-end learned video coding to a new level of compression performance. The idea of conditional coding is to learn the data distribution of a coding frame conditioned on useful contextual information, in order to reach a lower conditional entropy rate for better compression.The emergence of deep generative models, such as variational autoencoders (VAE) and normalizing flow models, opens up new opportunities for a paradigm shift in learning-based compression. Currently, VAE is a popular choice for the compression backbone. Representing a new attempt, this joint research proposal introduces a special type of normalizing flow model, called augmented normalizing flows (ANF), for conditional coding. We choose ANF because it is shown to achieve superior expressiveness to VAE and includes VAE as a special case.Another notable aspect of this joint research project is to address the generalizability and adaptability of the learned video codecs. The learned codecs often suffer from the domain gap between the training and the test data; that is, they may not generalize well on unseen data. In a more general sense, they can hardly achieve optimal compression for individual test images/videos, each of which can in fact be considered a distinct domain. To improve the generalizability, this proposal shall incorporate Noether’s theorem in the form of meta learning to learn an inductive bias that encourages decoded video frames to conserve certain latent consistency in the temporal dimension. We shall also use this learned inductive bias to adapt the encoder and/or the decoder at inference time to suit individual videos. Due to its unsupervised nature, our approach has the striking feature of not having to signal any additional information in the bitstream.
德国汉诺威莱布尼茨大学(汉诺威)信息系统研究所(TNT)和台湾国立交通大学(NYCU)计算机科学系的联合研究项目从条件编码的角度出发,采用基于Meta学习的正则化和裁剪方案,解决了端到端学习视频压缩问题。深度学习的到来推动了端到端学习视频压缩的新一轮发展。结束学习压缩。近年来,学习图像压缩取得了成功,最先进的MS-SSIM结果优于VVC Intra(PSNR结果与VVC Intra相当)。 相比之下,端到端学习视频压缩的发展仍处于早期阶段。大多数学习的视频编解码器遵循传统的基于混合的编码架构,即时间预测,然后是基于变换的残差编码。最近的一份出版物表明,尽管最先进的学习视频编解码器显示出比x265更好的结果,但在更真实的测试条件下,它们很难与HEVC测试模型(HM)竞争。最近,出现了一种新的思想流派,称为帧间条件编码,将端到端学习视频编码提升到一个新的压缩性能水平。条件编码的思想是学习编码帧的数据分布,以有用的上下文信息为条件,以达到更低的条件熵率以获得更好的压缩。深度生成模型的出现,如变分自编码器(VAE)和归一化流模型,为基于学习的压缩的范式转变提供了新的机会。目前,VAE是压缩主干网的流行选择。作为一种新的尝试,该联合研究提案引入了一种特殊类型的规范化流模型,称为增强规范化流(ANF),用于条件编码。我们选择ANF,因为它被证明可以实现上级的表现力VAE,包括VAE作为一个特殊的case.Another值得注意的方面,这个联合研究项目是解决学习视频编解码器的通用性和适应性。学习后的编解码器通常会受到训练数据和测试数据之间的域差距的影响;也就是说,它们可能无法很好地概括看不见的数据。从更一般的意义上说,它们很难实现对单个测试图像/视频的最佳压缩,每个测试图像/视频实际上都可以被认为是一个不同的域。为了提高可推广性,该提议将以Meta学习的形式并入Noether定理,以学习鼓励解码的视频帧在时间维度上保留某些潜在一致性的归纳偏差。我们还将使用这种学习的归纳偏差来在推理时调整编码器和/或解码器以适应各个视频。由于其无监督性质,我们的方法具有无需在比特流中发送任何附加信息的显著特征。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Professor Dr.-Ing. Jörn Ostermann其他文献

Professor Dr.-Ing. Jörn Ostermann的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Professor Dr.-Ing. Jörn Ostermann', 18)}}的其他基金

Marker-free identification of components for bearing rings
轴承套圈部件的无标记识别
  • 批准号:
    423957182
  • 财政年份:
    2019
  • 资助金额:
    --
  • 项目类别:
    Research Grants (Transfer Project)
Contour-based Multidirectional Prediction for Intra Coding
基于轮廓的帧内编码多向预测
  • 批准号:
    397975900
  • 财政年份:
    2018
  • 资助金额:
    --
  • 项目类别:
    Research Grants
Inductive transfer-learning for the classification of aerial and satellite images using Bayesian methods
使用贝叶斯方法对航空和卫星图像进行归纳迁移学习
  • 批准号:
    246374192
  • 财政年份:
    2013
  • 资助金额:
    --
  • 项目类别:
    Research Grants
Videorealistische Gesichtsanimation mit natürlichem Gesichtsausdruck für interaktive Dienste
具有自然面部表情的视频逼真面部动画,用于互动服务
  • 批准号:
    158317137
  • 财政年份:
    2009
  • 资助金额:
    --
  • 项目类别:
    Research Grants
Planar polymer-optical sensor networks for 2D strain measurement
用于二维应变测量的平面聚合物光学传感器网络
  • 批准号:
    444745111
  • 财政年份:
  • 资助金额:
    --
  • 项目类别:
    Research Grants

相似国自然基金

long non-coding RNA(lncRNA)-activatedby TGF-β(lncRNA-ATB)通过成纤维细胞影响糖尿病创面愈合的机制研究
  • 批准号:
    LQ23H150003
  • 批准年份:
    2023
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Non-coding RNA在RAS抑制剂治疗IgA肾病疗效差异中的作用及机制研究
  • 批准号:
    81770709
  • 批准年份:
    2017
  • 资助金额:
    52.0 万元
  • 项目类别:
    面上项目
水稻细菌性褐条病菌致病相关non-coding RNAs的鉴定、功能及调控机制研究
  • 批准号:
    31571971
  • 批准年份:
    2015
  • 资助金额:
    60.0 万元
  • 项目类别:
    面上项目
Long non-coding RNA MEG3分子对胶质瘤干细胞调控作用的研究
  • 批准号:
    81402438
  • 批准年份:
    2014
  • 资助金额:
    23.0 万元
  • 项目类别:
    青年科学基金项目
调控家蚕发育非编码RNA(non-coding RNA, ncRNA)的功能解析
  • 批准号:
    31172158
  • 批准年份:
    2011
  • 资助金额:
    60.0 万元
  • 项目类别:
    面上项目
保守非基因序列(CNGs),非编码RNA序列(Non-coding RNAs)和内含子(Introns)的信息论研究和功能预测
  • 批准号:
    90403010
  • 批准年份:
    2004
  • 资助金额:
    25.0 万元
  • 项目类别:
    重大研究计划

相似海外基金

From microscale structure to population coding of normal and learned behavior
从微观结构到正常和习得行为的群体编码
  • 批准号:
    10225540
  • 财政年份:
    2017
  • 资助金额:
    --
  • 项目类别:
From microscale structure to population coding of normal and learned behavior
从微观结构到正常和习得行为的群体编码
  • 批准号:
    9981045
  • 财政年份:
    2017
  • 资助金额:
    --
  • 项目类别:
From microscale structure to population coding of normal and learned behavior
从微观结构到正常和习得行为的群体编码
  • 批准号:
    9770569
  • 财政年份:
    2017
  • 资助金额:
    --
  • 项目类别:
Fluctuation-induced robust information coding, learned from neuron
从神经元学习的波动引起的鲁棒信息编码
  • 批准号:
    16K12508
  • 财政年份:
    2016
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Challenging Exploratory Research
Cortical mechanisms of learned spatial-temporal sequence coding
学习时空序列编码的皮质机制
  • 批准号:
    8425728
  • 财政年份:
    2013
  • 资助金额:
    --
  • 项目类别:
Cortical mechanisms of learned spatial-temporal sequence coding
学习时空序列编码的皮质机制
  • 批准号:
    9264583
  • 财政年份:
    2013
  • 资助金额:
    --
  • 项目类别:
Cortical mechanisms of learned spatial-temporal sequence coding
学习时空序列编码的皮质机制
  • 批准号:
    8978909
  • 财政年份:
    2013
  • 资助金额:
    --
  • 项目类别:
Cortical mechanisms of learned spatial-temporal sequence coding
学习时空序列编码的皮质机制
  • 批准号:
    8600324
  • 财政年份:
    2013
  • 资助金额:
    --
  • 项目类别:
Neural Coding and Perception of Learned Vocalizations
学习发声的神经编码和感知
  • 批准号:
    8024546
  • 财政年份:
    2010
  • 资助金额:
    --
  • 项目类别:
Neural Coding and Perception of Learned Vocalizations
学习发声的神经编码和感知
  • 批准号:
    8423779
  • 财政年份:
    2010
  • 资助金额:
    --
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了