权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

ERI: Generative Adversarial Networks for Video Coding

ERI：用于视频编码的生成对抗网络

基本信息

批准号：
2138635
负责人：
Ying Liu
金额：
$ 19.62万
依托单位：
Santa Clara University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2022
资助国家：
美国
起止时间：
2022-02-01 至 2025-01-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2138635&HistoricalAwards=false
关键词：
ERI Generative Adversarial Networks Video

项目摘要

This award is funded in whole or in part under the American Rescue Plan Act of 2021 (Public Law 117-2).Video coding is an important technology that compresses video signals to save transmission bandwidth and to provide Internet users with visually pleasing decoded videos. Inspired by recent breakthroughs in deep learning, convolutional neural networks have been increasingly exploited into video coding algorithms to provide significant coding gains compared to conventional approaches. Nevertheless, existing convolutional neural network-based video coding schemes tend to generate blurry decoded images which are inconsistent with human perception, and the high computational complexity of these schemes hinders their deployment on power-constrained and computation resource-limited devices, such as smart phones and tablets. Recently, the generative adversarial network demonstrated its capability of decoding sharp and photo-realistic images at low bit rates, but little research has investigated its potential for video compression. This project will develop generative adversarial network-based video coding systems to enhance the coding efficiency, meanwhile providing decoded videos with high perceptual quality. The project will also investigate low-complexity algorithms to reduce the power consumption and to accelerate the inference speed of the proposed video coding systems so that they are suitable for mobile and low-latency applications. The success of the project is expected to accelerate the economic growth of streaming video services to benefit people’s daily professional and entertainment activities. It will advance surveillance video services to enhance public safety in places such as airport, offices, highway, and road intersections. The research activities of the project will provide opportunities to train graduate and undergraduate students including minority and under-represented groups through theses research, senior design projects, as well as machine learning and artificial intelligence courses. The research results of the project will be showcased in a summer engineering seminar program to motivate high school students to pursue science and engineering majors in college.This project will address two problems: (1) How to leverage temporal correlations among video frames and explore scene dynamics in a generative adversarial network-based video coding architecture? Two approaches are proposed: a hierarchical predictive coding approach, and a spatial-temporal coding architecture based on 3-dimensional convolution. Since most existing generative adversarial network models are for still image compression, the success of this research will open the door to generative adversarial network-based coding systems for video coding professionals. (2) How to reduce the computational complexity of deep video coding networks? Despite the performance benefits of deep learning-based video coding tools, few of them are currently being adopted in real-world scenarios. This is due to the high computational complexity, slow inference speed and the large graphic processing unit memory requirements associated with deep network computation. To address this problem, the proposed research will develop algorithms to reduce the complexity, model size and model parameters of deep learning-based video coding models via separable convolution operations. The research results will accelerate the deployment of deep video coding models in real-world applications.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

该奖项全部或部分由《2021年美国救援计划法案》（公法117-2）资助。视频编码是对视频信号进行压缩以节省传输带宽，为互联网用户提供视觉上满意的解码视频的一项重要技术。受深度学习最新突破的启发，卷积神经网络越来越多地应用于视频编码算法中，与传统方法相比，它提供了显著的编码增益。然而，现有的基于卷积神经网络的视频编码方案往往会产生与人类感知不一致的模糊解码图像，并且这些方案的高计算复杂性阻碍了它们在智能手机和平板电脑等功率受限和计算资源有限的设备上的部署。最近，生成对抗网络证明了其在低比特率下解码清晰和逼真图像的能力，但很少有研究调查其在视频压缩方面的潜力。本项目将开发基于生成对抗网络的视频编码系统，以提高编码效率，同时提供高感知质量的解码视频。该项目还将研究低复杂度算法，以降低功耗并加快所提出的视频编码系统的推理速度，使其适合移动和低延迟应用。该项目的成功有望加速流媒体视频服务的经济增长，造福于人们的日常专业和娱乐活动。它将推进监控视频服务，以加强机场、办公室、高速公路和道路十字路口等场所的公共安全。该项目的研究活动将通过论文研究、高级设计项目以及机器学习和人工智能课程，为培养包括少数民族和代表性不足群体在内的研究生和本科生提供机会。该项目的研究成果将在鼓励高中生在大学攻读理工科专业的暑期工程研讨会上展示。该项目将解决两个问题：(1)如何利用视频帧之间的时间相关性，并在基于生成对抗网络的视频编码架构中探索场景动态？提出了两种方法：分层预测编码方法和基于三维卷积的时空编码体系。由于大多数现有的生成对抗网络模型用于静态图像压缩，因此本研究的成功将为视频编码专业人员打开基于生成对抗网络的编码系统的大门。(2)如何降低深度视频编码网络的计算复杂度？尽管基于深度学习的视频编码工具具有性能优势，但目前在现实场景中采用的工具很少。这是由于与深度网络计算相关的高计算复杂性，缓慢的推理速度和大型图形处理单元内存需求。为了解决这一问题，本研究将开发算法，通过可分离卷积操作来降低基于深度学习的视频编码模型的复杂性、模型大小和模型参数。研究结果将加速深度视频编码模型在实际应用中的部署。该奖项反映了美国国家科学基金会的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（6）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

A generative adversarial network for video compression

DOI：
10.1117/12.2618714
发表时间：
2022-05
期刊：
影响因子：
0
作者：
Pengli Du;Ying Liu;Nam Ling;Lingzhi Liu;Yongxiong Ren;M. Hsu
通讯作者：
Pengli Du;Ying Liu;Nam Ling;Lingzhi Liu;Yongxiong Ren;M. Hsu

Learned image compression with transformers

DOI：
10.1117/12.2656516
发表时间：
2023-06
期刊：
影响因子：
0
作者：
Tianma Shen;Y. Liu
通讯作者：
Tianma Shen;Y. Liu

Side Information Driven Image Coding for Machines

DOI：
10.1109/pcs56426.2022.10018039
发表时间：
2022-12
期刊：
2022 Picture Coding Symposium (PCS)
影响因子：
0
作者：
Zhongpeng Zhang;Y. Liu
通讯作者：
Zhongpeng Zhang;Y. Liu

Generative Video Compression with a Transformer-Based Discriminator

DOI：
10.1109/pcs56426.2022.10018030
发表时间：
2022-12
期刊：
2022 Picture Coding Symposium (PCS)
影响因子：
0
作者：
Pengli Du;Y. Liu;Nam Ling;Yongxiong Ren;Lingzhi Liu
通讯作者：
Pengli Du;Y. Liu;Nam Ling;Yongxiong Ren;Lingzhi Liu

A Survey of Efficient Deep Learning Models for Moving Object Segmentation

用于运动物体分割的高效深度学习模型综述

DOI：
10.1561/116.00000140
发表时间：
2023
期刊：
APSIPA Transactions on Signal and Information Processing
影响因子：
3.2
作者：
Hou, Bingxin;Liu, Ying;Ling, Nam;Ren, Yongxiong;Liu, Lingzhi
通讯作者：
Liu, Lingzhi

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Ying Liu其他文献

Low RCS microstrip antenna using polarization-dependent frequency selective surface

使用偏振相关频率选择表面的低 RCS 微带天线

DOI：
10.1109/aps.2014.6905361
发表时间：
2014-07
期刊：
Electronics Letters
影响因子：
1.1
作者：
Yongtao Jia;Ying Liu;Yuwen Hao;Shuxi Gong
通讯作者：
Shuxi Gong

spanPhenotypic and genetic evidence for ecological speciation of /spanspan style=line-height:1.5;Aquilegia japonica and A. oxysepala/span

日本耧斗菜和尖萼耧斗菜生态物种形成的表型和遗传证据

DOI：
发表时间：
2014
期刊：
New Physiologist
影响因子：
0
作者：
Lin-Feng Li;Hua-Ying Wang;Di Pang;Ying Liu;Bao Liu;Hong-Xing Xiao
通讯作者：
Hong-Xing Xiao

Down regulation of UCP2 expression in RPE cells under oxidative stress

氧化应激下RPE细胞UCP2表达下调

DOI：
发表时间：
2019
期刊：
Int J Ophthalmol
影响因子：
0
作者：
Ying Liu;Yuan Ren;Xia Wang;Xu Liu;Yuan He
通讯作者：
Yuan He

Effects of B on the segregation of Mo at the Fe-Cr-Ni Sigma 5(210) grain boundary

B对Fe-Cr-Ni Sigma 5(210)晶界Mo偏析的影响

DOI：
发表时间：
2019
期刊：
Physica B: Condensed Matter
影响因子：
0
作者：
Jianguo Li;Caili Zhang;Xu Li;Zhuxia Zhang;Nan Dong;Ying Liu;Jian Wang;Yanlu Zhang;Lixia Ling;Peide Han
通讯作者：
Peide Han