TY - JOUR
T1 - Generative Adversarial Networks and Perceptual Losses for Video Super-Resolution
AU - Lucas, Alice
AU - Lopez-Tapia, Santiago
AU - Molina, Rafael
AU - Katsaggelos, Aggelos K.
N1 - Funding Information:
Manuscript received July 3, 2018; revised December 14, 2018 and January 21, 2019; accepted January 23, 2019. Date of publication January 29, 2019; date of current version May 14, 2019. This work was supported in part by the Sony 2016 Research Award Program Research Project and in part by the National Science Foundation under Grant DGE-1450006. The work of S. López-Tapia was supported in part by the Spanish Ministry of Economy and Competitiveness under Project DPI2016-77869-C2-2-R, in part by the Visiting Scholar Program at the University of Granada, and in part by the Spanish FPU Program. The work of R. Molina was supported in part by the Spanish Ministry of Economy and Competitiveness under Project DPI2016-77869-C2-2-R and in part by the Visiting Scholar Program at the University of Granada. Preliminary experiments of this work were presented at the 2018 IEEE International Conference on Image Processing (ICIP) [1]. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Emanuele Salerno. (Corresponding author: Alice Lucas.) A. Lucas and A. K. Katsaggelos are with the Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL 60208 USA (e-mail: alicelucas2015@u.northwestern.edu).
PY - 2019/7
Y1 - 2019/7
N2 - Video super-resolution (VSR) has become one of the most critical problems in video processing. In the deep learning literature, recent works have shown the benefits of using adversarial-based and perceptual losses to improve the performance on various image restoration tasks; however, these have yet to be applied for video super-resolution. In this paper, we propose a generative adversarial network (GAN)-based formulation for VSR. We introduce a new generator network optimized for the VSR problem, named VSRResNet, along with new discriminator architecture to properly guide VSRResNet during the GAN training. We further enhance our VSR GAN formulation with two regularizers, a distance loss in feature-space and pixel-space, to obtain our final VSRResFeatGAN model. We show that pre-training our generator with the mean-squared-error loss only quantitatively surpasses the current state-of-the-art VSR models. Finally, we employ the PercepDist metric to compare the state-of-the-art VSR models. We show that this metric more accurately evaluates the perceptual quality of SR solutions obtained from neural networks, compared with the commonly used PSNR/SSIM metrics. Finally, we show that our proposed model, the VSRResFeatGAN model, outperforms the current state-of-the-art SR models, both quantitatively and qualitatively.
AB - Video super-resolution (VSR) has become one of the most critical problems in video processing. In the deep learning literature, recent works have shown the benefits of using adversarial-based and perceptual losses to improve the performance on various image restoration tasks; however, these have yet to be applied for video super-resolution. In this paper, we propose a generative adversarial network (GAN)-based formulation for VSR. We introduce a new generator network optimized for the VSR problem, named VSRResNet, along with new discriminator architecture to properly guide VSRResNet during the GAN training. We further enhance our VSR GAN formulation with two regularizers, a distance loss in feature-space and pixel-space, to obtain our final VSRResFeatGAN model. We show that pre-training our generator with the mean-squared-error loss only quantitatively surpasses the current state-of-the-art VSR models. Finally, we employ the PercepDist metric to compare the state-of-the-art VSR models. We show that this metric more accurately evaluates the perceptual quality of SR solutions obtained from neural networks, compared with the commonly used PSNR/SSIM metrics. Finally, we show that our proposed model, the VSRResFeatGAN model, outperforms the current state-of-the-art SR models, both quantitatively and qualitatively.
KW - Artificial neural networks
KW - image generation
KW - image resolution
KW - video signal processing
UR - http://www.scopus.com/inward/record.url?scp=85065998195&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065998195&partnerID=8YFLogxK
U2 - 10.1109/TIP.2019.2895768
DO - 10.1109/TIP.2019.2895768
M3 - Article
C2 - 30714918
AN - SCOPUS:85065998195
VL - 28
SP - 3312
EP - 3327
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
SN - 1057-7149
IS - 7
M1 - 8629024
ER -