TY - JOUR
T1 - A single video super-resolution GAN for multiple downsampling operators based on pseudo-inverse image formation models
AU - López-Tapia, Santiago
AU - Lucas, Alice
AU - Molina, Rafael
AU - Katsaggelos, Aggelos K.
N1 - Funding Information:
This work was supported in part by the Sony 2016 Research Award Program Research Project. The work of SLT and RM was supported by the Spanish Ministry of Economy and Competitiveness through project DPI2016-77869-C2-2-R and the Visiting Scholar program at the University of Granada . SLT received financial support by the Spanish Ministry of Science, Innovation and Universities through the FPU program.
Publisher Copyright:
© 2020 Elsevier Inc.
PY - 2020/9
Y1 - 2020/9
N2 - The popularity of high and ultra-high definition displays has led to the need for methods to improve the quality of videos already obtained at much lower resolutions. A large amount of current CNN-based Video Super-Resolution methods are designed and trained to handle a specific degradation operator (e.g., bicubic downsampling) and are not robust to mismatch between training and testing degradation models. This causes their performance to deteriorate in real-life applications. Furthermore, many of them use the Mean-Squared-Error as the only loss during learning, causing the resulting images to be too smooth. In this work we propose a new Convolutional Neural Network for video super resolution which is robust to multiple degradation models. During training, which is performed on a large dataset of scenes with slow and fast motions, it uses the pseudo-inverse image formation model as part of the network architecture in conjunction with perceptual losses and a smoothness constraint that eliminates the artifacts originating from these perceptual losses. The experimental validation shows that our approach outperforms current state-of-the-art methods and is robust to multiple degradations.
AB - The popularity of high and ultra-high definition displays has led to the need for methods to improve the quality of videos already obtained at much lower resolutions. A large amount of current CNN-based Video Super-Resolution methods are designed and trained to handle a specific degradation operator (e.g., bicubic downsampling) and are not robust to mismatch between training and testing degradation models. This causes their performance to deteriorate in real-life applications. Furthermore, many of them use the Mean-Squared-Error as the only loss during learning, causing the resulting images to be too smooth. In this work we propose a new Convolutional Neural Network for video super resolution which is robust to multiple degradation models. During training, which is performed on a large dataset of scenes with slow and fast motions, it uses the pseudo-inverse image formation model as part of the network architecture in conjunction with perceptual losses and a smoothness constraint that eliminates the artifacts originating from these perceptual losses. The experimental validation shows that our approach outperforms current state-of-the-art methods and is robust to multiple degradations.
KW - Convolutional neuronal networks
KW - Generative adversarial networks
KW - Perceptual loss functions
KW - Super-resolution
KW - Video
UR - http://www.scopus.com/inward/record.url?scp=85087921670&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85087921670&partnerID=8YFLogxK
U2 - 10.1016/j.dsp.2020.102801
DO - 10.1016/j.dsp.2020.102801
M3 - Article
AN - SCOPUS:85087921670
SN - 1051-2004
VL - 104
JO - Digital Signal Processing: A Review Journal
JF - Digital Signal Processing: A Review Journal
M1 - 102801
ER -