Convolutional neural networks (CNN) are a special type of deep neural networks (DNN). They have so far been successfully applied to image super-resolution (SR) as well as other image restoration tasks. In this paper, we consider the problem of video super-resolution. We propose a CNN that is trained on both the spatial and the temporal dimensions of videos to enhance their spatial resolution. Consecutive frames are motion compensated and used as input to a CNN that provides super-resolved video frames as output. We investigate different options of combining the video frames within one CNN architecture. While large image databases are available to train deep neural networks, it is more challenging to create a large video database of sufficient quality to train neural nets for video restoration. We show that by using images to pretrain our model, a relatively small video database is sufficient for the training of our model to achieve and even improve upon the current state-of-the-art. We compare our proposed approach to current video as well as image SR algorithms.
|Original language||English (US)|
|Title of host publication||IEEE Transactions on Computational Imaging|
|Number of pages||14|
|State||Published - 2016|