DeepBinaryMask: Learning a binary mask for video compressive sensing

Michael Iliadis*, Leonidas Spinoulas, Aggelos K. Katsaggelos

*Corresponding author for this work

Research output: Contribution to journalArticle

Abstract

In this paper, we propose an encoder-decoder neural network model referred to as DeepBinaryMask for video compressive sensing. In video compressive sensing one frame is acquired using a set of coded masks (sensing matrix) from which a number of video frames, equal to the number of coded masks, is reconstructed. The proposed framework is an end-to-end model where the sensing matrix is trained along with the video reconstruction. The encoder maps a video block to compressive measurements by learning the binary elements of the sensing matrix. The decoder is trained to map the measurements from a video patch back to a video block via several hidden layers of a Multi-Layer Perceptron network. The predicted video blocks are stacked together to recover the unknown video sequence. The reconstruction performance is found to improve when using the trained sensing mask from the network as compared to other mask designs such as random, across a wide variety of compressive sensing reconstruction algorithms. Finally, our analysis and discussion offers insights into understanding the characteristics of the trained mask designs that lead to the improved reconstruction quality.

Original languageEnglish (US)
Article number102591
JournalDigital Signal Processing: A Review Journal
Volume96
DOIs
StatePublished - Jan 2020

Fingerprint

Compressive Sensing
Mask
Masks
Binary
Sensing
Encoder
Reconstruction Algorithm
Multilayer neural networks
Perceptron
Neural Network Model
Patch
Multilayer
Learning
Neural networks
Unknown

Keywords

  • Binary mask
  • Compressive sensing
  • Deep learning
  • Mask optimization
  • Video reconstruction

ASJC Scopus subject areas

  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Statistics, Probability and Uncertainty
  • Computational Theory and Mathematics
  • Electrical and Electronic Engineering
  • Artificial Intelligence
  • Applied Mathematics

Cite this

@article{5be8500fad954846bac17277173dc10d,
title = "DeepBinaryMask: Learning a binary mask for video compressive sensing",
abstract = "In this paper, we propose an encoder-decoder neural network model referred to as DeepBinaryMask for video compressive sensing. In video compressive sensing one frame is acquired using a set of coded masks (sensing matrix) from which a number of video frames, equal to the number of coded masks, is reconstructed. The proposed framework is an end-to-end model where the sensing matrix is trained along with the video reconstruction. The encoder maps a video block to compressive measurements by learning the binary elements of the sensing matrix. The decoder is trained to map the measurements from a video patch back to a video block via several hidden layers of a Multi-Layer Perceptron network. The predicted video blocks are stacked together to recover the unknown video sequence. The reconstruction performance is found to improve when using the trained sensing mask from the network as compared to other mask designs such as random, across a wide variety of compressive sensing reconstruction algorithms. Finally, our analysis and discussion offers insights into understanding the characteristics of the trained mask designs that lead to the improved reconstruction quality.",
keywords = "Binary mask, Compressive sensing, Deep learning, Mask optimization, Video reconstruction",
author = "Michael Iliadis and Leonidas Spinoulas and Katsaggelos, {Aggelos K.}",
year = "2020",
month = "1",
doi = "10.1016/j.dsp.2019.102591",
language = "English (US)",
volume = "96",
journal = "Digital Signal Processing: A Review Journal",
issn = "1051-2004",
publisher = "Elsevier Inc.",

}

DeepBinaryMask : Learning a binary mask for video compressive sensing. / Iliadis, Michael; Spinoulas, Leonidas; Katsaggelos, Aggelos K.

In: Digital Signal Processing: A Review Journal, Vol. 96, 102591, 01.2020.

Research output: Contribution to journalArticle

TY - JOUR

T1 - DeepBinaryMask

T2 - Learning a binary mask for video compressive sensing

AU - Iliadis, Michael

AU - Spinoulas, Leonidas

AU - Katsaggelos, Aggelos K.

PY - 2020/1

Y1 - 2020/1

N2 - In this paper, we propose an encoder-decoder neural network model referred to as DeepBinaryMask for video compressive sensing. In video compressive sensing one frame is acquired using a set of coded masks (sensing matrix) from which a number of video frames, equal to the number of coded masks, is reconstructed. The proposed framework is an end-to-end model where the sensing matrix is trained along with the video reconstruction. The encoder maps a video block to compressive measurements by learning the binary elements of the sensing matrix. The decoder is trained to map the measurements from a video patch back to a video block via several hidden layers of a Multi-Layer Perceptron network. The predicted video blocks are stacked together to recover the unknown video sequence. The reconstruction performance is found to improve when using the trained sensing mask from the network as compared to other mask designs such as random, across a wide variety of compressive sensing reconstruction algorithms. Finally, our analysis and discussion offers insights into understanding the characteristics of the trained mask designs that lead to the improved reconstruction quality.

AB - In this paper, we propose an encoder-decoder neural network model referred to as DeepBinaryMask for video compressive sensing. In video compressive sensing one frame is acquired using a set of coded masks (sensing matrix) from which a number of video frames, equal to the number of coded masks, is reconstructed. The proposed framework is an end-to-end model where the sensing matrix is trained along with the video reconstruction. The encoder maps a video block to compressive measurements by learning the binary elements of the sensing matrix. The decoder is trained to map the measurements from a video patch back to a video block via several hidden layers of a Multi-Layer Perceptron network. The predicted video blocks are stacked together to recover the unknown video sequence. The reconstruction performance is found to improve when using the trained sensing mask from the network as compared to other mask designs such as random, across a wide variety of compressive sensing reconstruction algorithms. Finally, our analysis and discussion offers insights into understanding the characteristics of the trained mask designs that lead to the improved reconstruction quality.

KW - Binary mask

KW - Compressive sensing

KW - Deep learning

KW - Mask optimization

KW - Video reconstruction

UR - http://www.scopus.com/inward/record.url?scp=85074162015&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85074162015&partnerID=8YFLogxK

U2 - 10.1016/j.dsp.2019.102591

DO - 10.1016/j.dsp.2019.102591

M3 - Article

AN - SCOPUS:85074162015

VL - 96

JO - Digital Signal Processing: A Review Journal

JF - Digital Signal Processing: A Review Journal

SN - 1051-2004

M1 - 102591

ER -