Deep Networks for Degraded Document Image Binarization through Pyramid Reconstruction

Gaofeng Meng, Kun Yuan, Ying Wu, Shiming Xiang, Chunhong Pan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

Binarization of document images is an important processing step for document images analysis and recognition. However, this problem is quite challenging in some cases because of the quality degradation of document images, such as varying illumination, complicated backgrounds, image noises due to ink spots, water stains or document creases. In this paper, we propose a framework based on deep convolutional neural-network (DCNN) for adaptive binarization of degraded document images. The basic idea of our method is to decompose a degraded document image into a spatial pyramid structure by using DCNN, with each layer at different scale. Then the foreground image is sequentially reconstructed from these layers in a coarse-To-fine manner by using deconvolutional network. Such kind of decomposition is quite beneficial, since multi-resolution supervision information can be directly introduced into network learning. We also define several loss functions about label consistency and foregrounds smoothing to further regularize the training of the network. Experimental results demonstrate the effectiveness of the proposed method.

Original languageEnglish (US)
Title of host publicationProceedings - 14th IAPR International Conference on Document Analysis and Recognition, ICDAR 2017
PublisherIEEE Computer Society
Pages727-732
Number of pages6
ISBN (Electronic)9781538635865
DOIs
StatePublished - Jan 25 2018
Event14th IAPR International Conference on Document Analysis and Recognition, ICDAR 2017 - Kyoto, Japan
Duration: Nov 9 2017Nov 15 2017

Publication series

NameProceedings of the International Conference on Document Analysis and Recognition, ICDAR
Volume1
ISSN (Print)1520-5363

Other

Other14th IAPR International Conference on Document Analysis and Recognition, ICDAR 2017
CountryJapan
CityKyoto
Period11/9/1711/15/17

Keywords

  • Convolutional neural networks
  • Document image binarization
  • Document image processing
  • Pyramid reconstruction

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'Deep Networks for Degraded Document Image Binarization through Pyramid Reconstruction'. Together they form a unique fingerprint.

  • Cite this

    Meng, G., Yuan, K., Wu, Y., Xiang, S., & Pan, C. (2018). Deep Networks for Degraded Document Image Binarization through Pyramid Reconstruction. In Proceedings - 14th IAPR International Conference on Document Analysis and Recognition, ICDAR 2017 (pp. 727-732). (Proceedings of the International Conference on Document Analysis and Recognition, ICDAR; Vol. 1). IEEE Computer Society. https://doi.org/10.1109/ICDAR.2017.124