Skip to content
/ GDB Public

PR2024 GDB: Gated convolutions-based Document Binarization. This repository comprehensively collects the datasets that may be used in document binarization.

Notifications You must be signed in to change notification settings

Royalvice/GDB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GDB: Gated convolutions-based Document Binarization

Description

This is an official implementation for the paper GDB: Gated convolutions-based Document Binarization.

This repository also comprehensively collects the datasets that may be used in document binarization.

Datasets

Below is a summary table of the datasets used for document binarization, along with links to download them.

Environment

  • Python >= 3.7
  • torch >= 1.7.0
  • torchvision >= 0.8.0

Usage

Prepare the dataset

Note: The pre-processing code is not provided yet. But it is on the way.

You can download the datasets from the links below and put them in the datasets_ori folder. When evaluating performance on the DIBCO2019 dataset, first gather all datasets except for DIBCO2019 and place them in the img and gt folders under the datasets_ori directory. Then crop the images and ground truth images into patches (256 * 256) and place them in the img and gt folders under the datasets/DIBCO2019 directory. Next, use the Otsu thresholding method to binaryze the images under datasets/img and place the results in the datasets/otsu folder. Use the Sobel operator to process the images under datasets/img and place the results in the datasets/sobel folder. With these preprocessing steps completed, Pass ./datasets/img as an argument for the --dataRoot parameter in train.py and begin training.

Training

python train.py

Testing

python test.py

Datasets

Dataset
DIBCO 2009
H-DIBCO 2010
DIBCO 2011
H-DIBCO 2012
DIBCO 2013
H-DIBCO 2014
H-DIBCO 2016
DIBCO 2017
H-DIBCO 2018
DIBCO 2019
Palm Leaf Manuscript
Persian Heritage Image Binarization Dataset (PHIBD)
Ensiedeln
Noisy Office
Synchromedia Multispectral dataset
Bickly-diary dataset
IAM Historical Document Database

To-do list

  • Add the code for training
  • Add the code for testing
  • Add the code for pre-processing
  • Restruct the code
  • Upload the pretrained weights
  • Comprehensively collate document binarization benchmark datasets
  • Add the code for evaluating the performance of the model

License

This work is permitted for academic research purposes only. For commercial use, please contact the author.

Citation

  • If this work is useful, please cite it as:
@article{yang2024gdb,
  title={GDB: gated convolutions-based document binarization},
  author={Yang, Zongyuan and Liu, Baolin and Xiong, Yongping and Wu, Guibin},
  journal={Pattern Recognition},
  volume={146},
  pages={109989},
  year={2024},
  publisher={Elsevier}
}

About

PR2024 GDB: Gated convolutions-based Document Binarization. This repository comprehensively collects the datasets that may be used in document binarization.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages