GDB: Gated convolutions-based Document Binarization

Description

This is an official implementation for the paper GDB: Gated convolutions-based Document Binarization.

This repository also comprehensively collects the datasets that may be used in document binarization.

Datasets

Below is a summary table of the datasets used for document binarization, along with links to download them.

Environment

Python >= 3.7
torch >= 1.7.0
torchvision >= 0.8.0

Usage

Prepare the dataset

Note: The pre-processing code is not provided yet. But it is on the way.

You can download the datasets from the links below and put them in the datasets_ori folder. When evaluating performance on the DIBCO2019 dataset, first gather all datasets except for DIBCO2019 and place them in the img and gt folders under the datasets_ori directory. Then crop the images and ground truth images into patches (256 * 256) and place them in the img and gt folders under the datasets/DIBCO2019 directory. Next, use the Otsu thresholding method to binaryze the images under datasets/img and place the results in the datasets/otsu folder. Use the Sobel operator to process the images under datasets/img and place the results in the datasets/sobel folder. With these preprocessing steps completed, Pass ./datasets/img as an argument for the --dataRoot parameter in train.py and begin training.

Training

python train.py

Testing

python test.py

Datasets

| Dataset | |--------------------------------------------------------------------------------------------------------------------------------------------------------------| | DIBCO 2009 | | H-DIBCO 2010 | | DIBCO 2011 | | H-DIBCO 2012 | | DIBCO 2013 | | H-DIBCO 2014 | | H-DIBCO 2016 | | DIBCO 2017 | | H-DIBCO 2018 | | DIBCO 2019 | | Palm Leaf Manuscript | | Persian Heritage Image Binarization Dataset (PHIBD) | | Ensiedeln | | Noisy Office | | Synchromedia Multispectral dataset | | Bickly-diary dataset | | IAM Historical Document Database |

To-do list

[x] Add the code for training
[x] Add the code for testing
[ ] Add the code for pre-processing
[ ] Restruct the code
[ ] Upload the pretrained weights
[x] Comprehensively collate document binarization benchmark datasets
[ ] Add the code for evaluating the performance of the model

License

This work is permitted for academic research purposes only. For commercial use, please contact the author.

Citation

If this work is useful, please cite it as:

@article{yang2024gdb,
  title={GDB: gated convolutions-based document binarization},
  author={Yang, Zongyuan and Liu, Baolin and Xiong, Yongping and Wu, Guibin},
  journal={Pattern Recognition},
  volume={146},
  pages={109989},
  year={2024},
  publisher={Elsevier}
}

GDB

Install / Use

README

GDB: Gated convolutions-based Document Binarization

Description

Datasets

Environment

Usage

Prepare the dataset

Training

Testing

Datasets

To-do list

License

Citation