GDB
PR2024 GDB: Gated convolutions-based Document Binarization. This repository comprehensively collects the datasets that may be used in document binarization.
Install / Use
/learn @Royalvice/GDBREADME
GDB: Gated convolutions-based Document Binarization
Description
This is an official implementation for the paper GDB: Gated convolutions-based Document Binarization.
This repository also comprehensively collects the datasets that may be used in document binarization.
Datasets
Below is a summary table of the datasets used for document binarization, along with links to download them.
Environment
- Python >= 3.7
- torch >= 1.7.0
- torchvision >= 0.8.0
Usage
Prepare the dataset
Note: The pre-processing code is not provided yet. But it is on the way.
You can download the datasets from the links below and put them in the datasets_ori folder.
When evaluating performance on the DIBCO2019 dataset,
first gather all datasets except for DIBCO2019 and place them in the img and gt folders under the datasets_ori directory.
Then crop the images and ground truth images into patches (256 * 256) and place them in the img and gt folders under the datasets/DIBCO2019 directory.
Next, use the Otsu thresholding method to binaryze the images
under datasets/img and place the results in the datasets/otsu folder.
Use the Sobel operator to process the images under datasets/img
and place the results in the datasets/sobel folder.
With these preprocessing steps completed,
Pass ./datasets/img as an argument for the --dataRoot parameter in train.py and begin training.
Training
python train.py
Testing
python test.py
Datasets
| Dataset | |--------------------------------------------------------------------------------------------------------------------------------------------------------------| | DIBCO 2009 | | H-DIBCO 2010 | | DIBCO 2011 | | H-DIBCO 2012 | | DIBCO 2013 | | H-DIBCO 2014 | | H-DIBCO 2016 | | DIBCO 2017 | | H-DIBCO 2018 | | DIBCO 2019 | | Palm Leaf Manuscript | | Persian Heritage Image Binarization Dataset (PHIBD) | | Ensiedeln | | Noisy Office | | Synchromedia Multispectral dataset | | Bickly-diary dataset | | IAM Historical Document Database |
To-do list
- [x] Add the code for training
- [x] Add the code for testing
- [ ] Add the code for pre-processing
- [ ] Restruct the code
- [ ] Upload the pretrained weights
- [x] Comprehensively collate document binarization benchmark datasets
- [ ] Add the code for evaluating the performance of the model
License
This work is permitted for academic research purposes only. For commercial use, please contact the author.
Citation
- If this work is useful, please cite it as:
@article{yang2024gdb,
title={GDB: gated convolutions-based document binarization},
author={Yang, Zongyuan and Liu, Baolin and Xiong, Yongping and Wu, Guibin},
journal={Pattern Recognition},
volume={146},
pages={109989},
year={2024},
publisher={Elsevier}
}
