DRRG
Deep relational reasoning graph network for arbitrary shape text detection; Accepted by CVPR 2020 (Oral). http://arxiv.org/abs/2003.07493
Install / Use
/learn @GXYM/DRRGREADME
This is an implementation of “Deep relational reasoning graph network for arbitrary shape text detection”.

News
- [x] Our new work at https://github.com/GXYM/TextBPN-Plus-Plus.
- [x] This project is reproduced in MMOCR.
- [x] This project is reproduced in PaddleOCR.
- [x] This project is reproduced by Paddle implementation in DRRG_Paddle. Description of reproduce is in Paddle AI Studio
Prerequisites
python 3.7;
PyTorch 1.2.0;
Numpy >=1.16;
CUDA 10.1;
GCC >=9.0;
opencv-python < 4.5.0
NVIDIA GPU(with 10G or larger GPU memory for inference);
Compile
cd ./csrc and make
cd ./nmslib/lanms and make
Data Links
Note: download the data and put it under the data file
Models
- The trained models of Total-Text, CTW-1500 model, MSRA-TD500, MLT2017, Icdar2015 all in here.
Google Drive or Baidu Drive (download code: cfat)
Train
cd tool
sh train_CTW1500.sh # run or other shell script
you should modify the relevant training parameters according to the environment, such as gpu_id and input_size:
#!/bin/bash
cd ../
CUDA_LAUNCH_BLOCKING=1 python train_TextGraph.py --exp_name Ctw1500 --max_epoch 600 --batch_size 6 --gpu 0 --input_size 640 --optim SGD --lr 0.001 --start_epoch 0 --viz --net vgg
# --resume pretrained/mlt2017_pretain/textgraph_vgg_100.pth ### load the pretrain model, You should change this path to your own
Eval
First, you can modify the relevant parameters in the config.py and option.py
python eval_TextGraph.py # Testing single round model
or
python batch_eval.py # Testing multi round models
Qualitative results(
)


Citing the related works
@inproceedings{DBLP:conf/cvpr/ZhangZHLYWY20,
author = {Shi{-}Xue Zhang and
Xiaobin Zhu and
Jie{-}Bo Hou and
Chang Liu and
Chun Yang and
Hongfa Wang and
Xu{-}Cheng Yin},
title = {Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection},
booktitle = {2020 {IEEE/CVF} Conference on Computer Vision and Pattern Recognition,
{CVPR} 2020, Seattle, WA, USA, June 13-19, 2020},
pages = {9696--9705},
publisher = {Computer Vision Foundation / {IEEE}},
year = {2020},
doi = {10.1109/CVPR42600.2020.00972},
}
@inproceedings{DBLP:conf/iccv/Zhang0YWY21,
author = {Shi{-}Xue Zhang and
Xiaobin Zhu and
Chun Yang and
Hongfa Wang and
Xu{-}Cheng Yin},
title = {Adaptive Boundary Proposal Network for Arbitrary Shape Text Detection},
booktitle = {2021 {IEEE/CVF} International Conference on Computer Vision, {ICCV} 2021, Montreal, QC, Canada, October 10-17, 2021},
pages = {1285--1294},
publisher = {{IEEE}},
year = {2021},
}
@article{zhang2023arbitrary,
title={Arbitrary shape text detection via boundary transformer},
author={Zhang, Shi-Xue and Yang, Chun and Zhu, Xiaobin and Yin, Xu-Cheng},
journal={IEEE Transactions on Multimedia},
year={2023},
publisher={IEEE}
}
@article{DBLP:journals/pami/ZhangZCHY23,
author = {Shi{-}Xue Zhang and
Xiaobin Zhu and
Lei Chen and
Jie{-}Bo Hou and
Xu{-}Cheng Yin},
title = {Arbitrary Shape Text Detection via Segmentation With Probability Maps},
journal = {{IEEE} Trans. Pattern Anal. Mach. Intell.},
volume = {45},
number = {3},
pages = {2736--2750},
year = {2023},
url = {https://doi.org/10.1109/TPAMI.2022.3176122},
doi = {10.1109/TPAMI.2022.3176122},
}
@article{zhang2022kernel,
title={Kernel proposal network for arbitrary shape text detection},
author={Zhang, Shi-Xue and Zhu, Xiaobin and Hou, Jie-Bo and Yang, Chun and Yin, Xu-Cheng},
journal={IEEE Transactions on Neural Networks and Learning Systems},
year={2022},
publisher={IEEE}
}
License
This project is licensed under the MIT License - see the LICENSE.md file for details
<!---->✨ Star History
Related Skills
node-connect
350.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
350.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
350.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
