HDR
[AAAI2025 Oral] Predicting the Original Appearance of Damaged Historical Documents
Install / Use
/learn @yeungchenwa/HDRREADME
Predicting the Original Appearance of Damaged Historical Documents
</div>
🌟 Highlight

- We introduce a <u>H</u>istorical <u>D</u>ocument <u>R</u>epair (HDR) task, which endeavors to predict the original appearance of damaged historical document images.
- We build a large-scale historical document repair dataset, termed HDR28K, which includes <u>28,552</u> damaged-repaired image pairs with character-level annotations and multi-style degradation.
- 🔥🔥🔥 We propose a <u>Diff</u>usion-based <u>H</u>istorical <u>D</u>ocument <u>R</u>epair method (DiffHDR), which augments the DDPM framework with semantic and spatial information
📰 News
- 2025.07.15: 🎉 We propose a novel historical document restoration method, AutoHDR. Welcome to try our demo!
- 2025.03.20: 🎉🎉 The Historical Document Repair dataset HDR28K is released!
- 2024.12.17: Release inference code.
- 2024.12.10: 🎉🎉 Our paper is accepted by AAAI2025.
🔥 Model Zoo
| Model | chekcpoint | status | |----------------------------------------------|----------------|------------| | DiffHDR | GoogleDrive / BaiduYun:x62f | Released |
🔥 Dataset Zoo
| Model | chekcpoint | status | |----------------------------------------------|----------------|------------| | HDR28K | BaiduYun:upm9 | Released |
The dataset file structure is as followed:
- character_missing
- test
- char_mask_images
- content_images
- degraded_images
- original_images
- train
- char_mask_images
- content_images
- degraded_images
- original_images
- ink erosion
- similar to 'character_missing'
- paper damage
- similar to 'character_missing'
- test_image_only_damage
- hole_M5_image_2000_32_467_544_979_degrade0.png
- ......
NOTE: The test_image_only_damage contains the gt image after replacing the non-damaged region of $x_r$ by the target $x_{target}$.
🚧 Installation
Prerequisites (Recommended)
- Linux
- Python 3.9
- Pytorch 1.13.1
- CUDA 11.7
Environment Setup
Clone this repo:
git clone https://github.com/yeungchenwa/HDR.git
Step 0: Download and install Miniconda from the official website.
Step 1: Create a conda environment and activate it.
conda create -n diffhdr python=3.9 -y
conda activate diffhdr
Step 2: Install related version Pytorch following here.
# Suggested
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
Step 3: Install the required packages.
pip install -r requirements.txt
📺 Inference
Using DiffHDR for damaged historical documents repair (Some examples including damaged images, mask images, and content images are provided in /examples):
sh scripts/inference.sh
device: CUDA or CPU used for inference,image_path: The damaged image path.mask_image_path: The masked image path.content_image_path: The content image path.save_dir: The directory for saving repaired image.content_mask_guidance_scale: The guidance scale of content image and masked image.degraded_guidance_scale: The guidance scale of damaged image.ckpt_path: The unet checkpoint path.num_inference_steps: The number of inference steps.
📊 HDR28K

📏 Evaluation
Coming soon ...
💙 Acknowledgement
⛔️ Copyright
- This repository can only be used for non-commercial research purposes.
- For commercial use, please contact Prof. Lianwen Jin (eelwjin@scut.edu.cn).
- Copyright 2024, Deep Learning and Vision Computing Lab (DLVC-Lab), South China University of Technology.
📇 Citation
@inproceedings{yang2024fontdiffuser,
title={Predicting the Original Appearance of Damaged Historical Documents},
author={Yang, Zhenhua and Peng, Dezhi and Shi, Yongxin and Zhang, Yuyi and Liu, Chongyu and Jin, Lianwen},
booktitle={Proceedings of the AAAI conference on artificial intelligence},
year={2025}
}
🌟 Star Rising
Related Skills
node-connect
347.6kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
108.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.6kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.6kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
