HDR

[AAAI2025 Oral] Predicting the Original Appearance of Damaged Historical Documents

Generate Convert Improve

Install / Use

/learn @yeungchenwa/HDR

About this skill

Quality Score

0/100

README

Predicting the Original Appearance of Damaged Historical Documents

</div>

</div> <a href="#🖼️-Gallery">🖼️ Gallery </a> • <a href="#📊-HDR28K">📊 HDR28K </a> • <a href="#🔥-Model-Zoo">🔥 Model Zoo</a> • <a href="#🔥-Dataset-Zoo">🔥 Dataset Zoo</a> • <a href="#🚧-Installation">🚧 Installation</a> • <a href="#📺-Inference">📺 Inference</a> • <a href="#📏-Evaluation">📏 Evaluation</a>

🌟 Highlight

Vis_1 Vis_2

We introduce a Historical Document Repair (HDR) task, which endeavors to predict the original appearance of damaged historical document images.
We build a large-scale historical document repair dataset, termed HDR28K, which includes 28,552 damaged-repaired image pairs with character-level annotations and multi-style degradation.
🔥🔥🔥 We propose a Diffusion-based Historical Document Repair method (DiffHDR), which augments the DDPM framework with semantic and spatial information

📰 News

2025.07.15: 🎉 We propose a novel historical document restoration method, AutoHDR. Welcome to try our demo!
2025.03.20: 🎉🎉 The Historical Document Repair dataset HDR28K is released!
2024.12.17: Release inference code.
2024.12.10: 🎉🎉 Our paper is accepted by AAAI2025.

🔥 Model Zoo

| Model | chekcpoint | status | |----------------------------------------------|----------------|------------| | DiffHDR | GoogleDrive / BaiduYun:x62f | Released |

🔥 Dataset Zoo

| Model | chekcpoint | status | |----------------------------------------------|----------------|------------| | HDR28K | BaiduYun:upm9 | Released |

The dataset file structure is as followed:

- character_missing
  - test
    - char_mask_images
    - content_images
    - degraded_images
    - original_images
  - train
    - char_mask_images
    - content_images
    - degraded_images
    - original_images
- ink erosion
  - similar to 'character_missing'
- paper damage
  - similar to 'character_missing'
- test_image_only_damage
  - hole_M5_image_2000_32_467_544_979_degrade0.png
  - ......

NOTE: The test_image_only_damage contains the gt image after replacing the non-damaged region of $x_r$ by the target $x_{target}$.

🚧 Installation

Prerequisites (Recommended)

Linux
Python 3.9
Pytorch 1.13.1
CUDA 11.7

Environment Setup

Clone this repo:

git clone https://github.com/yeungchenwa/HDR.git

Step 0: Download and install Miniconda from the official website.

Step 1: Create a conda environment and activate it.

conda create -n diffhdr python=3.9 -y
conda activate diffhdr

Step 2: Install related version Pytorch following here.

# Suggested
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117

Step 3: Install the required packages.

pip install -r requirements.txt

📺 Inference

Using DiffHDR for damaged historical documents repair (Some examples including damaged images, mask images, and content images are provided in /examples):

sh scripts/inference.sh

device: CUDA or CPU used for inference,
image_path: The damaged image path.
mask_image_path: The masked image path.
content_image_path: The content image path.
save_dir: The directory for saving repaired image.
content_mask_guidance_scale: The guidance scale of content image and masked image.
degraded_guidance_scale: The guidance scale of damaged image.
ckpt_path: The unet checkpoint path.
num_inference_steps: The number of inference steps.

📊 HDR28K

HDR28K

📏 Evaluation

Coming soon ...

💙 Acknowledgement

diffusers

⛔️ Copyright

This repository can only be used for non-commercial research purposes.
For commercial use, please contact Prof. Lianwen Jin (eelwjin@scut.edu.cn).
Copyright 2024, Deep Learning and Vision Computing Lab (DLVC-Lab), South China University of Technology.

📇 Citation

@inproceedings{yang2024fontdiffuser,
  title={Predicting the Original Appearance of Damaged Historical Documents},
  author={Yang, Zhenhua and Peng, Dezhi and Shi, Yongxin and Zhang, Yuyi and Liu, Chongyu and Jin, Lianwen},
  booktitle={Proceedings of the AAAI conference on artificial intelligence},
  year={2025}
}

🌟 Star Rising

Related Skills

node-connect

347.6k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

108.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

347.6k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

347.6k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。