AutoHDR

[ACL 2025 main] The official GitHub page of "Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restoration"

Generate Convert Improve

Install / Use

/learn @SCUT-DLVCLab/AutoHDR

About this skill

Quality Score

0/100

README

Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restoration

</div>

</div>

Important Note

The original data of the dataset is sourced from public channels such as the Internet, and its copyright shall remain with the original providers. The collated and annotated dataset presented in this case is for non-commercial use only and is currently licensed to universities and research institutions. To apply for the use of this dataset, please fill in the corresponding application form in accordance with the requirements specified on the dataset’s official website. The applicant must be a full-time employee of a university or research institute and is required to sign the application form. For the convenience of review, it is recommended to affix an official seal (a seal of a secondary-level department is acceptable).

🌟 Highlights

AutoHDR
FPHDR dataset
We propose a novel fully Automated solution for HDR (AutoHDR), inspired by mirroring the workflow of expert historians.
We introduce a pioneer Full-Page HDR dataset (FPHDR), which supports comprehensive HDR model training and evaluation.
Extensive experiments demonstrate the superior performance of our method on both text and appearance restoration.
The modular design enables flexible adjustments, allowing AutoHDR to collaborate effectively with historians.

📅 News

2025.07.21: 📢 Released the FPHDR dataset!
2025.07.17: 🚀 The pretrained model has been released!
2025.07.13: 🔥🎉 The 💻 demo is now live! Welcome to try it out!
2025.07.09: Release the inference code.
2025.07.08: Our paper is now available on arXiv.
2025.05.15: 🎉🎉 Our paper is accepted by ACL2025 main.

🚧 TODO List

[x] Release inference code
[x] Release pretrained model
[x] Release a WebUI
[x] Release dataset
[ ] Upload pretrained model to Hugging Face

🔥 Model Zoo

| Model | Checkpoint | Status | |----------------------------------------------|----------------|------------| | AutoHDR-Qwen2-1.5B | BaiduYun:W2wq | Released | | AutoHDR-Qwen2-7B | BaiduYun:6o84 | Released | | DiffHDR | BaiduYun:63a3 | Released | | Damage Localization Model | BaiduYun:2QC7 | Released | | OCR Model | BaiduYun:1X88 | Released |

🔥 FPHDR Dataset

| Dataset | Link | status | |----------|----------|-------------| | Real data | BaiduYun:ryk3 | Released | | Synthetic data | BaiduYun:m8yn | Released |

Note:

The FPHDR dataset can only be used for non-commercial research purposes. For scholar or organization who wants to use the FPHDR dataset, please first fill in this Application Form and sign the Legal Commitment and email them to us (eelwjin@scut.edu.cn, cc: yuyi.zhang11@foxmail.com). When submitting the application form to us, please list or attached 1-2 of your publications in the recent 6 years to indicate that you (or your team) do research in the related research fields of OCR, historical document analysis and restoration, document image processing, and so on.
We will give you the decompression password after your application has been received and approved.
All users must follow all use conditions; otherwise, the authorization will be revoked.

<details> <summary><b>Dataset File Structure</b></summary>

images/
  ├── FS_2_2_1.jpg
  ├── FS_2_9_1.jpg
  ├── ...
labels/
  ├── FS_2_2_1.json
  ├── FS_2_9_1.json
  ├── ...

</details> <details> <summary><b>Label Annotation Format</b></summary>

{
  "columns": [
    {
      "x": ...,
      "y": ...,
      "w": ...,
      "h": ...,
      "column_id": "...",
      "idx": ...
    },
    ...
  ],
  "chars": [
    {
      "x": ...,
      "y": ...,
      "w": ...,
      "h": ...,
      "txt": "...",
      "cid": ...,
      "char_id": "...",
      "idx": ...,
      "grade": "light|medium|severe|null"
    },
    ...
  ]
}

columns: Column bounding boxes (x, y, w, h)
chars: Character annotations (txt, x, y, w, h, grade)
grade: Damage level (light, medium, severe, or empty for no damage)

</details>

🚧 Installation

Prerequisites

Ubuntu 20.04 (required)
Linux
Python 3.10
Pytorch 2.3.0
CUDA 11.8

Environment Setup

Clone this repo:

git clone https://github.com/SCUT-DLVCLab/AutoHDR.git

Step 0: Download and install Miniconda from the official website.

Step 1: Create a conda environment and activate it.

conda create -n autohdr python=3.10 -y
conda activate autohdr

Step 2: Install the required packages.

pip install -r requirements.txt

📺 Inference

Step 0: Download all model files (except the OCR model) from the Model Zoo and put them in the ckpt folder.

Step 1: Download the OCR model files from the Model Zoo, unzip the package, and move the extracted files into the dist folder.

Step 2: Using AutoHDR for damaged historical documents Restoration:

CUDA_VISIBLE_DEVICES=<gpu_id> python infer_pipeline.py

🚀 RUN WebUI

We provide two convenient ways to run the WebUI demo:

(1) Visit our deployed online demo directly: demo

(2) Run the demo locally:

CUDA_VISIBLE_DEVICES=<gpu_id> python demo_gradio.py

example: Vis_3

☎️ Contact

If you have any questions, feel free to contact Yuyi Zhang at yuyi.zhang11@foxmail.com

🌄 Gallery

Vis_3

💙 Acknowledgement

📜 License

The code and dataset should be used and distributed under (CC BY-NC-ND 4.0) for non-commercial research purposes.

⛔️ Copyright

This repository can only be used for non-commercial research purposes.
For commercial use, please contact Prof. Lianwen Jin (eelwjin@scut.edu.cn).
Copyright 2025, Deep Learning and Vision Computing Lab (DLVC-Lab), South China University of Technology.

✒️Citation

If you find AutoHDR helpful, please consider giving this repo a ⭐ and citing:

@article{Zhang2025autohdr,
      title={Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restoration}, 
      author={Yuyi Zhang and Peirong Zhang and Zhenhua Yang and Pengyu Yan and Yongxin Shi and Pengwei Liu and Fengjun Guo and Lianwen Jin},
      journal={Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics},
      year={2025},
}

Thanks for your support!

Related Skills

node-connect

346.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

107.2k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

346.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

346.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。