SRStitcher
Streamlining the Image Stitching Pipeline: Integrating Fusion and Rectangling into a Unified Model(SRStitcher)
Install / Use
/learn @yayoyo66/SRStitcherREADME
Reconstructing the Image Stitching Pipeline: Integrating Fusion and Rectangling into a Unified Inpainting Model
Paper
Requirements
-
Python >= 3.9
-
GPU (NVIDIA CUDA compatible)
-
Create a virtual environment (optional but recommended):
conda create -n srstitcher python==3.9 conda activate srstitcher -
Install the required dependencies:
pip install -r requirements.txt
Notice: check transformers==4.35.2 and diffusers==0.27.2, other version may report errors
Dataset
We provide a examples document to reproduce Figure 2 in our paper
The complete UDIS-D dataset can be obtained from UDIS
Aligned images and masks can be obtained by UDIS or UDIS++
The datasets should be organized as follows:
dataset
├── warp1
│ ├── 000001.jpg
│ ├── ...
├── warp2
│ ├── 000001.jpg
│ ├── ...
├── mask1
│ ├── 000001.jpg
│ ├── ...
├── mask2
│ ├── 000001.jpg
│ ├── ...
Usage
-
Run the script to get SRStitcher results of Figure 2:
python run.py --config configs/inpaint_config.yamlsee results in document
SRStitcherResults. -
Run the script to measure the CCS of stitched image:
python evaluation_ccs.py -
Run the script to get SRStitcher results of UDIS-D:
Modify the datapath in configs/inpaint_config.yaml
Variants
We provide the implementation of three variants.
SRStitcher-S
Implementation version of SRStitcher on the stable-diffusion-2-1-base
- Run the script to get SRStitcher-S results:
python run.py --config configs/SD2_config.yaml
SRStitcher-U
Implementation version of SRStitcher on the stable-diffusion-2-1-unclip-small
- Run the script to get SRStitcher-U results:
python run.py --config configs/unclipSD2_config.yaml
SRStitcher-C
Implementation version of SRStitcher on the control_v11p_sd15_inpaint
- Run the script to get SRStitcher-C results:
python run.py --config configs/controlnet_config.yaml
Seed Robustness
The SD model's generation results are affected by Torch.manual_seed(). We tested our method's stability, as shown in the figure below.
<img src="examples.png" width="800px"/>
However, random seed initialization is known to be affected by cuda version, pytorch version, and even hardware device. See PyTorch Docs Reproducibility
Therefore, even if you set the same seed, the results may be different from our results, but the overall performance should be close to our reported results. If there is a big difference, please report your test environment in the Issue to help us optimize the method. Thank you very much.
Limitations of Current T2I models
Through the test results, you can find that some images outside the mask are damaged, especially Chinese characters are very serious, this is an inherent flaw in the SD2 model, see the limitation section in the link stable-diffusion-2-inpainting . In SD3, this problem has been well fixed, especially the problem of English character damage, you can try to reproduce our method on SD3-Controlnet-Inpainting. (This version is not open source due to further work, but it is not difficult to reproduce.)
Citation
If you find our code or paper useful to your research work, please consider citing our work using the following bibtex:
@inproceedings{xie2024reconstructing,
title={Reconstructing the Image Stitching Pipeline: Integrating Fusion and Rectangling into a UnifiedInpainting Model},
author={Ziqi, Xie and Weidong, Zhao and Xianhui, Liu and Jian, Zhao and Ning, Jia},
booktitle=NeurIPS,
year={2024}
}
Related Skills
node-connect
354.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
112.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
354.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
354.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
