SRStitcher

Streamlining the Image Stitching Pipeline: Integrating Fusion and Rectangling into a Unified Model(SRStitcher)

Generate Convert Improve

Install / Use

/learn @yayoyo66/SRStitcher

About this skill

Quality Score

0/100

README

Reconstructing the Image Stitching Pipeline: Integrating Fusion and Rectangling into a Unified Inpainting Model

Paper

Requirements

Python >= 3.9
GPU (NVIDIA CUDA compatible)

Create a virtual environment (optional but recommended):

conda create -n srstitcher python==3.9
conda activate srstitcher

Install the required dependencies:
```
pip install -r requirements.txt
```

Notice: check transformers==4.35.2 and diffusers==0.27.2, other version may report errors

Dataset

We provide a examples document to reproduce Figure 2 in our paper

The complete UDIS-D dataset can be obtained from UDIS

Aligned images and masks can be obtained by UDIS or UDIS++

The datasets should be organized as follows:

dataset
├── warp1
│   ├── 000001.jpg
│   ├── ...
├── warp2
│   ├── 000001.jpg
│   ├── ...
├── mask1
│   ├── 000001.jpg
│   ├── ...
├── mask2
│   ├── 000001.jpg
│   ├── ...

Usage

Run the script to get SRStitcher results of Figure 2:
```
python run.py  --config configs/inpaint_config.yaml
```
see results in document SRStitcherResults.
Run the script to measure the CCS of stitched image:
```
python evaluation_ccs.py
```
Run the script to get SRStitcher results of UDIS-D:

Modify the datapath in configs/inpaint_config.yaml

Variants

We provide the implementation of three variants.

SRStitcher-S

Implementation version of SRStitcher on the stable-diffusion-2-1-base

Run the script to get SRStitcher-S results:

python run.py  --config configs/SD2_config.yaml

SRStitcher-U

Implementation version of SRStitcher on the stable-diffusion-2-1-unclip-small

Run the script to get SRStitcher-U results:

python run.py  --config configs/unclipSD2_config.yaml

SRStitcher-C

Implementation version of SRStitcher on the control_v11p_sd15_inpaint

Run the script to get SRStitcher-C results:

python run.py  --config configs/controlnet_config.yaml

Seed Robustness

The SD model's generation results are affected by Torch.manual_seed(). We tested our method's stability, as shown in the figure below. <img src="examples.png" width="800px"/>

However, random seed initialization is known to be affected by cuda version, pytorch version, and even hardware device. See PyTorch Docs Reproducibility

Therefore, even if you set the same seed, the results may be different from our results, but the overall performance should be close to our reported results. If there is a big difference, please report your test environment in the Issue to help us optimize the method. Thank you very much.

Limitations of Current T2I models

Through the test results, you can find that some images outside the mask are damaged, especially Chinese characters are very serious, this is an inherent flaw in the SD2 model, see the limitation section in the link stable-diffusion-2-inpainting . In SD3, this problem has been well fixed, especially the problem of English character damage, you can try to reproduce our method on SD3-Controlnet-Inpainting. (This version is not open source due to further work, but it is not difficult to reproduce.)

Citation

If you find our code or paper useful to your research work, please consider citing our work using the following bibtex:

@inproceedings{xie2024reconstructing,
  title={Reconstructing the Image Stitching Pipeline: Integrating Fusion and Rectangling into a UnifiedInpainting Model},
  author={Ziqi, Xie and Weidong, Zhao and Xianhui, Liu and Jian, Zhao and Ning, Jia},
  booktitle=NeurIPS,
  year={2024}
}

Related Skills

node-connect

354.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

112.3k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

354.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

354.3k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。