SkillAgentSearch skills...

RefLDMSeg

[AAAI 2025] Explore In-Context Segmentation via Latent Diffusion Models

Install / Use

/learn @wang-chaoyang/RefLDMSeg
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Explore In-Context Segmentation via Latent Diffusion Models

<div align="center">

arXiv Project Website

</div> <div> <p align="center" style="font-size: larger;"> <strong>AAAI 2025</strong> </p> </div> <p align="center"> <img src="assets/teaser.png" width=95%> <p>

Requirements

  1. Install torch==2.1.0.
  2. Install pip packages via pip install -r requirements.txt and alpha_clip.
  3. Our model is based on Stable Diffusion, download and put it into datasets/pretrain. Put the checkpoints of alpha_clip into datasets/pretrain/alpha-clip.

Data Preparation

Please download the following datasets: COCO 2014, DAVIS16, VSPW, and PASCAL, which includes PASCAL VOC 2012 and SBD. And then download the meta files. Put them under datasets and rearrange as follows.

datasets
├── pascal
│   ├── JPEGImages
│   ├── SegmentationClassAug
│   └── metas
├── davis16
│   ├── JPEGImages
│   ├── Annotations
│   └── metas
├── vspw
│   ├── images
│   ├── masks
│   └── metas
└── coco20i
    ├── annotations
    │   ├── train2014
    │   └── val2014
    ├── metas
    ├── train2014
    └── val2014

Train

The codes in scripts is launched by accelerate. The saved path is specified by --output_dir defined in args.

# ldis1
accelerate launch --multi_gpu --num_processes [GPUS] scripts/modelf.py --config configs/cfg.py
# ldisn
accelerate launch --multi_gpu --num_processes [GPUS] scripts/modeln.py --config configs/cfg.py --mask_alpha 0.4

Inference

# ldis1
accelerate launch --multi_gpu --num_processes [GPUS] scripts/modelf.py --config configs/cfg.py --only_val 1 --val_dataset pascal --output_dir [the path of ckpt]
# ldisn
accelerate launch --multi_gpu --num_processes [GPUS] scripts/modeln.py --config configs/cfg.py --only_val 1 --val_dataset pascal --output_dir [the path of ckpt] --mask_alpha 0.4

The pretrained models can be found here.

Citation

If you find our work useful, please kindly consider citing our paper:

@article{wang2024explore,
  title={Explore In-Context Segmentation via Latent Diffusion Models},
  author={Wang, Chaoyang and Li, Xiangtai and Ding, Henghui and Qi, Lu and Zhang, Jiangning and Tong, Yunhai and Loy, Chen Change and Yan, Shuicheng},
  journal={arXiv preprint arXiv:2403.09616},
  year={2024}
}

License

MIT license

Related Skills

View on GitHub
GitHub Stars22
CategoryDevelopment
Updated3mo ago
Forks0

Languages

Python

Security Score

87/100

Audited on Jan 7, 2026

No findings