OQTR
Official implementation of Transformer-based Efficient Salient Instance Segmentation Networks with Orientative Query
Install / Use
/learn @ssecv/OQTRREADME
Transformer-based Efficient Salient Instance Segmentation Networks with Orientative Query. TMM, 2022.
<div align=center> <img src="docs/OQTR.png" height=350 width=600> </div>Official implementation of TMM2022 "Transformer-based Efficient Salient Instance Segmentation Networks with Orientative Query"
Environment preparation
The code is tested on CUDA 10.1 and pytorch 1.6.0, specify the versions below to your desired ones.
conda create -n oqtr python=3.8 -y
conda activate oqtr
git clone https://github.com/ssecv/OQTR
cd OQTR
conda install -c pytorch torchvision
pip install -r requirements.txt
Data preparation
Revise build_sis function in datasets/coco.py.
Download the SIS10K dataset
- SIS10K
- Baidu Disk Verification code: hust
- Google Disk
Json files: Baidu Verification code: hust / Google
Run model
run demo
python visualize.py --input {INPUT_IMG} --output_dir {OUTPUT_DIR} --resume {WEIGHT_PATH}
- {INPUT_IMG} :input image path
- {OUTPUT_DIR}: output path
- {WEIGHT_PATH}: model weights
run training
python -m torch.distributed.launch --nproc_per_node=2 --use_env main.py \
--masks --dataset_file sis \
--epochs {EPOCHS} --lr_drop {DROP} --num_queries {NUM_QUERIES} --lr {LR} --batch_size {BATCH_SIZE} \
--coco_path {PATH_TO_COCO} \
--frozen_weights {PRETRAIN_PATH} \
--output_dir {OUTPUT_DIR} \
--saliency_query
run evaluation
python eval.py --no_aux_loss --masks --coco_path {PATH_TO_COCO} \
--dataset_file sis --saliency_query --resume {WEIGHT_PATH}
Please replace {PATH_TO_COCO} with the dir of your coco-style dataset and {WEIGHT_PATH} for the model weights.
Resources
- OQTR-R50
Citation
@article{pei2022oqtr,
title={Transformer-based Efficient Salient Instance Segmentation Networks with Orientative Query},
author={Pei, Jialun and Cheng, Tianyang and Tang, He and Chen, Chuanbo},
journal={IEEE Transactions on Multimedia},
year={2022},
publisher={IEEE}
}
Acknowledge
The project is based on DETR and CPD, thanks them for their great work!
Related Skills
node-connect
345.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
104.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
345.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
345.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
