OQTR

Official implementation of Transformer-based Efﬁcient Salient Instance Segmentation Networks with Orientative Query

Generate Convert Improve

Install / Use

/learn @ssecv/OQTR

About this skill

Quality Score

0/100

README

Transformer-based Efﬁcient Salient Instance Segmentation Networks with Orientative Query. TMM, 2022.

Official implementation of TMM2022 "Transformer-based Efﬁcient Salient Instance Segmentation Networks with Orientative Query"

Environment preparation

The code is tested on CUDA 10.1 and pytorch 1.6.0, specify the versions below to your desired ones.

conda create -n oqtr python=3.8 -y
conda activate oqtr
git clone https://github.com/ssecv/OQTR
cd OQTR
conda install -c pytorch torchvision
pip install -r requirements.txt

Data preparation

Revise build_sis function in datasets/coco.py.

Download the SIS10K dataset

SIS10K
- Baidu Disk Verification code: hust
- Google Disk

Json files: Baidu Verification code: hust / Google

Run model

run demo

python visualize.py --input {INPUT_IMG} --output_dir {OUTPUT_DIR} --resume {WEIGHT_PATH}

{INPUT_IMG} :input image path
{OUTPUT_DIR}: output path
{WEIGHT_PATH}: model weights

run training

python -m torch.distributed.launch --nproc_per_node=2 --use_env main.py \
        --masks --dataset_file sis \
        --epochs {EPOCHS} --lr_drop {DROP} --num_queries {NUM_QUERIES} --lr {LR} --batch_size {BATCH_SIZE} \
        --coco_path {PATH_TO_COCO} \
        --frozen_weights {PRETRAIN_PATH} \
        --output_dir {OUTPUT_DIR} \
        --saliency_query

run evaluation

python eval.py --no_aux_loss --masks --coco_path {PATH_TO_COCO} \
  --dataset_file sis --saliency_query --resume {WEIGHT_PATH}

Please replace {PATH_TO_COCO} with the dir of your coco-style dataset and {WEIGHT_PATH} for the model weights.

Resources

OQTR-R50

Citation

@article{pei2022oqtr,
  title={Transformer-based Efficient Salient Instance Segmentation Networks with Orientative Query},
  author={Pei, Jialun and Cheng, Tianyang and Tang, He and Chen, Chuanbo},
  journal={IEEE Transactions on Multimedia},
  year={2022},
  publisher={IEEE}
}

Acknowledge

The project is based on DETR and CPD, thanks them for their great work!

Related Skills

node-connect

345.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

104.6k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

345.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

345.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。