USIS10K & USIS-SAM

issues forks stars license

This repository is the official implementation of "Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset".

If you found this project useful, please give us a star ⭐️ or cite us in your paper, this is the greatest support and encouragement for us.

:speech_balloon: Updates

🚩 News (2024.05) This paper has been accepted as a paper at ICML 2024, receiving an average rating of 6 with confidence of 4.25.

:rocket: Highlights:

USIS10K dataset: We construct the first large-scale USIS10K dataset for the underwater salient instance segmentation task, which contains 10,632 images and pixel-level annotations of 7 categories. As far as we know, this is the largest salient instance segmentation dataset available that simultaneously includes Class-Agnostic and Multi-Class labels.
SOTA performance: We first attempt to apply SAM to underwater salient instance segmentation and propose USIS-SAM, aiming to improve the segmentation accuracy in complex underwater scenes. Extensive public evaluation criteria and large numbers of experiments verify the effectiveness of our USIS10K dataset and USIS-SAM model.

Installation

Requirements

Python 3.7+
Pytorch 2.0+ (we use the PyTorch 2.1.2)
CUDA 12.1 or other version
mmengine
mmcv>=2.0.0
transformers <= 4.50.3 or refer to this issue change code
MMDetection 3.0+

Environment Installation

<details> <summary>Install on Environment</summary> <br/>

Step 0: Download and install Miniconda from the official website.

Step 1: Create a conda environment and activate it.

conda create -n usis python=3.9 -y
conda activate usis

Step 2: Install PyTorch. If you have experience with PyTorch and have already installed it, you can skip to the next section.

Step 3: Install MMEngine, MMCV, and MMDetection using MIM.

pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"
mim install mmdet

Step 4: Install other dependencies from requirements.txt

pip install -r requirements.txt

</details>

Datasets

Please create a data folder in your working directory and put USIS10K in it for training or testing, or you can just change the dataset path in the config file. If you want to use other datasets, you can refer to MMDetection documentation to prepare the datasets.

data
  ├── USIS10K
  |   ├── foreground_annotations
  │   │   ├── foreground_train_annotations.json
  │   │   ├── foreground_val_annotations.json
  │   │   ├── foreground_test_annotations.json
  │   ├── multi_class_annotations
  │   │   ├── multi_class_train_annotations.json
  │   │   ├── multi_class_val_annotations.json
  │   │   ├── multi_class_test_annotations.json
  │   ├── train
  │   │   ├── train_00001.jpg
  │   │   ├── ...
  │   ├── val
  │   │   ├── val_00001.jpg
  │   │   ├── ...
  │   ├── test
  │   │   ├── test_00001.jpg
  │   │   ├── ...

you can get our USIS10K dataset in Baidu Disk (pwd:icml) or Google Drive.

Model Zoo

|Model|Test|Epoch | mAP | AP50 |AP75 | config | download | |:---:|:--:|:-------:|:----------:|:---------:|:--------:|:--------:|:----------:| |USIS-SAM|Class-Agnostic|24|64.3|84.9|74.0|config|Baidu (pwd:usis) / Google| |USIS-SAM|Multi-Class|24|43.9|59.6|50.0|config|Baidu (pwd:usis) / Google|

Note: We optimized the code and data augmentation strategy of USIS-SAM without substantial changes to make it more efficient for training and inference, so the results here are slightly higher than the experiments in the paper.

Model Training

Download SAM model weights from huggingface

We provide a simple script to download model weights from huggingface, or you can choose another source to download weights.

cd pretrain
bash download_huggingface.sh facebook/sam-vit-huge sam-vit-huge
cd ..

After downloading, please modify the model weight path in the config file.

Training

You can use the following command for single-card training.

python tools/train.py project/our/configs/multiclass_usis_train.py

Or you can use the following command for multi-card training.

bash tools/dist_train.sh project/our/configs/multiclass_usis_train.py nums_gpu

For more ways to train or test please refer to MMDetection User Guides, we provide you with their tools toolkit and test toolkit in the code!

Visualization

If you want to visualize the inference results, you can use vis_infer.py. These file was provided by Kaiying Han@original-doc, thanks for his contribution to this project!

Citation

If you find our repo or USIS10K dataset useful for your research, please cite us:

@inproceedings{lian2024diving,
  title     = {Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset},
  author    = {Lian, Shijie and Zhang, Ziyi and Li, Hua and Li, Wenjie and Yang, Laurence Tianruo and Kwong, Sam and Cong, Runmin},
  booktitle = {Proceedings of the 41st International Conference on Machine Learning},
  pages     = {29545--29559},
  year      = {2024}
  url       = {https://proceedings.mlr.press/v235/lian24c.html},
}

Acknowledgement

This repository is implemented based on the MMDetection framework and Segment Anything Model. In addition, we referenced some of the code in the RSPrompter repository. Thanks to them for their excellent work.

USIS10K

Install / Use

README