COSeg

[CVPR 2024] This repo contains the code for our paper: Rethinking Few-shot 3D Point Cloud Semantic Segmentation

Generate Convert Improve

Install / Use

/learn @ZhaochongAn/COSeg

About this skill

Quality Score

0/100

README

[CVPR 2024] Rethinking Few-shot 3D Point Cloud Semantic Segmentation

Zhaochong An, Guolei Sun<sup>†</sup>, Yun Liu<sup>†</sup>, Fayao Liu, Zongwei Wu, Dan Wang, Luc Van Gool, Serge Belongie

Welcome to the official PyTorch implementation repository of our paper Rethinking Few-shot 3D Point Cloud Semantic Segmentation, accepted to CVPR 2024 [Paper] [Bibtex].

News :triangular_flag_on_post:

2023/04/11: 🔥 GFS-VL code is now publicly available! We've released pre-trained model weights and benchmark datasets to facilitate research. Building on the efficient Pointcept codebase, our GFS-VL repository provides everything you need to get started with 3D few-shot learning powered by vision-language models.
2023/03/21: 📣 Our CVPR 2025 paper Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model is now available! Read the full paper for efficient 3D few-shot learning with vision-language models. Code will be released in our GFS-VL repository.
2023/03/09: 🏆 Our paper Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation has been accepted to ICLR 2025 as a Spotlight presentation! Check out the paper and our implementation repository for more details.

Highlight

The first thing we want you to be aware from this paper:

<p align="center"><i>please ensure you are using our <strong>corrected setting</strong> for the development and evaluation of your 3D few-shot models</i>.</p>

Identification of Key Issues: We pinpoint two significant issues in the current Few-shot 3D Point Cloud Semantic Segmentation (FS-PCS) setting: foreground leakage and sparse point distribution. These issues have undermined the validity of previous progress and hindered further advancements.
Standardized Setting and Benchmark: To rectify existing issues, we propose a standardized FS-PCS setting along with a new benchmark. This enables fair comparisons and fosters future advancements in the field. Our repository implements an effective few-shot running pipeline on our proposed standard FS-PCS setting, facilitating easy development for future researchers based on our code base.

Novel Method (COSeg): Our method introduces a novel correlation optimization paradigm, diverging from the traditional feature optimization approach used by all previous FS-PCS models. COSeg achieves state-of-the-art performance on both S3DIS and ScanNetv2 datasets, demonstrating effective contextual learning and background correlation adjustment ability.

📝 Citation

If you find this project useful, please consider giving a star :star: and citation 📚:

@inproceedings{an2025generalized,
  title={Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model},
  author={An, Zhaochong and Sun, Guolei and Liu, Yun and Li, Runjia and Han, Junlin and Konukoglu, Ender and Belongie, Serge},
  booktitle={CVPR},
  year={2025}
}

@inproceedings{an2024multimodality,
    title={Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation},
    author={An, Zhaochong and Sun, Guolei and Liu, Yun and Li, Runjia and Wu, Min 
            and Cheng, Ming-Ming and Konukoglu, Ender and Belongie, Serge},
    booktitle={ICLR},
    year={2025}
}

@inproceedings{an2024rethinking,
  title={Rethinking Few-shot 3D Point Cloud Semantic Segmentation},
  author={An, Zhaochong and Sun, Guolei and Liu, Yun and Liu, Fayao and Wu, Zongwei and Wang, Dan and Van Gool, Luc and Belongie, Serge},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={3996--4006},
  year={2024}
}

Environment

The following environment setup instructions have been tested on RTX 3090 GPUs with GCC 6.3.0.

Install dependencies

pip install -r requirements.txt

If you have any problem with the above command, you can also install them by

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install torch_points3d==1.3.0
pip install torch-scatter==2.1.1
pip install torch-points-kernels==0.6.10
pip install torch-geometric==1.7.2
pip install timm==0.9.2
pip install tensorboardX==2.6
pip install numpy==1.20.3

For incompatiable installation issues, such as wanting a higher torch version (e.g., 2.1.0) but conflicts with torch_points3d, please refer to this thread: https://github.com/ZhaochongAn/COSeg/issues/16 or feel free to open a new discussion for further assistance.

Compile pointops

Ensure you have gcc, cuda, and nvcc installed. Compile and install pointops2 as follows:

cd lib/pointops2
python3 setup.py install

Datasets Preparation

You can either directly download the preprocessed dataset directly from the links provided below or perform the preprocessing steps on your own.

Preprocessed Datasets

| Dataset | Download | | ------------------ | -------| | S3DIS | Download link | | ScanNet | Download link |

Preprocessing Instructions

S3DIS

Download: S3DIS Dataset Version 1.2.
Preprocessing: Re-organize raw data into npy files:
```
cd preprocess
python collect_s3dis_data.py --data_path [PATH_to_S3DIS_raw_data] --save_path [PATH_to_S3DIS_processed_data]
```
The generated numpy files will be stored in PATH_to_S3DIS_processed_data/scenes.

Splitting Rooms into Blocks:

python room2blocks.py --data_path [PATH_to_S3DIS_processed_data]/scenes

ScanNet

Download: ScanNet V2.

Preprocessing: Re-organize raw data into npy files:

cd preprocess
python collect_scannet_data.py --data_path [PATH_to_ScanNet_raw_data] --save_path [PATH_to_ScanNet_processed_data]

The generated numpy files will be stored in PATH_to_ScanNet_processed_data/scenes.

Splitting Rooms into Blocks:

python room2blocks.py --data_path [PATH_to_ScanNet_processed_data]/scenes

After preprocessing the datasets, a folder named blocks_bs1_s1 will be generated under PATH_to_DATASET_processed_data. Make sure to update the data_root entry in the .yaml config file to [PATH_to_DATASET_processed_data]/blocks_bs1_s1/data.

Model weights

We provide the trained model weights across different few-shot settings and datasets below. The training and testing are using 4 RTX 3090 GPUs. Please note that these weights have been retrained by us, which may have slight differences from reported results. You could directly load these weights for evaluation or train your own models following the training instructions. | Model name | Dataset| CVFOLD | N-way K-shot | Model Weight | | ------------------ | -------| ------|-----|----------------------------------- | | s30_1w1s | S3DIS | 0 | 1-way 1-shot | Download link | | s30_1w5s | S3DIS | 0 | 1-way 5-shot |Download link | | s30_2w1s | S3DIS| 0 | 2-way 1-shot | Download link | | s30_2w5s | S3DIS| 0 | 2-way 5-shot | Download link | | s31_1w1s | S3DIS | 1 | 1-way 1-shot | Download link | | s31_1w5s | S3DIS | 1 | 1-way 5-shot |Download link | | s31_2w1s | S3DIS| 1 | 2-way 1-shot | Download link | | s31_2w5s | S3DIS| 1 | 2-way 5-shot | Download link | | sc0_1w1s | ScanNet | 0 | 1-way 1-shot | Download link | | sc0_1w5s | ScanNet | 0 | 1-way 5-shot |Download link | | sc0_2w1s | ScanNet| 0 | 2-way 1-shot | Download link | | sc0_2w5s | ScanNet| 0 | 2-way 5-shot | Download link | | sc1_1w1s | ScanNet | 1 | 1-way 1-shot | [Download link](https://drive.google.com/drive/u/1/fo

Related Skills

node-connect

350.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

110.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

350.8k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

350.8k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。