COSeg
[CVPR 2024] This repo contains the code for our paper: Rethinking Few-shot 3D Point Cloud Semantic Segmentation
Install / Use
/learn @ZhaochongAn/COSegREADME
[CVPR 2024] Rethinking Few-shot 3D Point Cloud Semantic Segmentation
Zhaochong An, Guolei Sun<sup>†</sup>, Yun Liu<sup>†</sup>, Fayao Liu, Zongwei Wu, Dan Wang, Luc Van Gool, Serge Belongie
Welcome to the official PyTorch implementation repository of our paper Rethinking Few-shot 3D Point Cloud Semantic Segmentation, accepted to CVPR 2024 [Paper] [Bibtex].
News :triangular_flag_on_post:
-
2023/04/11: 🔥 GFS-VL code is now publicly available! We've released pre-trained model weights and benchmark datasets to facilitate research. Building on the efficient Pointcept codebase, our GFS-VL repository provides everything you need to get started with 3D few-shot learning powered by vision-language models.
-
2023/03/21: 📣 Our CVPR 2025 paper Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model is now available! Read the full paper for efficient 3D few-shot learning with vision-language models. Code will be released in our GFS-VL repository.
-
2023/03/09: 🏆 Our paper Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation has been accepted to ICLR 2025 as a Spotlight presentation! Check out the paper and our implementation repository for more details.
Highlight
The first thing we want you to be aware from this paper:
<p align="center"><i>please ensure you are using our <strong>corrected setting</strong> for the development and evaluation of your 3D few-shot models</i>.</p><div align="center"> <img src="figs/sampling.jpg"/> </div>
- Identification of Key Issues: We pinpoint two significant issues in the current Few-shot 3D Point Cloud Semantic Segmentation (FS-PCS) setting: foreground leakage and sparse point distribution. These issues have undermined the validity of previous progress and hindered further advancements.
- Standardized Setting and Benchmark: To rectify existing issues, we propose a standardized FS-PCS setting along with a new benchmark. This enables fair comparisons and fosters future advancements in the field. Our repository implements an effective few-shot running pipeline on our proposed standard FS-PCS setting, facilitating easy development for future researchers based on our code base.
- Novel Method (COSeg): Our method introduces a novel correlation optimization paradigm, diverging from the traditional feature optimization approach used by all previous FS-PCS models. COSeg achieves state-of-the-art performance on both S3DIS and ScanNetv2 datasets, demonstrating effective contextual learning and background correlation adjustment ability.
📝 Citation
If you find this project useful, please consider giving a star :star: and citation 📚:
@inproceedings{an2025generalized,
title={Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model},
author={An, Zhaochong and Sun, Guolei and Liu, Yun and Li, Runjia and Han, Junlin and Konukoglu, Ender and Belongie, Serge},
booktitle={CVPR},
year={2025}
}
@inproceedings{an2024multimodality,
title={Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation},
author={An, Zhaochong and Sun, Guolei and Liu, Yun and Li, Runjia and Wu, Min
and Cheng, Ming-Ming and Konukoglu, Ender and Belongie, Serge},
booktitle={ICLR},
year={2025}
}
@inproceedings{an2024rethinking,
title={Rethinking Few-shot 3D Point Cloud Semantic Segmentation},
author={An, Zhaochong and Sun, Guolei and Liu, Yun and Liu, Fayao and Wu, Zongwei and Wang, Dan and Van Gool, Luc and Belongie, Serge},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={3996--4006},
year={2024}
}
Environment
The following environment setup instructions have been tested on RTX 3090 GPUs with GCC 6.3.0.
- Install dependencies
pip install -r requirements.txt
If you have any problem with the above command, you can also install them by
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install torch_points3d==1.3.0
pip install torch-scatter==2.1.1
pip install torch-points-kernels==0.6.10
pip install torch-geometric==1.7.2
pip install timm==0.9.2
pip install tensorboardX==2.6
pip install numpy==1.20.3
For incompatiable installation issues, such as wanting a higher torch version (e.g., 2.1.0) but conflicts with torch_points3d, please refer to this thread: https://github.com/ZhaochongAn/COSeg/issues/16 or feel free to open a new discussion for further assistance.
- Compile pointops
Ensure you have gcc, cuda, and nvcc installed. Compile and install pointops2 as follows:
cd lib/pointops2
python3 setup.py install
Datasets Preparation
You can either directly download the preprocessed dataset directly from the links provided below or perform the preprocessing steps on your own.
Preprocessed Datasets
| Dataset | Download | | ------------------ | -------| | S3DIS | Download link | | ScanNet | Download link |
Preprocessing Instructions
S3DIS
- Download: S3DIS Dataset Version 1.2.
- Preprocessing: Re-organize raw data into
npyfiles:
The generated numpy files will be stored incd preprocess python collect_s3dis_data.py --data_path [PATH_to_S3DIS_raw_data] --save_path [PATH_to_S3DIS_processed_data]PATH_to_S3DIS_processed_data/scenes. - Splitting Rooms into Blocks:
python room2blocks.py --data_path [PATH_to_S3DIS_processed_data]/scenes
ScanNet
- Download: ScanNet V2.
- Preprocessing: Re-organize raw data into
npyfiles:
The generated numpy files will be stored incd preprocess python collect_scannet_data.py --data_path [PATH_to_ScanNet_raw_data] --save_path [PATH_to_ScanNet_processed_data]PATH_to_ScanNet_processed_data/scenes. - Splitting Rooms into Blocks:
python room2blocks.py --data_path [PATH_to_ScanNet_processed_data]/scenes
After preprocessing the datasets, a folder named blocks_bs1_s1 will be generated under PATH_to_DATASET_processed_data. Make sure to update the data_root entry in the .yaml config file to [PATH_to_DATASET_processed_data]/blocks_bs1_s1/data.
Model weights
We provide the trained model weights across different few-shot settings and datasets below. The training and testing are using 4 RTX 3090 GPUs. Please note that these weights have been retrained by us, which may have slight differences from reported results. You could directly load these weights for evaluation or train your own models following the training instructions. | Model name | Dataset| CVFOLD | N-way K-shot | Model Weight | | ------------------ | -------| ------|-----|----------------------------------- | | s30_1w1s | S3DIS | 0 | 1-way 1-shot | Download link | | s30_1w5s | S3DIS | 0 | 1-way 5-shot |Download link | | s30_2w1s | S3DIS| 0 | 2-way 1-shot | Download link | | s30_2w5s | S3DIS| 0 | 2-way 5-shot | Download link | | s31_1w1s | S3DIS | 1 | 1-way 1-shot | Download link | | s31_1w5s | S3DIS | 1 | 1-way 5-shot |Download link | | s31_2w1s | S3DIS| 1 | 2-way 1-shot | Download link | | s31_2w5s | S3DIS| 1 | 2-way 5-shot | Download link | | sc0_1w1s | ScanNet | 0 | 1-way 1-shot | Download link | | sc0_1w5s | ScanNet | 0 | 1-way 5-shot |Download link | | sc0_2w1s | ScanNet| 0 | 2-way 1-shot | Download link | | sc0_2w5s | ScanNet| 0 | 2-way 5-shot | Download link | | sc1_1w1s | ScanNet | 1 | 1-way 1-shot | [Download link](https://drive.google.com/drive/u/1/fo
Related Skills
node-connect
350.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
350.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
350.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
