OpenIns3D
[ECCV'24] OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
Install / Use
/learn @Pointcept/OpenIns3DREADME
Highlights
- 2 Aug, 2024: Major update 🔥: We have released optimized and easy-to-use code for OpenIns3D to reproduce all the results in the paper and demo.
- 1 Jul, 2024: OpenIns3D has been accepted at ECCV 2024 🎉. We will release more code on various experiments soon.
- 6 Jan, 2024: We have released a major revision, incorporating S3DIS and ScanNet benchmark code. Try out the latest version.
- 31 Dec, 2023 We release the batch inference code on ScanNet.
- 31 Dec, 2023 We release the zero-shot inference code, test it on your own data!
- Sep, 2023: OpenIns3D is released on arXiv, alongside with explanatory video, project page. We will release the code at end of this year.
Overview
- Installation
- Reproducing All Benchmarks Results
- Replacing Snap with RGBD
- Zero-Shot Inference with Single Vocabulary
- Zero-Shot Inference with Multiple Vocabulary
- Citation
- Acknowledgement
Installation
Please check the installation file to install OpenIns3D for:
Reproducing Results
🗂️ Replica
🔧 Data Preparation:
- Execute the following command to set up the Replica dataset, including scene
.plyfiles, predicted masks, and ground truth:
sh scripts/prepare_replica.sh
sh scripts/prepare_yoloworld.sh
📊 Open Vocabulary Instance Segmentation:
python openins3d/main.py --dataset replica --task OVIS --detector yoloworld
📈 Results Log: | Task | AP | AP50 | AP25 | Log | |-----------------------------|:----:|:----:|:----:|:----:| | Replica OVIS (in paper) | 13.6 | 18.0 | 19.7 | | | Replica OVIS (this Code) | 15.4 | 19.5 | 25.2 | log |
🗂️ ScanNet
🔧 Data Preparation:
- Make sure you have completed the form on ScanNet to obtain access.
- Place the
download-scannet.pyscript into thescriptsdirectory. - Run the following command to download all
_vh_clean_2.plyfiles for validation sets, as well as instance ground truth, GT-masks, and detected masks:
sh scripts/prepare_scannet.sh
📊 Open Vocabulary Object Recognition:
python openins3d/main.py --dataset scannet --task OVOR --detector odise
📈 Results Log: | Task | Top-1 Accuracy | Log | |-----------------------------|:--------------:|:----:| | ScanNet_OVOR (in paper) | 60.4 | | | ScanNet_OVOR (this Code) | 64.2 | log |
📊 Open Vocabulary Object Detection:
python openins3d/main.py --dataset scannet --task OVOD --detector odise
📊 Open Vocabulary Instance Segmentation:
python openins3d/main.py --dataset scannet --task OVIS --detector odise
📈 Results Log: | Task | AP | AP50 | AP25 | Log | |-----------------------------|:----:|:----:|:----:|:----:| | ScanNet_OVOD (in paper) | 17.8 | 28.3 | 36.0 | | | ScanNet_OVOD (this Code) | 20.7 | 29.9 | 39.7 | log | | ScanNet_OVIS (in paper) | 19.9 | 28.7 | 38.9 | | | ScanNet_OVIS (this Code) | 23.3 | 34.6 | 42.6 | log |
🗂️ S3DIS
🔧 Data Preparation:
- Make sure you have completed the form on S3DIS to obtain access.
- Then, run the following command to acquire scene
.plyfiles, predicted masks, and ground truth:
sh scripts/prepare_s3dis.sh
📊 Open Vocabulary Instance Segmentation:
python openins3d/main.py --dataset s3dis --task OVIS --detector odise
📈 Results Log: | Task | AP | AP50 | AP25 | Log | |-----------------------------|:----:|:----:|:----:|:----:| | S3DIS OVIS (in paper) | 21.1 | 28.3 | 29.5 | | | S3DIS OVIS (this Code) | 22.9 | 29.0 | 31.4 | log |
🗂️ STPLS3D
🔧 Data Preparation:
- Make sure you have completed the form STPLS3D to gain access.
- Then, run the following command to obtain scene
.plyfiles, predicted masks, and ground truth:
sh scripts/prepare_stpls3d.sh
📊 Open Vocabulary Instance Segmentation:
python openins3d/main.py --dataset stpls3d --task OVIS --detector odise
📈 Results Log: | Task | AP | AP50 | AP25 | Log | |-----------------------------|:------:|:-----:|:-----:|:----:| | STPLS3D OVIS (in paper) | 11.4 | 14.2 | 17.2 | | | STPLS3D OVIS (this Code) | 15.3 | 17.3 | 17.4 | log |
Replacing Snap with RGBD
We also evaluate the performance of OpenIns3D when the Snap module is replaced with original RGBD images while keeping the other design intact.
🗂️ Replica
🔧 Data Preparation
- Download the Replica dataset and RGBD images:
sh scripts/prepare_replica.sh
sh scripts/prepare_replica2d.sh
sh scripts/prepare_yoloworld.sh
📊 Open Vocabulary Instance Segmentation
python openins3d/main.py --dataset replica --task OVIS --detector yoloworld --use_2d true
📈 Results Log
| Task | AP | AP50 | AP25 | Log |
|----------------|:----:|:----:|:----:|:----------------------------------------:|
| OpenMask3D | 13.1 | 18.4 | 24.2 | |
| Open3DIS | 18.5 | 24.5 | 28.2 | |
| OpenIns3D | 21.1 | 26.2 | 30.6 | log |
Zero-Shot Inference with Single Vocabulary
We demonstrate how to perform single-vocabulary instance segmentation similar to the teaser image in the paper. The key new feature is the introduction of a CLIP ranking and filtering module to reduce false-positive results. (Works best with RGBD but is also fine with SNAP.)
Quick Start:
-
📥 Download the demo dataset by running:
sh scripts/prepare_demo_single.sh -
🚀 Run the model by executing:
python zero_sho
