Cnos

[ICCV 2023 R6D] PyTorch implementation of CNOS: A Strong Baseline for CAD-based Novel Object Segmentation based on Segmenting Anything and DINOv2

Generate Convert Improve

Install / Use

/learn @nv-nguyen/Cnos

About this skill

Quality Score

0/100

README

<div align="center"> <h2> CNOS: A Strong Baseline for CAD-based Novel Object Segmentation </h2> <h3> <a href="https://nv-nguyen.github.io/" target="_blank"><nobr>Van Nguyen Nguyen</nobr></a> &emsp; <a href="http://imagine.enpc.fr/~groueixt/" target="_blank"><nobr>Thibault Groueix</nobr></a> &emsp;

<a href="https://ponimatkin.github.io/" target="_blank"><nobr>Georgy Ponimatkin</nobr></a> <a href="https://vincentlepetit.github.io/" target="_blank"><nobr>Vincent Lepetit</nobr></a> <a href="https://cmp.felk.cvut.cz/~hodanto2/" target="_blank"><nobr>Tomáš Hodaň</nobr></a> <br>

<p></p>

<p></p>

framework

qualitative

</h3> </div>

CNOS is a simple three-stage approach for CAD-based novel object segmentation. It is based on Segmenting Anything, DINOv2 and can be used for any objects without retraining. CNOS outperforms the supervised MaskRCNN (in CosyPose) which was trained on target objects. CNOS has been used as the baseline for Task 5 and Task 6 in BOP challenge 2023!

bo results

Here are some qualitative results of CNOS on the YCBV dataset. We are displaying only detections with a confidence score greater than 0.5 for better visibility. ycbv If our project is helpful for your research, please consider citing :

@inproceedings{nguyen2023cnos,
title={CNOS: A Strong Baseline for CAD-based Novel Object Segmentation},
author={Nguyen, Van Nguyen and Groueix, Thibault and Ponimatkin, Georgy and Lepetit, Vincent and Hodan, Tomas},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={2134--2140},
year={2023}
}

You can also put a star :star:, if the code is useful to you.

If you like this project, check out related works from our group:

Updates:

23/05/2024: Added tutorial to run CNOS on HOPE dataset for BOP challenge 2024.
23/07/2023: Added tutorial to run CNOS on custom datasets

Installation :construction_worker:

<details><summary>Click to expand</summary>

Please make sure that you update this user's configuration before conducting any experiments.

1. Create conda environment

conda env create -f environment.yml
conda activate cnos

# for using SAM
pip install git+https://github.com/facebookresearch/segment-anything.git

# for using fastSAM
pip install ultralytics==8.0.135

2. Datasets and model weights

2.1. Download datasets from BOP challenge:

For BOP challenge 2024 core datasets (HOPE, HANDAL, HOT-3D), download each dataset with the following command:

pip install -U "huggingface_hub[cli]"
export DATASET_NAME=hope
python -m src.scripts.download_bop_h3 dataset_name=$DATASET_NAME

# For model-free tasks
python -m src.scripts.download_modelfree_onboarding_bop_h3

For BOP challenge 2023 core datasets (LMO, TLESS, TUDL, ICBIN, ITODD, HB, and TLESS), download all datasets with the following command:

python -m src.scripts.download_bop_classic

2.2. Rendering templates with Pyrender:

Note: This rendering is fast. For example, using a single V100 GPU, it can be done within 10 minutes for seven core datasets of BOP'23.

For BOP challenge 2024 core datasets (HOPE, HANDAL, HOT-3D), rendering templates with Pyrender is only required for model-based tasks, while for model-free tasks, you can skip this step since the images in onboarding videos can be used directly. To render templates for model-based tasks:

export DATASET_NAME=hope
python -m src.scripts.render_template_with_pyrender dataset_name=$DATASET_NAME

For BOP challenge 2023 core datasets (LMO, TLESS, TUDL, ICBIN, ITODD, HB, and TLESS), you can use the pre-rendered templates at this Google Drive link (4.64GB) and unzip it into $ROOT_DIR or render template from scratch with:

python -m src.scripts.render_template_with_pyrender

2.3. Download model weights of Segmenting Anything:

python -m src.scripts.download_sam

2.4. Download model weights of Fast Segmenting Anything:

python -m src.scripts.download_fastsam

2.5. Download BlenderProc4BOP set:

This is only required when you want to use realistic rendering with BlenderProc.

For BOP challenge 2024 core datasets (HOPE, HANDAL, HOT-3D), this download is only required for model-based tasks:

pip install -U "huggingface_hub[cli]"
export DATASET_NAME=hope
python -m src.scripts.download_train_pbr_bop24 dataset_name=$DATASET_NAME

For BOP challenge 2023 core datasets (LMO, TLESS, TUDL, ICBIN, ITODD, HB, and TLESS):

python -m src.scripts.download_train_pbr_bop23

</details>

Testing on BOP datasets :rocket:

We provide CNOS's predictions for three core dataset of BOP challenge 2024 with SAM model and seven core dataset of BOP challenge 2023 with both SAM and FastSAM models in this link.

<details><summary>Click to expand</summary>

Run CNOS to get predictions:

For BOP challenge 2024 datasets:

export DATASET_NAME=hope
# model-free tasks: with SAM + static onboarding
python run_inference.py dataset_name=$DATASET_NAME model.onboarding_config.rendering_type=onboarding_static

# model-free tasks: with SAM + dynamic onboarding
python run_inference.py dataset_name=$DATASET_NAME model.onboarding_config.rendering_type=onboarding_dynamic

# model-based tasks: with SAM + PBR
python run_inference.py dataset_name=$DATASET_NAME model.onboarding_config.rendering_type=pbr

# model-based tasks: with SAM + pyrender
python run_inference.py dataset_name=$DATASET_NAME model.onboarding_config.rendering_type=pyrender

Quantitative results on HOPE (only RealSense testing images, BOP'19-23) and HOPE_v2 (both RealSense and Vicon testing images, BOP'24):

| Dataset | Task | Static onboarding | Dynamic onboarding | Model-based PBR | Model-based Pyrender | |---------------|---------------|-----------|----------|-----------|----------| | HOPE | 2D detection | 39.8 | 40.8 | 41.6 | 39.3 | | HOPE_v2 | 2D detection | 33.4 | 32.4 | 35.4 | 33.5 | | HOPE | 2D segmentation | 52.2 | 54.7 | 57.2 | 52.9 | | HOPE_v2 | 2D segmentation | 43.5 | 43.4 | 47.5 | 44.9 |

For HOT3D datasets:

huggingface-cli download bop-benchmark/datasets --include "hot3d/object_models/*" --local-dir $DATASET_DIR --repo-type=dataset
python -m src.scripts.render_template_with_pyrender dataset_name=hot3d

For BOP challenge 2023 datasets:

export DATASET_NAME=lmo 
# adding CUDA_VISIBLE_DEVICES=$GPU_IDS if you want to use a specific GPU

# with FastSAM + PBR
python run_inference.py dataset_name=$DATASET_NAME model=cnos_fast

# with FastSAM + PBR + denser viewpoints
python run_inference.py dataset_name=$DATASET_NAME model=cnos_fast model.onboarding_config.level_templates=1

# with FastSAM + PyRender
python run_inference.py dataset_name=$DATASET_NAME model=cnos_fast model.onboarding_config.rendering_type=pyrender

# with SAM + PyRender
python run_inference.py dataset_name=$DATASET_NAME model.onboarding_config.rendering_type=pyrender

# with SAM + PBR
python run_inference.py dataset_name=$DATASET_NAME

After running this script, CNOS will output a prediction file at this dir. You can then evaluate this prediction on BOP challenge website.

Visualize the predictions:

There are two options:

2.a. Using our cus

Related Skills

node-connect

349.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.5k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

349.2k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

349.2k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。