Gigapose

[CVPR 2024] PyTorch implementation of GigaPose: Fast and Robust Novel Object Pose Estimation via One Correspondence

Generate Convert Improve

Install / Use

/learn @nv-nguyen/Gigapose

About this skill

Quality Score

0/100

README

<div align="center"> <h2> GigaPose: Fast and Robust Novel Object Pose Estimation

via One Correspondence

<p></p> </h2> <h3> <a href="https://nv-nguyen.github.io/" target="_blank"><nobr>Van Nguyen Nguyen</nobr></a> &emsp; <a href="http://imagine.enpc.fr/~groueixt/" target="_blank"><nobr>Thibault Groueix</nobr></a> &emsp; <a href="https://people.epfl.ch/mathieu.salzmann" target="_blank"><nobr>Mathieu Salzmann</nobr></a> &emsp; <a href="https://vincentlepetit.github.io/" target="_blank"><nobr>Vincent Lepetit</nobr></a> <p></p>

TL;DR: GigaPose is a "hybrid" template-patch correspondence approach to estimate 6D pose of novel objects in RGB images: GigaPose first uses templates, rendered images of the CAD models, to recover the out-of-plane rotation (2DoF) and then uses patch correspondences to estimate the remaining 4DoF.

The codebase is slightly modified to adapt BOP challenge 2024, if you work on BOP challenge 2023 and have any issues, please go back to previous commits:

git checkout 388e8bddd8a5443e284a7f70ad103d03f3f461c5

News 📣

[May 24th, 2024] We added the instructions for running on BOP challenge 2024 datasets and fixed memory requirement issues.
[January 19th, 2024] We released the intructions for estimating pose of novel objects from a single reference image on LM-O dataset.
[January 11th, 2024] We released the code for both training and testing settings. We are working on the demo for custom objects including detecting novel objects with CNOS and novel object pose estimation from a single reference image by reconstructing objects with Wonder3D. Stay tuned!

Citations

@inproceedings{nguyen2024gigaPose,
    title={GigaPose: Fast and Robust Novel Object Pose Estimation via One Correspondence},
    author={Nguyen, Van Nguyen and Groueix, Thibault and Salzmann, Mathieu and Lepetit, Vincent},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2024}}

GigaPose's codebase is mainly derived from CNOS and MegaPose:

@inproceedings{nguyen2023cnos,
    title={CNOS: A Strong Baseline for CAD-based Novel Object Segmentation},
    author={Nguyen, Van Nguyen and Groueix, Thibault and Ponimatkin, Georgy and Lepetit, Vincent and Hodan, Tomas},
    booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
    pages={2134--2140},
    year={2023}
}

@inproceedings{labbe2022megapose,
    title     = {MegaPose: 6D Pose Estimation of Novel Objects via Render \& Compare},
    author    = {Labb\'e, Yann and Manuelli, Lucas and Mousavian, Arsalan and Tyree, Stephen and Birchfield, Stan and Tremblay, Jonathan and Carpentier, Justin and Aubry, Mathieu and Fox, Dieter and Sivic, Josef},
    booktitle = {Proceedings of the 6th Conference on Robot Learning (CoRL)},
    year      = {2022},
}

Installation :construction_worker:

<details><summary>Click to expand</summary>

Environment

conda env create -f environment.yml
conda activate gigapose
bash src/scripts/install_env.sh

# to install megapose
pip install -e .

# to install bop_toolkit 
pip install git+https://github.com/thodan/bop_toolkit.git

Checkpoints

# download cnos detections for BOP'23 dataset
pip install -U "huggingface_hub[cli]"
python -m src.scripts.download_default_detections

# download gigaPose's checkpoints 
python -m src.scripts.download_gigapose

# download megapose's checkpoints
python -m src.scripts.download_megapose

Datasets

All datasets are defined in BOP format.

For BOP challenge 2024 core datasets (HOPE, HANDAL, HOT-3D), download each dataset with the following command:

pip install -U "huggingface_hub[cli]"
export DATASET_NAME=hope
python -m src.scripts.download_test_bop24 test_dataset_name=$DATASET_NAME

For BOP challenge 2023 core datasets (LMO, TLESS, TUDL, ICBIN, ITODD, HB, and TLESS), download all datasets with the following command:

# download testing images and CAD models
python -m src.scripts.download_test_bop23

For BOP challenge 2024 core datasets (HOPE, HANDAL, HOT-3D), render the templates from the CAD models:

python -m src.scripts.render_bop_templates test_dataset_name=hope

For BOP challenge 2023 core datasets (LMO, TLESS, TUDL, ICBIN, ITODD, HB, and TLESS), we provide the pre-rendered templates (from this link) and also the code to render the templates from the CAD models.

# option 1: download pre-rendered templates 
python -m src.scripts.download_bop_templates

# option 2: render templates from CAD models 
python -m src.scripts.render_bop_templates

Here is the structure of $ROOT_DIR after downloading all the above files (similar to BOP HuggingFace Hub):

├── $ROOT_DIR
    ├── datasets/ 
      ├── default_detections/  
      ├── lmo/ 
      ├── ... 
      ├── templates/
    ├── pretrained/ 
      ├── gigaPose_v1.ckpt 
      ├── megapose-models/

[Optional] We also provide the training code/datasets which is not necessary for testing purposes.

<details><summary>Click to expand</summary>

# download training images (> 2TB)
python -m src.scripts.download_train_metaData
python -m src.scripts.download_train_cad 
python -m src.scripts.download_train 

# render templates ( 162 imgs/obj takes ~30mins for gso, ~20hrs for shapenet)
python -m src.scripts.render_gso_templates 
python -m src.scripts.render_shapenet_templates

If you have training datasets pre-downloaded, you can create a symlink to the folder containing the datasets by running:

ln -s /path/to/datasets/gso $ROOT/datasets/gso

[Optional] Trick for faster converging of ISTNetwork (in-plane, scale, translation): using pretrained weights of LoFTR after Kaiming initialization. Please download the weights and put them in $ROOT_DIR/pretrained/loftr_indoor_ot.ckpt.

</details> </details>

Testing on BOP datasets :rocket:

If you want to test on BOP challenge 2024 datasets, please follow the instructions below:

<details><summary>Click to expand</summary>

Running coarse prediction on a single dataset:

# for 6D detection task
python test.py test_dataset_name=hope run_id=$NAME_RUN test_setting=detection

# for 6D localization task (for only core19 datasets)
python test.py test_dataset_name=lmo run_id=$NAME_RUN test_setting=localization

Running refinement on a single dataset:

# for both 6D detection task
python refine.py test_dataset_name=hope run_id=$NAME_RUN test_setting=detection

# for 6D localization task (for only core19 datasets)
python refine.py test_dataset_name=lmo run_id=$NAME_RUN test_setting=localization

Quantitative results on 6D detection task on HOPEv2 datasets:

| Method | Refinement | Model-based unseen | |---------------|---------------|-----------| | GigaPose | -- | 22.57 | | GigaPose | MegaPose | -- |

Evaluating with BOP toolkit:

export INPUT_DIR=DIR_TO_YOUR_PREDICTION_FILE
export FILE_NAME=NAME_PREDICTION_FILE
cd $ROOT_DIR_OF_TOOLKIT
python scripts/eval_bop24_pose.py --results_path $INPUT_DIR --eval_path $INPUT_DIR --result_filenames=$FILE_NAME

</details>

If you want to test on BOP challenge 2023 datasets, please follow the instructions below:

<details><summary>Click to expand</summary>

GigaPose's coarse prediction for seven core datasets of BOP challenge 2023 is available in this link. Below are the steps to reproduce the results and evaluate with BOP toolkit.

Running coarse prediction on a single dataset:

python test.py test_dataset_name=lmo run_id=$NAME_RUN

Running refinement on a single dataset:

python refine.py test_dataset_name=lmo run_id=$NAME_RUN

Running all steps for all 7 core datasets of BOP challenge:

python -m src.scripts.eval_bop

Evaluating with BOP toolkit:

export INPUT_DIR=DIR_TO_YOUR_PREDICTION_FILE
export FILE_NAME=NAME_PREDICTION_FILE
python bop_toolkit/scripts/eval_bop19_pose.py --renderer_type=vispy --results_path $INPUT_DIR --eval_path $INPUT_DIR --result_filenames=$FILE_NAME

</details>

Pose estimation from a single image on LM-O :smiley_cat:

<p align="center"> <img src=./media/wonder3d_meshe

Related Skills

proje

Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

API

A learning and reflection platform designed to cultivate clarity, resilience, and antifragile thinking in an uncertain world.

openclaw-plugin-loom

Loom Learning Graph Skill This skill guides agents on how to use the Loom plugin to build and expand a learning graph over time. Purpose - Help users navigate learning paths (e.g., Nix, German)

nv-nguyen

View profile

View on GitHub

GitHub Stars265

CategoryEducation

Updated3d ago

Forks27

nv-nguyen/gigapose

Languages

Python

Security Score

100/100

Audited on Mar 21, 2026

No findings