SkillAgentSearch skills...

CAPTRA

[ICCV 2021, Oral] Official PyTorch implementation of CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Install / Use

/learn @HalfSummer11/CAPTRA
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds (ICCV 2021, Oral)

teaser

Introduction

This is the official PyTorch implementation of our paper CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds.

For more information, please visit our project page.

<span class="center"><img src="images/bmvc_ours.gif" width="45%"> <img src="images/real_drawers_ours.gif" width="45%"></span>

<p style="text-align: left; width: 90%; margin-left: 0%"><b>Result visualization on real data.</b> Our models, trained on synthetic data only, can directly generalize to real data, assuming the availability of object masks but not part masks. Left: results on a laptop trajectory from BMVC dataset. Right: results on a real drawers trajectory we captured, where a Kinova Jaco2 arm pulls out the top drawer.</p>

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{weng2021captra,
	title={CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds},
	author={Weng, Yijia and Wang, He and Zhou, Qiang and Qin, Yuzhe and Duan, Yueqi and Fan, Qingnan and Chen, Baoquan and Su, Hao and Guibas, Leonidas J.},
	booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
    	month={October},
	year={2021},
    	pages={13209-13218}
}

Updates

  • [2021/04/14] Released code, data, and pretrained models for testing & evaluation.
  • [2021/04/22] Released code and data for training.
  • [2021/07/22] Our paper has been accepted by ICCV 2021 as an oral presentation!
  • [2021/10/24] Released code for visualization.

Installation

  • Our code has been tested with

    • Ubuntu 16.04, 20.04, and macOS(CPU only)
    • CUDA 11.0
    • Python 3.7.7
    • PyTorch 1.6.0
  • We recommend using Anaconda to create an environment named captra dedicated to this repository, by running the following:

    conda create -n captra python=3.7
    conda activate captra
    
  • Create a directory for code, data, and experiment checkpoints.

    mkdir captra && cd captra
    
  • Clone the repository

    git clone https://github.com/HalfSummer11/CAPTRA.git
    cd CAPTRA
    
  • Install dependencies.

    pip install -r requirements.txt
    
  • Compile the CUDA code for PointNet++ backbone.

    cd network/models/pointnet_lib
    python setup.py install
    

Datasets

  • Create a directory for all datasets under captra

    mkdir data && cd data
    
    • Make sure to point basepath in CAPTRA/configs/obj_config/obj_info_*.yml to your dataset if you put it at a different location.

NOCS-REAL275

mkdir nocs_data && cd nocs_data

Test

  • Download and unzip nocs_model_corners.tar, where the 3D bounding boxes of normalized object models are saved.

    wget http://download.cs.stanford.edu/orion/captra/nocs_model_corners.tar
    tar -xzvf nocs_real_corners.tar
    
  • Create nocs_full to hold original NOCS data. Download and unzip "Real Dataset - Test" from the original NOCS dataset, which contains 6 real test trajectories.

    mkdir nocs_full && cd nocs_full
    wget http://download.cs.stanford.edu/orion/nocs/real_test.zip
    unzip real_test.zip
    
  • Generate and run the pre-processing script

    cd CAPTRA/datasets/nocs_data/preproc_nocs
    # generate the script for data preprocessing
    # parallel & num_proc specifies the number of parallel processes in the following procedure
    python generate_all.py --data_path ../../../../data/nocs_data --data_type=test_only \
    			 --parallel --num_proc=10 > nocs_preproc.sh 
    # the actual data preprocessing
    bash nocs_preproc.sh 
    
  • After the steps above, the folder should look like File Structure - Dataset Folder Structure.

Train

  • Download and unzip "CAMERA Dataset - Training/Test" and "Real Dataset - Training" from the original NOCS dataset under nocs_data/nocs_full

    # current path relative to project root (captra): data/nocs_data/nocs_full
    wget http://download.cs.stanford.edu/orion/nocs/camera_train.zip
    unzip camera_train.zip
    wget http://download.cs.stanford.edu/orion/nocs/camera_val25K.zip
    unzip camera_val25K.zip
    wget http://download.cs.stanford.edu/orion/nocs/real_train.zip
    unzip real_train.zip
    
    • By now, nocs_full should be structured as follows. Note that the depth image (*_depth.png) only contains the synthetic foreground objects. For our purpose, we need a complete depth image composing both the synthetic foreground and the real background.

      nocs_full
      ├── real_test
      ├── real_train 
      ├── train	
      │   ├── 00000
      │   │   ├── 0000_color.png, 0000_coord.png, 0000_depth.png, 0000_mask.png, 0000_meta.txt
      │   │   ├── 0001_color.png, ...
      │   │   └── ...
      │   ├── 00001
      │   └── ...
      └── val				# same structure as train
      
  • Download and unzip "CAMERA Dataset - Composed_depths" from the original NOCS dataset under nocs_data.

    cd ../ # current path relative to project root (captra): data/nocs_data
    wget http://download.cs.stanford.edu/orion/nocs/camera_composed_depth.zip
    unzip camera_composed_depth.zip
    

    This will result in a folder named camera_full_depths, structured as follows.

    camera_full_depths
    ├── train	
    │   ├── 00000
    │   │   ├── 0000_composed.png # depth image containing both synthetic foreground objects 
    │   │   │		        # and the real background
    │   │   ├── 0001_composed.png # rendered object normalized coordinates
    │   │   └── ...
    │   ├── 00001
    │   └── ...
    └── val				# same structure as train
    

    Then copy-merge camera_full_depths with nocs_full.

    # merge camera_full_depth/train/????? to nocs_full/train/?????
    rsync -arv camera_full_depths/ nocs_full/
    rm -r camera_full_depths
    
  • Generate and run the pre-processing script

    cd CAPTRA/datasets/nocs_data/preproc_nocs
    python generate_all.py --data_path ../../../../data/nocs_data --data_type=all --parallel --num_proc=10 > nocs_preproc_all.sh # generate the script for data preprocessing
    # parallel & num_proc specifies the number of parallel processes in the following procedure
    bash nocs_preproc_all.sh # the actual data preprocessing
    
  • After the steps above, the folder should look like [File Structure - Dataset Folder Structure](#File Structure).

SAPIEN Synthetic Articulated Object Dataset

mkdir sapien_data && cd sapien_data

Test

  • Download and unzip object URDF models and testing trajectories

    wget http://download.cs.stanford.edu/orion/captra/sapien_urdf.tar
    wget http://download.cs.stanford.edu/orion/captra/sapien_test.tar
    tar -xzvf sapien_urdf.tar # urdf
    tar -xzvf sapien_test.tar # render_seq
    

Train

  • Download and unzip training data.

    wget http://download.cs.stanford.edu/orion/captra/sapien_train.tar
    tar -xzvf sapien_train.tar # render
    

Testing & Evaluation

Download Pretrained Model Checkpoints

  • Create a folder runs under captra for experiments

    mkdir runs && cd runs
    
  • Download our pretrained model checkpoints for

  • Unzip them in runs

    tar -xzvf nocs_ckpt.tar  
    

    which should give

    runs
    ├── 1_bottle_rot 	# RotationNet for the bottle category
    ├── 1_bottle_coord 	# CoordinateNet for the bottle category
    ├── 2_bowl_rot 
    └── ...
    

Testing

  • To generate pose predictions for a certain category, run the corresponding script in CAPTRA/scripts/track (without further specification, all scripts are run from CAPTRA), e.g. for the bottle category from NOCS-REAL275,

    bash scripts/track/nocs/1_bottle.sh
    
  • The predicted pose will be saved under the experiment folder 1_bottle_rot (see File Structure - Experiment Folder Structure).

  • To test the tracking speed for articulated objects in SAPIEN, make sure to set --batch_size=1 in the script. You may use --dataset_length=500 to avoid running through the whole test set.

Evaluation

  • To evaluate the pose predictions produced in the previous step, uncomment and run the corresponding line in CAPTRA/scripts/eval.sh, e.g. for the bottle category from NOCS-REAL275, the corresponding line is

    python misc/eval/eval.py --config config_track.yml --obj_config obj_info_nocs.yml --obj_category=1 --experiment_dir=../runs/1_bottle_rot
    

Visualization

  • To visualize the pose predictions as 3D bounding boxes, run the corresponding line in CAPTRA/scripts/visualize.sh, e.g. for NOCS-REAL275, running the following will generate bounding boxes for all categories.

    python misc/visualize/visualize_tracking_nocs.py --img_path ../data/nocs_data/nocs_full/real_test --exp_path ../runs --output_path ../nocs_viz --save_fig
    

Training

  • To train the CoordinateNet and RotationNet for a certain category, run the corresponding script in CAPTRA/scripts/train, e.g. for the bottle category from NOCS-REAL275, scripts

Related Skills

View on GitHub
GitHub Stars124
CategoryDevelopment
Updated1mo ago
Forks21

Languages

Python

Security Score

80/100

Audited on Feb 9, 2026

No findings