OverlapTransformer
[RAL/IROS 2022] OverlapTransformer: An Efficient and Yaw-Angle-Invariant Transformer Network for LiDAR-Based Place Recognition.
Install / Use
/learn @haomo-ai/OverlapTransformerREADME
OverlapTransformer
The code for our paper for RAL/IROS 2022:
OverlapTransformer: An Efficient and Yaw-Angle-Invariant Transformer Network for LiDAR-Based Place Recognition. [paper]
OverlapTransformer (OT) is a novel lightweight neural network exploiting the LiDAR range images to achieve fast execution with less than 4 ms per frame using python, less than 2 ms per frame using C++ in LiDAR similarity estimation. It is a newer version of our previous OverlapNet, which is faster and more accurate in LiDAR-based loop closure detection and place recognition.
Developed by Junyi Ma, Xieyuanli Chen and Jun Zhang.
OverlapTransformer is not a sophisticated model but holds natural mathematical properties in a lightweight style for surround-view observations. It can be seamlessly integrated into any range-image-based approach as a backbone, e.g., EINet (IROS 2024). Welcome to post results in issues if you have tried other input types (e.g., RGBD camera, Livox, 16/32-beam LiDAR).
News!
[2024-06] EINet successfully integrates OT into its framework as a powerful submodule, which is accepted by IROS 2024!
[2023-09] The multi-view extension of OT, CVTNet, is accepted by IEEE Transactions on Industrial Informatics (TII)! A better long-term recognition performance is available :star:
[2022-12] SeqOT is accepted by IEEE Transactions on Industrial Electronics (TIE)!
[2022-09] We further develop a sequence-enhanced version of OT named as SeqOT, which can be found here.
Haomo Dataset
<img src="https://github.com/haomo-ai/OverlapTransformer/blob/master/query_database_haomo.gif" >Fig. 1 An online demo for finding the top1 candidate with OverlapTransformer on sequence 1-1 (database) and 1-3 (query) of Haomo Dataset.
<div align=center> <img src="https://github.com/haomo-ai/OverlapTransformer/blob/master/Haomo_Dataset/haomo_dataset.png" width="98%"/> </div>Fig. 2 Haomo Dataset which is collected by HAOMO.AI.
More details of Haomo Dataset can be found in dataset description (link).
Table of Contents
- Introduction and Haomo Dataset
- Publication
- Dependencies
- How to Use
- Datasets Used by OT
- Related Work
- License
Publication
If you use the code or the Haomo dataset in your academic work, please cite our paper (PDF):
@ARTICLE{ma2022ral,
author={Ma, Junyi and Zhang, Jun and Xu, Jintao and Ai, Rui and Gu, Weihao and Chen, Xieyuanli},
journal={IEEE Robotics and Automation Letters},
title={OverlapTransformer: An Efficient and Yaw-Angle-Invariant Transformer Network for LiDAR-Based Place Recognition},
year={2022},
volume={7},
number={3},
pages={6958-6965},
doi={10.1109/LRA.2022.3178797}}
Dependencies
We use pytorch-gpu for neural networks.
An nvidia GPU is needed for faster retrival. OverlapTransformer is also fast enough when using the neural network on CPU.
To use a GPU, first you need to install the nvidia driver and CUDA.
-
CUDA Installation guide: link
We use CUDA 11.3 in our work. Other versions of CUDA are also supported but you should choose the corresponding torch version in the following Torch dependences. -
System dependencies:
sudo apt-get update sudo apt-get install -y python3-pip python3-tk sudo -H pip3 install --upgrade pip -
Torch dependences:
Following this link, you can download Torch dependences by pip:pip3 install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio==0.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.htmlor by conda:
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch -
Other Python dependencies (may also work with different versions than mentioned in the requirements file):
sudo -H pip3 install -r requirements.txt
How to Use
We provide a training and test tutorials for KITTI sequences in this repository. The tutorials for Haomo dataset will be released together with the complete Haomo dataset.
We recommend you follow our code and data structures as follows.
Code Structure
├── config
│ ├── config_haomo.yml
│ └── config.yml
├── modules
│ ├── loss.py
│ ├── netvlad.py
│ ├── overlap_transformer_haomo.py
│ └── overlap_transformer.py
├── test
│ ├── test_haomo_topn_prepare.py
│ ├── test_haomo_topn.py
│ ├── test_kitti00_prepare.py
│ ├── test_kitti00_PR.py
│ ├── test_kitti00_topN.py
│ ├── test_results_haomo
│ │ └── predicted_des_L2_dis_bet_traj_forward.npz (to be generated)
│ └── test_results_kitti
│ └── predicted_des_L2_dis.npz (to be generated)
├── tools
│ ├── read_all_sets.py
│ ├── read_samples_haomo.py
│ ├── read_samples.py
│ └── utils
│ ├── gen_depth_data.py
│ ├── split_train_val.py
│ └── utils.py
├── train
│ ├── training_overlap_transformer_haomo.py
│ └── training_overlap_transformer_kitti.py
├── valid
│ └── valid_seq.py
├── visualize
│ ├── des_list.npy
│ └── viz_haomo.py
└── weights
├── pretrained_overlap_transformer_haomo.pth.tar
└── pretrained_overlap_transformer.pth.tar
<!---
To use our code, you need to download the following necessary files and put them in the right positions of the structure above:
- [pretrained_overlap_transformer.pth.tar](https://drive.google.com/file/d/1FNrx9pcDa9NF7z8CFtuTWyauNkeSEFW4/view?usp=sharing): Our pretrained OT on KITTI sequences for easier evaluation.
- [des_list.npy](https://drive.google.com/file/d/13btLQiUokuSHYx229WxtcHGw49-oxmX2/view?usp=sharing): descriptors of Haomo dataset generated by our pretrained OT for visualization.
-->
Dataset Structure
In the file config.yaml, the parameters of data_root are described as follows:
data_root_folder (KITTI sequences root) follows:
├── 00
│ ├── depth_map
│ ├── 000000.png
│ ├── 000001.png
│ ├── 000002.png
│ ├── ...
│ └── overlaps
│ ├── train_set.npz
├── 01
├── 02
├── ...
├── 10
└── loop_gt_seq00_0.3overlap_inactive.npz
valid_scan_folder (KITTI sequence 02 velodyne) contains:
├── 000000.bin
├── 000001.bin
...
gt_valid_folder (KITTI sequence 02 computed overlaps) contains:
├── 02
│ ├── overlap_0.npy
│ ├── overlap_10.npy
...
You need to download or generate the following files and put them in the right positions of the structure above:
- You can find the groud truth for KITTI 00 here: loop_gt_seq00_0.3overlap_inactive.npz
- You can find
gt_valid_folderfor sequence 02 here. - Since the whole KITTI sequences need a large memory, we recommend you generate range images such as
00/depth_map/000000.pngby the preprocessing from Overlap_Localization or its C++ version, and we will not provide these images. Please note that in OverlapTransformer, the.pngimages are used instead of.npyfiles saved in Overlap_Localization. - More directly, you can generate
.pngrange images by the script from OverlapNet updated by us. overlapsfolder of each sequence belowdata_root_folderis provided by the authors of OverlapNet here. You should rename them totrain_set.npz.
Quick Use
For a quick use, you could download our model pretrained on KITTI, and the following two files also should be downloaded :
- calib_file: calibration file from KITTI 00.
- poses_file: pose file from KITTI 00.
Then you should modify demo1_config in the file config.yaml.
Run the demo by:
cd demo
python ./demo_compute_overlap_sim.py
You can see a query scan (000000.bin of KITTI 00) with a reprojected positive sample (000005.bin of KITTI 00) and a reprojected negative sample (000015.bin of KITTI 00), and the corresponding similarity.
<img src="https://github.com/haomo-ai/OverlapTransformer/blob/master/demo.png" width="100%" height="100%">Fig. 3 Demo for calculating overlap and similarity with our approach.
Training
In the file config.yaml, training_seqs are set for the KITTI s
Related Skills
node-connect
349.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.9kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
