GeMap
[ECCV'24] Online Vectorized HD Map Construction using Geometry
Install / Use
/learn @cnzzx/GeMapREADME
Zhixin Zhang<sup>1</sup>, Yiyuan Zhang<sup>2</sup>, Xiaohan Ding<sup>3</sup>, Fusheng Jin<sup>1*</sup>, Xiangyu Yue<sup>2</sup>
<sup>1</sup>Beijing Institute of Technology, <sup>2</sup>CUHK, <sup>3</sup>Tencent AI Lab
Website | arXiv | YouTube | Bilibili | Zhihu
</div> <div align='center'> <img src='assets/demo_x0.5.gif' alt='framework' width='88%' height='auto'></img> </div>News
We're working on more powerful and efficient models, please stay tuned.
- (2024/7/2) GeMap is accepted by ECCV 2024 and we release a new GeMap model with 76.0 mAP.
- (2023/12/7) We released the first version of GeMap (with pre-trained checkpoints and evaluation).
- (2023/12/7) GeMap is released on arXiv.
Motivation
- Recent efforts have built strong baselines for online vectorized HD map construction task, however, shapes and relations of instances in urban road systems are still under-explored, such as parallelism, perpendicular, or rectangle-shape.
- As the ego vehicle moves, the shape of a specific instance or the relations between two instances will remain unchanged. To accurately represent such geometric features, invariance to rigid transformation is a fundamental property.
Highlights
This work contributes from two perspectives:
- GeMap achieves new state-of-the-art performance on the NuScenes and Argoverse 2 datasets. Remarkably, it reaches a 71.8% mAP on the large-scale Argoverse 2 dataset, outperforming MapTR V2 by +4.4% and surpassing the 70% mAP threshold for the first time.
- GeMap end-to-end learns Euclidean shapes and relations of map instances beyond basic perception. Specifically, we design a geometric loss based on angle and distance clues, which is robust to rigid transformations. We also decouple self-attention to independently handle Euclidean shapes and relations.
Quantitative Results
NuScenes
| Model | Objective | Backbone | Epoch | mAP | FPS | Config / Log | Checkpoint | | :-------: | :------: | :--: | :--: | :--: | :--: | :--: | :--: | | GeMap | simple | R50 | 110 | 62.7 | 15.6 | config/log | model | | GeMap | simple | Camera(R50) & LiDAR(SEC) | 110 | 66.5 | 6.8 | config/log | model | | GeMap | full | R50 | 110 | 69.4 | 13.3 | config/log | model | | GeMap | full | Swin-T | 110 | 72.0 | 10.0 | config/log | model | | GeMap | full | V2-99 | 110 | 72.2 | 9.5 | config/log | model | | GeMap | full | V2-99(DD3D) | 110 | 76.0 | 9.5 | config/log | model |
Argoverse 2
| Model | Objective | Backbone | Epoch | mAP | FPS | Config / Log | Checkpoint | | :-------: | :------: | :--: | :--: | :--: | :--: | :--: | :--: | | GeMap | simple | R50 | 6 | 63.9 | 13.5 | config/log | model | | GeMap | simple | R50 | 24 | 68.2 | 13.5 | config/log | model | GeMap | full | R50 | 24 | 71.8 | 12.1 | config/log | model |
* All models are trained on 8 NVIDIA RTX3090 GPUs. The speed (Frames Per Second, FPS) is evaluated on a single 3090 GPU.
Visualization Results
Comparison Video
GeMap exhibits more robust predictions in occluded and rotated scenarios, especially under rainy weather conditions.
<div align='center'> <video src='https://github.com/cnzzx/GeMap-dev/assets/71703448/f5213adb-15a3-49a4-94c1-f4fe8e43babd.mp4' width='88%' height='auto'></video> </div>More Cases of GeMap
<div align='center'> <img src="assets/doc_pres.png" width="88%" height="auto"></img> </div>Getting Started
TODO
- [ ] Faster implementation for inference of GeMap.
- [ ] More powerful LiDAR and Camera + LiDAR models.
- [ ] Lighter and faster models with 30+ FPS.
Acknowledgements
GeMap is based on mmdetection3d. It is also greatly inspired by the following outstanding contributions to the open-source community: LSS, GKT, Swin-Transformer, VoVNet, BEVFormer, MapTR, BeMapNet, HDMapNet.
Citation
If the paper and code help your research, please kindly cite:
@article{zhang2023online,
title={Online Vectorized HD Map Construction using Geometry},
author={Zhang, Zhixin and Zhang, Yiyuan and Ding, Xiaohan and Jin, Fusheng and Yue, Xiangyu},
journal={arXiv preprint arXiv:2312.03341},
year={2023}
}
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
isf-agent
a repo for an agent that helps researchers apply for isf funding
