SkillAgentSearch skills...

OPEN

[ECCV 2024] OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection

Install / Use

/learn @AlmoonYsl/OPEN
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<div align="center">

OPEN

OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection

Jinghua Hou <sup>1</sup>, Tong Wang <sup>2</sup>, Xiaoqing Ye <sup>2</sup>, Zhe Liu <sup>1</sup>, Shi Gong <sup>2</sup>, Xiao Tan <sup>2</sup>,<br> Errui Ding <sup>2</sup>, Jingdong Wang <sup>2</sup>, Xiang Bai <sup>1,✉</sup> <br> <sup>1</sup> Huazhong University of Science and Technology, <sup>2</sup> Baidu Inc. <br> ✉ Corresponding author. <br>

ECCV 2024

arXiv

</div>

Abstract Accurate depth information is crucial for enhancing the performance of multi-view 3D object detection. Despite the success of some existing multi-view 3D detectors utilizing pixel-wise depth supervision, they overlook two significant phenomena: 1) the depth supervision obtained from LiDAR points is usually distributed on the surface of the object, which is not so friendly to existing DETR-based 3D detectors due to the lack of the depth of 3D object center; 2) for distant objects, fine-grained depth estimation of the whole object is more challenging. Therefore, we argue that the object-wise depth (or 3D center of the object) is essential for accurate detection. In this paper, we propose a new multi-view 3D object detector named OPEN, whose main idea is to effectively inject object-wise depth information into the network through our proposed object-wise position embedding. Specifically, we first employ an object-wise depth encoder, which takes the pixel-wise depth map as a prior, to accurately estimate the object-wise depth. Then, we utilize the proposed object-wise position embedding to encode the object-wise depth information into the transformer decoder, thereby producing 3D object-aware features for final detection. Extensive experiments verify the effectiveness of our proposed method. Furthermore, OPEN achieves a new state-of-the-art performance with 64.4% NDS and 56.7% mAP on the nuScenes test benchmark.

arch

News

  • 2024.07.02: Our another work SEED has also been accepted by ECCV 2024. 🎉
  • 2024.07.02: OPEN has been accepted by ECCV 2024. 🎉

Results

  • nuScenes Val Set

    The reproduced results are slightly higher than the reported results in the paper.

    R50:56.4 -> 56.5 NDS, 46.5 -> 47.0mAP

    R101: 60.6 -> 60.6 NDS, 51.6 -> 51.9 mAP

| Model | Backbone | Pretrain | Resolution | NDS | mAP | Config | Download | |:-----:|:--------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:----------:|:----:|:----:|:------------------------------------------------:|:--------------------------------------------------------------------------------------------:| | OPEN | V2-99 | DD3D | 320 x 800 | 61.3 | 52.1 | config | model | | OPEN | R50 | nuImage | 256 x 704 | 56.5 | 47.0 | config | model | | OPEN | R101 | nuImage | 512 x 1408 | 60.6 | 51.9 | config | model |

  • nuScenes Test Set

| Model | Backbone | Pretrain | Resolution | NDS | mAP | Config | Download | |:-----:|:--------:|:---------:|:----------:|:--------:|:----:|:--------------------------------------------------------:|:-------------------------------------------------------------------------------------------:| | OPEN | V2-99 | DD3D | 640 x 1600 | 64.4 | 56.7 | config | model |

TODO

  • [x] Release the paper.
  • [x] Release the code of OPEN.

Citation

@inproceedings{
  hou2024open,
  title={OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection},
  author={Hou, Jinghua and Wang, Tong and Ye, Xiaoqing and Liu, Zhe and Tan, Xiao and Ding, Errui and Wang, Jingdong and Bai, Xiang},
  booktitle={ECCV},
  year={2024},
}

Acknowledgements

We thank these great works and open-source repositories: 3DPPE, StreamPETR, and MMDetection3D.

View on GitHub
GitHub Stars78
CategoryDevelopment
Updated1mo ago
Forks1

Languages

Python

Security Score

95/100

Audited on Mar 5, 2026

No findings