Tr3d
TR3D: Towards Real-Time Indoor 3D Object Detection
Install / Use
/learn @higginsjoshuag/Tr3dREADME
TR3D: Towards Real-Time Indoor 3D Object Detection
News:
- February, 2023. TR3D on all 3 datasets is now supported in mmdetection3d as a project.
- :fire: February, 2023. TR3D is now state-of-the-art on paperswithcode on SUN RGB-D and S3DIS.
This repository contains an implementation of TR3D, a 3D object detection method introduced in our paper:
TR3D: Towards Real-Time Indoor 3D Object Detection<br> Danila Rukhovich, Anna Vorontsova, Anton Konushin <br> Samsung AI Center Moscow <br> https://arxiv.org/abs/2302.02858
Installation
For convenience, we provide a Dockerfile.
Alternatively, you can install all required packages manually. This implementation is based on mmdetection3d framework.
Please refer to the original installation guide getting_started.md, including MinkowskiEngine installation, replacing open-mmlab/mmdetection3d with samsunglabs/tr3d.
Most of the TR3D-related code locates in the following files:
detectors/mink_single_stage.py,
detectors/tr3d_ff.py,
dense_heads/tr3d_head.py,
necks/tr3d_neck.py.
Getting Started
Please see getting_started.md for basic usage examples. We follow the mmdetection3d data preparation protocol described in scannet, sunrgbd, and s3dis.
Training
To start training, run train with TR3D configs:
python tools/train.py configs/tr3d/tr3d_scannet-3d-18class.py
Testing
Test pre-trained model using test with TR3D configs:
python tools/test.py configs/tr3d/tr3d_scannet-3d-18class.py \
work_dirs/tr3d_scannet-3d-18class/latest.pth --eval mAP
Visualization
Visualizations can be created with test script.
For better visualizations, you may set score_thr in configs to 0.3:
python tools/test.py configs/tr3d/tr3d_scannet-3d-18class.py \
work_dirs/tr3d_scannet-3d-18class/latest.pth --eval mAP --show \
--show-dir work_dirs/tr3d_scannet-3d-18class
Models
The metrics are obtained in 5 training runs followed by 5 test runs. We report both the best and the average values (the latter are given in round brackets). Inference speed (scenes per second) is measured on a single NVidia RTX 4090.
TR3D 3D Detection
| Dataset | mAP@0.25 | mAP@0.5 | Scenes <br> per sec.| Download | |:-------:|:--------:|:-------:|:-------------------:|:--------:| | ScanNet | 72.9 (72.0) | 59.3 (57.4) | 23.7 | model | log | config | | SUN RGB-D | 67.1 (66.3) | 50.4 (49.6) | 27.5 | model | log | config | | S3DIS | 74.5 (72.1) | 51.7 (47.6) | 21.0 | model | log | config |
RGB + PC 3D Detection on SUN RGB-D
| Model | mAP@0.25 | mAP@0.5 | Scenes <br> per sec.| Download | |:-----:|:--------:|:-------:|:-------------------:|:--------:| | ImVoteNet | 63.4 | - | 14.8 | instruction | | VoteNet+FF | 64.5 (63.7) | 39.2 (38.1) | - | model | log | config | | TR3D+FF | 69.4 (68.7) | 53.4 (52.4) | 17.5 | model | log | config |
Example Detections
<p align="center"><img src="./resources/github.png" alt="drawing" width="90%"/></p>Citation
If you find this work useful for your research, please cite our paper:
@misc{rukhovich2023tr3d,
doi = {10.48550/ARXIV.2302.02858},
url = {https://arxiv.org/abs/2302.02858},
author = {Rukhovich, Danila and Vorontsova, Anna and Konushin, Anton},
title = {TR3D: Towards Real-Time Indoor 3D Object Detection},
publisher = {arXiv},
year = {2023}
}
Related Skills
node-connect
349.7kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.7kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.7kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
