Tr3d

TR3D: Towards Real-Time Indoor 3D Object Detection

Generate Convert Improve

Install / Use

/learn @higginsjoshuag/Tr3d

About this skill

Quality Score

0/100

README

TR3D: Towards Real-Time Indoor 3D Object Detection

News:

February, 2023. TR3D on all 3 datasets is now supported in mmdetection3d as a project.
:fire: February, 2023. TR3D is now state-of-the-art on paperswithcode on SUN RGB-D and S3DIS.

This repository contains an implementation of TR3D, a 3D object detection method introduced in our paper:

TR3D: Towards Real-Time Indoor 3D Object Detection Danila Rukhovich, Anna Vorontsova, Anton Konushin Samsung AI Center Moscow https://arxiv.org/abs/2302.02858

Installation

For convenience, we provide a Dockerfile.

Alternatively, you can install all required packages manually. This implementation is based on mmdetection3d framework. Please refer to the original installation guide getting_started.md, including MinkowskiEngine installation, replacing open-mmlab/mmdetection3d with samsunglabs/tr3d.

Most of the TR3D-related code locates in the following files: detectors/mink_single_stage.py, detectors/tr3d_ff.py, dense_heads/tr3d_head.py, necks/tr3d_neck.py.

Getting Started

Please see getting_started.md for basic usage examples. We follow the mmdetection3d data preparation protocol described in scannet, sunrgbd, and s3dis.

Training

To start training, run train with TR3D configs:

python tools/train.py configs/tr3d/tr3d_scannet-3d-18class.py

Testing

Test pre-trained model using test with TR3D configs:

python tools/test.py configs/tr3d/tr3d_scannet-3d-18class.py \
    work_dirs/tr3d_scannet-3d-18class/latest.pth --eval mAP

Visualization

Visualizations can be created with test script. For better visualizations, you may set score_thr in configs to 0.3:

python tools/test.py configs/tr3d/tr3d_scannet-3d-18class.py \
    work_dirs/tr3d_scannet-3d-18class/latest.pth --eval mAP --show \
    --show-dir work_dirs/tr3d_scannet-3d-18class

Models

The metrics are obtained in 5 training runs followed by 5 test runs. We report both the best and the average values (the latter are given in round brackets). Inference speed (scenes per second) is measured on a single NVidia RTX 4090.

TR3D 3D Detection

| Dataset | mAP@0.25 | mAP@0.5 | Scenes per sec.| Download | |:-------:|:--------:|:-------:|:-------------------:|:--------:| | ScanNet | 72.9 (72.0) | 59.3 (57.4) | 23.7 | model | log | config | | SUN RGB-D | 67.1 (66.3) | 50.4 (49.6) | 27.5 | model | log | config | | S3DIS | 74.5 (72.1) | 51.7 (47.6) | 21.0 | model | log | config |

RGB + PC 3D Detection on SUN RGB-D

| Model | mAP@0.25 | mAP@0.5 | Scenes per sec.| Download | |:-----:|:--------:|:-------:|:-------------------:|:--------:| | ImVoteNet | 63.4 | - | 14.8 | instruction | | VoteNet+FF | 64.5 (63.7) | 39.2 (38.1) | - | model | log | config | | TR3D+FF | 69.4 (68.7) | 53.4 (52.4) | 17.5 | model | log | config |

Example Detections

Citation

If you find this work useful for your research, please cite our paper:

@misc{rukhovich2023tr3d,
  doi = {10.48550/ARXIV.2302.02858},
  url = {https://arxiv.org/abs/2302.02858},
  author = {Rukhovich, Danila and Vorontsova, Anna and Konushin, Anton},
  title = {TR3D: Towards Real-Time Indoor 3D Object Detection},
  publisher = {arXiv},
  year = {2023}
}

Related Skills

node-connect

349.7k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.7k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

349.7k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

349.7k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。