3EED
[NeurIPS 2025 DB Track] 3EED: Ground Everything Everywhere in 3D
Install / Use
/learn @worldbench/3EEDREADME
<p align="center"> <img src="figs/teaser.png" alt="3EED Teaser" width="90%"> </p>
🎯 Highlights
- Cross-Platform: First 3D grounding dataset spanning vehicle, drone, and quadruped platforms
- Large-Scale: Large-scale annotated samples across diverse real-world scenarios
- Multi-Modal: Synchronized RGB, LiDAR, and language annotations
- Challenging: Complex outdoor environments with varying object densities and viewpoints
- Reproducible: Unified evaluation protocols and baseline implementations
:books: Citation
If you find our work helpful, please consider citing:
@inproceedings{li2025_3eed,
title = {{3EED}: Ground Everything Everywhere in {3D}},
author = {Rong Li and Yuhao Dong and Tianshuai Hu and Ao Liang and Youquan Liu and Dongyue Lu and Liang Pan and Lingdong Kong and Junwei Liang and Ziwei Liu},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
volume = {38},
year = {2025}
}
Statistics
<p align="center"> <img src="figs/statics.jpg" alt="3EED Dataset Statistics" width="90%"> </p>📄 For detailed dataset statistics and analysis, please refer to our paper.
📰 News
- [2025.10] Dataset and code are now publicly available on HuggingFace and GitHub! 📦
- [2025.09] 3EED has been accepted to NeurIPS 2025 Dataset and Benchmark Track! 🎉
📚 Table of Contents
- Highlights
- Statistics
- News
- Table of Contents
- Installation
- Pretrained Models
- Dataset
- Quick Start
- License
- Acknowledgements
⚙️ Installation
Environment Setup
We support both CUDA 11 and CUDA 12 environments. Choose the one that matches your system:
<details> <summary><b>Option 1: CUDA 11.1 Environment</b></summary>| Component | Version | |-------------|-----------------| | CUDA | 11.1 | | cuDNN | 8.0.5 | | PyTorch | 1.9.1+cu111 | | torchvision | 0.10.1+cu111 | | Python | 3.10 / 3.11 |
</details> <details> <summary><b>Option 2: CUDA 12.4 Environment</b></summary>| Component | Version | |-------------|-----------------| | CUDA | 12.4 | | cuDNN | 8.0.5 | | PyTorch | 2.5.1+cu124 | | torchvision | 0.20.1+cu124 | | Python | 3.10 / 3.11 |
</details>Custom CUDA Operators
cd ops/teed_pointnet/pointnet2_batch
python setup.py develop
cd ../roiaware_pool3d
python setup.py develop
📦 Pretrained Models
Language Encoder
Download the RoBERTa-base checkpoint from HuggingFace and move it to data/roberta_base.
💾 Dataset
Download
Download the 3EED dataset from HuggingFace:
🔗 Dataset Link: https://huggingface.co/datasets/RRRong/3EED
Dataset Structure
After extraction, organize your dataset as follows:
data/3eed/
├── drone/ # Drone platform data
│ ├── scene-0001/
│ │ ├── 0000_0/
│ │ │ ├── image.jpg
│ │ │ ├── lidar.bin
│ │ │ └── meta_info.json
│ │ └── ...
│ └── ...
├── quad/ # Quadruped platform data
│ ├── scene-0001/
│ └── ...
├── waymo/ # Vehicle platform data
│ ├── scene-0001/
│ └── ...
├── roberta_base/ # Language model weights
└── splits/ # Train/val split files
├── drone_train.txt
├── drone_val.txt
├── quad_train.txt
├── quad_val.txt
├── waymo_train.txt
└── waymo_val.txt
🚀 Quick Start
Training
Train the baseline model on different platform combinations:
# Train on all platforms (recommended for best performance)
bash scripts/train_3eed.sh
# Train on single platform
bash scripts/train_waymo.sh # Vehicle only
bash scripts/train_drone.sh # Drone only
bash scripts/train_quad.sh # Quadruped only
Output:
- Checkpoints:
logs/Train_<datasets>_Val_<datasets>/<timestamp>/ - Training logs:
logs/Train_<datasets>_Val_<datasets>/<timestamp>/log.txt - TensorBoard logs:
logs/Train_<datasets>_Val_<datasets>/<timestamp>/tensorboard/
Evaluation
Evaluate trained models on validation sets:
Quick Evaluation:
# Evaluate on all platforms
bash scripts/val_3eed.sh
# Evaluate on single platform
bash scripts/val_waymo.sh # Vehicle
bash scripts/val_drone.sh # Drone
bash scripts/val_quad.sh # Quadruped
⚠️ Before running evaluation:
- Update
--checkpoint_pathin the script to point to your trained model - Ensure the validation dataset is downloaded and properly structured
Output:
- Results saved to:
<checkpoint_dir>/evaluation/Val_<dataset>/<timestamp>/
Visualization
Visualize predictions with 3D bounding boxes overlaid on point clouds:
# Visualize prediction results
python utils/visualize_pred.py
Visualization Output:
- 🟢 Ground Truth: Green bounding box
- 🔴 Prediction: Red bounding box
Output Structure:
visualizations/
├── waymo/
│ ├── scene-0001_frame-0000/
│ │ ├── pointcloud.ply
│ │ ├── pred/gt_bbox.ply
│ │ └── info.txt
│ └── ...
├── drone/
└── quad/
Baseline Checkpoints
Baseline models and predictions are available at: Huggingface
📄 License
This repository is released under the Apache 2.0 License (see LICENSE).
🙏 Acknowledgements
We sincerely thank the following projects and teams that made this work possible:
Codebase & Methods
- BUTD-DETR - Bottom-Up Top-Down DETR for visual grounding
- WildRefer - Wild referring expression comprehension
Dataset Sources
- Waymo Open Dataset - Vehicle platform data
- M3ED - Drone and quadruped platform data
Related Projects
| :sunglasses: Awesome | Projects | |:-:|:-| | | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/worldbench_survey.webp"> | 3D and 4D World Modeling: A Survey<br>[GitHub Repo] - [Project Page] - [Paper] | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/worldlens.png"> | WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World<br>[GitHub Repo] - [Project Page] - [Paper] | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/lidarcrafter.png"> | LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences<br>[GitHub Repo] - [Project Page] - [Paper] | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/drivebench.png"> | Are VLMs Ready for Autonomous Driving? A Study from Reliability, Data & Metric Perspectives<br>[[GitHub Repo](https:
Related Skills
node-connect
352.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.0kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
