3EED

[NeurIPS 2025 DB Track] 3EED: Ground Everything Everywhere in 3D

Generate Convert Improve

Install / Use

/learn @worldbench/3EED

About this skill

Quality Score

0/100

README

<h1 align="center">3EED: Ground Everything Everywhere in 3D</h1> <a href="https://huggingface.co/datasets/RRRong/3EED/tree/main"><img src="https://img.shields.io/badge/Dataset-HuggingFace-ffcc00" /></a> <a href="http://arxiv.org/abs/2511.01755"><img src="https://img.shields.io/badge/arXiv-Paper-b31b1b.svg" /></a> <a href="https://project-3eed.github.io/"><img src="https://img.shields.io/badge/Project-Page-green.svg" /></a> <a href="LICENSE"><img src="https://img.shields.io/badge/License-Apache--2.0-blue.svg" /></a> <img src="https://img.shields.io/badge/Python-3.10%7C3.11-blue" /> <img src="https://img.shields.io/badge/CUDA-11.1%20%7C%2012.4-informational" /> <img src="https://visitor-badge.laobi.icu/badge?page_id=iris0329.3eed" alt="Visitors"/> <a href="https://rongli.tech/">Rong Li</a>*,   <a href="https://scholar.google.com/citations?hl=zh-CN&user=kMui170AAAAJ">Yuhao Dong</a>*,   <a href="https://scholar.google.com/citations?hl=en&user=RJ7NR54AAAAJ">Tianshuai Hu</a>*,   <a href="https://alanliangc.github.io/">Ao Liang</a>*,   <a href="https://scholar.google.com/citations?user=J9a48hMAAAAJ&hl=en">Youquan Liu</a>*,   <a href="https://dylanorange.github.io/">Dongyue Lu</a>* <a href="https://scholar.google.com/citations?user=lSDISOcAAAAJ">Liang Pan</a>,   <a href="https://ldkong.com/">Lingdong Kong</a>†,   <a href="https://junweiliang.me/">Junwei Liang</a>‡,   <a href="https://liuziwei7.github.io/">Ziwei Liu</a>‡ *Equal contribution   †Project lead   ‡Corresponding authors

🎯 Highlights

Cross-Platform: First 3D grounding dataset spanning vehicle, drone, and quadruped platforms
Large-Scale: Large-scale annotated samples across diverse real-world scenarios
Multi-Modal: Synchronized RGB, LiDAR, and language annotations
Challenging: Complex outdoor environments with varying object densities and viewpoints
Reproducible: Unified evaluation protocols and baseline implementations

:books: Citation

If you find our work helpful, please consider citing:

@inproceedings{li2025_3eed,
    title     = {{3EED}: Ground Everything Everywhere in {3D}},
    author    = {Rong Li and Yuhao Dong and Tianshuai Hu and Ao Liang and Youquan Liu and Dongyue Lu and Liang Pan and Lingdong Kong and Junwei Liang and Ziwei Liu},
    booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
    volume    = {38},
    year      = {2025}
}

Statistics

📄 For detailed dataset statistics and analysis, please refer to our paper.

📰 News

[2025.10] Dataset and code are now publicly available on HuggingFace and GitHub! 📦
[2025.09] 3EED has been accepted to NeurIPS 2025 Dataset and Benchmark Track! 🎉

📚 Table of Contents

Highlights
Statistics
News
Table of Contents
Installation
- Environment Setup
- Custom CUDA Operators
Pretrained Models
- Language Encoder
Dataset
- Download
- Dataset Structure
Quick Start
License
Acknowledgements
- Codebase & Methods
- Dataset Sources

⚙️ Installation

Environment Setup

We support both CUDA 11 and CUDA 12 environments. Choose the one that matches your system:

<details> <summary>Option 1: CUDA 11.1 Environment</summary>

| Component | Version | |-------------|-----------------| | CUDA | 11.1 | | cuDNN | 8.0.5 | | PyTorch | 1.9.1+cu111 | | torchvision | 0.10.1+cu111 | | Python | 3.10 / 3.11 |

</details> <details> <summary>Option 2: CUDA 12.4 Environment</summary>

| Component | Version | |-------------|-----------------| | CUDA | 12.4 | | cuDNN | 8.0.5 | | PyTorch | 2.5.1+cu124 | | torchvision | 0.20.1+cu124 | | Python | 3.10 / 3.11 |

</details>

Custom CUDA Operators

cd ops/teed_pointnet/pointnet2_batch
python setup.py develop

cd ../roiaware_pool3d
python setup.py develop

📦 Pretrained Models

Language Encoder

Download the RoBERTa-base checkpoint from HuggingFace and move it to data/roberta_base.

💾 Dataset

Download

Download the 3EED dataset from HuggingFace:

🔗 Dataset Link: https://huggingface.co/datasets/RRRong/3EED

Dataset Structure

After extraction, organize your dataset as follows:

data/3eed/
├── drone/                    # Drone platform data
│   ├── scene-0001/
│   │   ├── 0000_0/
│   │   │   ├── image.jpg
│   │   │   ├── lidar.bin
│   │   │   └── meta_info.json
│   │   └── ...
│   └── ...
├── quad/                     # Quadruped platform data
│   ├── scene-0001/
│   └── ...
├── waymo/                    # Vehicle platform data
│   ├── scene-0001/
│   └── ...
├── roberta_base/            # Language model weights
└── splits/                  # Train/val split files
    ├── drone_train.txt
    ├── drone_val.txt
    ├── quad_train.txt
    ├── quad_val.txt
    ├── waymo_train.txt
    └── waymo_val.txt

🚀 Quick Start

Training

Train the baseline model on different platform combinations:

# Train on all platforms (recommended for best performance)
bash scripts/train_3eed.sh

# Train on single platform
bash scripts/train_waymo.sh   # Vehicle only
bash scripts/train_drone.sh   # Drone only
bash scripts/train_quad.sh    # Quadruped only

Output:

Checkpoints: logs/Train_<datasets>_Val_<datasets>/<timestamp>/
Training logs: logs/Train_<datasets>_Val_<datasets>/<timestamp>/log.txt
TensorBoard logs: logs/Train_<datasets>_Val_<datasets>/<timestamp>/tensorboard/

Evaluation

Evaluate trained models on validation sets:

Quick Evaluation:

# Evaluate on all platforms
bash scripts/val_3eed.sh

# Evaluate on single platform
bash scripts/val_waymo.sh    # Vehicle
bash scripts/val_drone.sh    # Drone
bash scripts/val_quad.sh     # Quadruped

⚠️ Before running evaluation:

Update --checkpoint_path in the script to point to your trained model
Ensure the validation dataset is downloaded and properly structured

Output:

Results saved to: <checkpoint_dir>/evaluation/Val_<dataset>/<timestamp>/

Visualization

Visualize predictions with 3D bounding boxes overlaid on point clouds:

# Visualize prediction results
python utils/visualize_pred.py

Visualization Output:

🟢 Ground Truth: Green bounding box
🔴 Prediction: Red bounding box

Output Structure:

visualizations/
├── waymo/
│   ├── scene-0001_frame-0000/
│   │   ├── pointcloud.ply
│   │   ├── pred/gt_bbox.ply
│   │   └── info.txt
│   └── ...
├── drone/
└── quad/

Baseline Checkpoints

Baseline models and predictions are available at: Huggingface

📄 License

This repository is released under the Apache 2.0 License (see LICENSE).

🙏 Acknowledgements

We sincerely thank the following projects and teams that made this work possible:

Codebase & Methods

BUTD-DETR - Bottom-Up Top-Down DETR for visual grounding
WildRefer - Wild referring expression comprehension

Dataset Sources

Waymo Open Dataset - Vehicle platform data
M3ED - Drone and quadruped platform data

Related Projects

| :sunglasses: Awesome | Projects | |:-:|:-| | | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/worldbench_survey.webp"> | 3D and 4D World Modeling: A Survey [GitHub Repo] - [Project Page] - [Paper] | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/worldlens.png"> | WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World [GitHub Repo] - [Project Page] - [Paper] | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/lidarcrafter.png"> | LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences [GitHub Repo] - [Project Page] - [Paper] | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/drivebench.png"> | Are VLMs Ready for Autonomous Driving? A Study from Reliability, Data & Metric Perspectives [[GitHub Repo](https:

Related Skills

node-connect

352.0k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

352.0k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

352.0k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。