LiDARCrafter

[AAAI 2026 Oral] LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences

Generate Convert Improve

Install / Use

/learn @worldbench/LiDARCrafter

About this skill

Quality Score

0/100

README

English | <a href="./README_CN.md">简体中文</a> <img src="images/crane.gif" width="12.5%" align="center"> <h1 align="center"> LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences </h1> <a href="https://alanliang.vercel.app/" target="_blank">Ao Liang</a>     <a href="" target="_blank">Youquan Liu</a>     <a href="https://yuyang-cloud.github.io/" target="_blank">Yu Yang</a>     <a href="https://dylanorange.github.io/" target="_blank">Dongyue Lu</a>     <a href="" target="_blank">Linfeng Li</a> <a href="https://ldkong.com/" target="_blank">Lingdong Kong</a>     <a href="" target="_blank">Huaici Zhao</a>     <a href="https://www.comp.nus.edu.sg/~ooiwt/" target="_blank">Wei Tsang Ooi</a> <a href="https://arxiv.org/abs/2508.03692" target='_blank'> <img src="https://img.shields.io/badge/Paper-%F0%9F%93%96-darkred"> </a>  <a href="https://lidarcrafter.github.io/" target='_blank'> <img src="https://img.shields.io/badge/Project-%F0%9F%94%97-blue"> </a>  <a href="https://huggingface.co/datasets/Pi3DET/data" target='_blank'> <img src="https://img.shields.io/badge/Dataset-%F0%9F%94%97-green"> </a>  <a href="" target='_blank'> <img src="https://visitor-badge.laobi.icu/badge?page_id=lidarcrafter.toolkit"> </a>

<img src="images/teaser.png" alt="Teaser" width="100%"> | | :-: |

In this work, we introduce LiDARCrafter, a unified framework for 4D LiDAR generation and editing. We contribute:

The first 4D generative world model dedicated to LiDAR data, with superior controllability and spatiotemporal consistency.
We introduce a tri-branch 4D layout conditioned pipeline that turns language into an editable 4D layout and uses it to guide temporally stable LiDAR synthesis.
We propose a comprehensive evaluation suite for LiDAR sequence generation, encompassing scene-level, object-level, and sequence-level metrics.
We demonstrate best single-frame and sequence-level LiDAR point cloud generation performance on nuScenes, with improved foreground quality over existing methods.

:books: Citation

If you find this work helpful for your research, please kindly consider citing our paper:

@inproceedings{liang2026lidarcrafter,
    title     = {{LiDARCrafter}: Dynamic {4D} World Modeling from {LiDAR} Sequences},
    author    = {Ao Liang and Youquan Liu and Yu Yang and Dongyue Lu and Linfeng Li and Lingdong Kong and Huaici Zhao and Wei Tsang Ooi},
    booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)},
    volume    = {40},
    year      = {2026},
}

Updates

[11.2025] - LiDARCrafter has been accepted to AAAI 2026 for Oral Presentation. :tada:
[10.2025] - We will soon start organizing the code. All pretrained weights for evaluation can be found at Hugging Face.
[08.2025] - The technical report of LiDARCrafter is available on arXiv.

:gear: Installation

Please configure your environment according to the version information in environment.yml.

:hotsprings: Data Preparation

Create dataset: same as DrivingDiffusion

ln -s ${ROOT_DATA_PATH} ./data/nuscenes

Run bash scripts/create_data.sh for generate:

info with track and state
Updated pkl with scene graph
CLIP feature of scene graph

The file-tree of data is like:

data
├── clips
│   └── nuscenes
│       ├── obj_text_feat.pkl
│       ├── train
│       └── val
├── infos
│   ├── needed_5_framed_token.pkl
│   ├── nuscenes_dbinfos_10sweeps_withvelo.pkl
│   ├── nuscenes_infos_10sweeps_train.pkl
│   ├── nuscenes_infos_10sweeps_val.pkl
│   ├── nuscenes_infos_lidargen_train.pkl
│   ├── nuscenes_infos_lidargen_val.pkl
│   ├── nuscenes_infos_train.pkl
│   ├── nuscenes_infos_val.pkl
│   ├── nuscenes_object_classification_train.pkl
│   └── nuscenes_object_classification_val.pkl
└── nuscenes

:rocket: Getting Started

Evaluation

Train classification model
- python train/train_classification_pointmlp.py
Train uncertainty model
- python train/train_uncertainty_glenet.py

For each generated 1w model

Extract foreground samples
- python evaluation/extract_foreground_samples.py --model ori

:wrench: Generation Framework

Overall Framework

<img src="images/framework.png" alt="Framework" width="100%"> | | :-: |

4D Layout Generation

<img src="images/gen-4d-layout.png" alt="Example" width="100%"> | | :-: |

Single-Frame Generation

<img src="images/gen-single-frame.png" alt="Example" width="100%"> | | :-: |

:snake: Model Zoo

To be updated.

:memo: TODO List

[x] Initial release. 🚀
[x] Release the training code.
[x] Release the inference code.
[x] Release the evaluation code.
[ ] Refine the Readme.md

License

This work is under the <a rel="license" href="https://www.apache.org/licenses/LICENSE-2.0">Apache License Version 2.0</a>, while some specific implementations in this codebase might be under other licenses. Kindly refer to LICENSE.md for a more careful check, if you are using our code for commercial matters.

Acknowledgements

This work is developed based on the MMDetection3D codebase.

<img src="https://github.com/open-mmlab/mmdetection3d/blob/main/resources/mmdet3d-logo.png" width="31%"/> MMDetection3D is an open-source toolbox based on PyTorch, towards the next-generation platform for general 3D perception. It is a part of the OpenMMLab project developed by MMLab.

Part of the benchmarked models are from the OpenPCDet and 3DTrans projects.

Related Projects

| :sunglasses: Awesome | Projects | |:-:|:-| | | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/worldbench_survey.webp"> | 3D and 4D World Modeling: A Survey [GitHub Repo] - [Project Page] - [Paper] | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/worldlens.png"> | WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World [GitHub Repo] - [Project Page] - [Paper] | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/3eed.png"> | 3EED: Ground Everything Everywhere in 3D [GitHub Repo] - [Project Page] - [Paper] | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/drivebench.png"> | Are VLMs Ready for Autonomous Driving? A Study from Reliability, Data & Metric Perspectives [GitHub Repo] - [Project Page] - [Paper] | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/pi3det.png"> | Perspective-Invariant 3D Object Detection [GitHub Repo] - [Project Page] - [Paper] | | <img width="95px" src="https://github.com/ldkong1205/ldkong1205/blob/master/Images/dynamiccity.webp"> | DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes [GitHub Repo] - [Project Page] - [Paper] | | |

Related Skills

node-connect

343.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

92.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

343.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

343.3k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。