Free4D
[ICCV 2025] Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
Install / Use
/learn @TQTQliu/Free4DREADME
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
<div> <div align="center"> <a href='http://tqtqliu.github.io/' target='_blank'>Tianqi Liu</a><sup>1,2,*</sup>  <a href='https://inso-13.github.io/' target='_blank'>Zihao Huang</a><sup>1,2,*</sup>  <a href='https://frozenburning.github.io/' target='_blank'>Zhaoxi Chen</a><sup>2</sup>  <a href='https://wanggcong.github.io/' target='_blank'>Guangcong Wang</a><sup>3</sup>  <a href='https://skhu101.github.io/' target='_blank'>Shoukang Hu</a><sup>2</sup> <br> <a href='https://leoshen917.github.io/' target='_blank'>Liao Shen</a><sup>1,2</sup>  <a href='https://huiqiang-sun.github.io/' target='_blank'>Huiqiang Sun</a><sup>1,2</sup>  <a href='http://english.aia.hust.edu.cn/info/1030/1072.htm' target='_blank'>Zhiguo Cao</a><sup>1</sup>  <a href='https://weivision.github.io/' target='_blank'>Wei Li</a><sup>2,†</sup>  <a href='https://liuziwei7.github.io/' target='_blank'>Ziwei Liu</a><sup>2,†</sup> </div> <div> <div align="center"> <sup>1</sup>Huazhong University of Science and Technology  <sup>2</sup>Nanyang Technological University  <sup>3</sup>Great Bay University </div> <div align="center"> <sup>*</sup>Equal Contribution  <sup>†</sup>Corresponding Authors </div> <p align="center"> <a href="https://arxiv.org/abs/2503.20785" target='_blank'> <img src="http://img.shields.io/badge/arXiv-2503.20785-b31b1b?logo=arxiv&logoColor=b31b1b" alt="ArXiv"> </a> <a href="https://free4d.github.io/" target='_blank'> <img src="https://img.shields.io/badge/Project-Page-red?logo=googlechrome&logoColor=red"> </a> <a href="https://youtu.be/GpHnoSczlhA"> <img src="https://img.shields.io/badge/YouTube-Video-blue?logo=youtube&logoColor=blue"> </a> <a href="#"> <img src="https://visitor-badge.laobi.icu/badge?page_id=TQTQliu.Free4D" alt="Visitors"> </a> </p><details> <summary>Click to expand Free4D introduction</summary>TL;DR: <em>Free4D is a tuning-free framework for 4D scene generation.</em>


🌟 Abstract
We present Free4D, a novel tuning-free framework for 4D scene generation from a single image. Existing methods either focus on object-level generation, making scene-level generation infeasible, or rely on large-scale multi-view video datasets for expensive training, with limited generalization ability due to the scarcity of 4D scene data. In contrast, our key insight is to distill pre-trained foundation models for consistent 4D scene representation, which offers promising advantages such as efficiency and generalizability. 1) To achieve this, we first animate the input image using image-to-video diffusion models followed by 4D geometric structure initialization. 2) To turn this coarse structure into spatial-temporal consistent multiview videos, we design an adaptive guidance mechanism with a point-guided denoising strategy for spatial consistency and a novel latent replacement strategy for temporal coherence. 3) To lift these generated observations into consistent 4D representation, we propose a modulation-based refinement to mitigate inconsistencies while fully leveraging the generated information. The resulting 4D representation enables real-time, controllable spatial-temporal rendering, marking a significant advancement in single-image-based 4D scene generation.
<div style="text-align:center"> <img src="assets/teaser.jpg" width="100%" height="100%"> </div>🛠️ Installation
Clone Free4D
git clone https://github.com/TQTQliu/Free4D.git
cd Free4D
Setup environments
# Create conda environment
conda create -n free4d python=3.11
conda activate free4d
pip install -r requirements.txt
pip install torch==2.4.1 torchvision==0.19.1
# Softmax-splatting
pip install git+https://github.com/Free4D/splatting
# PyTorch3D
conda install https://anaconda.org/pytorch3d/pytorch3d/0.7.8/download/linux-64/pytorch3d-0.7.8-py311_cu121_pyt241.tar.bz2
# Gaussian Splatting renderer
pip install -e lib/submodules/depth-diff-gaussian-rasterization
pip install -e lib/submodules/simple-knn
# Install colmap on headless server
conda install conda-forge::colmap
Download pretrained models
sh scripts/download_ckpt.sh
🚀 Usage
For a single image or text input, we first use an off-the-shelf video generation model (such as KLing, Wan, CogVideo, etc.) to obtain a single-view video. Then, run the following commands for 4D generation. We have provided some pre-generated video frames in data/vc. Below, we take the scene fox as an example.
# Multi-View Video Generation
sh scripts/run_mst.sh fox
# organize data
python lib/utils/organize_mst.py -i output/vc/fox/fox -o data/gs/fox
# colmap
sh ./scripts/colmap.sh fox
# 4DGS training
python train_mst.py -s data/gs/fox --expname fox
# 4DGS rendering.
python render.py --model_path output/gs/fox
The rendered multi-view video will be saved in output/gs/fox/test. For camera trajectory setup, please refer here.
Here is an alternative solution that does not use MonST3R, but instead employs DUSt3R and optical flow.
# Multi-View Generation for the first frame
sh scripts/run_dst.sh fox
# organize data
python lib/utils/organize_dst.py -vd data/vc/fox -mv output/vc/fox_dst/0000 -o data/gs/fox_dst
# colmap
sh ./scripts/colmap.sh fox_dst
# 4DGS training and Flow-guided Multi-View Video Generation
python train_dst.py -s data/gs/fox_dst --expname fox_dst
# 4DGS rendering
python render.py --model_path output/gs/fox_dst
📚 Citation
If you find our work useful for your research, please consider citing our paper:
@InProceedings{Liu_2025_ICCV,
author = {Liu, Tianqi and Huang, Zihao and Chen, Zhaoxi and Wang, Guangcong and Hu, Shoukang and Shen, Liao and Sun, Huiqiang and Cao, Zhiguo and Li, Wei and Liu, Ziwei},
title = {Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2025},
pages = {25571-25582}
}
♥️ Acknowledgement
This work is built on many amazing open-source projects shared by 4DGaussians, ViewCrafter, MonST3R, DUSt3R, and VistaDream. Thanks all the authors for their excellent contributions!
📧 Contact
If you have any questions, please feel free to contact Tianqi Liu <b>(tq_liu at hust.edu.cn)</b>.
Related Skills
qqbot-channel
349.9kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
docs-writer
100.4k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
349.9kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Design
Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t
