Dressrecon
DressRecon: Freeform 4D Human Reconstruction from Monocular Video (3DV'25 Oral)
Install / Use
/learn @jefftan969/DressreconREADME
DressRecon: Freeform 4D Human Reconstruction from Monocular Video
Website | Paper
3DV 2025 (Oral)
Jeff Tan, Donglai Xiang, Shubham Tulsiani, Deva Ramanan, Gengshan Yang<br>

About
DressRecon is a method for freeform 4D human reconstruction, with support for dynamic clothing and human-object interactions. Given a monocular video as input, it reconstructs a time-consistent body model, including shape, appearance, articulation of body+clothing, and 3D tracks. The software is licensed under the MIT license.
Release Plan
- [x] Training code
- [ ] Data preprocessing scripts
- [ ] Pretrained checkpoints
Installation
- Clone DressRecon
git clone https://github.com/jefftan969/dressrecon
cd dressrecon
- Create the environment
conda create -y -n dressrecon -c conda-forge python=3.9
conda activate dressrecon
pip install torch==2.4.1
conda install -y -c conda-forge absl-py numpy==1.24.4 tqdm trimesh tensorboard opencv scipy scikit-image matplotlib urdfpy networkx=3 einops imageio-ffmpeg pyrender open3d
pip install pysdf geomloss
pip install -e .
# (Optional) Visualization dependencies
pip install viser
- Install third-party libraries
# CUDA kernels for fast dual-quaternion skinning
pip install -e lab4d/third_party/dqtorch
# CUDA kernels for 3D Gaussian refinement
pip install -e lab4d/diffgs/third_party/simple-knn
pip install git+https://github.com/gengshan-y/gsplat-dev.git
Data
We provide two ways to obtain data: download our preprocessed data (dna-0121_02.zip), or process your own data following the instructions (coming soon!).
<details><summary>Expand to download preprocessed data for sequences in the paper:</summary><p>Each sequence is about 1.7 GB compressed and 2.3GB uncompressed.
- dna-0008_01.zip
- dna-0047_01.zip
- dna-0047_12.zip
- dna-0102_02.zip
- dna-0113_06.zip
- dna-0121_02.zip
- dna-0123_02.zip
- dna-0128_04.zip
- dna-0133_07.zip
- dna-0152_01.zip
- dna-0166_04.zip
- dna-0188_02.zip
- dna-0206_04.zip
- dna-0239_01.zip
To unzip preprocessed data:
mkdir database/processed
cd database/processed
unzip {path_to_downloaded_zip}
cd ../..
Demo
This example shows how to reconstruct a human from a monocular video. To begin, download preprocessed data above or process your own videos.
Training neural fields
To optimize a body model given an input monocular video:
python lab4d/train.py --num_rounds 240 --imgs_per_gpu 96 --seqname {data_sequence_name} --logname {name_of_this_experiment}
On a 4090 GPU, 240 optimization rounds should take ~8-9 hours. Checkpoints are saved to logdir/{seqname}-{logname}. For faster experiments, you can pass --num_rounds 40 to train a lower-quality model that's not fully converged yet.
python lab4d/train.py --num_rounds 240 --imgs_per_gpu 32 --grad_accum 3 --seqname {data_sequence_name} --logname {name_of_this_experiment}
</details>
<details><summary>Expand for a description of checkpoint contents:</summary><p>
logdir/{seqname}-{logname}
- ckpt_*.pth => (Saved model checkpoints)
- metadata.pth => (Saved dataset metadata)
- opts.log => (Command-line options)
- params.txt => (Learning rates for each optimizable parameter)
- uncertainty/*.npy => (Per-pixel uncertainty cache for weighted pixel sampling during training)
- *-fg-gauss.ply => (Body Gaussians over all optimization iterations)
- *-fg-proxy.ply => (Body shape and cameras over all optimization iterations)
- *-fg-sdf.ply => (Deformation fields range of influence over all optimization iterations)
</p></details>
Exporting meshes
To extract time-consistent meshes, and render the shape and body+clothing Gaussians:
python lab4d/export.py --flagfile=logdir/{seqname}-{logname}/opts.log
Results are saved to logdir/{seqname}-{logname}/export_0000.
The output directory structure is as follows:
logdir/{seqname}-{logname}
- export_0000
- render-shape-*.mp4 => (Rendered time-consistent body shapes)
- render-boneonly-*.mp4 => (Rendered body+clothing Gaussians)
- render-bone-*.mp4 => (Body+clothing Gaussians, overlaid on top of body shape)
- fg-mesh.ply => (Canonical shape exported as a mesh)
- camera.json => (Saved camera intrinsics)
- fg
- bone/*.ply => (Time-varying body+clothing Gaussians, exported as meshes)
- mesh/*.ply => (Time-varying body shape, exported as time-consistent meshes)
- motion.json => (Saved camera poses and time-varying articulations)
- renderings_proxy
- fg.mp4 => (Birds-eye-view of cameras and body shape over all optimization iterations)
<details><summary>Expand for scripts to visualize the canonical shape, deformation by body Gaussians only, or deformation by clothing Gaussians only:</summary><p>
python lab4d/export.py --flag canonical --flagfile=logdir/{seqname}-{logname}/opts.log
python lab4d/export.py --flag body_only --flagfile=logdir/{seqname}-{logname}/opts.log
python lab4d/export.py --flag cloth_only --flagfile=logdir/{seqname}-{logname}/opts.log
</p></details>
Rendering neural fields
To render RGB, normals, masks, and the other modalities described below:
python lab4d/render.py --flagfile=logdir/{seqname}-{logname}/opts.log
On a 4090 GPU, rendering each frame at 512x512 resolution should take ~20 seconds. Results are saved to logdir/{seqname}-{logname}/renderings_0000. For faster rendering, you can render every N-th frame by passing --stride <N> above.
The output directory structure is as follows:
logdir/{seqname}-{logname}
- renderings_0000
- ref
- depth.mp4 => (Rendered depth, colorized as RGB)
- feature.mp4 => (Rendered features)
- mask.mp4 => (Rendered mask)
- normal.mp4 => (Rendered normal)
- rgb.mp4 => (Rendered RGB)
<details><summary>Expand to describe additional videos that are rendered for debugging purposes:</summary><p>
logdir/{seqname}-{logname}
- renderings_0000
- ref
- eikonal.mp4 => (Rendered magnitude of eikonal loss)
- gauss_mask.mp4 => (Rendered silhouette of deformation field)
- ref_*.mp4 => (Rendered input signals, after cropping to tight bounding box and reshaping)
- sdf.mp4 => (Rendered magnitude of signed distance field)
- vis.mp4 => (Rendered visibility field)
- xyz.mp4 => (Rendered world-frame canonical XYZ coordinates)
- xyz_cam.mp4 => (Rendered camera-frame XYZ coordinates)
- xyz_t.mp4 => (Rendered world-frame time-t XYZ coordinates)
</p></details>
3D Gaussian refinement
Training refined 3D Gaussian model
This step requires a pretrained model from the previous section, which we assume is located at logdir/{seqname}-{logname}. To run refinement with 3D Gaussians:
bash scripts/train_diffgs_refine.sh {seqname} {logname}
On a 4090 GPU, 240 optimization rounds should take ~8-9 hours. Checkpoints are saved to logdir/{seqname}-diffgs-{logname}. For faster experiments, you can use --num_rounds 40 to train a lower-quality model that's not fully converged yet.
bash scripts/train_diffgs_refine.sh {seqname} {logname} --imgs_per
Related Skills
qqbot-channel
349.9kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
docs-writer
100.4k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
349.9kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Design
Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t
