Dressrecon

DressRecon: Freeform 4D Human Reconstruction from Monocular Video (3DV'25 Oral)

Generate Convert Improve

Install / Use

/learn @jefftan969/Dressrecon

About this skill

Quality Score

0/100

README

DressRecon: Freeform 4D Human Reconstruction from Monocular Video

Website | Paper

3DV 2025 (Oral)

Jeff Tan, Donglai Xiang, Shubham Tulsiani, Deva Ramanan, Gengshan Yang

Teaser video

About

DressRecon is a method for freeform 4D human reconstruction, with support for dynamic clothing and human-object interactions. Given a monocular video as input, it reconstructs a time-consistent body model, including shape, appearance, articulation of body+clothing, and 3D tracks. The software is licensed under the MIT license.

Release Plan

[x] Training code
[ ] Data preprocessing scripts
[ ] Pretrained checkpoints

Installation

Clone DressRecon

git clone https://github.com/jefftan969/dressrecon
cd dressrecon

Create the environment

conda create -y -n dressrecon -c conda-forge python=3.9
conda activate dressrecon
pip install torch==2.4.1
conda install -y -c conda-forge absl-py numpy==1.24.4 tqdm trimesh tensorboard opencv scipy scikit-image matplotlib urdfpy networkx=3 einops imageio-ffmpeg pyrender open3d
pip install pysdf geomloss
pip install -e .
# (Optional) Visualization dependencies
pip install viser

Install third-party libraries

# CUDA kernels for fast dual-quaternion skinning
pip install -e lab4d/third_party/dqtorch
# CUDA kernels for 3D Gaussian refinement
pip install -e lab4d/diffgs/third_party/simple-knn
pip install git+https://github.com/gengshan-y/gsplat-dev.git

Data

We provide two ways to obtain data: download our preprocessed data (dna-0121_02.zip), or process your own data following the instructions (coming soon!).

<details><summary>Expand to download preprocessed data for sequences in the paper:</summary>

Each sequence is about 1.7 GB compressed and 2.3GB uncompressed.

</details>

To unzip preprocessed data:

mkdir database/processed
cd database/processed
unzip {path_to_downloaded_zip}
cd ../..

Demo

This example shows how to reconstruct a human from a monocular video. To begin, download preprocessed data above or process your own videos.

Training neural fields

To optimize a body model given an input monocular video:

python lab4d/train.py --num_rounds 240 --imgs_per_gpu 96 --seqname {data_sequence_name} --logname {name_of_this_experiment}

On a 4090 GPU, 240 optimization rounds should take ~8-9 hours. Checkpoints are saved to logdir/{seqname}-{logname}. For faster experiments, you can pass --num_rounds 40 to train a lower-quality model that's not fully converged yet.

<details><summary>The training command above assumes 24GB of GPU memory. Expand if you have 10GB GPU memory:</summary>

python lab4d/train.py --num_rounds 240 --imgs_per_gpu 32 --grad_accum 3 --seqname {data_sequence_name} --logname {name_of_this_experiment}

</details> <details><summary>Expand for a description of checkpoint contents:</summary>

logdir/{seqname}-{logname}
  - ckpt_*.pth         => (Saved model checkpoints)
  - metadata.pth       => (Saved dataset metadata)
  - opts.log           => (Command-line options)
  - params.txt         => (Learning rates for each optimizable parameter)
  - uncertainty/*.npy  => (Per-pixel uncertainty cache for weighted pixel sampling during training)
  - *-fg-gauss.ply     => (Body Gaussians over all optimization iterations)
  - *-fg-proxy.ply     => (Body shape and cameras over all optimization iterations)
  - *-fg-sdf.ply       => (Deformation fields range of influence over all optimization iterations)

</details>

Exporting meshes

To extract time-consistent meshes, and render the shape and body+clothing Gaussians:

python lab4d/export.py --flagfile=logdir/{seqname}-{logname}/opts.log

Results are saved to logdir/{seqname}-{logname}/export_0000.

The output directory structure is as follows:

logdir/{seqname}-{logname}
  - export_0000
      - render-shape-*.mp4     => (Rendered time-consistent body shapes)
      - render-boneonly-*.mp4  => (Rendered body+clothing Gaussians)
      - render-bone-*.mp4      => (Body+clothing Gaussians, overlaid on top of body shape)
      - fg-mesh.ply            => (Canonical shape exported as a mesh)
      - camera.json            => (Saved camera intrinsics)
      - fg
          - bone/*.ply         => (Time-varying body+clothing Gaussians, exported as meshes)
          - mesh/*.ply         => (Time-varying body shape, exported as time-consistent meshes)
          - motion.json        => (Saved camera poses and time-varying articulations)
  - renderings_proxy
      - fg.mp4                 => (Birds-eye-view of cameras and body shape over all optimization iterations)

<details><summary>Expand for scripts to visualize the canonical shape, deformation by body Gaussians only, or deformation by clothing Gaussians only:</summary>

python lab4d/export.py --flag canonical --flagfile=logdir/{seqname}-{logname}/opts.log
python lab4d/export.py --flag body_only --flagfile=logdir/{seqname}-{logname}/opts.log
python lab4d/export.py --flag cloth_only --flagfile=logdir/{seqname}-{logname}/opts.log

</details>

Rendering neural fields

To render RGB, normals, masks, and the other modalities described below:

python lab4d/render.py --flagfile=logdir/{seqname}-{logname}/opts.log

On a 4090 GPU, rendering each frame at 512x512 resolution should take ~20 seconds. Results are saved to logdir/{seqname}-{logname}/renderings_0000. For faster rendering, you can render every N-th frame by passing --stride <N> above.

The output directory structure is as follows:

logdir/{seqname}-{logname}
  - renderings_0000
      - ref
          - depth.mp4    => (Rendered depth, colorized as RGB)
          - feature.mp4  => (Rendered features)
          - mask.mp4     => (Rendered mask)
          - normal.mp4   => (Rendered normal)
          - rgb.mp4      => (Rendered RGB)

<details><summary>Expand to describe additional videos that are rendered for debugging purposes:</summary>

logdir/{seqname}-{logname}
  - renderings_0000
      - ref
          - eikonal.mp4     => (Rendered magnitude of eikonal loss)
          - gauss_mask.mp4  => (Rendered silhouette of deformation field)
          - ref_*.mp4       => (Rendered input signals, after cropping to tight bounding box and reshaping)
          - sdf.mp4         => (Rendered magnitude of signed distance field)
          - vis.mp4         => (Rendered visibility field)
          - xyz.mp4         => (Rendered world-frame canonical XYZ coordinates)
          - xyz_cam.mp4     => (Rendered camera-frame XYZ coordinates)
          - xyz_t.mp4       => (Rendered world-frame time-t XYZ coordinates)

</details>

3D Gaussian refinement

Training refined 3D Gaussian model

This step requires a pretrained model from the previous section, which we assume is located at logdir/{seqname}-{logname}. To run refinement with 3D Gaussians:

bash scripts/train_diffgs_refine.sh {seqname} {logname}

On a 4090 GPU, 240 optimization rounds should take ~8-9 hours. Checkpoints are saved to logdir/{seqname}-diffgs-{logname}. For faster experiments, you can use --num_rounds 40 to train a lower-quality model that's not fully converged yet.

<details><summary>The training script above assumes 24GB of GPU memory. Expand if you have 10GB GPU memory:</summary>

bash scripts/train_diffgs_refine.sh {seqname} {logname} --imgs_per

Related Skills

qqbot-channel

349.9k

QQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口，自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。

docs-writer

100.4k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

349.9k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

Design

Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t

jefftan969

View profile

View on GitHub

GitHub Stars137

CategoryContent

Updated7d ago

Forks6

jefftan969/dressrecon

Languages

Python

Security Score

95/100

Audited on Mar 30, 2026

No findings