IntrinsicAvatar
[CVPR 2024] IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing
Install / Use
/learn @taconite/IntrinsicAvatarREADME
IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing
Paper | Project Page
<img src="assets/teaser.png" width="800"/>This repository contains the implementation of our paper IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing.
You can find detailed usage instructions for installation, dataset preparation, training and testing below.
If you find our code useful, please cite:
@inproceedings{WangCVPR2024,
title = {IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing},
author = {Shaofei Wang and Bo\v{z}idar Anti\'{c} and Andreas Geiger and Siyu Tang},
booktitle = {IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
year = {2024}
}
Requirements
- This repository is tested on Ubuntu 20.04/CentOS 7.9.2009 with Python 3.10, PyTorch 1.13 and CUDA 11.6.
- NVIDIA GPU with at least 24GB VRAM. NVIDIA GeForce RTX 3090 is recommended.
- GCC/C++ 8 or higher.
openexrlibrary. Can be obtained on Ubuntu viasudo apt install openexr.
Install
Code and SMPL Setup
- Clone the repository
git clone --recursive https://github.com/taconite/IntrinsicAvatar.git
- Download
SMPL v1.0 for Python 2.7from SMPL website (for male and female models), andSMPLIFY_CODE_V2.ZIPfrom SMPLify website (for the neutral model). After downloading, insideSMPL_python_v.1.0.0.zip, male and female models aresmpl/models/basicmodel_m_lbs_10_207_0_v1.0.0.pklandsmpl/models/basicModel_f_lbs_10_207_0_v1.0.0.pkl, respectively. Insidempips_smplify_public_v2.zip, the neutral model issmplify_public/code/models/basicModel_neutral_lbs_10_207_0_v1.0.0.pkl. Rename these.pklfiles and copy them to subdirectories under./data/SMPLX/smpl/. Eventually, the./datafolder should have the following structure:
data
└-- SMPLX
└-- smpl
├-- SMPL_FEMALE.pkl
├-- SMPL_MALE.pkl
└-- SMPL_NEUTRAL.pkl
Environment Setup
- Create a Python virtual environment via either
venvorconda - Install PyTorch>=1.13 here based on the package management tool you are using and your cuda version (older PyTorch versions may work but have not been tested)
- Install tiny-cuda-nn PyTorch extension:
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch - Install other packages:
pip install -r requirements.txt - Set
PYTHONPATHto the current working directory:export PYTHONPATH=${PWD}
Dataset Preparation
Please follow the steps in DATASET.md.
Training
PeopleSnapshot and RANA
Training and validation use wandb for logging, which is free to use but requires online register. If you don't want to use it, append logger.offline=true to your command.
To train on the male-3-casual sequence of PeopleSnapshot, use the following command:
python launch.py dataset=peoplesnapshot/male-3-casual tag=IA-male-3-casual
Checkpoints, code snapshot, and visualizations will be saved under the directory exp/intrinsic-avatar-male-3-casual/male-3-casual@YYYYMMDD-HHMMSS
ZJU-MoCap
Similarly, to train on the 377 sequence of ZJU-MoCap, use the following command:
python launch.py dataset=zju-mocap/377 sampler=balanced pose_correction.dataset_length=125 pose_correction.enable_pose_correction=true tag=IA-377
This default setting trains on the 377 sequence using 125 frames from a single camera. You can also train on longer sequences with 4 cameras (with 300 frames for each camera) via the following command:
python launch.py --config-name config_long dataset=zju-mocap/377_4cam_long sampler=balanced pose_correction.dataset_length=300 pose_correction.enable_pose_correction=true tag=IA-377
Testing
To test on the male-3-casual sequence for relighting on within-distribution poses, use the following command:
python launch.py mode=test \
resume=${PATH_TO_CKPT} \
dataset=peoplesnapshot/male-3-casual \
dataset.hdri_filepath=hdri_images/city.hdr \
light=envlight_tensor \
model.render_mode=light \ # light importance sampling
model.global_illumination=false \
model.samples_per_pixel=1024 \
model.resample_light=false \ # set to true if you are doing quantitative evaluation
tag=IA-male-3-casual \
model.add_emitter=true # set to false if you are doing quantitative evaluation
To test on the male-3-casual sequence for relighting on out-of-distribution poses, use the following command:
python launch.py mode=test \
resume=${PATH_TO_CKPT} \
dataset=animation/male-3-casual \
dataset.hdri_filepath=hdri_images/city.hdr \
light=envlight_tensor \
model.render_mode=light \
model.global_illumination=false \
model.samples_per_pixel=1024 \
model.resample_light=false \
tag=IA-male-3-casual \
model.add_emitter=true
NOTE: if you encounter the error mismatched input '=' expecting <EOF>, it is most likely because your checkpoint path contains = (which is the default checkpoint format of this repo). In such a case you can quote twice, e.g. use 'resume="${PATH_TO_CKPT}"'. For details please check this Hydra issue.
TODO
- [ ] Blender script to render SyntheticHuman-relit from the SyntheticHuman dataset
- [ ] Proper mesh export code
- [ ] Unified dataset loader for PeopleSnapshot (monocular), RANA/SyntheticHuman (synthetic), and ZJU (multi-view)
Acknowledgement
Our code structure is based on instant-nsr-pl. The importance sampling code (lib/nerfacc) follows the structure of NeRFAcc. The SMPL mesh visualization code (utils/smpl_renderer.py) is borrowed from NeuralBody. The LBS-based deformer code (models/deformers/fast-snarf) is borrowed from Fast-SNARF and InstantAvatar. We thank authors of these papers for their wonderful works which greatly facilitates the development of our project.
Related Skills
docs-writer
98.9k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
332.9kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Design
Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t
arscontexta
2.8kClaude Code plugin that generates individualized knowledge systems from conversation. You describe how you think and work, have a conversation and get a complete second brain as markdown files you own.
