MAS
The official implementation of the paper "MAS: Multiview Ancestral Sampling for 3D Motion Generation Using 2D Diffusion"
Install / Use
/learn @roykapon/MASREADME
MAS: Multi-view Ancestral Sampling for 3D motion generation using 2D diffusion
The official PyTorch implementation of the paper "MAS: Multi-view Ancestral Sampling for 3D motion generation using 2D diffusion".
<img src="assets/video_teaser.gif" width="100%" height="auto"/>News
📢 5/Nov/23 - First release
Getting started
This code was tested on Ubuntu 18.04.5 LTS and requires:
- Python 3.7
- conda3 or miniconda3
- CUDA capable GPU (one is enough)
1. Setup environment
Setup conda env:
conda env create -f environment.yml
conda activate mas
<!--If you would like to use text conditioning, you can use CLIP:
pip install git+https://github.com/openai/CLIP.git -->
2. Get data
If you would like to to train or evaluate a model, you can download the relevant dataset. This is not required for using the pretrained models.
Download the zip file <dataset_name>.zip, then extract it into dataset/<dataset_name>/ inside the project directory. Alternatively, you can extract it to some other path and edit datapath in data_loaders/<dataset_name>/config.py to your custom path.
Note: NBA dataset is quite large in size so make sure you have enough free space on your machine.
<details> <summary><b>3D pose estimations of the NBA dataset</b></summary>You can download the results of 3D pose estimations by the supervised MotionBert and unsupervised Elepose on the 2D NBA dataset. You can later use them for evaluation, visualization, or for your own needs.
Download the zip <method>_predictions.zip file and unzip it into dataset/nba/<method>_predictions/. You can use an alternative path, and change the hard-coded paths in eval/evaluate.py (defined in the beginning of the file).
-
Using Elepose: elepose_predictions.zip
3. Download the pretrained models
We supply the pretrained diffusion models for each dataset.
Download the model(s) you wish to use, then unzip them. We suggest to place them in save/<dataset>/.
We also supply the VAE model used for evaluation on the NBA dataset.
Download the model, then unzip it. We suggest to place it in save/evaluator/<dataset>.
Motion Generation
Use the diffusion model to generate 2D motions, or use MAS to generate 3D motions.
<details> <summary><b>General generation arguments</b></summary>--model_path <path/to/checkpoint.pth>- the checkpoint file from which we load the model. The model's properties are loaded from theargs.jsonfile that is found in the same directory as the checkpoint file.--num_samples <number of samples>- number of generated samples (defaults to 10).--device <index>- CUDA device index (defaults to 0).--seed <seed>- seed for all random processes (defaults to 0).--motion_length <duration>- motion duration in seconds (defaults to 6.0).--use_data- use motion lengths and conditions sampled from the data (recommended, but requires downloading the data).--show_input_motions- plot the sampled 2D motions to a video file (only when specifying--use_data).--output_dir <directory path>- the path of the output directory. If not specified, will create a directory named<method_name>_seed_<seed>in the same directory as the<model_path>.--overwrite- if there already exists a directory with the same path as<output_dir>, overwrite the directory with the new results.
Run python -m sample.generate -h for full details.
2D Motion Generation
python -m sample.generate --model_path save/nba/nba_diffusion_model/checkpoint_500000.pth
Note: Make sure that the args.json file is in the same directory as the <model_path>.
Use MAS to generate 3D motions
python -m sample.mas --model_path save/nba/nba_diffusion_model/checkpoint_500000.pth
<details>
<summary><b>MAS arguments</b></summary>
- All of the general generation arguments are available here.
--num_views <number of views>- number of views used for MAS (defaults to 7).
Run python -m sample.mas -h for full details.
Use MAS with new view angles sampled on each iteration (Appendix C in the paper).
python -m sample.ablations --model_path save/nba/nba_diffusion_model/checkpoint_500000.pth --ablation_name random_angles
</details>
<details>
<summary><b>Alternative methods</b></summary>
Generate a 3D motion with alternative methods.
Specify --ablation_name to control the type of method to apply. Can be one of [mas, dreamfusion, dreamfusion_annealed, no_3d_noise, prominent_angle, prominent_angle_no_3d_noise, random_angles].
python -m sample.ablations --model_path save/nba/nba_diffusion_model/checkpoint_500000.pth --ablation_name dreamfusion
</details>
Output
Running one of the 2D or 3D motion generation scripts will get you a new directory <output_dir>, which includes:
results.npyfile with all conditions and xyz positions of the generated motions.result_<index>.mp4- a stick figure animation for each generated motion. 3D motions are plotted with a rotating camera, and repeat twice (this is why there is a sharp transition in the middle):
<img src="assets/nba_example.gif" width="30%" height="auto"/> <img src="assets/horse_example.gif" width="30%" height="auto"/> <img src="assets/gymnastics_example.gif" width="30%" height="auto"/>
You can stop here, or generate an explicit mesh using the following script.
<details> <summary><b>Generate a SMPL mesh</b></summary>For human motions, it is possible to create a SMPL mesh for each frame:
Download dependencies:
bash prepare/download_smpl_files.sh
If the script fails, create body_models/ directory, then download the zip file smpl.zip and extract it into the directory you created.
Then run the script:
python -m visualize.render_mesh --input_path /path/to/mp4/video/result_<index>.mp4 --skeleton_type <type>
This script outputs:
<file_name>_smpl_params.npy- SMPL parameters (thetas, root translations, vertices and faces)<file_name>_obj/- A directory with a mesh per frame in.objformat.
Notes:
- Important - Make sure that the results.npy file is in the same directory as the video file.
- The
.objcan be integrated into Blender/Maya/3DS-MAX and rendered using them. - This script is running SMPLify and needs GPU as well (can be specified with the
--deviceflag). - You have two ways to animate the sequence:
- Use the SMPL add-on and the theta parameters saved to
<file_name>_smpl_params.npy(we always use beta=0 and the gender-neutral model). - A more straightforward way is using the mesh data itself. All meshes have the same topology (SMPL), so you just need to keyframe vertex locations.
Since the OBJs are not preserving vertex order, we also save this data to the
<file_name>_smpl_params.npyfile for your convenience.
- Use the SMPL add-on and the theta parameters saved to
--input_path </path/to/mp4/video/result_<index>.mp4to specify which sample to convert to a mesh.--skeleton_type <type>to specify the skeleton type of the motion. Can be one of [nba,motionbert,elepose,gymnastics].
Train Your Own 2D Motion Diffusion Model
python -m train.train_mdm --save_dir save/nba/my_nba_diffusion_model --dataset nba
<details>
<summary><b>Training with evaluation</b></summary>
Add evaluation during training using the --eval_during_training flag.
python -m train.train_mdm --save_dir save/nba/my_nba_diffusion_model --dataset nba --eval_during_training --evaluator_path save/evaluator/nba/no_aug/checkpoint_1000000.pth --subjects "model mas" --num_views 5 --num_eval_iterations 2
</details>
<details>
<summary><b>Training arguments</b></summary>
See Evaluate section for the evaluation arguments.
--save_dir <path>- the directory to save results to (required).--device <index>- CUDA device index (defaults to 0).--seed <seed>- seed for all random processes (defaults to 0).--lr <learning rate>- learning rate of the gradient decent algorithm (defaults to 10^(-5)).--resume_checkpoint <path/to/checkpoint.pth- resume training from an existing checkpoint.--train_batch_size- batch size during training (defaults to 32). If the a
Related Skills
node-connect
346.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
346.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
346.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
