MAS

The official implementation of the paper "MAS: Multiview Ancestral Sampling for 3D Motion Generation Using 2D Diffusion"

Generate Convert Improve

Install / Use

/learn @roykapon/MAS

About this skill

Quality Score

0/100

README

MAS: Multi-view Ancestral Sampling for 3D motion generation using 2D diffusion

The official PyTorch implementation of the paper "MAS: Multi-view Ancestral Sampling for 3D motion generation using 2D diffusion".

News

📢 5/Nov/23 - First release

Getting started

This code was tested on Ubuntu 18.04.5 LTS and requires:

Python 3.7
conda3 or miniconda3
CUDA capable GPU (one is enough)

1. Setup environment

Setup conda env:

conda env create -f environment.yml
conda activate mas

2. Get data

If you would like to to train or evaluate a model, you can download the relevant dataset. This is not required for using the pretrained models.

Download the zip file <dataset_name>.zip, then extract it into dataset/<dataset_name>/ inside the project directory. Alternatively, you can extract it to some other path and edit datapath in data_loaders/<dataset_name>/config.py to your custom path.

Note: NBA dataset is quite large in size so make sure you have enough free space on your machine.

<details> <summary>3D pose estimations of the NBA dataset</summary>

You can download the results of 3D pose estimations by the supervised MotionBert and unsupervised Elepose on the 2D NBA dataset. You can later use them for evaluation, visualization, or for your own needs. Download the zip <method>_predictions.zip file and unzip it into dataset/nba/<method>_predictions/. You can use an alternative path, and change the hard-coded paths in eval/evaluate.py (defined in the beginning of the file).

Using Elepose: elepose_predictions.zip
Using MotionBert: motionBert_predictions.zip

</details>

3. Download the pretrained models

We supply the pretrained diffusion models for each dataset. Download the model(s) you wish to use, then unzip them. We suggest to place them in save/<dataset>/.

We also supply the VAE model used for evaluation on the NBA dataset. Download the model, then unzip it. We suggest to place it in save/evaluator/<dataset>.

nba_evaluator.zip

Motion Generation

Use the diffusion model to generate 2D motions, or use MAS to generate 3D motions.

<details> <summary>General generation arguments</summary>

--model_path <path/to/checkpoint.pth> - the checkpoint file from which we load the model. The model's properties are loaded from the args.json file that is found in the same directory as the checkpoint file.
--num_samples <number of samples> - number of generated samples (defaults to 10).
--device <index> - CUDA device index (defaults to 0).
--seed <seed> - seed for all random processes (defaults to 0).
--motion_length <duration> - motion duration in seconds (defaults to 6.0).
--use_data - use motion lengths and conditions sampled from the data (recommended, but requires downloading the data).
--show_input_motions - plot the sampled 2D motions to a video file (only when specifying --use_data).
--output_dir <directory path> - the path of the output directory. If not specified, will create a directory named <method_name>_seed_<seed> in the same directory as the <model_path>.
--overwrite - if there already exists a directory with the same path as <output_dir>, overwrite the directory with the new results.

Run python -m sample.generate -h for full details.

</details>

2D Motion Generation

python -m sample.generate --model_path save/nba/nba_diffusion_model/checkpoint_500000.pth

Note: Make sure that the args.json file is in the same directory as the <model_path>.

Use MAS to generate 3D motions

python -m sample.mas --model_path save/nba/nba_diffusion_model/checkpoint_500000.pth

<details> <summary>MAS arguments</summary>

All of the general generation arguments are available here.
--num_views <number of views> - number of views used for MAS (defaults to 7).

Run python -m sample.mas -h for full details.

</details> <details> <summary>Dynamic view-point sampling</summary>

Use MAS with new view angles sampled on each iteration (Appendix C in the paper).

python -m sample.ablations --model_path save/nba/nba_diffusion_model/checkpoint_500000.pth --ablation_name random_angles

</details> <details> <summary>Alternative methods</summary>

Generate a 3D motion with alternative methods. Specify --ablation_name to control the type of method to apply. Can be one of [mas, dreamfusion, dreamfusion_annealed, no_3d_noise, prominent_angle, prominent_angle_no_3d_noise, random_angles].

python -m sample.ablations --model_path save/nba/nba_diffusion_model/checkpoint_500000.pth --ablation_name dreamfusion

</details>

Output

Running one of the 2D or 3D motion generation scripts will get you a new directory <output_dir>, which includes:

results.npy file with all conditions and xyz positions of the generated motions.
result_<index>.mp4 - a stick figure animation for each generated motion. 3D motions are plotted with a rotating camera, and repeat twice (this is why there is a sharp transition in the middle):

You can stop here, or generate an explicit mesh using the following script.

<details> <summary>Generate a SMPL mesh</summary>

For human motions, it is possible to create a SMPL mesh for each frame:

Download dependencies:

bash prepare/download_smpl_files.sh

If the script fails, create body_models/ directory, then download the zip file smpl.zip and extract it into the directory you created.

Then run the script:

python -m visualize.render_mesh --input_path /path/to/mp4/video/result_<index>.mp4 --skeleton_type <type>

This script outputs:

<file_name>_smpl_params.npy - SMPL parameters (thetas, root translations, vertices and faces)
<file_name>_obj/ - A directory with a mesh per frame in .obj format.

Notes:

Important - Make sure that the results.npy file is in the same directory as the video file.
The .obj can be integrated into Blender/Maya/3DS-MAX and rendered using them.
This script is running SMPLify and needs GPU as well (can be specified with the --device flag).
You have two ways to animate the sequence:
1. Use the SMPL add-on and the theta parameters saved to <file_name>_smpl_params.npy (we always use beta=0 and the gender-neutral model).
2. A more straightforward way is using the mesh data itself. All meshes have the same topology (SMPL), so you just need to keyframe vertex locations. Since the OBJs are not preserving vertex order, we also save this data to the <file_name>_smpl_params.npy file for your convenience.

<details> <summary>Mesh generation arguments</summary>

--input_path </path/to/mp4/video/result_<index>.mp4 to specify which sample to convert to a mesh.
--skeleton_type <type> to specify the skeleton type of the motion. Can be one of [nba, motionbert, elepose, gymnastics].

</details> </details>

Train Your Own 2D Motion Diffusion Model

python -m train.train_mdm --save_dir save/nba/my_nba_diffusion_model --dataset nba

<details> <summary>Training with evaluation</summary>

Add evaluation during training using the --eval_during_training flag.

python -m train.train_mdm --save_dir save/nba/my_nba_diffusion_model --dataset nba --eval_during_training --evaluator_path save/evaluator/nba/no_aug/checkpoint_1000000.pth --subjects "model mas" --num_views 5 --num_eval_iterations 2

</details> <details> <summary>Training arguments</summary>

See Evaluate section for the evaluation arguments.

--save_dir <path> - the directory to save results to (required).
--device <index> - CUDA device index (defaults to 0).
--seed <seed> - seed for all random processes (defaults to 0).
--lr <learning rate> - learning rate of the gradient decent algorithm (defaults to 10^(-5)).
--resume_checkpoint <path/to/checkpoint.pth - resume training from an existing checkpoint.
--train_batch_size - batch size during training (defaults to 32). If the a

Related Skills

node-connect

346.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

107.6k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

346.8k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

346.8k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。