MotionGPT

No description available

Generate Convert Improve

Install / Use

/learn @humansensinglab/MotionGPT

About this skill

Quality Score

0/100

README

MotionGPT

The official PyTorch implementation of the paper "MotionGPT: Human Motion Synthesis with Improved Diversity and Realism via GPT-3 Prompting".

Bibtex

If you find this code useful in your research, please cite:

@inproceedings{ribeiro2024motiongpt,
  title={MotionGPT: Human Motion Synthesis with Improved Diversity and Realism via GPT-3 Prompting},
  author={Ribeiro-Gomes, Jose and Cai, Tianhui and Milacski, Zolt{\'a}n A and Wu, Chen and Prakash, Aayush and Takagi, Shingo and Aubel, Amaury and Kim, Daeil and Bernardino, Alexandre and De La Torre, Fernando},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={5070--5080},
  year={2024}
}

NOTE: WIP

This code currently only has the instructions for inference. Training and data preparation will come shortly.

Getting started

This code was tested on Ubuntu 18.04 LTS and requires:

Python 3.7
conda3 or miniconda3
CUDA capable GPU (tested on NVidia RTX A4000 16GB)

1. Setup environment

Install ffmpeg (if not already installed):

sudo apt update
sudo apt install ffmpeg

Setup conda env:

conda env create -f environment.yml
conda activate motiongpt
python -m spacy download en_core_web_sm
pip install git+https://github.com/openai/CLIP.git
pip install sentence_transformers

Download dependencies:

bash prepare/download_smpl_files.sh
bash prepare/download_glove.sh
bash prepare/download_t2m_evaluators.sh

2. Download the pretrained models

Download the model(s) you wish to use, then unzip and place them in ./save/.

link

Motion Synthesis

Generate a single prompt

python -m sample.generate --model_path ./save/mini/model000600161.pt --text_prompt "greet a friend" --babel_prompt "hug"

You may also define:

--device id.
--seed to sample different prompts.
--motion_length (text-to-motion only) in seconds (maximum is 9.8[sec]).
--second_llm

Running those will get you:

results.npy file with text prompts and xyz positions of the generated animation
sample##_rep##.mp4 - a stick figure animation for each generated motion.

It will look something like this:

example

You can stop here, or render the SMPL mesh using the following script.

Render SMPL mesh

To create SMPL mesh per frame run:

python -m visualize.render_mesh --input_path /path/to/mp4/stick/figure/file

This script outputs:

sample##_rep##_smpl_params.npy - SMPL parameters (thetas, root translations, vertices and faces)
sample##_rep##_obj - Mesh per frame in .obj format.

Acknowledgments

This code is heavily adapted from:

Related Skills

node-connect

343.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

90.0k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

343.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

343.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。