Mojito
Official repository of paper "Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens".
Install / Use
/learn @koyui/MojitoREADME
🚀 Getting Started
1. Environment Setup
We tested our environment on Ubuntu 20.04 LTS and Windows 11 with CUDA 12.1.
conda create python=3.10 --name mojito
conda activate mojito
conda install pytorch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt
# ignore deepspeed installation if using Win 11
DS_BUILD_OPS=1 DS_BUILD_CUTLASS_OPS=0 DS_BUILD_RAGGED_DEVICE_OPS=0 DS_BUILD_EVOFORMER_ATTN=0 pip install deepspeed
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"
pip install "fastapi[standard]"
2. Prepare Body Model and Weights
Download SMPL-H (the extended SMPL+H model) and put the models under body_model/ folder. The structure of body_model/ folder should be:
body_model/
|--body_model.py
|--utils.py
|--smplh/
|----info.txt
|----LICENSE.txt
|----female/
|------model.npz
|----male/
|------model.npz
|----neutral/
|------model.npz
3. Download pretrained imu tokenizer model
We are releasing the IMU tokenizer model mojito_imu_tokenizer.pth. To set up:
- Download the model checkpoint.
- Create a
checkpoints/directory in your project if it doesn't exist. - Place the downloaded file in
checkpoints/mojito_imu_tokenizer.pth.
4. Example
Run the processing script
python -m example --cfg configs/config_imu_tokenizer.yaml --nodebug
🏄♂️ Contributors
- Ziwei Shan - koyui
- Yaoyu He - TropinoneH
- Chengfeng Zhao - AfterJourney00
- Jiashen Du - ALT-JS
📖 Citation
If you find our code or paper helps, please consider citing:
@article{shan2025mojito,
title = {Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens},
author = {Shan, Ziwei and He, Yaoyu and Du, Jiashen and Zhao, Chengfeng and Zhang, Jingyan and
Zhang, Qixuan and Yu, Jingyi and Xu, Lan},
journal = {arXiv preprint arXiv:},
year = {2025}
}
Acknowledgments
Thanks to the following work that we refer to and benefit from:
- MotionGPT: the overall framework;
- Qwen2: the causal language model;
- EgoEgo: the SMPL-H body model script;
- TransPose: the data pre-processing of TotalCapture dataset;
- SmoothNet: SMPL pose smoother
Licenses
<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/80x15.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.
