Mojito
Official repo for the paper "Mojito: Motion Trajectory and Intensity Control for Video Generation""
Install / Use
/learn @eric-ai-lab/MojitoREADME
Mojito: Motion Trajectory and Intensity Control for Video Generation
Xuehai He, Shuohang Wang, Jianwei Yang, Xiaoxia Wu, Yiping Wang, Kuan Wang, Zheng Zhan, Olatunji Ruwase, Yelong Shen, Xin Eric Wang
<a href='https://arxiv.org/abs/2412.08948'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a> <a href='https://sites.google.com/view/mojito-video'><img src='https://img.shields.io/badge/Project-Page-green'></a>
</a>

TODO
- [ ] Gradio Demo
- [ ] Huggingface setup
- [x] Release training-free DMC module
- [ ] Release training code
- [x] Release inference code
Directional Control
Environmental Setup
bash environment.bash
Inference
Download the checkpoint from Google Drive and put it under the root path of this folder.
Write your text prompt in prompts.txt.
Run the inference by:
bash configs/inference/run_text2video.sh
To achieve the directional/trajectory guidance, you need to specify the object name and it's spatial locations by (the object name should be in your text prompt):
--object_phrase # The object name to control the moving direction/trajectory in the generated video.
--bboxes # The location of input bounding boxes
You can also specify the number of guidance steps by:
--guidance_step # The number of steps to performance guidance, the higher the stronger guidance.
and guidance scale directly by:
--loss_scale # The higher scale, the stronger guidance.
News
2025.6.10Release the inference code of DMC.2025.2.10🚀 Release the project website!
Citation
@article{he2024mojito,
title = {Mojito: Motion Trajectory and Intensity Control for Video Generation},
author = {Xuehai He and Shuohang Wang and Jianwei Yang and Xiaoxia Wu and Yiping Wang and Kuan Wang and Zheng Zhan and Olatunji Ruwase and Yelong Shen and Xin Eric Wang},
year = {2024},
journal = {arXiv preprint arXiv: 2412.08948}
}
