GenMAC
[AAAI 2026] GenMAC for Compositional Text-to-Video Generation
Install / Use
/learn @Karine-Huang/GenMACREADME
<img src="assets/genmac_logo.png" alt="Icon" width="30"/> GENMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration
<a href='https://karine-h.github.io/GenMAC/'><img src='https://img.shields.io/badge/Project-Page-Green'></a> <a href='https://arxiv.org/abs/2412.04440'><img src='https://img.shields.io/badge/GenMAC-Arxiv-red'></a>
Method Overview

This repository provides the inference code for GenMAC, enabling compositional text-to-video generation. Follow the steps below to set up and generate your videos.
🔨 Installation
Step 1: Install Environment
Set up the required environment using Anaconda:
conda create -n genmac python=3.9 -y
conda activate genmac
pip install -r requirements.txt
Step 2: Download Required Files
-
Pretrained Model
- Download the pretrained model from Hugging Face. Place the model in the following directory (checkpoints/base_512_v2/model.ckpt):
mkdir -p checkpoints/base_512_v2 huggingface-cli download VideoCrafter/VideoCrafter2 model.ckpt --local-dir checkpoints/base_512_v2
- Download the pretrained model from Hugging Face. Place the model in the following directory (checkpoints/base_512_v2/model.ckpt):
-
Tokenizer
- Download the tokenizer from this link. Place it in the following directory (checkpoints/tokenizer):
huggingface-cli download ali-vilab/text-to-video-ms-1.7b \ --include "tokenizer/*" \ --local-dir checkpoints
- Download the tokenizer from this link. Place it in the following directory (checkpoints/tokenizer):
💡 Quick Start
Step 1: Prepare API Key
GenMAC uses GPT-4o for its multi-agent collaboration. To enable access to the model:
-
Add your OpenAI API key to the file:
utils/api_key.py -
Ensure that your system has internet access to connect to OpenAI's servers.
Step 2: Run Inference
To generate videos, use the following command:
python scripts/run_t2v.py
Note: This process requires approximately 78GB of GPU VRAM
Additional Configurations
-
Specify the Prompt
Update the prompt text in the following file:
assets/prompt/prompt.txt -
Adjust Iterations
Modify the maximum iteration number (MAX_ITER) in Line 22 of the script (scripts/run_t2v.py):
e.g.,
MAX_ITER = 5 -
Set Seed
Set the seed value in Line 13 of the script (scripts/run_t2v.py):
e.g.,
seed = 12345678
Output
Generated videos will be saved in the following directory:
results/<seed>_<timestamp>/iter_<MAX_ITER-1>/video/results_t2v_baseline_0/base_512_v2/
Example: If the following parameters are used:
- MAX_ITER = 5
- seed = 12345678
- Script executed on 2025-07-25-22-21-24 The generated videos will be located in:
results/12345678_2025-07-25-22-21-24/iter_4/video/results_t2v_baseline_0/base_512_v2/
⭐ Acknowledgements
We would like to thank the following great open-source projects and research works: LVD, VideoCrafter2.
📚 Citation
@article{huang2024genmaccompositionaltexttovideogeneration,
title={GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration},
author={Kaiyi Huang and Yukun Huang and Xuefei Ning and Zinan Lin and Yu Wang and Xihui Liu},
year={2024},
eprint={2412.04440},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.04440},
}
Related Skills
docs-writer
98.6k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
329.0kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
arscontexta
2.8kClaude Code plugin that generates individualized knowledge systems from conversation. You describe how you think and work, have a conversation and get a complete second brain as markdown files you own.
be
Assume the personality of the Persona described in any of the document available in the @~/.ai/personas directory.
