PIA
[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA,你的个性化图像动画生成器,利用文本提示将图像变为奇妙的动画
Install / Use
/learn @open-mmlab/PIAREADME
CVPR 2024 | PIA:Personalized Image Animator
PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models
Yiming Zhang*, Zhening Xing*, Yanhong Zeng†, Youqing Fang, Kai Chen†
(*equal contribution, †corresponding Author)
<a target="_blank" href="https://huggingface.co/spaces/Leoxing/PIA">
<img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-sm.svg" alt="Open in HugginFace"/>
</a>
PIA is a personalized image animation method which can generate videos with high motion controllability and strong text and image alignment.
If you find our project helpful, please give it a star :star: or cite it, we would be very grateful :sparkling_heart: .
<img src="__assets__/image_animation/teaser/teaser.gif">What's New
- [x]
2024/01/03Replicate Demo & API support! - [x]
2024/01/03Colab support from camenduru! - [x]
2023/12/28Supportscaled_dot_product_attentionfor 1024x1024 images with just 16GB of GPU memory. - [x]
2023/12/25HuggingFace demo is available now! 🤗 Hub - [x]
2023/12/22Release the demo of PIA on OpenXLab and checkpoints on Google Drive or
Setup
Prepare Environment
Use the following command to install a conda environment for PIA from scratch:
conda env create -f pia.yml
conda activate pia
You may also want to install it based on an existing environment, then you can use environment-pt2.yaml for Pytorch==2.0.0. If you want to use lower version of Pytorch (e.g. 1.13.1), you can use the following command:
conda env create -f environment.yaml
conda activate pia
We strongly recommend you to use Pytorch==2.0.0 which supports scaled_dot_product_attention for memory-efficient image animation.
Download checkpoints
<li>Download the Stable Diffusion v1-5</li>conda install git-lfs
git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/
<li>Download PIA</li>
git clone https://huggingface.co/Leoxing/PIA models/PIA/
<li>Download Personalized Models</li>
bash download_bashscripts/1-RealisticVision.sh
bash download_bashscripts/2-RcnzCartoon.sh
bash download_bashscripts/3-MajicMix.sh
You can also download pia.ckpt manually through link on Google Drive or HuggingFace.
Put checkpoints as follows:
└── models
├── DreamBooth_LoRA
│ ├── ...
├── PIA
│ ├── pia.ckpt
└── StableDiffusion
├── vae
├── unet
└── ...
Inference
Image Animation
Image to Video result can be obtained by:
python inference.py --config=example/config/lighthouse.yaml
python inference.py --config=example/config/harry.yaml
python inference.py --config=example/config/majic_girl.yaml
Run the command above, then you can find the results in example/result:
<table class="center"> <tr> <td><p style="text-align: center">Input Image</p></td> <td><p style="text-align: center">lightning, lighthouse</p></td> <td><p style="text-align: center">sun rising, lighthouse</p></td> <td><p style="text-align: center">fireworks, lighthouse</p></td> </tr> <tr> <td><img src="example/img/lighthouse.jpg"></td> <td><img src="__assets__/image_animation/real/1.gif"></td> <td><img src="__assets__/image_animation/real/2.gif"></td> <td><img src="__assets__/image_animation/real/3.gif"></td> </tr> <tr> <td><p style="text-align: center">Input Image</p></td> <td><p style="text-align: center">1boy smiling</p></td> <td><p style="text-align: center">1boy playing the magic fire</p></td> <td><p style="text-align: center">1boy is waving hands</p></td> </tr> <tr> <td><img src="example/img/harry.png"></td> <td><img src="__assets__/image_animation/rcnz/1.gif"></td> <td><img src="__assets__/image_animation/rcnz/2.gif"></td> <td><img src="__assets__/image_animation/rcnz/3.gif"></td> </tr> <tr> <td><p style="text-align: center">Input Image</p></td> <td><p style="text-align: center">1girl is smiling</p></td> <td><p style="text-align: center">1girl is crying</p></td> <td><p style="text-align: center">1girl, snowing </p></td> </tr> <tr> <td><img src="example/img/majic_girl.jpg"></td> <td><img src="__assets__/image_animation/majic/1.gif"></td> <td><img src="__assets__/image_animation/majic/2.gif"></td> <td><img src="__assets__/image_animation/majic/3.gif"></td> </tr> </table> <!-- More results: <table class="center"> <tr> <td><p style="text-align: center">Input Image</p></td> </tr> <tr> </tr> </table> -->Motion Magnitude
You can control the motion magnitude through the parameter magnitude:
python inference.py --config=example/config/xxx.yaml --magnitude=0 # Small Motion
python inference.py --config=example/config/xxx.yaml --magnitude=1 # Moderate Motion
python inference.py --config=example/config/xxx.yaml --magnitude=2 # Large Motion
Examples:
python inference.py --config=example/config/labrador.yaml
python inference.py --config=example/config/bear.yaml
python inference.py --config=example/config/genshin.yaml
<table class="center">
<tr>
<td><p style="text-align: center">Input Image<br>& Prompt</p></td>
<td><p style="text-align: center">Small Motion</p></td>
<td><p style="text-align: center">Moderate Motion</p></td>
<td><p style="text-align: center">Large Motion</p></td>
</tr>
<tr>
<td><img src="example/img/labrador.png" style="width: 220px">a golden labrador is running</td>
<td><img src="__assets__/image_animation/magnitude/labrador/1.gif"></td>
<td><img src="__assets__/image_animation/magnitude/labrador/2.gif"></td>
<td><img src="__assets__/image_animation/magnitude/labrador/3.gif"></td>
</tr>
<tr>
<td><img src="example/img/bear.jpg" style="width: 220px">1bear is walking, ...</td>
<td><img src="__assets__/image_animation/magnitude/bear/1.gif"></td>
<td><img src="__assets__/image_animation/magnitude/bear/2.gif"></td>
<td><img src="__assets__/image_animation/magnitude/bear/3.gif"></td>
</tr>
<tr>
<td><img src="example/img/genshin.jpg" style="width: 220px">cherry blossom, ...</td>
<td><img src="__assets__/image_animation/magnitude/genshin/1.gif"></td>
<td><img src="__assets__/image_animation/magnitude/genshin/2.gif"></td>
<td><img src="__assets__/image_animation/magnitude/genshin/3.gif"></td>
</tr>
</table>
Style Transfer
To achieve style transfer, you can run the command(Please don't forget set the base model in xxx.yaml):
Examples:
python inference.py --config example/config/concert.yaml --style_transfer
python inference.py --config example/config/anya.yaml --style_transfer
<table class="center">
<tr>
<td><p style="text-align: center">Input Image<br> & Base Model</p></td>
<td><p style="text-align: center">1man is smiling</p></td>
<td><p style="text-align: center">1man is crying</p></td>
<td><p style="text-align: center">1man is singing</p></td>
</tr>
<tr>
<td style="text-align: center"><img src="example/img/concert.png" style="width:220px">Realistic Vision</td>
<td><img src="__assets__/image_animation/style_transfer/concert/1.gif"></td>
<td><img src="__assets__/image_animation/style_transfer/concert/2.gif"></td>
<td><img src="__assets__/image_animation/style_transfer/concert/3.gif"></td>
</tr>
<tr>
<td style="text-align: center"><img src="example/img/concert.png" style="width:220px">RCNZ Cartoon 3d</td>
<td><img src="__assets__/image_animation/style_transfer/concert/4.gif"></td>
<td><img src="__assets__/image_animation/style_transfer/concert/5.gif"></td>
<td><img src="__assets__/image_animation/style_transfer/concert/6.gif"></td>
</tr>
<tr>
<td><p style="text-align: center"></p></td>
<td><p style="text-align: center">1girl smiling</p></td>
<td><p style="text-align: center">1girl open mouth</p></td>
<td><p style="text-align: center">1girl is crying, pout</p></td>
</tr>
<tr>
<td style="text-align: center"><img src="example/img/anya.jpg" style="width:220px">RCNZ Cartoon 3d</td>
<td><img src="__assets__/image_animation/style_transfer/anya/1.gif"></td>
<td><img src="__assets__/image_animation/style_transfer/anya/2.gif"></td>
<tRelated Skills
docs-writer
99.3k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
339.3kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Design
Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t
ddd
Guía de Principios DDD para el Proyecto > 📚 Documento Complementario : Este documento define los principios y reglas de DDD. Para ver templates de código, ejemplos detallados y guías paso
