SkillAgentSearch skills...

PIA

[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA,你的个性化图像动画生成器,利用文本提示将图像变为奇妙的动画

Install / Use

/learn @open-mmlab/PIA

README

CVPR 2024 | PIA:Personalized Image Animator

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models

Yiming Zhang*, Zhening Xing*, Yanhong Zeng†, Youqing Fang, Kai Chen†

(*equal contribution, †corresponding Author)

arXiv Project Page Open in OpenXLab Third Party Colab HuggingFace Model <a target="_blank" href="https://huggingface.co/spaces/Leoxing/PIA"> <img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-sm.svg" alt="Open in HugginFace"/> </a> Replicate

PIA is a personalized image animation method which can generate videos with high motion controllability and strong text and image alignment.

If you find our project helpful, please give it a star :star: or cite it, we would be very grateful :sparkling_heart: .

<img src="__assets__/image_animation/teaser/teaser.gif">

What's New

  • [x] 2024/01/03 Replicate Demo & API support!
  • [x] 2024/01/03 Colab support from camenduru!
  • [x] 2023/12/28 Support scaled_dot_product_attention for 1024x1024 images with just 16GB of GPU memory.
  • [x] 2023/12/25 HuggingFace demo is available now! 🤗 Hub
  • [x] 2023/12/22 Release the demo of PIA on OpenXLab and checkpoints on Google Drive or Open in OpenXLab

Setup

Prepare Environment

Use the following command to install a conda environment for PIA from scratch:

conda env create -f pia.yml
conda activate pia

You may also want to install it based on an existing environment, then you can use environment-pt2.yaml for Pytorch==2.0.0. If you want to use lower version of Pytorch (e.g. 1.13.1), you can use the following command:

conda env create -f environment.yaml
conda activate pia

We strongly recommend you to use Pytorch==2.0.0 which supports scaled_dot_product_attention for memory-efficient image animation.

Download checkpoints

<li>Download the Stable Diffusion v1-5</li>
conda install git-lfs
git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/
<li>Download PIA</li>
git clone https://huggingface.co/Leoxing/PIA models/PIA/
<li>Download Personalized Models</li>
bash download_bashscripts/1-RealisticVision.sh
bash download_bashscripts/2-RcnzCartoon.sh
bash download_bashscripts/3-MajicMix.sh

You can also download pia.ckpt manually through link on Google Drive or HuggingFace.

Put checkpoints as follows:

└── models
    ├── DreamBooth_LoRA
    │   ├── ...
    ├── PIA
    │   ├── pia.ckpt
    └── StableDiffusion
        ├── vae
        ├── unet
        └── ...

Inference

Image Animation

Image to Video result can be obtained by:

python inference.py --config=example/config/lighthouse.yaml
python inference.py --config=example/config/harry.yaml
python inference.py --config=example/config/majic_girl.yaml

Run the command above, then you can find the results in example/result:

<table class="center"> <tr> <td><p style="text-align: center">Input Image</p></td> <td><p style="text-align: center">lightning, lighthouse</p></td> <td><p style="text-align: center">sun rising, lighthouse</p></td> <td><p style="text-align: center">fireworks, lighthouse</p></td> </tr> <tr> <td><img src="example/img/lighthouse.jpg"></td> <td><img src="__assets__/image_animation/real/1.gif"></td> <td><img src="__assets__/image_animation/real/2.gif"></td> <td><img src="__assets__/image_animation/real/3.gif"></td> </tr> <tr> <td><p style="text-align: center">Input Image</p></td> <td><p style="text-align: center">1boy smiling</p></td> <td><p style="text-align: center">1boy playing the magic fire</p></td> <td><p style="text-align: center">1boy is waving hands</p></td> </tr> <tr> <td><img src="example/img/harry.png"></td> <td><img src="__assets__/image_animation/rcnz/1.gif"></td> <td><img src="__assets__/image_animation/rcnz/2.gif"></td> <td><img src="__assets__/image_animation/rcnz/3.gif"></td> </tr> <tr> <td><p style="text-align: center">Input Image</p></td> <td><p style="text-align: center">1girl is smiling</p></td> <td><p style="text-align: center">1girl is crying</p></td> <td><p style="text-align: center">1girl, snowing </p></td> </tr> <tr> <td><img src="example/img/majic_girl.jpg"></td> <td><img src="__assets__/image_animation/majic/1.gif"></td> <td><img src="__assets__/image_animation/majic/2.gif"></td> <td><img src="__assets__/image_animation/majic/3.gif"></td> </tr> </table> <!-- More results: <table class="center"> <tr> <td><p style="text-align: center">Input Image</p></td> </tr> <tr> </tr> </table> -->

Motion Magnitude

You can control the motion magnitude through the parameter magnitude:

python inference.py --config=example/config/xxx.yaml --magnitude=0 # Small Motion
python inference.py --config=example/config/xxx.yaml --magnitude=1 # Moderate Motion
python inference.py --config=example/config/xxx.yaml --magnitude=2 # Large Motion

Examples:

python inference.py --config=example/config/labrador.yaml
python inference.py --config=example/config/bear.yaml
python inference.py --config=example/config/genshin.yaml
<table class="center"> <tr> <td><p style="text-align: center">Input Image<br>& Prompt</p></td> <td><p style="text-align: center">Small Motion</p></td> <td><p style="text-align: center">Moderate Motion</p></td> <td><p style="text-align: center">Large Motion</p></td> </tr> <tr> <td><img src="example/img/labrador.png" style="width: 220px">a golden labrador is running</td> <td><img src="__assets__/image_animation/magnitude/labrador/1.gif"></td> <td><img src="__assets__/image_animation/magnitude/labrador/2.gif"></td> <td><img src="__assets__/image_animation/magnitude/labrador/3.gif"></td> </tr> <tr> <td><img src="example/img/bear.jpg" style="width: 220px">1bear is walking, ...</td> <td><img src="__assets__/image_animation/magnitude/bear/1.gif"></td> <td><img src="__assets__/image_animation/magnitude/bear/2.gif"></td> <td><img src="__assets__/image_animation/magnitude/bear/3.gif"></td> </tr> <tr> <td><img src="example/img/genshin.jpg" style="width: 220px">cherry blossom, ...</td> <td><img src="__assets__/image_animation/magnitude/genshin/1.gif"></td> <td><img src="__assets__/image_animation/magnitude/genshin/2.gif"></td> <td><img src="__assets__/image_animation/magnitude/genshin/3.gif"></td> </tr> </table>

Style Transfer

To achieve style transfer, you can run the command(Please don't forget set the base model in xxx.yaml):

Examples:

python inference.py --config example/config/concert.yaml --style_transfer
python inference.py --config example/config/anya.yaml --style_transfer
<table class="center"> <tr> <td><p style="text-align: center">Input Image<br> & Base Model</p></td> <td><p style="text-align: center">1man is smiling</p></td> <td><p style="text-align: center">1man is crying</p></td> <td><p style="text-align: center">1man is singing</p></td> </tr> <tr> <td style="text-align: center"><img src="example/img/concert.png" style="width:220px">Realistic Vision</td> <td><img src="__assets__/image_animation/style_transfer/concert/1.gif"></td> <td><img src="__assets__/image_animation/style_transfer/concert/2.gif"></td> <td><img src="__assets__/image_animation/style_transfer/concert/3.gif"></td> </tr> <tr> <td style="text-align: center"><img src="example/img/concert.png" style="width:220px">RCNZ Cartoon 3d</td> <td><img src="__assets__/image_animation/style_transfer/concert/4.gif"></td> <td><img src="__assets__/image_animation/style_transfer/concert/5.gif"></td> <td><img src="__assets__/image_animation/style_transfer/concert/6.gif"></td> </tr> <tr> <td><p style="text-align: center"></p></td> <td><p style="text-align: center">1girl smiling</p></td> <td><p style="text-align: center">1girl open mouth</p></td> <td><p style="text-align: center">1girl is crying, pout</p></td> </tr> <tr> <td style="text-align: center"><img src="example/img/anya.jpg" style="width:220px">RCNZ Cartoon 3d</td> <td><img src="__assets__/image_animation/style_transfer/anya/1.gif"></td> <td><img src="__assets__/image_animation/style_transfer/anya/2.gif"></td> <t

Related Skills

View on GitHub
GitHub Stars978
CategoryContent
Updated27d ago
Forks73

Languages

Python

Security Score

100/100

Audited on Mar 1, 2026

No findings