PIA

[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA，你的个性化图像动画生成器，利用文本提示将图像变为奇妙的动画

Generate Convert Improve

Install / Use

/learn @open-mmlab/PIA

About this skill

Quality Score

0/100

README

CVPR 2024 | PIA：Personalized Image Animator

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models

Yiming Zhang*, Zhening Xing*, Yanhong Zeng†, Youqing Fang, Kai Chen†

(*equal contribution, †corresponding Author)

PIA is a personalized image animation method which can generate videos with high motion controllability and strong text and image alignment.

If you find our project helpful, please give it a star :star: or cite it, we would be very grateful :sparkling_heart: .

What's New

[x] 2024/01/03 Replicate Demo & API support!
[x] 2024/01/03 Colab support from camenduru!
[x] 2023/12/28 Support scaled_dot_product_attention for 1024x1024 images with just 16GB of GPU memory.
[x] 2023/12/25 HuggingFace demo is available now! 🤗 Hub
[x] 2023/12/22 Release the demo of PIA on OpenXLab and checkpoints on Google Drive or

Setup

Prepare Environment

Use the following command to install a conda environment for PIA from scratch:

conda env create -f pia.yml
conda activate pia

You may also want to install it based on an existing environment, then you can use environment-pt2.yaml for Pytorch==2.0.0. If you want to use lower version of Pytorch (e.g. 1.13.1), you can use the following command:

conda env create -f environment.yaml
conda activate pia

We strongly recommend you to use Pytorch==2.0.0 which supports scaled_dot_product_attention for memory-efficient image animation.

Download checkpoints

<li>Download the Stable Diffusion v1-5</li>

conda install git-lfs
git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/

<li>Download PIA</li>

git clone https://huggingface.co/Leoxing/PIA models/PIA/

<li>Download Personalized Models</li>

bash download_bashscripts/1-RealisticVision.sh
bash download_bashscripts/2-RcnzCartoon.sh
bash download_bashscripts/3-MajicMix.sh

You can also download pia.ckpt manually through link on Google Drive or HuggingFace.

Put checkpoints as follows:

└── models
    ├── DreamBooth_LoRA
    │   ├── ...
    ├── PIA
    │   ├── pia.ckpt
    └── StableDiffusion
        ├── vae
        ├── unet
        └── ...

Inference

Image Animation

Image to Video result can be obtained by:

python inference.py --config=example/config/lighthouse.yaml
python inference.py --config=example/config/harry.yaml
python inference.py --config=example/config/majic_girl.yaml

Run the command above, then you can find the results in example/result:

<table class="center"> <tr> <td>Input Image</td> <td>lightning, lighthouse</td> <td>sun rising, lighthouse</td> <td>fireworks, lighthouse</td> </tr> <tr> <td><img src="example/img/lighthouse.jpg"></td> <td><img src="__assets__/image_animation/real/1.gif"></td> <td><img src="__assets__/image_animation/real/2.gif"></td> <td><img src="__assets__/image_animation/real/3.gif"></td> </tr> <tr> <td>Input Image</td> <td>1boy smiling</td> <td>1boy playing the magic fire</td> <td>1boy is waving hands</td> </tr> <tr> <td><img src="example/img/harry.png"></td> <td><img src="__assets__/image_animation/rcnz/1.gif"></td> <td><img src="__assets__/image_animation/rcnz/2.gif"></td> <td><img src="__assets__/image_animation/rcnz/3.gif"></td> </tr> <tr> <td>Input Image</td> <td>1girl is smiling</td> <td>1girl is crying</td> <td>1girl, snowing </td> </tr> <tr> <td><img src="example/img/majic_girl.jpg"></td> <td><img src="__assets__/image_animation/majic/1.gif"></td> <td><img src="__assets__/image_animation/majic/2.gif"></td> <td><img src="__assets__/image_animation/majic/3.gif"></td> </tr> </table>

Motion Magnitude

You can control the motion magnitude through the parameter magnitude:

python inference.py --config=example/config/xxx.yaml --magnitude=0 # Small Motion
python inference.py --config=example/config/xxx.yaml --magnitude=1 # Moderate Motion
python inference.py --config=example/config/xxx.yaml --magnitude=2 # Large Motion

Examples:

python inference.py --config=example/config/labrador.yaml
python inference.py --config=example/config/bear.yaml
python inference.py --config=example/config/genshin.yaml

<table class="center"> <tr> <td>Input Image & Prompt</td> <td>Small Motion</td> <td>Moderate Motion</td> <td>Large Motion</td> </tr> <tr> <td><img src="example/img/labrador.png" style="width: 220px">a golden labrador is running</td> <td><img src="__assets__/image_animation/magnitude/labrador/1.gif"></td> <td><img src="__assets__/image_animation/magnitude/labrador/2.gif"></td> <td><img src="__assets__/image_animation/magnitude/labrador/3.gif"></td> </tr> <tr> <td><img src="example/img/bear.jpg" style="width: 220px">1bear is walking, ...</td> <td><img src="__assets__/image_animation/magnitude/bear/1.gif"></td> <td><img src="__assets__/image_animation/magnitude/bear/2.gif"></td> <td><img src="__assets__/image_animation/magnitude/bear/3.gif"></td> </tr> <tr> <td><img src="example/img/genshin.jpg" style="width: 220px">cherry blossom, ...</td> <td><img src="__assets__/image_animation/magnitude/genshin/1.gif"></td> <td><img src="__assets__/image_animation/magnitude/genshin/2.gif"></td> <td><img src="__assets__/image_animation/magnitude/genshin/3.gif"></td> </tr> </table>

Style Transfer

To achieve style transfer, you can run the command(Please don't forget set the base model in xxx.yaml):

Examples:

python inference.py --config example/config/concert.yaml --style_transfer
python inference.py --config example/config/anya.yaml --style_transfer

<table class="center"> <tr> <td>Input Image & Base Model</td> <td>1man is smiling</td> <td>1man is crying</td> <td>1man is singing</td> </tr> <tr> <td style="text-align: center"><img src="example/img/concert.png" style="width:220px">Realistic Vision</td> <td><img src="__assets__/image_animation/style_transfer/concert/1.gif"></td> <td><img src="__assets__/image_animation/style_transfer/concert/2.gif"></td> <td><img src="__assets__/image_animation/style_transfer/concert/3.gif"></td> </tr> <tr> <td style="text-align: center"><img src="example/img/concert.png" style="width:220px">RCNZ Cartoon 3d</td> <td><img src="__assets__/image_animation/style_transfer/concert/4.gif"></td> <td><img src="__assets__/image_animation/style_transfer/concert/5.gif"></td> <td><img src="__assets__/image_animation/style_transfer/concert/6.gif"></td> </tr> <tr> <td></td> <td>1girl smiling</td> <td>1girl open mouth</td> <td>1girl is crying, pout</td> </tr> <tr> <td style="text-align: center"><img src="example/img/anya.jpg" style="width:220px">RCNZ Cartoon 3d</td> <td><img src="__assets__/image_animation/style_transfer/anya/1.gif"></td> <td><img src="__assets__/image_animation/style_transfer/anya/2.gif"></td> <t

Related Skills

docs-writer

99.3k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

339.3k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

Design

Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t

ddd

Guía de Principios DDD para el Proyecto > 📚 Documento Complementario : Este documento define los principios y reglas de DDD. Para ver templates de código, ejemplos detallados y guías paso

open-mmlab

View profile

View on GitHub

GitHub Stars978

CategoryContent

Updated27d ago

Forks73

open-mmlab/PIA

Languages

Python

Security Score

100/100

Audited on Mar 1, 2026

No findings