SkillAgentSearch skills...

DynamiCrafter

[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

Install / Use

/learn @Doubiiu/DynamiCrafter
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

<!-- ![](./assets/logo_long.png#gh-light-mode-only){: width="50%"} --> <!-- ![](./assets/logo_long_dark.png#gh-dark-mode-only=100x20) --> <div align="center"> <img src='assets/logo_long.png' style="height:100px"></img>

<a href='https://arxiv.org/abs/2310.12190'><img src='https://img.shields.io/badge/arXiv-2310.12190-b31b1b.svg'></a>   <a href='https://doubiiu.github.io/projects/DynamiCrafter/'><img src='https://img.shields.io/badge/Project-Page-Green'></a>   <a href='https://huggingface.co/papers/2310.12190'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Page-blue'></a>   <a href='https://youtu.be/0NfmIsNAg-g'><img src='https://img.shields.io/badge/Youtube-Video-b31b1b.svg'></a><br> Open in OpenXLab   <a href='https://replicate.com/camenduru/dynami-crafter-576x1024'><img src='https://img.shields.io/badge/replicate-Demo-blue'></a>   <a href='https://github.com/camenduru/DynamiCrafter-colab'><img src='https://img.shields.io/badge/Colab-Demo-Green'></a>  <a href='https://huggingface.co/spaces/Doubiiu/DynamiCrafter'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face%20ImageAnimation-Demo-blue'></a>  <a href='https://huggingface.co/spaces/Doubiiu/DynamiCrafter_interp_loop'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face%20Interpolation/Looping-Demo-blue'></a>  <a href='https://openbayes.com/console/public/tutorials/XMVDVpXKN5o'><img src='https://img.shields.io/badge/Demo-OpenBayes贝式计算-blue'></a>

Jinbo Xing, Menghan Xia, Yong Zhang, Haoxin Chen, Wangbo Yu, <br>Hanyuan Liu, Gongye Liu, Xintao Wang, Ying Shan, Tien-Tsin Wong <br><br> From CUHK and Tencent AI Lab.

<strong>at European Conference on Computer Vision (ECCV) 2024, Oral</strong>

</div>

🔆 Introduction

🔥🔥 Training / Fine-tuning code is available NOW!!!

🔥 We 1024x576 version ranks 1st on the I2V benchmark list from VBench!<br> 🔥 Generative frame interpolation / looping video generation model weights (320x512) have been released!<br> 🔥 New Update Rolls Out for DynamiCrafter! Better Dynamic, Higher Resolution, and Stronger Coherence! <br> 🤗 DynamiCrafter can animate open-domain still images based on <strong>text prompt</strong> by leveraging the pre-trained video diffusion priors. Please check our project page and paper for more information. <br>

👀 Seeking comparisons with Stable Video Diffusion and PikaLabs? Click the image below.

1.1. Showcases (576x1024)

<table class="center"> <!-- <tr> <td colspan="1">"fireworks display"</td> <td colspan="1">"a robot is walking through a destroyed city"</td> </tr> --> <tr> <td> <img src=assets/showcase/firework03.gif width="340"> </td> <td> <img src=assets/showcase/robot01.gif width="340"> </td> </tr> <!-- <tr> <td colspan="1">"riding a bike under a bridge"</td> <td colspan="1">""</td> </tr> --> <tr> <td> <img src=assets/showcase/bike_chineseink.gif width="340"> </td> <td> <img src=assets/showcase/girl07.gif width="340"> </td> </tr> </table>

1.2. Showcases (320x512)

<table class="center"> <!-- <tr> <td colspan="1">"fireworks display"</td> <td colspan="1">"a robot is walking through a destroyed city"</td> </tr> --> <tr> <td> <img src=assets/showcase/bloom2.gif width="340"> </td> <td> <img src=assets/showcase/train_anime02.gif width="340"> </td> </tr> <!-- <tr> <td colspan="1">"riding a bike under a bridge"</td> <td colspan="1">""</td> </tr> --> <tr> <td> <img src=assets/showcase/pour_honey.gif width="340"> </td> <td> <img src=assets/showcase/lighthouse.gif width="340"> </td> </tr> </table>

1.3. Showcases (256x256)

<table class="center"> <tr> <td colspan="2">"bear playing guitar happily, snowing"</td> <td colspan="2">"boy walking on the street"</td> </tr> <tr> <td> <img src=assets/showcase/guitar0.jpeg_00.png width="170"> </td> <td> <img src=assets/showcase/guitar0.gif width="170"> </td> <td> <img src=assets/showcase/walk0.png_00.png width="170"> </td> <td> <img src=assets/showcase/walk0.gif width="170"> </td> </tr> <!-- <tr> <td colspan="2">"two people dancing"</td> <td colspan="2">"girl talking and blinking"</td> </tr> <tr> <td> <img src=assets/showcase/dance1.jpeg_00.png width="170"> </td> <td> <img src=assets/showcase/dance1.gif width="170"> </td> <td> <img src=assets/showcase/girl3.jpeg_00.png width="170"> </td> <td> <img src=assets/showcase/girl3.gif width="170"> </td> </tr> --> <!-- <tr> <td colspan="2">"zoom-in, a landscape, springtime"</td> <td colspan="2">"A blonde woman rides on top of a moving <br>washing machine into the sunset."</td> </tr> <tr> <td> <img src=assets/showcase/Upscaled_Aime_Tribolet_springtime_landscape_golden_hour_morning_pale_yel_e6946f8d-37c1-4ce8-bf62-6ba90d23bd93.mp4_00.png width="170"> </td> <td> <img src=assets/showcase/Upscaled_Aime_Tribolet_springtime_landscape_golden_hour_morning_pale_yel_e6946f8d-37c1-4ce8-bf62-6ba90d23bd93.gif width="170"> </td> <td> <img src=assets/showcase/Upscaled_Alex__State_Blonde_woman_riding_on_top_of_a_moving_washing_mach_c31acaa3-dd30-459f-a109-2d2eb4c00fe2.mp4_00.png width="170"> </td> <td> <img src=assets/showcase/Upscaled_Alex__State_Blonde_woman_riding_on_top_of_a_moving_washing_mach_c31acaa3-dd30-459f-a109-2d2eb4c00fe2.gif width="170"> </td> </tr> <tr> <td colspan="2">"explode colorful smoke coming out"</td> <td colspan="2">"a bird on the tree branch"</td> </tr> <tr> <td> <img src=assets/showcase/explode0.jpeg_00.png width="170"> </td> <td> <img src=assets/showcase/explode0.gif width="170"> </td> <td> <img src=assets/showcase/bird000.jpeg width="170"> </td> <td> <img src=assets/showcase/bird000.gif width="170"> </td> </tr> --> </table >

2. Applications

2.1 Storytelling video generation (see project page for more details)

<table class="center"> <!-- <tr style="font-weight: bolder;text-align:center;"> <td>Input</td> <td>Output</td> <td>Input</td> <td>Output</td> </tr> --> <tr> <td colspan="4"><img src=assets/application/storytellingvideo.gif width="250"></td> </tr> </table >

2.2 Generative frame interpolation

<table class="center"> <tr style="font-weight: bolder;text-align:center;"> <td>Input starting frame</td> <td>Input ending frame</td> <td>Generated video</td> </tr> <tr> <td> <img src=assets/application/gkxX0kb8mE8_input_start.png width="250"> </td> <td> <img src=assets/application/gkxX0kb8mE8_input_end.png width="250"> </td> <td> <img src=assets/application/gkxX0kb8mE8.gif width="250"> </td> </tr> <tr> <td> <img src=assets/application/smile_start.png width="250"> </td> <td> <img src=assets/application/smile_end.png width="250"> </td> <td> <img src=assets/application/smile.gif width="250"> </td> </tr> <tr> <td> <img src=assets/application/stone01_start.png width="250"> </td> <td> <img src=assets/application/stone01_end.png width="250"> </td> <td> <img src=assets/application/stone01.gif width="250"> </td> </tr> </table >

2.3 Looping video generation

<table class="center"> <tr> <td> <img src=assets/application/60.gif width="300"> </td> <td> <img src=assets/application/35.gif width="300"> </td> <td> <img src=assets/application/36.gif width="300"> </td> </tr> <!-- <tr> <td> <img src=assets/application/05.gif width="300"> </td> <td> <img src=assets/application/25.gif width="300"> </td> <td> <img src=assets/application/34.gif width="300"> </td> </tr> --> </table >

📝 Changelog

  • [2024.06.14]: 🔥🔥 Release training code for interpolation.
  • [2024.05.24]: Release WebVid10M-motion annotations.
  • [2024.05.05]: Release training code.
  • [2024.03.14]: Release generative frame interpolation and looping video models (320x512).
  • [2024.02.05]: Release high-resolution models (320x512 & 576x1024).
  • [2023.12.02]: Launch the local Gradio demo.
  • [2023.11.29]: Release the main model at a resolution of 256x256.
  • [2023.11.27]: Launch the project page and update the arXiv preprint. <br>

🧰 Models

|Model|Resolution|GPU Mem. & Inference Time (A100, ddim 50steps)|Checkpoint| |:---------|:---------|:--------|:--------| |DynamiCrafter1024|576x1024|18.3GB & 75s (perframe_ae=True)|Hugging Face| |DynamiCrafter512|320x512|12.8GB & 20s (perframe_ae=True)|Hugging Face| |DynamiCrafter256|256x256|11.9GB & 10s (perframe_ae=False)|Hugging Face| |DynamiCrafter512_interp|320x512|12.8GB & 20s (perframe_ae=True)|Hugging Face|

Currently, our DynamiCrafter can support

Related Skills

View on GitHub
GitHub Stars3.0k
CategoryContent
Updated2d ago
Forks245

Languages

Python

Security Score

100/100

Audited on Mar 20, 2026

No findings