DynamiCrafter
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
Install / Use
/learn @Doubiiu/DynamiCrafterREADME
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
<!-- {: width="50%"} --> <!--  --> <div align="center"> <img src='assets/logo_long.png' style="height:100px"></img><a href='https://arxiv.org/abs/2310.12190'><img src='https://img.shields.io/badge/arXiv-2310.12190-b31b1b.svg'></a>
<a href='https://doubiiu.github.io/projects/DynamiCrafter/'><img src='https://img.shields.io/badge/Project-Page-Green'></a>
<a href='https://huggingface.co/papers/2310.12190'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Page-blue'></a>
<a href='https://youtu.be/0NfmIsNAg-g'><img src='https://img.shields.io/badge/Youtube-Video-b31b1b.svg'></a><br>
<a href='https://replicate.com/camenduru/dynami-crafter-576x1024'><img src='https://img.shields.io/badge/replicate-Demo-blue'></a>
<a href='https://github.com/camenduru/DynamiCrafter-colab'><img src='https://img.shields.io/badge/Colab-Demo-Green'></a>
<a href='https://huggingface.co/spaces/Doubiiu/DynamiCrafter'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face%20ImageAnimation-Demo-blue'></a>
<a href='https://huggingface.co/spaces/Doubiiu/DynamiCrafter_interp_loop'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face%20Interpolation/Looping-Demo-blue'></a>
<a href='https://openbayes.com/console/public/tutorials/XMVDVpXKN5o'><img src='https://img.shields.io/badge/Demo-OpenBayes贝式计算-blue'></a>
Jinbo Xing, Menghan Xia, Yong Zhang, Haoxin Chen, Wangbo Yu, <br>Hanyuan Liu, Gongye Liu, Xintao Wang, Ying Shan, Tien-Tsin Wong <br><br> From CUHK and Tencent AI Lab.
<strong>at European Conference on Computer Vision (ECCV) 2024, Oral</strong>
</div>🔆 Introduction
🔥🔥 Training / Fine-tuning code is available NOW!!!
🔥 We 1024x576 version ranks 1st on the I2V benchmark list from VBench!<br> 🔥 Generative frame interpolation / looping video generation model weights (320x512) have been released!<br> 🔥 New Update Rolls Out for DynamiCrafter! Better Dynamic, Higher Resolution, and Stronger Coherence! <br> 🤗 DynamiCrafter can animate open-domain still images based on <strong>text prompt</strong> by leveraging the pre-trained video diffusion priors. Please check our project page and paper for more information. <br>
👀 Seeking comparisons with Stable Video Diffusion and PikaLabs? Click the image below.

1.1. Showcases (576x1024)
<table class="center"> <!-- <tr> <td colspan="1">"fireworks display"</td> <td colspan="1">"a robot is walking through a destroyed city"</td> </tr> --> <tr> <td> <img src=assets/showcase/firework03.gif width="340"> </td> <td> <img src=assets/showcase/robot01.gif width="340"> </td> </tr> <!-- <tr> <td colspan="1">"riding a bike under a bridge"</td> <td colspan="1">""</td> </tr> --> <tr> <td> <img src=assets/showcase/bike_chineseink.gif width="340"> </td> <td> <img src=assets/showcase/girl07.gif width="340"> </td> </tr> </table>1.2. Showcases (320x512)
<table class="center"> <!-- <tr> <td colspan="1">"fireworks display"</td> <td colspan="1">"a robot is walking through a destroyed city"</td> </tr> --> <tr> <td> <img src=assets/showcase/bloom2.gif width="340"> </td> <td> <img src=assets/showcase/train_anime02.gif width="340"> </td> </tr> <!-- <tr> <td colspan="1">"riding a bike under a bridge"</td> <td colspan="1">""</td> </tr> --> <tr> <td> <img src=assets/showcase/pour_honey.gif width="340"> </td> <td> <img src=assets/showcase/lighthouse.gif width="340"> </td> </tr> </table>1.3. Showcases (256x256)
<table class="center"> <tr> <td colspan="2">"bear playing guitar happily, snowing"</td> <td colspan="2">"boy walking on the street"</td> </tr> <tr> <td> <img src=assets/showcase/guitar0.jpeg_00.png width="170"> </td> <td> <img src=assets/showcase/guitar0.gif width="170"> </td> <td> <img src=assets/showcase/walk0.png_00.png width="170"> </td> <td> <img src=assets/showcase/walk0.gif width="170"> </td> </tr> <!-- <tr> <td colspan="2">"two people dancing"</td> <td colspan="2">"girl talking and blinking"</td> </tr> <tr> <td> <img src=assets/showcase/dance1.jpeg_00.png width="170"> </td> <td> <img src=assets/showcase/dance1.gif width="170"> </td> <td> <img src=assets/showcase/girl3.jpeg_00.png width="170"> </td> <td> <img src=assets/showcase/girl3.gif width="170"> </td> </tr> --> <!-- <tr> <td colspan="2">"zoom-in, a landscape, springtime"</td> <td colspan="2">"A blonde woman rides on top of a moving <br>washing machine into the sunset."</td> </tr> <tr> <td> <img src=assets/showcase/Upscaled_Aime_Tribolet_springtime_landscape_golden_hour_morning_pale_yel_e6946f8d-37c1-4ce8-bf62-6ba90d23bd93.mp4_00.png width="170"> </td> <td> <img src=assets/showcase/Upscaled_Aime_Tribolet_springtime_landscape_golden_hour_morning_pale_yel_e6946f8d-37c1-4ce8-bf62-6ba90d23bd93.gif width="170"> </td> <td> <img src=assets/showcase/Upscaled_Alex__State_Blonde_woman_riding_on_top_of_a_moving_washing_mach_c31acaa3-dd30-459f-a109-2d2eb4c00fe2.mp4_00.png width="170"> </td> <td> <img src=assets/showcase/Upscaled_Alex__State_Blonde_woman_riding_on_top_of_a_moving_washing_mach_c31acaa3-dd30-459f-a109-2d2eb4c00fe2.gif width="170"> </td> </tr> <tr> <td colspan="2">"explode colorful smoke coming out"</td> <td colspan="2">"a bird on the tree branch"</td> </tr> <tr> <td> <img src=assets/showcase/explode0.jpeg_00.png width="170"> </td> <td> <img src=assets/showcase/explode0.gif width="170"> </td> <td> <img src=assets/showcase/bird000.jpeg width="170"> </td> <td> <img src=assets/showcase/bird000.gif width="170"> </td> </tr> --> </table >2. Applications
2.1 Storytelling video generation (see project page for more details)
<table class="center"> <!-- <tr style="font-weight: bolder;text-align:center;"> <td>Input</td> <td>Output</td> <td>Input</td> <td>Output</td> </tr> --> <tr> <td colspan="4"><img src=assets/application/storytellingvideo.gif width="250"></td> </tr> </table >2.2 Generative frame interpolation
<table class="center"> <tr style="font-weight: bolder;text-align:center;"> <td>Input starting frame</td> <td>Input ending frame</td> <td>Generated video</td> </tr> <tr> <td> <img src=assets/application/gkxX0kb8mE8_input_start.png width="250"> </td> <td> <img src=assets/application/gkxX0kb8mE8_input_end.png width="250"> </td> <td> <img src=assets/application/gkxX0kb8mE8.gif width="250"> </td> </tr> <tr> <td> <img src=assets/application/smile_start.png width="250"> </td> <td> <img src=assets/application/smile_end.png width="250"> </td> <td> <img src=assets/application/smile.gif width="250"> </td> </tr> <tr> <td> <img src=assets/application/stone01_start.png width="250"> </td> <td> <img src=assets/application/stone01_end.png width="250"> </td> <td> <img src=assets/application/stone01.gif width="250"> </td> </tr> </table >2.3 Looping video generation
<table class="center"> <tr> <td> <img src=assets/application/60.gif width="300"> </td> <td> <img src=assets/application/35.gif width="300"> </td> <td> <img src=assets/application/36.gif width="300"> </td> </tr> <!-- <tr> <td> <img src=assets/application/05.gif width="300"> </td> <td> <img src=assets/application/25.gif width="300"> </td> <td> <img src=assets/application/34.gif width="300"> </td> </tr> --> </table >📝 Changelog
- [2024.06.14]: 🔥🔥 Release training code for interpolation.
- [2024.05.24]: Release WebVid10M-motion annotations.
- [2024.05.05]: Release training code.
- [2024.03.14]: Release generative frame interpolation and looping video models (320x512).
- [2024.02.05]: Release high-resolution models (320x512 & 576x1024).
- [2023.12.02]: Launch the local Gradio demo.
- [2023.11.29]: Release the main model at a resolution of 256x256.
- [2023.11.27]: Launch the project page and update the arXiv preprint. <br>
🧰 Models
|Model|Resolution|GPU Mem. & Inference Time (A100, ddim 50steps)|Checkpoint|
|:---------|:---------|:--------|:--------|
|DynamiCrafter1024|576x1024|18.3GB & 75s (perframe_ae=True)|Hugging Face|
|DynamiCrafter512|320x512|12.8GB & 20s (perframe_ae=True)|Hugging Face|
|DynamiCrafter256|256x256|11.9GB & 10s (perframe_ae=False)|Hugging Face|
|DynamiCrafter512_interp|320x512|12.8GB & 20s (perframe_ae=True)|Hugging Face|
Currently, our DynamiCrafter can support
Related Skills
docs-writer
98.6k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
329.0kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
arscontexta
2.8kClaude Code plugin that generates individualized knowledge systems from conversation. You describe how you think and work, have a conversation and get a complete second brain as markdown files you own.
be
Assume the personality of the Persona described in any of the document available in the @~/.ai/personas directory.
