Giffusion
Create GIFs and Videos using Stable Diffusion
Install / Use
/learn @DN6/GiffusionREADME
GIFfusion 💥
<p align="center"> <img src="https://user-images.githubusercontent.com/7529846/220882002-72cbfdef-876a-4cb2-9f41-e5989e769868.gif" width="256" title="hover text"> </p>Giffusion is a Web UI for generating GIFs and Videos using Stable Diffusion.
To Run
In Colab
Open the Colab Notebook linked above and follow the instructions to start the Giffusion UI
On your local machine
Clone the Giffusion repository
git clone https://github.com/DN6/giffusion.git && cd giffusion
Install the requirements
pip install -r requirements.txt
Start the application
python app.py
Features
Saving and Loading Sessions
Giffusion uss the Hugging Face Hub to save output generations and settings. To save and load sessions, you will first need to set your Hugging Face access token using huggingface-cli login.
Once set, you can save your session by clicking on the Save Session button in the Session Settings. This will create a dataset repo on the Hub and save your settings and output generations to a folder with a randomly generated name. You can also set the Repo ID and Session Name manually in order to save your session to a specific repo.
Loading sessions works in a similar manner. Simply provide the Repo ID and Session Name of the session you would like to load and click on the Load Session button. You can filter the settings for the individual components in the UI using the dropdown selector.
Bring Your Own Pipeline
Giffusion supports using any pipeline and compatible checkpoint from the Diffusers library. Simply paste in the checkpoint name and pipeline name in the Pipeline Settings
ControlNet Support
Giffusion allows you to use the StableDiffusionControlNetPipeline. Simply paste in the ControlNet checkpoint you would like to use to load in the Pipeline.
MultiControlnet's are also supported. Just paste in a list of model checkpoint paths from the Hugging Face Hub
lllyasviel/control_v11p_sd15_softedge, lllyasviel/control_v11f1p_sd15_depth
Notes on Preprocessing: When using Controlnets, you need to preprocess your inputs before using them as conditioning signals for the model. The Controlnet Preprocessing Settings allow you to choose a set of preprocessing options to apply to your image. Be sure to select them in the same order as your Controlnet models. For example, for the code snippet above, you would have to select the softedge preprocessor before the depth one. If you are using a Controlnet model that requires no processing that in a MultiControlnet setting, a no-processing option is also provided.
Custom Pipeline Support
You can use your own custom pipelines with Giffusion as well. Simply paste in the path to your Pipeline file in the Custom Pipeline section. The Pipeline file must follow a format similar to the community pipelines found in Diffusers
Compel Prompt Weighting Support
Prompt Embeds are now generated via Compel and support the weighting syntanx outlined here
Multiframe Generation
Giffusion follows a prompt syntax similar to the one used in Deforum Art's Stable Diffusion Notebook
0: a picture of a corgi
60: a picture of a lion
The first part of the prompt indicates a key frame number, while the text after the colon is the prompt used by the model to generate the image.
In the example above, we're asking the model to generate a picture of a Corgi at frame 0 and a picture of a lion at frame 60. So what about all the images in between these two key frames? How do they get generated?
You might recall that Diffusion Models work by turning noise into images. Stable Diffusion turns a noise tensor into a latent embedding in order to save time and memory when running the diffusion process. This latent embedding is fed into a decoder to produce the image.
The inputs to our model are a noise tensor and text embedding tensor. Using our key frames as our start and end points, we can produce images in between these frames by interpolating these tensors.
<p align="center"> <img src="https://user-images.githubusercontent.com/7529846/204506200-49f91bd1-396f-4cf1-927c-c91b885f5c4a.gif" width="256" title="hover text"> </p>Inspiration Button
Creating prompts can be challenging. Click the Give me some inspiration button to automatically generate prompts for you.
You can even provide a list of topics for the inspiration button to use as a starting point.
<p align="center"> <img src="https://user-images.githubusercontent.com/7529846/220324835-fbbae3be-9a9a-48f9-a773-5e45c6274ed2.gif" width="800" title="hover text"> </p>Multimedia Support
Augment the image generation process with additional media inputs
<details> <summary>Image Input</summary>You can seed the generation process with an inital image. Upload your file using the, using the Image Input dropdown.
Drive your GIF and Video animations using audio.
https://user-images.githubusercontent.com/7529846/204550897-70777873-30ca-46a9-a74e-65b6ef429958.mp4
In order to use audio to drive your animations,
- Head over to the
Audio Inputdropdown and upload your audio file. - Click
Get Key Frame Information. This will extract key frames from the audio based on theAudio Componentyou have selected. You can extract key frames based on the percussive, harmonic or combined audio components of your file.
Additionally, timestamp information for these key frames is also extracted for reference in case you would like to sync your prompts to a particular time in the audio.
Note: The key frames will change based the frame rate that you have set in the UI.
</details> <details> <summary>Video Input</summary>You can use frames from an existing video as initial images in the diffusion process.
https://user-images.githubusercontent.com/7529846/204550451-5d2162dc-5d6b-4ecd-b1ed-c15cb56bc224.mp4
To use video initialization:
-
Head over to the
Video Inputdropdown -
Upload your file. Click
Get Key Frame Informationto extract the maximum number of frames present in the video and to update the frame rate setting in the UI to match the frame rate of the input video.
Resampling Output Generations
You can resample videos and GIFs created in the output tab and send them either to the Image Input or Video Input.
<details> <summary>Resampling to Image Input</summary>To sample an image from a video, select the frame id you want to sample from your output video or GIF and click on Send to Image Input
To resample a video, click on Send to Video Input
Saving to Comet
GIFfusion also support saving prompts, generated GIFs/Videos, images, and settings to Comet so you can keep track of your generative experiments.
Check out an example project here with some of my GIFs!
Diffusion Settings
This section covers all the components in the Diffusion Settings dropdown.
-
Use Fixed Latent: Use the same noise latent for every frame of the generation process. This is useful if you want to keep the noise latent fixed while interpolating over just the prompt embeddings.
-
Use Prompt Embeds: By default, Giffusion converts your prompts into embeddings and interpolates between the prompt embeddings for the in between frames. If you disable this option, Giffusion will forward fill the text prompts between frames instead. If you are using the
ComposableDiffusionpipeline or would like to use the prompt embedding function of the pipeline directly, disable this option. -
Numerical Seed: Seed for the noise latent generation process. If
Use Fixed Latentisn't set, this seed is used to generate a schedule that provides a unique seed for each key frame. -
Number of Iteration Steps: Number of steps to use in the generation process.
-
Classifier Free Guidance Scale: Higher guidance scale encourages generated images that are closely linked to the text prompt, usually at the expense of lower image quality.
-
Image Strength Schedule: Indicates how much to transform
Related Skills
docs-writer
99.6k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
342.0kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
arscontexta
2.9kClaude Code plugin that generates individualized knowledge systems from conversation. You describe how you think and work, have a conversation and get a complete second brain as markdown files you own.
cursor-agent-tracking
134A repository that provides a structured system for maintaining context and tracking changes in Cursor's AGENT mode conversations through template files, enabling better continuity and organization of AI interactions.
