SkillAgentSearch skills...

Giffusion

Create GIFs and Videos using Stable Diffusion

Install / Use

/learn @DN6/Giffusion
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

GIFfusion 💥

<p align="center"> <img src="https://user-images.githubusercontent.com/7529846/220882002-72cbfdef-876a-4cb2-9f41-e5989e769868.gif" width="256" title="hover text"> </p>

Giffusion is a Web UI for generating GIFs and Videos using Stable Diffusion.

Open In Colab Open In Comet

To Run

In Colab

Open the Colab Notebook linked above and follow the instructions to start the Giffusion UI

On your local machine

Clone the Giffusion repository

git clone https://github.com/DN6/giffusion.git && cd giffusion

Install the requirements

pip install -r requirements.txt

Start the application

python app.py

Features

Saving and Loading Sessions

Giffusion uss the Hugging Face Hub to save output generations and settings. To save and load sessions, you will first need to set your Hugging Face access token using huggingface-cli login.

Once set, you can save your session by clicking on the Save Session button in the Session Settings. This will create a dataset repo on the Hub and save your settings and output generations to a folder with a randomly generated name. You can also set the Repo ID and Session Name manually in order to save your session to a specific repo.

Loading sessions works in a similar manner. Simply provide the Repo ID and Session Name of the session you would like to load and click on the Load Session button. You can filter the settings for the individual components in the UI using the dropdown selector.

Bring Your Own Pipeline

Giffusion supports using any pipeline and compatible checkpoint from the Diffusers library. Simply paste in the checkpoint name and pipeline name in the Pipeline Settings

ControlNet Support

Giffusion allows you to use the StableDiffusionControlNetPipeline. Simply paste in the ControlNet checkpoint you would like to use to load in the Pipeline.

MultiControlnet's are also supported. Just paste in a list of model checkpoint paths from the Hugging Face Hub

lllyasviel/control_v11p_sd15_softedge, lllyasviel/control_v11f1p_sd15_depth

Notes on Preprocessing: When using Controlnets, you need to preprocess your inputs before using them as conditioning signals for the model. The Controlnet Preprocessing Settings allow you to choose a set of preprocessing options to apply to your image. Be sure to select them in the same order as your Controlnet models. For example, for the code snippet above, you would have to select the softedge preprocessor before the depth one. If you are using a Controlnet model that requires no processing that in a MultiControlnet setting, a no-processing option is also provided.

<p align="center"> <img width="341" alt="Screenshot 2023-07-26 at 11 41 11 PM" src="https://user-images.githubusercontent.com/7529846/256476148-fc0dc1ad-ed26-435c-9850-8c9cb7f9a789.png"> </p>

Custom Pipeline Support

You can use your own custom pipelines with Giffusion as well. Simply paste in the path to your Pipeline file in the Custom Pipeline section. The Pipeline file must follow a format similar to the community pipelines found in Diffusers

Compel Prompt Weighting Support

Prompt Embeds are now generated via Compel and support the weighting syntanx outlined here

Multiframe Generation

Giffusion follows a prompt syntax similar to the one used in Deforum Art's Stable Diffusion Notebook

0: a picture of a corgi
60: a picture of a lion

The first part of the prompt indicates a key frame number, while the text after the colon is the prompt used by the model to generate the image.

In the example above, we're asking the model to generate a picture of a Corgi at frame 0 and a picture of a lion at frame 60. So what about all the images in between these two key frames? How do they get generated?

You might recall that Diffusion Models work by turning noise into images. Stable Diffusion turns a noise tensor into a latent embedding in order to save time and memory when running the diffusion process. This latent embedding is fed into a decoder to produce the image.

The inputs to our model are a noise tensor and text embedding tensor. Using our key frames as our start and end points, we can produce images in between these frames by interpolating these tensors.

<p align="center"> <img src="https://user-images.githubusercontent.com/7529846/204506200-49f91bd1-396f-4cf1-927c-c91b885f5c4a.gif" width="256" title="hover text"> </p>

Inspiration Button

Creating prompts can be challenging. Click the Give me some inspiration button to automatically generate prompts for you.

<p align="center"> <img src="https://user-images.githubusercontent.com/7529846/220324203-444c1720-c71b-4ccf-b08f-5b20668b7f98.gif" width="800" title="hover text"> </p>

You can even provide a list of topics for the inspiration button to use as a starting point.

<p align="center"> <img src="https://user-images.githubusercontent.com/7529846/220324835-fbbae3be-9a9a-48f9-a773-5e45c6274ed2.gif" width="800" title="hover text"> </p>

Multimedia Support

Augment the image generation process with additional media inputs

<details> <summary>Image Input</summary>

You can seed the generation process with an inital image. Upload your file using the, using the Image Input dropdown.

<p align="center"> <img src="https://user-images.githubusercontent.com/7529846/220880564-dba393c5-6023-4539-a59c-c33758769500.gif" width="800" title="hover text"> </p> <p align="center"> <a align="center" href="https://www.krea.ai/prompt/184bf3cf-ec0d-4ff8-b4f1-45577799700b">Image Source</a> </p> </details> <details> <summary>Audio Input</summary>

Drive your GIF and Video animations using audio.

https://user-images.githubusercontent.com/7529846/204550897-70777873-30ca-46a9-a74e-65b6ef429958.mp4

In order to use audio to drive your animations,

  1. Head over to the Audio Input dropdown and upload your audio file.
  2. Click Get Key Frame Information. This will extract key frames from the audio based on the Audio Component you have selected. You can extract key frames based on the percussive, harmonic or combined audio components of your file.

Additionally, timestamp information for these key frames is also extracted for reference in case you would like to sync your prompts to a particular time in the audio.

Note: The key frames will change based the frame rate that you have set in the UI.

</details> <details> <summary>Video Input</summary>

You can use frames from an existing video as initial images in the diffusion process.

https://user-images.githubusercontent.com/7529846/204550451-5d2162dc-5d6b-4ecd-b1ed-c15cb56bc224.mp4

To use video initialization:

  1. Head over to the Video Input dropdown

  2. Upload your file. Click Get Key Frame Information to extract the maximum number of frames present in the video and to update the frame rate setting in the UI to match the frame rate of the input video.

</details>

Resampling Output Generations

You can resample videos and GIFs created in the output tab and send them either to the Image Input or Video Input.

<details> <summary>Resampling to Image Input</summary>

To sample an image from a video, select the frame id you want to sample from your output video or GIF and click on Send to Image Input

<p align="center"> <img src="https://user-images.githubusercontent.com/7529846/220325938-22438722-d4ac-4a35-995f-51d8dbafaa34.gif" width="800" title="hover text"> </p> </details> <details> <summary>Resampling to Video Input</summary>

To resample a video, click on Send to Video Input

<p align="center"> <img src="https://user-images.githubusercontent.com/7529846/220322852-f2fab800-43dc-41b8-bdb4-c4057bb65a5f.gif" width="800" title="hover text"> </p> </details>

Saving to Comet

GIFfusion also support saving prompts, generated GIFs/Videos, images, and settings to Comet so you can keep track of your generative experiments.

Check out an example project here with some of my GIFs!

Diffusion Settings

This section covers all the components in the Diffusion Settings dropdown.

  1. Use Fixed Latent: Use the same noise latent for every frame of the generation process. This is useful if you want to keep the noise latent fixed while interpolating over just the prompt embeddings.

  2. Use Prompt Embeds: By default, Giffusion converts your prompts into embeddings and interpolates between the prompt embeddings for the in between frames. If you disable this option, Giffusion will forward fill the text prompts between frames instead. If you are using the ComposableDiffusion pipeline or would like to use the prompt embedding function of the pipeline directly, disable this option.

  3. Numerical Seed: Seed for the noise latent generation process. If Use Fixed Latent isn't set, this seed is used to generate a schedule that provides a unique seed for each key frame.

  4. Number of Iteration Steps: Number of steps to use in the generation process.

  5. Classifier Free Guidance Scale: Higher guidance scale encourages generated images that are closely linked to the text prompt, usually at the expense of lower image quality.

  6. Image Strength Schedule: Indicates how much to transform

Related Skills

View on GitHub
GitHub Stars225
CategoryContent
Updated3mo ago
Forks22

Languages

Python

Security Score

82/100

Audited on Dec 19, 2025

No findings