SkillAgentSearch skills...

Ctrlora

[ICLR 2025] Codebase for "CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation"

Install / Use

/learn @xyfJASON/Ctrlora
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

To create your customized ControlNet in an easy and low-cost manner 🎉

<p align="center"> <img src="./assets/banner.jpg" alt="banner" style="width: 100%" /> </p> <p align="center"> <img src="./assets/style-transfer.gif" alt="style-transfer" style="width: 100%" /> </p>

The images are compressed for loading speed.

<h1 align="center"> <a href="https://arxiv.org/abs/2410.09400">CtrLoRA</a> </h1>

<a href="https://arxiv.org/abs/2410.09400"><img src="https://img.shields.io/badge/ICLR 2025-3A98B9?label=%F0%9F%93%9D&labelColor=FFFDD0" style="height: 28px" /></a> <a href="https://huggingface.co/xyfJASON/ctrlora/tree/main"><img src="https://img.shields.io/badge/Model-3A98B9?label=%F0%9F%A4%97&labelColor=FFFDD0" style="height: 28px" /></a>

CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation
Yifeng Xu<sup>1,2</sup>, Zhenliang He<sup>1</sup>, Shiguang Shan<sup>1,2</sup>, Xilin Chen<sup>1,2</sup>
<sup>1</sup>Key Lab of AI Safety, Institute of Computing Technology, CAS, China
<sup>2</sup>University of Chinese Academy of Sciences, China

<p align="center"> <img src="./assets/overview.jpg" alt="base-conditions" style="width: 100%" /> </p>

We first train a Base ControlNet along with condition-specific LoRAs on base conditions with a large-scale dataset. Then, our Base ControlNet can be efficiently adapted to novel conditions by new LoRAs with <mark>10% parameters, as few as 1,000 images, and less than 1 hour training on a single GPU</mark>.

📜 Contents

🎨 Visual Results

🎨 Controllable generation on "base conditions"

|<img src="./assets/base-conditions.jpg" alt="base-conditions" style="width: 100%" />| |-|

🎨 Controllable generation on "novel conditions"

|<img src="./assets/novel-conditions.jpg" alt="novel-conditions" style="width: 100%" />| |-|

🎨 Integration into community models & Multi-conditional generation

|<img src="./assets/community-multi.jpg" alt="integration" style="width: 100%" />| |-|

🎨 Application to style transfer

|<img src="./assets/style-transfer.jpg" alt="style-transfer" style="width: 100%" />| |-|

🎨 Spatial + style control (integrated with InstantStyle)

| <img src="./assets/instant-style.jpg" alt="style-transfer" style="width: 100%" /> | |-|

🛠️ Installation

Clone this repo:

git clone --depth 1 https://github.com/xyfJASON/ctrlora.git
cd ctrlora

Create and activate a new conda environment:

conda create -n ctrlora python=3.10
conda activate ctrlora

Install pytorch and other dependencies:

pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt

🤖️ Download Pretrained Models

We provide our pretrained models here. Please put the Base ControlNet (ctrlora_sd15_basecn700k.ckpt) into ./ckpts/ctrlora-basecn and the LoRAs into ./ckpts/ctrlora-loras. The naming convention of the LoRAs is ctrlora_sd15_<basecn>_<condition>.ckpt for base conditions and ctrlora_sd15_<basecn>_<condition>_<images>_<steps>.ckpt for novel conditions.

You also need to download the SD1.5-based Models and put them into ./ckpts/sd15. Models used in our work:

🚀 Gradio Demo

🚀 Spatial Control

python app/gradio_ctrlora.py

Requires at least 9GB/21GB GPU RAM to generate a batch of one/four 512x512 images.

<details><summary><strong>Single-conditional generation</strong></summary>
  1. select the Stable Diffusion checkpoint, Base Controlnet checkpoint and LoRA checkpoint.
  2. write prompts and negative prompts. We provide several commonly used prompts.
  3. prepare a condition image
    • upload an image to the left of the "Condition" panel, select the preprocessor corresponding to the LoRA, and click "Detect".
    • or upload the condition image directly, select the "none" preprocessor, and click "Detect".
  4. click "Run" to generate images.
  5. if you upload any new checkpoints, restart gradio or click "Refresh".
<img src="./assets/gradio.jpg" alt="gradio" style="width: 100%;" /> </details> <details><summary><strong>Multi-conditional generation</strong></summary> <img src="./assets/gradio2.jpg" alt="gradio2" style="width: 100%;" /> </details> <details><summary><strong>Application to style transfer</strong></summary>
  1. select a stylized Stable Diffusion checkpoint to specify the target style, e.g., Pixel.
  2. select the Base ControlNet checkpoint.
  3. select palette for the LoRA1 checkpoint and lineart for the LoRA2 checkpoint.
    • palette + canny or palette + hed also work, maybe there are more interesting combinations to be discovered
  4. write prompts and negative prompts.
  5. upload the source image to the "Condition 1" panel, select the "none" preprocessor, and click "Detect".
  6. upload the source image to the "Condition 2" panel, select the "lineart" preprocessor, and click "Detect".
  7. adjust the weights for the two conditions in the "Basic options" panel.
  8. click "Run" to generate images.
<img src="./assets/gradio3.jpg" alt="gradio3" style="width: 100%;" /> </details>

🚀 Spatial + Style Control

Many thanks to Lianchen-li for integrating InstantStyle with CtrLoRA to support "spatial + style" control!

To run the gradio demo, first download IP Adapter from this repo. You need to download the models directory, rename it to ip-adapter, and put it into ./ckpts/. Then launch the gradio demo:

python app/gradio_ctrlora_style_transfer.py
<details><summary><strong>Instructions</strong></summary>
  1. select the Stable Diffusion, Base Controlnet, LoRA, and IP Adapter checkpoint.
  2. write prompts and negative prompts.
  3. prepare a condition image
    • upload an image to the "Content" panel in the "Reference images" block, select the preprocessor corresponding to the LoRA, and click "Detect".
    • or upload the condition image directly, select the "none" preprocessor, and click "Detect".
  4. upload a style image to the "Style" panel in the "Reference images" block.
    • optionally, you can choose style mode and use negative content prompt.
  5. click "Run" to generate images.
<img src="./assets/gradio4.jpg" alt="gradio" style="width: 100%;" /> </details>

🚗 Python API

Besides the Gradio demo, you can also sample images with the following Python code.

🚗 Single-conditional generation

from api import CtrLoRA

ctrlora = CtrLoRA(num_loras=1)
ctrlora.create_model(
    sd_file='ckpts/sd15/v1-5-pruned.ckpt',
    basecn_file='ckpts/ctrlora-basecn/ctrlora_sd15_basecn700k.ckpt',
    lora_files='ckpts/ctrlora-loras/novel-conditions/ctrlora_sd15_basecn700k_inpainting_brush_rank128_1kimgs_1ksteps.ckpt',
)
samples = ctrlora.sample(
    cond_image_paths='assets/test_images/inpaint_cat.png',
    prompt='A cat wearing a brown cowboy hat, best quality',
    n_prompt='worst quality',
    num_samples=1,
)
samples[0].show()

🚗 Multi-conditional generation

from api import CtrLoRA

ctrlora = CtrLoRA(num_loras=2)
ctrlora.create_model(
    sd_file='ckpts/sd15/v1-5-pruned.ckpt',
    basecn_file='ckpts/ctrlora-basecn/ctrlora_sd15_basecn700k.ckpt',
    lora_files=('ckpts/ctrlora-loras/novel-conditions/ctrlora_sd15_basecn700k_lineart_rank128_1kimgs_1ksteps.ckpt',
                'ckpts/ctrlora-loras/novel-conditions/ctrlora_sd15_basecn700k_palette_rank128_100kimgs_100ksteps.ckpt'),
)
samples = ctrlora.sample(
    cond_image_paths=('assets/test_images/lineart_bird.png',
                      'assets/test_images/palette_bird.png'),
    prompt='Photo of a parrot, best quality',
    n_prompt='worst quality',
    num_samples=1,
    lora_weights=(1.0, 1.0),
)
samples[0].show()

🚢 ComfyUI Workflows

Many thanks to Kosinkadink for his hard work to create the CtrLoRA node! Many thanks to toyxyz for sharing his workflow using CtrLoRA with AnimateDiff!

| **[CtrLoRA-Canny](https://raw.githubusercontent.com/xyfJASON/ctrlora/refs/heads/main/assets/workflow-c

Related Skills

View on GitHub
GitHub Stars264
CategoryContent
Updated8d ago
Forks11

Languages

Python

Security Score

100/100

Audited on Mar 30, 2026

No findings