StyleStudio

[CVPR 2025] Official implementation of StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

Generate Convert Improve

Install / Use

/learn @Westlake-AGI-Lab/StyleStudio

About this skill

Quality Score

0/100

README

<h1 align="center"> StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements CVPR 2025 </h1> <a href="https://github.com/MingkunLei">Mingkun Lei</a>1    <a href="https://scholar.google.com/citations?user=M6CSZVsAAAAJ&hl=zh-CN&oi=ao">Xue Song</a>2    <a href="https://beierzhu.github.io/">Beier Zhu</a>1, 3    <a href="https://wanghao.tech/">Hao Wang</a>4    <a href="https://icoz69.github.io/">Chi Zhang</a>1✉ 1AGI Lab, Westlake University,  2Fudan University,  3Nanyang Technological University  4The Hong Kong University of Science and Technology (Guangzhou)  <a href='https://arxiv.org/abs/2412.08503'><img src='https://img.shields.io/badge/ArXiv-2412.08503-red'></a>  <a href='https://stylestudio-official.github.io/'><img src='https://img.shields.io/badge/Project-Page-green'></a>  <a href="https://huggingface.co/spaces/Westlake-AGI-Lab/StyleStudio"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Gradio%20Demo-HF-orange"></a> <img src="assets/teaser.jpg">

News and Update

[2024.12.12] 🔥🔥We release the code.
[2024.12.19] 📝📝We have summarized the recent developments in style transfer. And we will continue to update.

Abstract

Text-driven style transfer aims to merge the style of a reference image with content described by a text prompt. Recent advancements in text-to-image models have improved the nuance of style transformations, yet significant challenges remain, particularly with overfitting to reference styles, limiting stylistic control, and misaligning with textual content. In this paper, we propose three complementary strategies to address these issues. First, we introduce a cross-modal Adaptive Instance Normalization (AdaIN) mechanism for better integration of style and text features, enhancing alignment. Second, we develop a Style-based Classifier-Free Guidance (SCFG) approach that enables selective control over stylistic elements, reducing irrelevant influences. Finally, we incorporate a teacher model during early generation stages to stabilize spatial layouts and mitigate artifacts. Our extensive evaluations demonstrate significant improvements in style transfer quality and alignment with textual prompts. Furthermore, our approach can be integrated into existing style transfer frameworks without fine-tuning.

Getting Started

1.Clone the code and prepare the environment

git clone https://github.com/Westlake-AGI-Lab/StyleStudio
cd StyleStudio

# create env using conda
conda create -n StyleStudio python=3.10
conda activate StyleStudio

# install dependencies with pip
# for Linux and Windows users
pip install -r requirements.txt

2.Run StyleStudio

Please note: Our solution is designed to be fine-tuning free and can be combined with different methods.

Parameter Explanation

adainIP using the cross modal AdaIN
fuSAttn hijack Self-Attention Map in the Teacher Model
fuAttn hijack Cross-Attention Map in the Teacher Model
end_fusion define when the Teacher Model stops participating
prompt specified prompt for generating the image
style_path path to the style image or folder
neg_style_path path to the negative style image

Integration with CSGO

Follow CSGO to download pre-trained checkpoints.

This is an example of usage: as the value of end_fusion increases, the style gradually diminishes. If the num_inference_steps are set to 50, we recommend setting end_fusion between 10 and 20. Typically, end_fusion should be set within the first 1/5 to 1/3 of the total num_inference_steps.

If you find that layout stability is not satisfactory, consider increasing the duration of the Teacher Model's involvement.

# Generate a single stylized image
# Use a specific text prompt and style image path
python infer_StyleStudio.py \
  --prompt "A red apple" \
  --style_path "assets/style1.jpg" \
  --adainIP \ # Enable Cross-Modal AdaIN
  --fuSAttn \ # Enable Teacher Model with Self Attention Map
  --end_fusion 20 \ # Define when the Teacher Model stop participating
  --num_inference_steps 50

# Check layout stability across different style images
# With the same text prompt and a set of style images
python infer_StyleStudio_layout_stability.py \
    --prompt "A red apple" \
    --style_path "path/to/style_images_folder" \
    --adainIP \ # Enable Cross-Modal AdaIN
    --fuSAttn \ # Enable Teacher Model with Self Attention Map
    --end_fusion 20 \ # Define when the Teacher Model stop participating
    --num_inference_steps 50

Note

As shown in Figure 15 of the paper, employing a Cross Attention Map in the Teacher Model does not ensure layout stability. We have also provided an interface fuAttn and encourage everyone to experiment with it.
To ensure layout stability and consistency for the same prompt under different style images, it is important to maintain consistency in the initial noise $z_0$ during experiments. For more details on this aspect, refer to infer_StyleStudio_layout_stability.py.

This is an example of using Style-based Classifier-Free Guidance.

python infer_StyleStudio.py \
  --prompt "A red apple" \
  --style_path "assets/style2.jpg" \
  --neg_style_path "assets/neg_style2.jpg" \

Some recommendations for generating Negative Style Images.

You can use ControlNet Canny for generation.
To ensure the generated images are more realistic, you can use weights from Civitai or Huggingface that are better suited for generating realistic image effects. We use the RealVisXL_V4.0.

To generate negative style images, we provide a code implementation in example_create_neg_style.py for your reference.

Integration with InstantStyle

Follow InstantStyle to download pre-trained checkpoints.

python infer_InstantStyle.py \
  --prompt "A red apple" \
  --style_path "assets/style1.jpg" \
  --adainIP \ # Enable Cross-Modal AdaIN
  --fuSAttn \ # Enable Teacher Model with Self Attention Map
  --end_fusion 20 \ # Define when the Teacher Model stop participating
  --num_inference_steps 50

Integration with StyleCrafter

Follow StyleCrafter to download pre-trained checkpoints.

We encourage you to integrate the Teacher Model with StyleCrafter. This combination, as shown in our experiments, not only helps maintain layout stability but also effectively reduces content leakage.

cd stylecrafter_sdxl

python stylecrafter_teacherModel.py \
  --config config/infer/style_crafter_sdxl.yaml \
  --style_path "../assets/style1.jpg" \
  --prompt "A red apple" \
  --scale 0.5 \
  --num_samples 2 \
  --end_fusion 10 # Define when the Teacher Model stop participating

3. Demo

To run a local demo of the project, run the following:

python gradio/app.py

BibTeX

If you find our repo helpful, please consider leaving a star or cite our paper :)

@inproceedings{lei2025stylestudio,
  title={StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements},
  author={Lei, Mingkun and Song, Xue and Zhu, Beier and Wang, Hao and Zhang, Chi},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={23443--23452},
  year={2025}
}

📭 Contact

If you have any comments or questions, feel free to contact Mingkun Lei.

Related Skills

node-connect

350.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

110.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

350.8k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

350.8k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。

Westlake-AGI-Lab

View profile

View on GitHub

GitHub Stars166

CategoryDevelopment

Updated1d ago

Forks8

Westlake-AGI-Lab/StyleStudio

Languages

Python

Security Score

85/100

Audited on Apr 6, 2026

No findings

StyleStudio

Install / Use

README

News and Update

Abstract

Getting Started

1.Clone the code and prepare the environment

2.Run StyleStudio

Parameter Explanation

Integration with CSGO

Note

Integration with InstantStyle

Integration with StyleCrafter

3. Demo

Related Links

BibTeX

📭 Contact

Related Skills