T2ITrainer

Practice Code for text to image trainer

Generate Convert Improve

Install / Use

/learn @lrzjason/T2ITrainer

About this skill

Quality Score

0/100

README

🚀 T2ITrainer

⚠️ Development Notice: Currently in active development - stability not guaranteed. Frequent updates - check changelogs regularly.

T2ITrainer is a diffusers based training script. It aims to provide simple yet implementation for lora training.

❗ Mandatory: Update diffusers to latest github version

pip install git+https://github.com/huggingface/diffusers.git -U

📅 Major Updates

2026-01-04: Fix frontend template save and load issues.

🛡️ Prerequisites

PyTorch: torch>=2.3.0+cu121 (CUDA 12.1 supported)
Node.js: node>=14.0.0 (Required for frontend UI)

💻 Supported Training Configurations

| Model Type | VRAM Requirements | Status | |------------------|----------------------------|--------------| | LongCat Image/Edit | 24GB GPU | ✅ Supported | | Qwen Edit | 48GB GPU (bf16)| ✅ Supported | | Qwen Image | 24GB GPU (nf4) 48GB GPU (bf16)| ✅ Supported | | Flux Fill, Kontext| 24GB GPU | ✅ Supported |

⚙️ Installation Guide

0. System Requirements

❗ Mandatory: Install Microsoft Visual C++ Redistributable if encountering DLL errors

0.1 Frontend Requirements

❗ Mandatory: Install Node.js (version 14 or higher) for the Node-Based Frontend UI

After installing Node.js, verify the installation:

node --version
npm --version

    cd frontend
    npm install
    npm run build
    cd ..

1. Automated Setup

Recommended Method

  git clone https://github.com/lrzjason/T2ITrainer.git
  cd T2ITrainer
  setup.bat

Handles: Virtual Environment • Dependency Installation • Model Downloads • Frontend Dependencies

The automated setup will:

Create a Python virtual environment
Install Python dependencies
Install Node.js dependencies for the frontend
Build the frontend UI
Download required models

2. Manual Installation

Clone Repository 🌐

    git clone https://github.com/lrzjason/T2ITrainer.git
    cd T2ITrainer

Virtual Environment 🛠️

    python -m venv venv
    call venv\Scripts\activate
    pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Frontend Setup 🖥️

    cd frontend
    npm install
    npm run build
    cd ..

Backend Dependencies 📦

    pip install -r requirements.txt

Model Downloads 📥 ❗ Notice: Only download the models you want to train. Install huggingface-cli if you haven't (or update the huggingface-cli if you have an old version). You could find the download scripts in download_xxx.txt

    # NF4 Qwen Image
    hf download "lrzjason/qwen_image_nf4" --local-dir qwen_models/qwen_image_nf4/

    # NF4 Flux kontext
    hf download "lrzjason/flux-kontext-nf4" --local-dir flux_models/kontext/

    # NF4 Flux Fill for low gpu
    hf download "lrzjason/flux-fill-nf4" --local-dir flux_models/fill/

    # Kolors
    hf download Kwai-Kolors/Kolors --local-dir kolors_models/

    # SD3.5 Models
    hf download "stabilityai/stable-diffusion-3.5-large" --local-dir "sd3.5L/"

    # download original repo for lokr training
    hf download "Qwen/Qwen-Image" --local-dir qwen_models/qwen_image/
    hf download "Qwen/Qwen-Image-Edit" --local-dir qwen_models/qwen_image_edit/

🚀 Launch Options

Command Line Interface

| Model | Command | Special Notes | |-----------------|--------------------------|-----------------------------------| | Qwen Edit | python train_qwen_image_edit.py | 48GB VRAM Recommended for original model| | Qwen Image | python train_qwen_image.py | 24GB VRAM Recommended for nf4, 48GB VRAM Recommended for original model| | Flux kontext | python ui_flux_fill.py | 24GB VRAM Recommended | | Flux Fill | python ui_flux_fill.py | 24GB VRAM Recommended | | LongCat Image | python train_longcat.py | 24GB VRAM Recommended | | LongCat Image Edit | python train_longcat_edit.py | 24GB VRAM Recommended |

New Architecture Backend Services

The new architecture uses a distributed service approach:

| Service | Command | Port | Purpose | |---------|---------|------|---------| | API Service | python -m services.api_service.main | 8000 | Handles HTTP requests and job queuing | | Worker Service | python -m services.worker_service.main | N/A | Executes training jobs | | Streamer Service | python -m services.streamer_service.main | 8001 | Streams real-time output to WebSocket clients | | Combined Services | python main_services.py | 8000, 8001 | Runs all services together |

Node-Based Frontend UI (Recommended)

For the new Node-Based Frontend UI with visualization capabilities:

Development Mode (Fastest for development):

# Terminal 1: Start new architecture backend services
python main_services.py

# Terminal 2: Start frontend (auto-reloads on changes)
cd frontend
npm run dev

Access at: http://localhost:3000

Production Mode (Optimized for performance):

# Build and serve the frontend with backend
python main.py

Access at: http://localhost:7860

Preview Mode (Pre-built optimized version):

# Terminal 1: Start new architecture backend services
python main_services.py

# Terminal 2: Serve pre-built frontend (faster than main.py)
cd frontend
npm run preview

Access at: http://localhost:7860

Performance Note: npm run dev provides the fastest experience with hot reloading, while npm run preview offers optimized performance similar to production. The python main.py approach uses npm run preview internally for better performance but still requires the backend to be running separately.

🔧 Parameter Configuration Guide

🌌 Qwen Model Management

| Config | Usage | |--------|-------| | config_qwen_single.json | Train qwen image with a single image; leave the suffix empty to use all images without a suffix. |

Usage: python train_qwen_image.py --config_path config_qwen_single.json

| Config | Usage | |--------|-------| | config_qwen_single.json | Train Qwen Image/Edit with a single image; leave the suffix empty to use all images without a suffix. | | config_qwen_edit_pairs.json | Traditional Qwen Edit training using _T and _R suffixed images. | | config_qwen_edit_pairs_multiple.json | Train with multiple reference images by setting suffixes like _T, _R, and _G. |

Usage: python train_qwen_image_edit.py --config_path config_qwen_single.json

Qwen Model Installation

Inpainting Model Setup

  hf download"lrzjason/qwen_image_nf4" --local-dir qwen_models/qwen_image_nf4/

For more details (example dataset):

https://github.com/lrzjason/T2ITrainer/blob/main/doc/qwen.md

⚙️ Qwen Recommended Parameters

Qwen Image NF4

| Category | Settings | |-------------------|-------------------------------| | Base Configuration| Rank 32, AdamW, Learn Rate 1e-4 | | 24GB GPU | 512 resolution, Batch Size 1 | | Precision | bf16 |

Qwen Image Model

| Category | Settings | |-------------------|-------------------------------| | Base Configuration| Rank 32~64, AdamW, Learn Rate 1e-4 | | 48GB GPU | 1024 resolution, Batch Size 1 | | Precision | bf16 |

Qwen Edit Model

| Category | Settings | |-------------------|-------------------------------| | Base Configuration| Rank 32~64, AdamW, Learn Rate 1e-4 | | 48GB GPU | 512 resolution, Batch Size 1 | | Precision | bf16 |

💻 VRAM Usage (nf4, bs1, blocks_to_swap=20)

💻 VRAM Usage (nf4, bs1, blocks_to_swap=0)

💻 VRAM Usage (Original, bf16, bs1, blocks_to_swap=0)

<div align="center"> <table> <tr> <td align="center"> <strong>VRAM Peak</strong><br> <strong>Around 43GB</strong> </td> </tr> </table> </div>

🌌 Flux Model Management

| Config | Usage | |--------|-------| | config_new_single.json | Train Kontext with a single image; leave the suffix empty to use all images without a suffix. | | config_new_pairs.json | Traditional Kontext training using _T and _R suffixed images. | | config_new_pairs_multiple.json | Train with multiple reference images by setting suffixes like _T, _R, and _G. | | config_new_mixed.json | Train Kontext using a mixed layout—e.g., combine traditional pair training with single-image training. |

Usage: python train_flux_lora_ui_kontext_new.py --config_path config_new_single.json

Kontext Model Installation

Inpainting Model Setup

  hf download"lrzjason/flux-kontext-nf4" --local-dir flux_models/kontext/

For more details (example dataset):

https://github.com/lrzjason/T2ITrainer/blob/main/doc/image/flux_kontext.md
https://huggingface.co/datasets/lrzjason/object_removal_alpha_kontext

Fill Model Installation (Skip if train kontext)

Inpainting Model Setup

  hf download"lrzjason/flux-fill-nf4" --local-dir flu

Related Skills

node-connect

345.9k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

106.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

345.9k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

345.9k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。