Iris.c

Flux 2 image generation model pure C inference

Generate Convert Improve

Install / Use

/learn @antirez/Iris.c

About this skill

Quality Score

0/100

README

Iris - a C inference pipeline for image synthesis models

Iris is an inference pipeline that generates images from text prompts using open weights diffusion transformer models. It is implemented entirely in C, with zero external dependencies beyond the C standard library. MPS and BLAS acceleration are optional but recommended. Under macOS, a BLAS API is part of the system, so nothing is required.

The name comes from the Greek goddess Iris, messenger of the gods and personification of the rainbow.

Supported model families:

FLUX.2 Klein (by Black Forest Labs):
- 4B distilled (4 steps, auto guidance set to 1, very fast).
- 4B base (50 steps for max quality, or less. Classifier-Free Diffusion Guidance, much slower but more generation variety).
- 9B distilled (4 steps, larger model, higher quality. Non-commercial license).
- 9B base (50 steps, CFG, highest quality. Non-commercial license).
Z-Image-Turbo (by Tongyi-MAI):
- 6B (8 NFE / 9 scheduler steps, no CFG, fast).

Quick Start

# Build (choose your backend)
make mps       # Apple Silicon (fastest)
# or: make blas    # Intel Mac / Linux with OpenBLAS
# or: make generic # Pure C, no dependencies

# Download a model (~16GB) - pick one:
./download_model.sh 4b                   # using curl
# or: pip install huggingface_hub && python download_model.py 4b

# Generate an image
./iris -d flux-klein-4b -p "A woman wearing sunglasses" -o output.png

If you want to try the base model, instead of the distilled one (much slower, higher quality), use the following instructions. Use 10 steps if your computer is quite slow, instead of the default of 50, it will still work well enough to test it (10 seconds to generate a 256x256 image on a MacBook M3 Max).

./download_model.sh 4b-base
# or: pip install huggingface_hub && python download_model.py 4b-base
./iris -d flux-klein-4b-base -p "A woman wearing sunglasses" -o output.png

If you want to try the 9B model (higher quality, non-commercial license, ~30GB download):

# 9B is a gated model - you need a HuggingFace token
# 1. Accept the license at https://huggingface.co/black-forest-labs/FLUX.2-klein-9B
# 2. Get your token from https://huggingface.co/settings/tokens
./download_model.sh 9b --token YOUR_TOKEN
# or: python download_model.py 9b --token YOUR_TOKEN
# or: set HF_TOKEN env var
./iris -d flux-klein-9b -p "A woman wearing sunglasses" -o output.png

For Z-Image-Turbo:

# Download Z-Image-Turbo (~12GB)
pip install huggingface_hub && python download_model.py zimage-turbo
./iris -d zimage-turbo -p "a fish" -o fish.png

That's it. No Python runtime or CUDA toolkit required at inference time.

Example Output

Woman with sunglasses

Generated with: ./iris -d flux-klein-4b -p "A picture of a woman in 1960 America. Sunglasses. ASA 400 film. Black and White." -W 512 -H 512 -o woman.png

Image-to-Image Example

antirez to drawing

Generated with: ./iris -i antirez.png -o antirez_to_drawing.png -p "make it a drawing" -d flux-klein-4b

Features

Zero dependencies: Pure C implementation, works standalone. BLAS optional for ~30x speedup (Apple Accelerate on macOS, OpenBLAS on Linux)
Metal GPU acceleration: Automatic on Apple Silicon Macs. Performance matches PyTorch's optimized MPS pipeline
Runs where Python can't: Memory-mapped weights (default) enable inference on 8GB RAM systems where the Python ML stack cannot run at all
Text-to-image: Generate images from text prompts
Image-to-image: Transform existing images guided by prompts (Flux models)
Multi-reference: Combine multiple reference images (e.g., -i car.png -i beach.png for "car on beach")
Integrated text encoder: Qwen3 encoder built-in (4B or 8B depending on model), no external embedding computation needed
Memory efficient: Automatic encoder release after encoding (up to ~16GB freed)
Memory-mapped weights: Enabled by default. Reduces peak memory from ~16GB to ~4-5GB. Fastest mode on MPS; BLAS users with plenty of RAM may prefer --no-mmap for faster inference
Size-independent seeds: Same seed produces similar compositions at different resolutions. Explore at 256x256, then render at 512x512 with the same seed
Terminal image display: watch the resulting image without leaving your terminal (Ghostty, Kitty, iTerm2, WezTerm, or Konsole).

Terminal Image Display

Kitty protocol example

Display generated images directly in your terminal with --show, or watch the denoising process step-by-step with --show-steps:

# Display final image in terminal (auto-detects Kitty/Ghostty/iTerm2/WezTerm/Konsole)
./iris -d flux-klein-4b -p "a cute robot" -o robot.png --show

# Display each denoising step (slower, but interesting to watch)
./iris -d flux-klein-4b -p "a cute robot" -o robot.png --show-steps

Requires a terminal supporting the Kitty graphics protocol (such as Kitty or Ghostty), the iTerm2 inline image protocol (iTerm2, WezTerm), or Konsole. Terminal type is auto-detected from environment variables.

Use --zoom N to adjust the display size (default: 2 for Retina displays, use 1 for non-HiDPI screens).

Usage

Text-to-Image

./iris -d flux-klein-4b -p "A fluffy orange cat sitting on a windowsill" -o cat.png

Image-to-Image

Transform an existing image based on a prompt:

./iris -d flux-klein-4b -p "oil painting style" -i photo.png -o painting.png

FLUX.2 uses in-context conditioning for image-to-image generation. Unlike traditional approaches that add noise to the input image, FLUX.2 passes the reference image as additional tokens that the model can attend to during generation. This means:

The model "sees" your input image and uses it as a reference
The prompt describes what you want the output to look like
Results tend to preserve the composition while applying the described transformation

Tips for good results:

Use descriptive prompts that describe the desired output, not instructions
Good: "oil painting of a woman with sunglasses, impressionist style"
Less good: "make it an oil painting" (instructional prompts may work less well)

Super Resolution: Since the reference image can be a different size than the output, you can use img2img for upscaling:

./iris -d flux-klein-4b -i small.png -W 1024 -H 1024 -o big.png -p "Create an exact copy of the input image."

The model will generate a higher-resolution version while preserving the composition and details of the input.

Multi-Reference Generation

Combine elements from multiple reference images:

./iris -d flux-klein-4b -i car.png -i beach.png -p "a sports car on the beach" -o result.png

Each reference image is encoded separately and passed to the transformer with different positional embeddings (T=10, T=20, T=30, ...). The model attends to all references during generation, allowing it to combine elements from each.

Example:

Reference 1: A red sports car
Reference 2: A tropical beach with palm trees
Prompt: "combine the two images"
Result: A red sports car on a tropical beach

You can specify up to 16 reference images with multiple -i flags. The prompt guides how the references are combined.

Interactive CLI Mode

Start without -p to enter interactive mode:

./iris -d flux-klein-4b

Generate images by typing prompts. Each image gets a $N reference ID:

iris> a red sports car
Done -> /tmp/iris-.../image-0001.png (ref $0)

iris> a tropical beach
Done -> /tmp/iris-.../image-0002.png (ref $1)

iris> $0 $1 combine them
Generating 256x256 (multi-ref, 2 images)...
Done -> /tmp/iris-.../image-0003.png (ref $2)

Prompt syntax:

prompt - text-to-image
512x512 prompt - set size inline
$ prompt - img2img with last image
$N prompt - img2img with reference $N
$0 $3 prompt - multi-reference (combine images)

Commands: !help, !save, !load, !seed, !size, !steps, !guidance, !linear, !power, !explore, !show, !quit

Command Line Options

Required:

-d, --dir PATH        Path to model directory
-p, --prompt TEXT     Text prompt for generation
-o, --output PATH     Output image path (.png or .ppm)

Generation options:

-W, --width N         Output width in pixels (default: 256)
-H, --height N        Output height in pixels (default: 256)
-s, --steps N         Sampling steps (default: auto, 4 distilled / 50 base / 9 zimage)
-S, --seed N          Random seed for reproducibility
-g, --guidance N      CFG guidance scale (default: auto, 1.0 distilled / 4.0 base / 0.0 zimage)
    --linear          Use linear timestep schedule (see below)
    --power           Use power curve timestep schedule (see below)
    --power-alpha N   Set power schedule exponent (default: 2.0)
    --base            Force base model mode (undistilled, CFG enabled)

Image-to-image options:

-i, --input PATH      Reference image (can be specified multiple times)

Output options:

-q, --quiet           Silent mode, no output
-v, --verbose         Show detailed config and timing info
    --show            Display image in terminal (auto-detects Kitty/Ghostty/iTerm2/WezTerm/Konsole)
    --show-steps      Display each denoising step (slower)
    --zoom N          Terminal image zoom factor (default: 2 for Retina)

Other options:

-m, --mmap            Memory-mapped weights (default, fastest on MPS)
    --no-mmap         Disable mmap, load all weights upfront
    --n

Related Skills

docs-writer

99.4k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

339.5k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

Design

Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t

arscontexta

2.9k

Claude Code plugin that generates individualized knowledge systems from conversation. You describe how you think and work, have a conversation and get a complete second brain as markdown files you own.

antirez

View profile

View on GitHub

GitHub Stars1.9k

CategoryContent

Updated6h ago

Forks130

antirez/iris.c

Languages

Security Score

95/100

Audited on Mar 28, 2026

No findings