Opentryon

Open-source APIs, SDKs, and models for building virtual try-on and fashion AI applications. Generate models, edit garments, create photoshoots, and build personalized fashion experiences.

Generate Convert Improve

Install / Use

/learn @tryonlabs/Opentryon

About this skill

Quality Score

0/100

README

OpenTryOn: Open-source AI toolkit for fashion tech and virtual try-on

OpenTryOn is an open-source AI toolkit designed for fashion technology and virtual try-on applications. This project provides a comprehensive suite of tools for garment segmentation, human parsing, pose estimation, and virtual try-on using state-of-the-art diffusion models.

📚 Documentation: Comprehensive documentation is available at https://tryonlabs.github.io/opentryon/

🎯 Features

Virtual Try-On:
- Amazon Nova Canvas virtual try-on using AWS Bedrock
- Kling AI virtual try-on using Kolors API
- Segmind Try-On Diffusion API integration
- Advanced diffusion-based virtual try-on capabilities using TryOnDiffusion
Image Generation:
- Nano Banana (Gemini 2.5 Flash Image) for fast, efficient image generation
- Nano Banana Pro (Gemini 3 Pro Image Preview) for advanced 4K image generation with search grounding
- Nano Banana 2 (Gemini 3.1 Flash Image) for Pro capabilities at Flash speed (1K/2K/4K, subject consistency)
- FLUX.2 [PRO] high-quality image generation with text-to-image, image editing, and multi-image composition
- FLUX.2 [FLEX] flexible image generation with advanced controls (guidance, steps, prompt upsampling)
- Photon-Flash-1 (Luma AI): Fast and cost efficient image generation, ideal for rapid iteration and scale
- Photon-1 (Luma AI): High-fidelity default model for professional-grade quality, creativity and detailed prompt handling
- GPT-Image-1 & GPT-Image-1.5 (OpenAI): High-quality image generation with strong prompt understanding, consistent composition, and reliable visual accuracy. GPT-Image-1.5 offers enhanced quality and better consistency
Video Generation:
- Luma AI Video Generation Model (Dream Machine): High-quality video generation with text-to-image and image-to-video modes.
- Google Veo 3 Video Generation Model: Generate high-quality, cinematic videos from text or images with realistic motion, temporal consistency, and fine-grained control over style and camera dynamics.
Remove Image Background: Remove Image Background using BEN2 (Background Erase Network)
Local Models (GPU Inference):
- FLUX.2-dev Turbo: 6x faster image generation with 8-step inference, supports text-to-image and image-to-image
- Automatic VRAM-based model selection (full, 8-bit, or 4-bit quantized)
Datasets Module:
- Fashion-MNIST dataset loader with automatic download
- VITON-HD dataset loader with lazy loading via PyTorch DataLoader
- Class-based adapter pattern for easy dataset integration
- Support for both small and large datasets
Garment Preprocessing:
- Garment segmentation using U2Net
- Garment extraction and preprocessing
- Human segmentation and parsing
Pose Estimation: OpenPose-based pose keypoint extraction for garments and humans
Outfit Generation: FLUX.1-dev LoRA-based outfit generation from text descriptions
Model Swap: Swap garments on different models
Interactive Demos: Gradio-based web interfaces for all features
Preprocessing Pipeline: Complete preprocessing pipeline for training and inference
AI Agents:
- Virtual Try-On Agent: LangChain-based agent for intelligent virtual try-on operations
- Model Swap Agent: AI agent for replacing models while preserving outfits using multiple AI models (Nano Banana, Nano Banana Pro, Nano Banana 2, FLUX 2 Pro, FLUX 2 Flex)

📚 Documentation

Complete documentation for OpenTryOn is available at https://tryonlabs.github.io/opentryon/

The documentation includes:

Getting Started guides
API Reference for all modules
Usage examples and tutorials
Datasets documentation (Fashion-MNIST, VITON-HD)
API adapters documentation (Segmind, Kling AI, Amazon Nova Canvas)
Interactive demos and examples
Advanced guides and troubleshooting

Visit the documentation site to explore all features, learn how to use OpenTryOn, and get started quickly!

🚀 Installation

Prerequisites

Python 3.10
CUDA-capable GPU (recommended)
Conda or Miniconda

Step 1: Clone the Repository

git clone https://github.com/tryonlabs/opentryon.git
cd opentryon

Step 2: Create Conda Environment

conda env create -f environment.yml
conda activate opentryon

Alternatively, you can install dependencies using pip:

pip install -r requirements.txt

Step 3: Install Package

pip install -e .

Step 4: Environment Variables

Create a .env file in the project root with the following variables:

U2NET_CLOTH_SEG_CHECKPOINT_PATH=cloth_segm.pth

# AWS Credentials for Amazon Nova Canvas (optional, can use AWS CLI default profile)
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AMAZON_NOVA_REGION=us-east-1  # Optional: us-east-1, ap-northeast-1, eu-west-1
AMAZON_NOVA_MODEL_ID=amazon.nova-canvas-v1:0  # Optional

# Kling AI Credentials (required for Kling AI virtual try-on)
KLING_AI_API_KEY=your_kling_api_key
KLING_AI_SECRET_KEY=your_kling_secret_key
KLING_AI_BASE_URL=https://api-singapore.klingai.com  # Optional, defaults to Singapore endpoint

# Segmind Credentials (required for Segmind virtual try-on)
SEGMIND_API_KEY=your_segmind_api_key

# Google Gemini Credentials (required for Nano Banana image generation and Google Veo 3 Video generation)
GEMINI_API_KEY=your_gemini_api_key

# BFL API Credentials (required for FLUX.2 image generation)
BFL_API_KEY=your_bfl_api_key

# Luma AI Credentials (required for Luma AI image generation and Luma AI Video generation)
LUMA_AI_API_KEY=your_luma_ai_api_key

# OpenAI Credentials (required for OpenAI GPT-Image-1 image generation)
OPENAI_API_KEY=your_openai_api_key

# LLM Provider Credentials (required for Virtual Try-On Agent)
OPENAI_API_KEY=your_openai_api_key  # For OpenAI (default)
# OR
ANTHROPIC_API_KEY=your_anthropic_api_key  # For Anthropic Claude
# OR
GOOGLE_API_KEY=your_google_api_key  # For Google Gemini

Notes:

Download the U2Net checkpoint file from the huggingface-cloth-segmentation repository
For Amazon Nova Canvas, ensure you have AWS credentials configured (via .env file or AWS CLI) and Nova Canvas enabled in your AWS Bedrock console
For Kling AI, obtain your API key and secret key from the Kling AI Developer Portal
For Segmind, obtain your API key from the Segmind API Portal
For Nano Banana and Google Veo 3, obtain your API key from the Google AI Studio
For FLUX.2 models, obtain your API key from BFL AI
For FLUX.2 models, obtain your API key from BFL AI
For Luma AI, obtain your API key from Luma Labs AI
For OpenAI, obtain your API key from OpenAI Platform
For Virtual Try-On Agent, obtain LLM API keys from:
- OpenAI: OpenAI API Keys
- Anthropic: Anthropic API Keys
- Google: Google AI Studio

🎮 Quick Start

Basic Preprocessing

from dotenv import load_dotenv
load_dotenv()

from tryon.preprocessing import segment_garment, extract_garment, segment_human

# Segment garment
segment_garment(
    inputs_dir="data/original_cloth",
    outputs_dir="data/garment_segmented",
    cls="upper"  # Options: "upper", "lower", "all"
)

# Extract garment
extract_garment(
    inputs_dir="data/original_cloth",
    outputs_dir="data/cloth",
    cls="upper",
    resize_to_width=400
)

# Segment human
segment_human(
    image_path="data/original_human/model.jpg",
    output_dir="data/human_segmented"
)

Command Line Interface

# Segment garment
python main.py --dataset data --action segment_garment --cls upper

# Extract garment
python main.py --dataset data --action extract_garment --cls upper

# Segment human
p

Related Skills

node-connect

346.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

107.6k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

346.8k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

346.8k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。