Opentryon
Open-source APIs, SDKs, and models for building virtual try-on and fashion AI applications. Generate models, edit garments, create photoshoots, and build personalized fashion experiences.
Install / Use
/learn @tryonlabs/OpentryonREADME
OpenTryOn: Open-source AI toolkit for fashion tech and virtual try-on
OpenTryOn is an open-source AI toolkit designed for fashion technology and virtual try-on applications. This project provides a comprehensive suite of tools for garment segmentation, human parsing, pose estimation, and virtual try-on using state-of-the-art diffusion models.
📚 Documentation: Comprehensive documentation is available at https://tryonlabs.github.io/opentryon/
🎯 Features
- Virtual Try-On:
- Amazon Nova Canvas virtual try-on using AWS Bedrock
- Kling AI virtual try-on using Kolors API
- Segmind Try-On Diffusion API integration
- Advanced diffusion-based virtual try-on capabilities using TryOnDiffusion
- Image Generation:
- Nano Banana (Gemini 2.5 Flash Image) for fast, efficient image generation
- Nano Banana Pro (Gemini 3 Pro Image Preview) for advanced 4K image generation with search grounding
- Nano Banana 2 (Gemini 3.1 Flash Image) for Pro capabilities at Flash speed (1K/2K/4K, subject consistency)
- FLUX.2 [PRO] high-quality image generation with text-to-image, image editing, and multi-image composition
- FLUX.2 [FLEX] flexible image generation with advanced controls (guidance, steps, prompt upsampling)
- Photon-Flash-1 (Luma AI): Fast and cost efficient image generation, ideal for rapid iteration and scale
- Photon-1 (Luma AI): High-fidelity default model for professional-grade quality, creativity and detailed prompt handling
- GPT-Image-1 & GPT-Image-1.5 (OpenAI): High-quality image generation with strong prompt understanding, consistent composition, and reliable visual accuracy. GPT-Image-1.5 offers enhanced quality and better consistency
- Video Generation:
- Luma AI Video Generation Model (Dream Machine): High-quality video generation with text-to-image and image-to-video modes.
- Google Veo 3 Video Generation Model: Generate high-quality, cinematic videos from text or images with realistic motion, temporal consistency, and fine-grained control over style and camera dynamics.
- Remove Image Background: Remove Image Background using BEN2 (Background Erase Network)
- Local Models (GPU Inference):
- FLUX.2-dev Turbo: 6x faster image generation with 8-step inference, supports text-to-image and image-to-image
- Automatic VRAM-based model selection (full, 8-bit, or 4-bit quantized)
- Datasets Module:
- Fashion-MNIST dataset loader with automatic download
- VITON-HD dataset loader with lazy loading via PyTorch DataLoader
- Class-based adapter pattern for easy dataset integration
- Support for both small and large datasets
- Garment Preprocessing:
- Garment segmentation using U2Net
- Garment extraction and preprocessing
- Human segmentation and parsing
- Pose Estimation: OpenPose-based pose keypoint extraction for garments and humans
- Outfit Generation: FLUX.1-dev LoRA-based outfit generation from text descriptions
- Model Swap: Swap garments on different models
- Interactive Demos: Gradio-based web interfaces for all features
- Preprocessing Pipeline: Complete preprocessing pipeline for training and inference
- AI Agents:
- Virtual Try-On Agent: LangChain-based agent for intelligent virtual try-on operations
- Model Swap Agent: AI agent for replacing models while preserving outfits using multiple AI models (Nano Banana, Nano Banana Pro, Nano Banana 2, FLUX 2 Pro, FLUX 2 Flex)
📋 Table of Contents
- Documentation
- Installation
- Quick Start
- Usage
- Datasets Module
- Virtual Try-On with Amazon Nova Canvas
- Virtual Try-On with Kling AI
- Virtual Try-On with Segmind
- Virtual Try-On Agent
- Model Swap Agent
- Image Generation with Nano Banana
- Image Generation with FLUX.2
- Image Generation with Luma AI
- Image Generation with OpenAI
- Video Generation with Luma AI
- Video Generation with Google Veo 3
- Remove Image Background with BEN2
- Local Models (GPU Inference)
- Preprocessing Functions
- Demos
- Project Structure
- TryOnDiffusion Roadmap
- Contributing
- License
📚 Documentation
Complete documentation for OpenTryOn is available at https://tryonlabs.github.io/opentryon/
The documentation includes:
- Getting Started guides
- API Reference for all modules
- Usage examples and tutorials
- Datasets documentation (Fashion-MNIST, VITON-HD)
- API adapters documentation (Segmind, Kling AI, Amazon Nova Canvas)
- Interactive demos and examples
- Advanced guides and troubleshooting
Visit the documentation site to explore all features, learn how to use OpenTryOn, and get started quickly!
🚀 Installation
Prerequisites
- Python 3.10
- CUDA-capable GPU (recommended)
- Conda or Miniconda
Step 1: Clone the Repository
git clone https://github.com/tryonlabs/opentryon.git
cd opentryon
Step 2: Create Conda Environment
conda env create -f environment.yml
conda activate opentryon
Alternatively, you can install dependencies using pip:
pip install -r requirements.txt
Step 3: Install Package
pip install -e .
Step 4: Environment Variables
Create a .env file in the project root with the following variables:
U2NET_CLOTH_SEG_CHECKPOINT_PATH=cloth_segm.pth
# AWS Credentials for Amazon Nova Canvas (optional, can use AWS CLI default profile)
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AMAZON_NOVA_REGION=us-east-1 # Optional: us-east-1, ap-northeast-1, eu-west-1
AMAZON_NOVA_MODEL_ID=amazon.nova-canvas-v1:0 # Optional
# Kling AI Credentials (required for Kling AI virtual try-on)
KLING_AI_API_KEY=your_kling_api_key
KLING_AI_SECRET_KEY=your_kling_secret_key
KLING_AI_BASE_URL=https://api-singapore.klingai.com # Optional, defaults to Singapore endpoint
# Segmind Credentials (required for Segmind virtual try-on)
SEGMIND_API_KEY=your_segmind_api_key
# Google Gemini Credentials (required for Nano Banana image generation and Google Veo 3 Video generation)
GEMINI_API_KEY=your_gemini_api_key
# BFL API Credentials (required for FLUX.2 image generation)
BFL_API_KEY=your_bfl_api_key
# Luma AI Credentials (required for Luma AI image generation and Luma AI Video generation)
LUMA_AI_API_KEY=your_luma_ai_api_key
# OpenAI Credentials (required for OpenAI GPT-Image-1 image generation)
OPENAI_API_KEY=your_openai_api_key
# LLM Provider Credentials (required for Virtual Try-On Agent)
OPENAI_API_KEY=your_openai_api_key # For OpenAI (default)
# OR
ANTHROPIC_API_KEY=your_anthropic_api_key # For Anthropic Claude
# OR
GOOGLE_API_KEY=your_google_api_key # For Google Gemini
Notes:
-
Download the U2Net checkpoint file from the huggingface-cloth-segmentation repository
-
For Amazon Nova Canvas, ensure you have AWS credentials configured (via
.envfile or AWS CLI) and Nova Canvas enabled in your AWS Bedrock console -
For Kling AI, obtain your API key and secret key from the Kling AI Developer Portal
-
For Segmind, obtain your API key from the Segmind API Portal
-
For Nano Banana and Google Veo 3, obtain your API key from the Google AI Studio
-
For FLUX.2 models, obtain your API key from BFL AI
-
For FLUX.2 models, obtain your API key from BFL AI
-
For Luma AI, obtain your API key from Luma Labs AI
-
For OpenAI, obtain your API key from OpenAI Platform
-
For Virtual Try-On Agent, obtain LLM API keys from:
- OpenAI: OpenAI API Keys
- Anthropic: Anthropic API Keys
- Google: Google AI Studio
🎮 Quick Start
Basic Preprocessing
from dotenv import load_dotenv
load_dotenv()
from tryon.preprocessing import segment_garment, extract_garment, segment_human
# Segment garment
segment_garment(
inputs_dir="data/original_cloth",
outputs_dir="data/garment_segmented",
cls="upper" # Options: "upper", "lower", "all"
)
# Extract garment
extract_garment(
inputs_dir="data/original_cloth",
outputs_dir="data/cloth",
cls="upper",
resize_to_width=400
)
# Segment human
segment_human(
image_path="data/original_human/model.jpg",
output_dir="data/human_segmented"
)
Command Line Interface
# Segment garment
python main.py --dataset data --action segment_garment --cls upper
# Extract garment
python main.py --dataset data --action extract_garment --cls upper
# Segment human
p
Related Skills
node-connect
346.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
346.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
346.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
