SkillAgentSearch skills...

UMO

[CVPR 2026] ๐Ÿ”ฅ๐Ÿ”ฅ Official Repo of UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward

Install / Use

/learn @bytedance/UMO
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<h3 align="center"> <img src="assets/logo.png" alt="Logo" style="vertical-align: middle; width: 100px; height: auto;"> </br> UMO: Scaling Multi-Identity Consistency for Image Customization </br> via Matching Reward </h3> <p align="center"> <a href="https://github.com/bytedance/UMO"><img alt="Build" src="https://img.shields.io/github/stars/bytedance/UMO"></a> <a href="https://bytedance.github.io/UMO/"><img alt="Build" src="https://img.shields.io/badge/Project%20Page-UMO-blue"></a> <a href="https://huggingface.co/bytedance-research/UMO"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Hugging%20Face&message=Model&color=green"></a> <a href="https://arxiv.org/abs/2509.06818"><img alt="Build" src="https://img.shields.io/badge/arXiv%20paper-UMO-b31b1b.svg"></a> <a href="https://huggingface.co/spaces/bytedance-research/UMO_UNO"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Demo&message=UMO-UNO&color=orange"></a> <a href="https://huggingface.co/spaces/bytedance-research/UMO_OmniGen2"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Demo&message=UMO-OmniGen2&color=orange"></a> </p>
<p align="center"> <span style="color:#137cf3; font-family: Gill Sans">Yufeng Cheng,</span><sup></sup></a> <span style="color:#137cf3; font-family: Gill Sans">Wenxu Wu,</span><sup></sup></a> <span style="color:#137cf3; font-family: Gill Sans">Shaojin Wu,</span><sup></sup></a> <span style="color:#137cf3; font-family: Gill Sans">Mengqi Huang,</span><sup></sup></a> <span style="color:#137cf3; font-family: Gill Sans">Fei Ding,</span><sup></sup></a> <span style="color:#137cf3; font-family: Gill Sans">Qian He</span></a> <br> <span style="font-size: 16px">UXO Team</span><br> <span style="font-size: 16px">Intelligent Creation Lab, Bytedance</span></p>

๐Ÿ”ฅ News

<p align="center"> <img src="comfyui/UNO/UMO_UNO_1.png" alt="ๅ›พ็‰‡1" height="400" width="auto"/> <img src="comfyui/OmniGen2/UMO_OmniGen2_1.png" alt="ๅ›พ็‰‡2" height="400" width="auto"/> </p>
  • 2025.09.09 ๐Ÿ”ฅ The demos of UMO are released: UMO-UNO & UMO-OmniGen2
  • 2025.09.09 ๐Ÿ”ฅ The paper of UMO is released.
  • 2025.09.08 ๐Ÿ”ฅ The models of UMO based on UNO and OmniGen2 are released. The released version of UMO are more stable than that reported in our paper.
  • 2025.09.08 ๐Ÿ”ฅ The project page of UMO is created.
  • 2025.09.08 ๐Ÿ”ฅ The inference and evaluation code of UMO is released.

๐Ÿ“– Introduction

Recent advancements in image customization exhibit a wide range of application prospects due to stronger customization capabilities. However, since we humans are more sensitive to faces, a significant challenge remains in preserving consistent identity while avoiding identity confusion with multi-reference images, limiting the identity scalability of customization models. To address this, we present UMO, a Unified Multi-identity Optimization framework, designed to maintain high-fidelity identity preservation and alleviate identity confusion with scalability. With "multi-to-multi matching" paradigm, UMO reformulates multi-identity generation as a global assignment optimization problem and unleashes multi-identity consistency for existing image customization methods generally through reinforcement learning on diffusion models. To facilitate the training of UMO, we develop a scalable customization dataset with multi-reference images, consisting of both synthesised and real parts. Additionally, we propose a new metric to measure identity confusion. Extensive experiments demonstrate that UMO not only improves identity consistency significantly, but also reduces identity confusion on several image customization methods, setting a new state-of-the-art among open-source methods along the dimension of identity preserving.

<p align="center"> <img src="assets/showcase.jpg" width="1024"/> </p>

โšก๏ธ Quick Start

๐Ÿ”ง Requirements and Installation

# 1. Clone the repo with submodules: UNO & OmniGen2
git clone --recurse-submodules git@github.com:bytedance/UMO.git
cd UMO

UMO requirements based on UNO

# 2.1 (Optional, but recommended) Create a clean virtual Python 3.11 environment
python3 -m venv venv/UMO_UNO
source venv/UMO_UNO/bin/activate

# 3.1 Install submodules UNO requirements as:
# https://github.com/bytedance/UNO?tab=readme-ov-file#-requirements-and-installation

# 4.1 Install UMO requirements
pip install -r requirements.txt

UMO requirements based on OmniGen2

# 2.2 (Optional, but recommended) Create a clean virtual Python 3.11 environment
python3 -m venv venv/UMO_OmniGen2
source venv/UMO_OmniGen2/bin/activate

# 3.2 Install submodules OmniGen2 requirements as:
# https://github.com/VectorSpaceLab/OmniGen2?tab=readme-ov-file#%EF%B8%8F-environment-setup

# 4.2 Install UMO requirements
pip install -r requirements.txt

UMO checkpoints download

# pip install huggingface_hub hf-transfer
export HF_HUB_ENABLE_HF_TRANSFER=1 # use hf_transfer to speedup
# export HF_ENDPOINT=https://hf-mirror.com # use mirror to speedup if necessary

repo_name="bytedance-research/UMO"
local_dir="models/"$repo_name

huggingface-cli download --resume-download $repo_name --local-dir $local_dir

๐ŸŒŸ Gradio Demo

# UMO (based on UNO)
python3 demo/UNO/app.py --lora_path models/bytedance-research/UMO/UMO_UNO.safetensors

# UMO (based on OmniGen2)
python3 demo/OmniGen2/app.py --lora_path models/bytedance-research/UMO/UMO_OmniGen2.safetensors

โš™๏ธ ComfyUI Workflow

UMO (based on UNO)

Since ComfyUI supports USO, we get workflow of UMO (based on UNO) with removing nodes related to SigLIP style feature, and extend it to multi-reference.

We provide several example images. You can download the image and drag it into ComfyUI to load the workflow.

Example with Single Identity

Reference Image

<p align="center"> <img src="comfyui/UNO/UMO_UNO_1.png" width="auto"> </p>

Example with Multi-Identity

Reference Image 1, Reference Image 2

<p align="center"> <img src="comfyui/UNO/UMO_UNO_2.png" width="auto"> </p>

UMO (based on OmniGen2)

Since ComfyUI supports OmniGen2, we just add a node to load our UMO lora.

Firstly, you should convert our UMO lora checkpoint to ComfyUI format as below:

python3 comfyui/OmniGen2/convert_ckpt.py

Then, you can download the example images and drag them into ComfyUI to load the workflow.

Example with Single Identity

Reference Image

<p align="center"> <img src="comfyui/OmniGen2/UMO_OmniGen2_1.png" width="auto"> </p>

Example with Multi-Identity

Reference Image 1, Reference Image 2

<p align="center"> <img src="comfyui/OmniGen2/UMO_OmniGen2_2.png" width="auto"> </p>

โœ๏ธ Inference

UMO (based on UNO) inference on XVerseBench

# single subject
accelerate launch eval/UNO/inference_xversebench.py \
    --eval_json_path projects/XVerse/eval/tools/XVerseBench_single.json \
    --num_images_per_prompt 4 \
    --width 768 \
    --height 768 \
    --save_path output/XVerseBench/single/UMO_UNO \
    --lora_path models/bytedance-research/UMO/UMO_UNO.safetensors
    

# multi subject
accelerate launch eval/UNO/inference_xversebench.py \
    --eval_json_path projects/XVerse/eval/tools/XVerseBench_multi.json \
    --num_images_per_prompt 4 \
    --width 768 \
    --height 768 \
    --save_path output/XVerseBench/multi/UMO_UNO \
    --lora_path models/bytedance-research/UMO/UMO_UNO.safetensors

UMO (based on UNO) inference on OmniContext

accelerate launch eval/UNO/inference_omnicontext.py \
    --eval_json_path OmniGen2/OmniContext \
    --width 768 \
    --height 768 \
    --save_path output/OmniContext/UMO_UNO \
    --lora_path models/bytedance-research/UMO/UMO_UNO.safetensors

UMO (based on OmniGen2) inference on XVerseBench

# single subject
accelerate launch -m eval.OmniGen2.inference_xversebench \
    --model_path OmniGen2/OmniGen2 \
    --model_name UMO_OmniGen2 \
    --test_data projects/XVerse/eval/tools/XVerseBench_single.json \
    --result_dir output/XVerseBench/single \
    --num_images_per_prompt 4 \
    --disable_align_res \
    --lora_path models/bytedance-research/UMO/UMO_OmniGen2.safetensors

# multi subject
accelerate launch -m eval.OmniGen2.inference_xversebench \
    --model_path OmniGen2/OmniGen2 \
    --model_name UMO_OmniGen2 \
    --test_data projects/XVerse/eval/tools/XVerseBench_multi.json \
    --result_dir output/XVerseBench/multi \
    --num_images_per_prompt 4 \
    --disable_align_res \
    --lora_path models/bytedance-research/UMO/UMO_OmniGen2.safetensors

UMO (based on OmniGen2) inference on OmniContext

accelerate launch -m eval.OmniGen2.inference_omnicontext \
    --model_path OmniGen2/OmniGen2 \
    --model_name UMO_OmniGen2 \
    --test_data OmniGen2/OmniContext \
    --result_dir output/OmniContext \
    --num_images_per_prompt 1 \
    --disable_align_res \
    --lora_path models/bytedance-research/UMO/UMO_OmniGen2.safetensors

๐Ÿ” Evaluation

Evaluation on XVerseBench

To make evaluation on XVerseBench, please get

Related Skills

View on GitHub
GitHub Stars184
CategoryDevelopment
Updated3d ago
Forks3

Languages

Python

Security Score

95/100

Audited on Apr 6, 2026

No findings