SkillAgentSearch skills...

X2Edit

AAAI2026 X2Edit: Revisiting Arbitrary-Instruction Image Editing through Self-Constructed Data and Task-Aware Representation Learning

Install / Use

/learn @OPPO-Mente-Lab/X2Edit
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<div align="center"> <h1>AAAI2026 X2Edit</h1> <a href='https://arxiv.org/abs/2508.07607'><img src='https://img.shields.io/badge/arXiv-2508.07607-b31b1b.svg'></a> &nbsp; <a href='https://huggingface.co/datasets/OPPOer/X2Edit-Dataset'><img src='https://img.shields.io/badge/🤗%20HuggingFace-X2Edit Dataset-ffd21f.svg'></a> <a href='https://huggingface.co/OPPOer/X2Edit'><img src='https://img.shields.io/badge/🤗%20HuggingFace-X2Edit-ffd21f.svg'></a> <a href='https://www.modelscope.cn/datasets/AIGCer-OPPO/X2Edit-Dataset'><img src='https://img.shields.io/badge/🤖%20ModelScope-X2Edit Dataset-purple.svg'></a> </div>

X2Edit: Revisiting Arbitrary-Instruction Image Editing through Self-Constructed Data and Task-Aware Representation Learning <br> Jian Ma<sup>1</sup>, Xujie Zhu<sup>2</sup>, Zihao Pan<sup>2</sup>, Qirong Peng<sup>1</sup>, Xu Guo<sup>3</sup>, Chen Chen<sup>1</sup>, Haonan Lu<sup>1</sup>

<br>

<sup>1</sup>OPPO AI Center, <sup>2</sup>Sun Yat-sen University, <sup>3</sup>Tsinghua University <br>

X2Edit image generation results

<div align="center"> <img src="assets/X2Edit images.jpg"> </div>

News

  • 2025/09/16: We release a dataset built using Qwen-Image and Qwen-Image-Edit. This sub-dataset specifically focuses on subject-driven generation with facial consistency—a key requirement for tasks requiring stable subject identity across generated content. Asian-portrait and NonAsian-portrait
  • 2025/08/25: Support Qwen-Image for training and inference. Checkpoint

X2Edit image generation results with Qwen-Image

<div align="center"> <img src="assets/qwen-image1.png"> </div> <div align="center"> <img src="assets/qwen-image0.png"> </div>

Environment

Prepare the environment, install the required libraries:

$ cd X2Edit
$ conda create --name X2Edit python==3.11
$ conda activate X2Edit
$ pip install -r requirements.txt

Clone LaMa to data_pipeline and rename it to lama. Clone SAM and GroundingDINO to SAM, and then rename them to segment_anything and GroundingDINO

Data Construction

(./assets/dataset_detail.jpg)

X2Edit provides executable scripts for each data construction workflow shown in the figure. We organize the dataset using the WebDataset format. Please replace the dataset in the scripts. The following Qwen model can be selected from Qwen2.5-VL-72B, Qwen3-8B, and Qwen2.5-VL-7B. In addition, we also use aesthetic scoring models for screening, please donwload SigLIP and aesthetic-predictor-v2-5, and then change the path in siglip_v2_5.py.

Inference

We provides inference scripts for editing images with resolutions of 1024 and 512. In addition, we can choose the base model of X2Edit, including FLUX.1-Krea, FLUX.1-dev, FLUX.1-schnell, PixelWave, shuttle-3-diffusion, and choose the LoRA for integration with MoE-LoRA including Turbo-Alpha, AntiBlur, Midjourney-Mix2, Super-Realism, Chatgpt-Ghibli. Choose the model you like and download it. For the MoE-LoRA, we will open source a unified checkpoint that can be used for both 512 and 1024 resolutions.

Before executing the script, download Qwen3-8B to select the task type for the input instruction, base model(FLUX.1-Krea, FLUX.1-dev, FLUX.1-schnell, shuttle-3-diffusion), MLLM and Alignet. All scripts follow analogous command patterns. Simply replace the script filename while maintaining consistent parameter configurations.

$ python infer.py --device cuda --pixel 1024 --num_experts 12 --base_path BASE_PATH --qwen_path QWEN_PATH --lora_path LORA_PATH --extra_lora_path EXTRA_LORA_PATH
$ python infer_qwen.py --device cuda --pixel 1024 --num_experts 12 --base_path BASE_PATH --qwen_path QWEN_PATH --lora_path LORA_PATH --extra_lora_path EXTRA_LORA_PATH  ## for Qwen-Image backbone

device: The device used for inference. default: cuda<br> pixel: The resol

View on GitHub
GitHub Stars98
CategoryEducation
Updated2d ago
Forks4

Languages

Python

Security Score

95/100

Audited on Apr 8, 2026

No findings