SkillAgentSearch skills...

OmniAID

Official PyTorch Code for "OmniAID: Decoupling Semantic and Artifacts for Universal AI-Generated Image Detection in the Wild".

Install / Use

/learn @yunncheng/OmniAID
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

OmniAID: Decoupling Semantic and Artifacts for Universal AI-Generated Image Detection in the Wild

<div align="center">

Paper Hugging Face Models Hugging Face Spaces Hugging Face Dataset License

</div>

📖 Introduction

OmniAID is a universal AI-generated image detector designed for real-world, in-the-wild scenarios.

Most existing detectors collapse under distribution shift because they entangle high-level semantic flaws (e.g., distorted humans, inconsistent object logic) and low-level generator artifacts (e.g., diffusion-specific fingerprints), learning a single fused representation that generalizes poorly.

To address these fundamental limitations, OmniAID explicitly decouples semantic and artifact cues through a hybrid Mixture-of-Experts (MoE) architecture—paired with a new modern dataset, Mirage, which reflects contemporary generative models and realistic threats.

Method Framework

🌟 Key Features

🧠 Hybrid MoE Architecture

  • Routable Semantic Experts
    Specialized experts dedicated to specific semantic domains (Human, Animal, Object, Scene, Anime).

  • Fixed Universal Artifact Expert
    Always active, focusing solely on content-agnostic generative artifacts.

⚙️ Two-Stage Training Strategy

  1. Expert Specialization
    Each semantic expert is trained independently with domain-specific hard sampling.

  2. Router Training
    A lightweight router learns to dispatch inputs to the most relevant semantic experts, while the artifact expert is always included.

📢 News

  • 🗓️ 2026/02/03: OmniAID now supports LoRA-style fine-tuning and DINOv3 backbone. (Please ensure to update your environment according to requirements.txt for the latest features.)
  • 🗓️ 2025/12/10: We have released a clean inference script for deploying OmniAID as a reward model.
  • 🗓️ 2025/11/30: We have released training and testing code, along with model weights.
  • 🗓️ 2025/11/11: OmniAID paper is now available on arXiv.

🚀 Online Demo

Experience OmniAID instantly in your browser. This demo is powered by the OmniAID checkpoint trained on Mirage-Train.

Try OmniAID on Hugging Face Spaces

Supported Modes:

  • 🤖 Auto (Router) Mode (Default) The lightweight router dynamically analyzes the input image and assigns optimal weights to specific semantic experts.
  • 🎛️ Manual Mode (Analysis) Allows you to manually adjust expert weights to interpret how different semantic domains or the universal artifact expert contribute to the final detection score.

🦾 Act as a Reward Model

OmniAID has been successfully integrated as a reward model to guide the image generator RealGEN. By providing fine-grained feedback on both semantic plausibility and low-level generative artifacts, OmniAID enables RealGEN to produce images with significantly enhanced realism and fewer detectable AI fingerprints.

RealGEN

To facilitate its use in other generation pipelines, we provide a clean and self-contained test script for deploying OmniAID as a reward model. You can find the reference implementation in reward/clean_test.py. This script demonstrates how to load the pre-trained checkpoint and compute a detection score for a list of input images, which can be directly used as a reward signal for reinforcement learning-based generation.

📚 Dataset Download

🔸 GenImage-SD v1.4 (Classified)

A reorganized subset of GenImage-SD v1.4, classified into semantic categories (Human_Animal, Object_Scene) to train the Semantic Experts.

Download via Google Drive

🔸 GenImage-SD v1.4 Reconstruction

The real images from the GenImage-SD v1.4 subset, reconstructed using the SD1.4 VAE. We apply the reconstruction methodology from AlignedForensics to this specific dataset to serve as "purified" reference data for artifact learning.

Download via Google Drive

🔸 Mirage-Test

A challenging evaluation set containing images from held-out modern generators, optimized for high realism to rigorously test model generalization.

Download via Hugging Face
Download via Google Drive

📦 Model Zoo

We provide checkpoints trained on different datasets and different backbones. All models are hosted on Hugging Face.

| Model Variant | Training Data | Backbone | Filename | Download | | :--- | :--- | :--- | :--- | :--- | | OmniAID-DINO (Recommended) | Mirage-Train (Ours)| DINOv3 ViT-L/16 | checkpoint_omniaid_dino.pth | Link | | OmniAID (Recommended) | Mirage-Train (Ours)| CLIP-ViT-L/14@336px | checkpoint_omniaid_mirage.pth | Link | | OmniAID-GenImage | GenImage-SD v1.4 | CLIP-ViT-L/14@336px | checkpoint_omniaid_genimage_sd14.pth | Link |

Note:

  • OmniAID-DINO (Recommended) utilizes the DINOv3 ViT-L/16 backbone. It achieves a lower false positive rate on real images compared to OmniAID while maintaining robust detection generalization.
  • OmniAID (Recommended) is trained on our Mirage-Train, offering the best generalization for real-world "in-the-wild" detection.
  • OmniAID-GenImage is trained on the standard academic dataset GenImage-SD v1.4, primarily for fair comparison with previous baselines.

🛠️ Installation

git clone https://github.com/yunncheng/OmniAID.git
cd OmniAID
pip install -r requirements.txt

⚡ Quick Start

To reproduce our results or train on your own data, please follow the steps below.

1. Configuration

Modify the configuration file config.json to set model hyperparameters (e.g., number of experts, rank, hidden dimensions) and other global settings. By default, the parameter stage1_base_dir should simply be set to the same path as OUTPUT_DIR in both the scripts/train.sh and scripts/eval.sh scripts.

{
    "CLIP_path": "openai/clip-vit-large-patch14-336",
    "num_experts": 3,
    "rank_per_expert": 4,
    // ...
}

2. Training

We provide a shell script for training. Before running, please open scripts/train.sh and configure the necessary paths:

  • DATA_PATH: Path to your training dataset.
  • OUTPUT_DIR: Directory where checkpoints will be saved.
  • LOG_DIR: Directory where logs will be saved.
  • MOE_CONFIG_PATH: Path to your config.json.

Once configured, start training:

bash scripts/train.sh

3. Evaluation

To evaluate the model on test sets, open scripts/eval.sh and set the following:

  • EVAL_DATA_PATH: Path to the validation/test dataset.
  • OUTPUT_DIR: Directory where results will be saved.
  • RESUME: Path to the trained model weight (.pth).
  • MOE_CONFIG_PATH: Path to your config.json.

Then run the evaluation script:

bash scripts/eval.sh

🙏 Acknowledgements

We gratefully acknowledge the outstanding open-source contributions that enabled this work.

🔸 Base Framework

Our main training/inference framework is developed on top of AIDE and ConvNeXt. We sincerely thank the authors for their robust codebase.

🔸 Reconstruction Code

The reconstruction scripts located in recon/ are adapted from AlignedForensics. We are grateful to the authors for their valuable contribution to artifact purification research.

📝 Citation

If you find this work useful for your research, please cite our paper:

@article{guo2025omniaid,
  title={OmniAID: Decoupling Semantic and Artifacts for Universal AI-Generated Image Detection in the Wild},
  author={Guo, Yuncheng and Ye, Junyan and Zhang, Chenjue and Kang, Hengrui and Fu, Haohuan and He, Conghui and Li, Weijia},
  journal={arXiv preprint arXiv:2511.08423},
  year={2025}
}
View on GitHub
GitHub Stars20
CategoryDevelopment
Updated8d ago
Forks0

Languages

Python

Security Score

80/100

Audited on Mar 20, 2026

No findings