SkillAgentSearch skills...

FakeShield

πŸ”₯ [ICLR 2025] FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models

Install / Use

/learn @zhipeixu/FakeShield
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<div align="center"> <img src="./assets/Logo.png" alt="Image Alt Text" width="150" height="150"> <h3> FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models </h3> <!-- <h4> CVPR 2024 </h4> -->

Zhipei Xu, Xuanyu Zhang, Runyi Li, Zecheng Tang, Qing Huang, Jian Zhang

School of Electronic and Computer Engineering, Peking University

arXiv License Visitors hf_space hf_space Home Page <br> wechat wechat zhihu csdn

</div>
<details open><summary>πŸ’‘ We also have other Copyright Protection projects that may interest you ✨. </summary><p> <!-- may -->

AvatarShield: Visual Reinforcement Learning for Human-Centric Video Forgery Detection <br> Zhipei Xu, Xuanyu Zhang, Xing Zhou, Jian Zhang <br> github github arXiv <br>

EditGuard: Versatile Image Watermarking for Tamper Localization and Copyright Protection [CVPR 2024] <br> Xuanyu Zhang, Runyi Li, Jiwen Yu, Youmin Xu, Weiqi Li, Jian Zhang <br> github github arXiv <br>

OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking [CVPR 2025] <br> Xuanyu Zhang, Zecheng Tang, Zhipei Xu, Runyi Li, Youmin Xu, Bin Chen, Feng Gao, Jian Zhang <br> github github arXiv <br>

</p></details>

πŸ“° News

  • [2026.02.21] πŸ”₯πŸ”₯πŸ”₯ We have updated the SD_Inpaint dataset on Hugging Face, and you can access it from here.
  • [2025.04.23] πŸ€— We have open-sourced the MMTD-Set-34k dataset on Hugging Face, and you can access it from here.
  • [2025.02.14] πŸ€— We ~~are progressively open-sourcing~~ have open-sourced all code & pre-trained model weights. Welcome to watch πŸ‘€ this repository for the latest updates.
  • [2025.01.23] πŸŽ‰πŸŽ‰πŸŽ‰ Our FakeShield has been accepted at ICLR 2025!
  • [2024.10.03] πŸ”₯ We have released FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models. We present explainable IFDL tasks, constructing the MMTD-Set dataset and the FakeShield framework. Check out the paper. The code and dataset are coming soon

<img id="painting_icon" width="3%" src="https://cdn-icons-png.flaticon.com/128/1022/1022330.png"> FakeShield Overview

FakeShield is a novel multi-modal framework designed for explainable image forgery detection and localization (IFDL). Unlike traditional black-box IFDL methods, FakeShield integrates multi-modal large language models (MLLMs) to analyze manipulated images, generate tampered region masks, and provide human-understandable explanations based on pixel-level artifacts and semantic inconsistencies. To improve generalization across diverse forgery types, FakeShield introduces domain tags, which guide the model to recognize different manipulation techniques effectively. Additionally, we construct MMTD-Set, a richly annotated dataset containing multi-modal descriptions of manipulated images, fostering better interpretability. Through extensive experiments, FakeShield demonstrates superior performance in detecting and localizing various forgeries, including copy-move, splicing, removal, DeepFake, and AI-generated manipulations.

alt text

πŸ† Contributions

  • FakeShield Introduction. We introduce FakeShield, a multi-modal framework for explainable image forgery detection and localization, which is the first to leverage MLLMs for the IFDL task. We also propose Domain Tag-guided Explainable Forgery Detection Module(DTE-FDM) and Multimodal Forgery Localization Module (MFLM) to improve the generalization and robustness of the models

  • Novel Explainable-IFDL Task. We propose the first explainable image forgery detection and localization (e-IFDL) task, addressing the opacity of traditional IFDL methods by providing both pixel-level and semantic-level explanations.

  • MMTD-Set Dataset Construction. We create the MMTD-Set by enriching existing IFDL datasets using GPT-4o, generating high-quality β€œimage-mask-description” triplets for enhanced multimodal learning.

πŸ› οΈ Requirements and Installation

Note: If you want to reproduce the results from our paper, please prioritize using the Docker image to set up the environment. For more details, see this issue.

Installation via Pip

  1. Ensure your environment meets the following requirements:

    • Python == 3.9
    • Pytorch == 1.13.0
    • CUDA Version == 11.6
  2. Clone the repository:

    git clone https://github.com/zhipeixu/FakeShield.git
    cd FakeShield
    
  3. Install dependencies:

    apt update && apt install git
    pip install -r requirements.txt
    
    ## Install MMCV
    git clone https://github.com/open-mmlab/mmcv
    cd mmcv
    git checkout v1.4.7
    MMCV_WITH_OPS=1 pip install -e .
    
  4. Install DTE-FDM:

    cd ../DTE-FDM
    pip install -e .
    pip install -e ".[train]"
    pip install flash-attn --no-build-isolation
    

Installation via Docker

  1. Pull the pre-built Docker image:

    docker pull zhipeixu/mflm:v1.0
    docker pull zhipeixu/dte-fdm:v1.0
    
  2. Clone the repository:

    git clone https://github.com/zhipeixu/FakeShield.git
    cd FakeShield
    
  3. Run the container:

    docker run --gpus all -it --rm \
        -v $(pwd):/workspace/FakeShield \
        zhipeixu/dte-fdm:latest /bin/bash
    
    docker run --gpus all -it --rm \
        -v $(pwd):/workspace/FakeShield \
        zhipeixu/mflm:latest /bin/bash
    
  4. Inside the container, navigate to the repository:

    cd /workspace/FakeShield
    
  5. Install MMCV:

    git clone https://github.com/open-mmlab/mmcv
    

πŸ€– Prepare Model

  1. Download FakeShield weights from Hugging Face

    The model weights consist of three parts: DTE-FDM, MFLM, and DTG. For convenience, we have packaged them together and uploaded them to the Hugging Face repository.

    We recommend using huggingface_hub to download the weights:

    pip install huggingface_hub
    huggingface-cli download --resume-download zhipeixu/fakeshield-v1-22b --local-dir weight/
    
  2. Download pretrained SAM weight

    In MFLM, we will use the SAM pre-training weights. You can use wget to download the sam_vit_h_4b8939.pth model:

    wget https://huggingface.co/ybelkada/segment-anything/resolve/main/checkpoints/sam_vit_h_4b8939.pth -P weight/
    
  3. Ensure the weights are placed correctly

    Organize your weight/ folder as follows:

     FakeShield/
     β”œβ”€β”€ weight/
     β”‚   β”œβ”€β”€ fakeshield-v1-22b/
     β”‚   β”‚   β”œβ”€β”€ DTE-FDM/
     β”‚   β”‚   β”œβ”€β”€ MFLM/
     β”‚   β”‚   β”œβ”€β”€ DTG.pth
     β”‚   β”œβ”€β”€ sam_vit_h_4b8939.pth
    

πŸš€ Quick Start

CLI Demo

You can quickly run the demo script by executing:

bash scripts/cli_demo.sh

The cli_demo.sh script allows customization through the following environment variables:

  • WEIGHT_PATH: Path to the FakeShield weight directory (default: ./weight/fakeshield-v1-22b)
  • IMAGE_PATH: Path to the input image (default: `./play
View on GitHub
GitHub Stars551
CategoryDevelopment
Updated1h ago
Forks35

Languages

Python

Security Score

100/100

Audited on Apr 2, 2026

No findings