SkillAgentSearch skills...

MMTrustEval

A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)

Install / Use

/learn @thu-ml/MMTrustEval
About this skill

Quality Score

0/100

Supported Platforms

Claude Code
Claude Desktop

README

<div align="center"> <img src="docs/structure/background.png" alt="background" style="width: 90%;"> </div> <div align="center" style="font-size: 16px;"> 🌐 <a href="https://multi-trust.github.io/">Project Page</a> &nbsp&nbsp 📖 <a href="https://arxiv.org/abs/2406.07057">arXiv Paper</a> &nbsp&nbsp 📜 <a href="https://thu-ml.github.io/MMTrustEval/">Documentation </a> &nbsp&nbsp 📊 <a href="https://docs.google.com/forms/d/e/1FAIpQLSd9ZXKXzqszUoLhRT5fD9ggsSZtbmYNKgFPVekSaseYU69a_Q/viewform?usp=sf_link">Dataset</a> &nbsp&nbsp 🤗 <a href="https://huggingface.co/datasets/thu-ml/MultiTrust">Hugging Face</a> &nbsp&nbsp 🏆 <a href="https://multi-trust.github.io/#leaderboard">Leaderboard</a> </div> <br> <div align="center"> <img src="https://img.shields.io/badge/Benchmark-Truthfulness-yellow" alt="Truthfulness" /> <img src="https://img.shields.io/badge/Benchmark-Safety-red" alt="Safety" /> <img src="https://img.shields.io/badge/Benchmark-Robustness-blue" alt="Robustness" /> <img src="https://img.shields.io/badge/Benchmark-Fairness-orange" alt="Fairness" /> <img src="https://img.shields.io/badge/Benchmark-Privacy-green" alt="Privacy" /> </div> <br>

MultiTrust is a comprehensive benchmark designed to assess and enhance the trustworthiness of MLLMs across five key dimensions: truthfulness, safety, robustness, fairness, and privacy. It integrates a rigorous evaluation strategy involving 32 diverse tasks to expose new trustworthiness challenges.

<div align="center"> <img src="docs/structure/framework.jpg" alt="framework" style="width: 90%;"> </div>

🚀 News

🛠️ Installation

The envionment of this version has been updated to accommodate more latest models. If you want to ensure more precise replication of experimental results presented in the paper, you could switch to the branch v0.1.0.

  • Option A: UV install

    uv venv --python 3.9
    source .venv/bin/activate
    
    uv pip install setuptools
    uv pip install torch==2.3.0
    uv pip sync --no-build-isolation env/requirements.txt
    
  • Option B: Docker

    • How to install docker

      # Our docker version:
      #     Client: Docker Engine - Community
      #     Version:           27.0.0-rc.1
      #     API version:       1.46
      #     Go version:        go1.21.11
      #     OS/Arch:           linux/amd64
      
      distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
      curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
      curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
      
      sudo apt-get update
      sudo apt-get install -y nvidia-container-toolkit
      
      sudo systemctl restart docker
      sudo usermod -aG docker [your_username_here]
      
    • Get our image:

      • B.1: Pull image from DockerHub

        docker pull jankinfstmrvv/multitrust:latest
        
      • B.2: Build from scratch

        #  Note: 
        # [data] is the `absolute paths` of data.
        
        docker build --network=host -t multitrust:latest -f env/Dockerfile .
        
    • Start a container:

      docker run -it \
          --name multitrust \
          --gpus all \
          --privileged=true \
          --shm-size=10gb \
          -v $HOME/.cache/huggingface:/root/.cache/huggingface \
          -v $HOME/.cache/torch:/root/.cache/torch \
          -v [data]:/root/MMTrustEval/data \
          -w /root/MMTrustEval \
          -d multitrust:latest /bin/bash
      
      # entering the container
      docker exec -it multitrust /bin/bash
      
  • Several tasks require the use of commercial APIs for auxiliary testing. Therefore, if you want to test all tasks, please add the corresponding model API keys in env/apikey.yml.

:envelope: Dataset

License

  • The codebase is licensed under the CC BY-SA 4.0 license.

  • MultiTrust is only used for academic research. Commercial use in any form is prohibited.

  • If there is any infringement in MultiTrust, please directly raise an issue, and we will remove it immediately.

Data Preparation

Refer here for detailed instructions.

📚 Docs

Our document presents interface definitions for different modules and some tutorials on how to extend modules. Running online at: https://thu-ml.github.io/MMTrustEval/

Run following command to see the docs(locally).

mkdocs serve -f env/mkdocs.yml -a 0.0.0.0:8000

📈 Reproduce results in Our paper

Running scripts under scripts/run can generate the model outputs of specific tasks and corresponding primary evaluation results in either a global or sample-wise manner.

📌 To Make Inference

# Description: Run scripts require a model_id to run inference tasks.
# Usage: bash scripts/run/*/*.sh <model_id>

scripts/run
├── fairness_scripts
│   ├── f1-stereo-generation.sh
│   ├── f2-stereo-agreement.sh
│   ├── f3-stereo-classification.sh
│   ├── f3-stereo-topic-classification.sh
│   ├── f4-stereo-query.sh
│   ├── f5-vision-preference.sh
│   ├── f6-profession-pred.sh
│   └── f7-subjective-preference.sh
├── privacy_scripts
│   ├── p1-vispriv-recognition.sh
│   ├── p2-vqa-recognition-vispr.sh
│   ├── p3-infoflow.sh
│   ├── p4-pii-query.sh
│   ├── p5-visual-leakage.sh
│   └── p6-pii-leakage-in-conversation.sh
├── robustness_scripts
│   ├── r1-ood-artistic.sh
│   ├── r2-ood-sensor.sh
│   ├── r3-ood-text.sh
│   ├── r4-adversarial-untarget.sh
│   ├── r5-adversarial-target.sh
│   └── r6-adversarial-text.sh
├── safety_scripts
│   ├── s1-nsfw-image-description.sh
│   ├── s2-risk-identification.sh
│   ├── s3-toxic-content-generation.sh
│   ├── s4-typographic-jailbreaking.sh
│   ├── s5-multimodal-jailbreaking.sh
│   └── s6-crossmodal-jailbreaking.sh
└── truthfulness_scripts
    ├── t1-basic.sh
    ├── t2-advanced.sh
    ├── t3-instruction-enhancement.sh
    ├── t4-visual-assistance.sh
    ├── t5-text-misleading.sh
    ├── t6-visual-confusion.sh
    └── t7-visual-misleading.sh

📌 To Evaluate Results

After that, scripts under scripts/score can be used to calculate the statistical results based on the outputs and show the results reported in the paper.

# Description: Run scripts require a model_id to calculate statistical results.
# Usage: python scripts/score/*/*.py --model_id <model_id>

scripts/score
├── fairness
│   ├── f1-stereo-generation.py
│   ├── f2-stereo-agreement.py
│   ├── f3-stereo-classification.py
│   ├── f3-stereo-topic-classification.py
│   ├── f4-stereo-query.py
│   ├── f5-vision-preference.py
│   ├── f6-profession-pred.py
│   └── f7-subjective-preference.py
├── privacy
│   ├── p1-vispriv-recognition.py
│   ├── p2-vqa-recognition-vispr.py
│   ├── p3-infoflow.py
│   ├── p4-pii-query.py
│   ├── p5-visual-leakage.py
│   └── p6-pii-leakage-in-conversation.py
├── robustness
│   ├── r1-ood_artistic.py
│   ├── r2-ood_sensor.py
│   ├── r3-ood_text.py
│   ├── r4-adversarial_untarget.py
│   ├── r5-adversarial_target.py
│   └── r6-adversarial_text.py
├── safefy
│   ├── s1-nsfw-image-description.py
│   ├── s2-risk-identification.py
│   ├── s3-toxic-content-generation.py
│   ├── s4-typographic-jailbreaking.py
│   ├── s5-multimodal-jailbreaking.py
│   └── s6-crossmodal-jailbreaking.py
└── truthfulness
    ├── t1-basic.py
    ├── t2-advanced.py
    ├── t3-instruction-enhancement.py
    ├── t4-visual-assistance.py
    ├── t5-text-misleading.py
    ├── t6-visual-confusion.py
    └── t7-visual-misleading.py

📌 Task List

The total 32 tasks are listed here and ○: rule-based evaluation (e.g., keywords matching); ●: automatic evaluation by GPT-4 or other classif

View on GitHub
GitHub Stars173
CategoryDevelopment
Updated11d ago
Forks12

Languages

Python

Security Score

100/100

Audited on Mar 20, 2026

No findings