SkillAgentSearch skills...

MiRAGE

MiRAGE: A Multiagent Framework for Generating Multimodal Multihop Question-Answer Dataset for RAG Evaluation

Install / Use

/learn @ChandanKSahu/MiRAGE
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

MiRAGE: A Multiagent Framework for Generating Multimodal Multihop Question-Answer Dataset for RAG Evaluation

<p align="center"> <img src="https://img.shields.io/badge/python-3.9+-blue.svg" alt="Python 3.9+"> <img src="https://img.shields.io/badge/license-Apache%202.0-green.svg" alt="License"> <img src="https://img.shields.io/pypi/v/mirage-benchmark.svg" alt="PyPI"> </p>

MiRAGE is a multi-agent framework for generating high-quality, multimodal, multihop question-answer datasets for evaluating Retrieval-Augmented Generation (RAG) systems.

<p align="center"> <img src="assets/MiRAGE_15s.gif" alt="MiRAGE Process Demo" width="80%"> </p>

Interactive Process Flow

Explore the step-by-step multihop QA generation process:

View Interactive Visualization

Multiagent Architecture

<p align="center"> <img src="assets/mirage_framework.png" alt="MiRAGE Framework Architecture" width="100%"> </p>

Sample QA Pair

<p align="center"> <img src="assets/ample question-answer pair generated.png" alt="Sample QA Pair Generated" width="100%"> </p>

Key Features

  • Multi-hop Context Completion: Iteratively expands incomplete chunks with relevant context.
  • Domain and Expert Role Detection: Automatic domain identification using BERTopic + LLM
  • Multi-stage QA Pipeline: Generate, Select, Verify, Correct for quality assurance
  • Multimodal Support: Handles text, tables, figures, and images
  • Multiple Backend Support: Gemini, OpenAI, and local Ollama models
  • Fully Parallelized: Thread and process pools for maximum throughput
  • Token Usage Tracking: Automatic tracking of input/output tokens across all LLM calls
  • Checkpoint & Resume: Interrupt and resume long-running pipelines without losing progress

Table of Contents

Installation

From PyPI

pip install mirage-benchmark

From Source

git clone https://github.com/ChandanKSahu/MiRAGE.git
cd MiRAGE
pip install -e .

With Optional Dependencies

pip install mirage-benchmark[eval]  # Evaluation metrics (ragas, langchain)
pip install mirage-benchmark[all]   # All optional dependencies

Note: As of v1.2.7, all core dependencies (PDF processing, embeddings, OCR, visualization) are included in the base install. Only evaluation metrics (ragas, langchain) are optional.

GPU Support (FAISS-GPU)

For GPU-accelerated similarity search, install FAISS-GPU via conda:

# Create conda environment (recommended)
conda create -n mirage python=3.11
conda activate mirage

# Install FAISS-GPU
conda install -c pytorch faiss-gpu

# Then install MiRAGE
pip install mirage-benchmark

Quick Start

Step 1: Install the Package

pip install mirage-benchmark

Step 2: Python Library API (Recommended)

Use MiRAGE directly in your Python scripts - just like HuggingFace Transformers or OpenAI:

from mirage import MiRAGE

# Create and run pipeline
pipeline = MiRAGE(
    input_dir="data/my_documents",
    output_dir="output/my_dataset",
    backend="gemini",
    api_key="your-gemini-api-key",
    num_qa_pairs=50,
)
results = pipeline.run()

# Access results
print(f"Generated {len(results)} QA pairs")
for qa in results:
    print(f"Q: {qa['question']}")
    print(f"A: {qa['answer']}\n")

# Save results
results.save("my_dataset.json")

# Convert to pandas DataFrame
df = results.to_dataframe()
df.to_csv("my_dataset.csv")

Advanced Configuration:

from mirage import MiRAGE

# Full control over pipeline
pipeline = MiRAGE(
    input_dir="data/papers",
    output_dir="output/papers_qa",
    backend="gemini",
    api_key="your-key",
    num_qa_pairs=200,
    max_depth=2,
    max_breadth=5,
    embedding_model="nomic",        # "auto", "nomic", "bge_m3", "gemini"
    reranker_model="gemini_vlm",    # "gemini_vlm", "monovlm", "text_embedding"
    device="cuda:0",                # "cuda", "cuda:0", "cpu", or None (auto)
    max_workers=8,
    run_deduplication=True,
    run_evaluation=True,
)

# Or load from config file
pipeline = MiRAGE.from_config("config.yaml", num_qa_pairs=100)

# Method chaining
results = pipeline.configure(num_qa_pairs=50).run()

Load existing results:

from mirage import MiRAGEResults

results = MiRAGEResults.load("output/qa_multihop_pass.json")
print(f"Loaded {len(results)} QA pairs")
df = results.to_dataframe()

Step 3: CLI Usage (Alternative)

You can also use MiRAGE from the command line:

# Set API key
export GEMINI_API_KEY="your-gemini-key"

# Basic usage
run_mirage --input data/my_documents --output output/my_dataset --num-qa-pairs 10

# With API key as argument
run_mirage -i data/my_documents -o output/my_dataset --backend gemini --api-key YOUR_GEMINI_KEY

# Using OpenAI
run_mirage -i data/my_documents -o output/my_dataset --backend openai --api-key YOUR_OPENAI_KEY

# Using local Ollama (no API key needed)
run_mirage -i data/my_documents -o output/my_dataset --backend ollama

# Generate a config file for full customization
run_mirage --init-config

Note: When using --api-key, always specify --backend to indicate which service the key is for.

Step 5: Check Results

ls output/my_dataset/
# qa_multihop_pass.json  - Generated QA pairs (always created)
# chunks.json            - Semantic chunks (always created)
# multihop_visualization.html - Interactive visualization (always created)
# embeddings/            - FAISS index and embeddings

# Optional outputs (if --deduplication and --evaluation flags used):
# qa_deduplicated.json   - Deduplicated QA pairs (with --deduplication)
# evaluation_report.json - Quality metrics (with --evaluation)

Quick Test

# Verify installation
run_mirage --version

# Run preflight checks
run_mirage --preflight

# Generate 1 QA pair for testing
run_mirage --input data/sample --output results/test --num-qa-pairs 1

Usage (CLI)

Basic Usage (QA Generation Only)

By default, MiRAGE runs the core pipeline: document processing, chunking, embedding, and QA generation/verification. Deduplication and evaluation are OFF by default.

# Default: Generates QA pairs without deduplication or evaluation
run_mirage --input <INPUT_DIR> --output <OUTPUT_DIR> --num-qa-pairs 100

With Deduplication

To merge similar QA pairs and remove duplicates:

run_mirage -i data/documents -o output/results --num-qa-pairs 100 --deduplication

With Evaluation Metrics

To compute quality metrics (faithfulness, relevancy, etc.):

run_mirage -i data/documents -o output/results --num-qa-pairs 100 --evaluation

Full Pipeline (Deduplication + Evaluation)

run_mirage -i data/documents -o output/results --num-qa-pairs 100 --deduplication --evaluation

With All Options

run_mirage \
    --input data/documents \
    --output output/results \
    --backend gemini \
    --api-key YOUR_GEMINI_KEY \
    --num-qa-pairs 100 \
    --max-workers 4 \
    --max-depth 2 \
    --embedding-model auto \
    --reranker-model gemini_vlm \
    --deduplication \
    --evaluation \
    --verbose

Auto-Selected Reranker

The reranker is automatically selected based on your backend/API keys:

  • Gemini backend/key -> Uses Gemini VLM reranker (fast, API-based, uses same model as VLM config)
  • OpenAI backend -> Uses Gemini VLM if Gemini key available, else MonoVLM
  • No API keys -> Falls back to MonoVLM (local model, slower)

You can override with --reranker-model flag (options: gemini_vlm, monovlm, text_embedding).

Backend Options:

  • gemini (default) - Requires GEMINI_API_KEY or --api-key
  • openai - Requires OPENAI_API_KEY or --api-key
  • ollama - No API key needed (runs locally)

Pipeline Steps: | Step | Description | Default | |------|-------------|---------| | 1. Document Processing | PDF/HTML to Markdown | Mandatory | | 2. Chunking | Semantic chunking | Mandatory | | 3. Embedding | FAISS index creation | Mandatory | | 4. Domain Detection | Expert persona extraction | Mandatory | | 5. QA Generation | Multi-hop QA with verification | Mandatory | | 6. Deduplication | Merge similar QA pairs | OFF (use --deduplication) | | 7. Evaluation | Quality metrics | OFF (use --evaluation) |

Run Preflight Checks

Before running the full pipeline, verify your setup:

run_mirage --preflight

Using Sample Dataset

# Prepare sample data (if you have it)
mkdir -p data/sample
cp /path/to/your/documents/*.pdf data/sample/

# Run on sample
run_mirage -i data/sample -o output/sample_results --num-qa-pairs 10

API Keys Setup

Google Gemini

  1. Get API key from: https://makersuite.google.com/app/apikey
  2. Set environment variable:
export GEMINI_API_KEY="your-key-here"

Or create a file:

mkdir -p ~/.config/gemini
echo "your-key-here" > ~/.config/gemini/api_key.txt

OpenAI

  1. Get API key from: https://platform.openai.com/api-keys
  2. Set environment variable:
export OPENAI_API_KEY="your-key-here"

Ollama (Local - Free)

No API key needed! Just install Ollama:

# Install
curl -fsSL https://ollama.com/install.sh | sh

# Start server
ollama serve

# Pull models
ollama pull llama3      # For text
ollama pull llava       # For vision

Conf

View on GitHub
GitHub Stars21
CategoryDevelopment
Updated21d ago
Forks3

Languages

Python

Security Score

90/100

Audited on Mar 12, 2026

No findings