MiRAGE
MiRAGE: A Multiagent Framework for Generating Multimodal Multihop Question-Answer Dataset for RAG Evaluation
Install / Use
/learn @ChandanKSahu/MiRAGEREADME
MiRAGE: A Multiagent Framework for Generating Multimodal Multihop Question-Answer Dataset for RAG Evaluation
<p align="center"> <img src="https://img.shields.io/badge/python-3.9+-blue.svg" alt="Python 3.9+"> <img src="https://img.shields.io/badge/license-Apache%202.0-green.svg" alt="License"> <img src="https://img.shields.io/pypi/v/mirage-benchmark.svg" alt="PyPI"> </p>MiRAGE is a multi-agent framework for generating high-quality, multimodal, multihop question-answer datasets for evaluating Retrieval-Augmented Generation (RAG) systems.
<p align="center"> <img src="assets/MiRAGE_15s.gif" alt="MiRAGE Process Demo" width="80%"> </p>Interactive Process Flow
Explore the step-by-step multihop QA generation process:
View Interactive Visualization
Multiagent Architecture
<p align="center"> <img src="assets/mirage_framework.png" alt="MiRAGE Framework Architecture" width="100%"> </p>Sample QA Pair
<p align="center"> <img src="assets/ample question-answer pair generated.png" alt="Sample QA Pair Generated" width="100%"> </p>Key Features
- Multi-hop Context Completion: Iteratively expands incomplete chunks with relevant context.
- Domain and Expert Role Detection: Automatic domain identification using BERTopic + LLM
- Multi-stage QA Pipeline: Generate, Select, Verify, Correct for quality assurance
- Multimodal Support: Handles text, tables, figures, and images
- Multiple Backend Support: Gemini, OpenAI, and local Ollama models
- Fully Parallelized: Thread and process pools for maximum throughput
- Token Usage Tracking: Automatic tracking of input/output tokens across all LLM calls
- Checkpoint & Resume: Interrupt and resume long-running pipelines without losing progress
Table of Contents
- Installation
- Quick Start
- Python API Reference
- Examples
- Usage (CLI)
- API Keys Setup
- Configuration
- Command Line Options
- Output Format
- Project Structure
- Contributing
- License
Installation
From PyPI
pip install mirage-benchmark
From Source
git clone https://github.com/ChandanKSahu/MiRAGE.git
cd MiRAGE
pip install -e .
With Optional Dependencies
pip install mirage-benchmark[eval] # Evaluation metrics (ragas, langchain)
pip install mirage-benchmark[all] # All optional dependencies
Note: As of v1.2.7, all core dependencies (PDF processing, embeddings, OCR, visualization) are included in the base install. Only evaluation metrics (ragas, langchain) are optional.
GPU Support (FAISS-GPU)
For GPU-accelerated similarity search, install FAISS-GPU via conda:
# Create conda environment (recommended)
conda create -n mirage python=3.11
conda activate mirage
# Install FAISS-GPU
conda install -c pytorch faiss-gpu
# Then install MiRAGE
pip install mirage-benchmark
Quick Start
Step 1: Install the Package
pip install mirage-benchmark
Step 2: Python Library API (Recommended)
Use MiRAGE directly in your Python scripts - just like HuggingFace Transformers or OpenAI:
from mirage import MiRAGE
# Create and run pipeline
pipeline = MiRAGE(
input_dir="data/my_documents",
output_dir="output/my_dataset",
backend="gemini",
api_key="your-gemini-api-key",
num_qa_pairs=50,
)
results = pipeline.run()
# Access results
print(f"Generated {len(results)} QA pairs")
for qa in results:
print(f"Q: {qa['question']}")
print(f"A: {qa['answer']}\n")
# Save results
results.save("my_dataset.json")
# Convert to pandas DataFrame
df = results.to_dataframe()
df.to_csv("my_dataset.csv")
Advanced Configuration:
from mirage import MiRAGE
# Full control over pipeline
pipeline = MiRAGE(
input_dir="data/papers",
output_dir="output/papers_qa",
backend="gemini",
api_key="your-key",
num_qa_pairs=200,
max_depth=2,
max_breadth=5,
embedding_model="nomic", # "auto", "nomic", "bge_m3", "gemini"
reranker_model="gemini_vlm", # "gemini_vlm", "monovlm", "text_embedding"
device="cuda:0", # "cuda", "cuda:0", "cpu", or None (auto)
max_workers=8,
run_deduplication=True,
run_evaluation=True,
)
# Or load from config file
pipeline = MiRAGE.from_config("config.yaml", num_qa_pairs=100)
# Method chaining
results = pipeline.configure(num_qa_pairs=50).run()
Load existing results:
from mirage import MiRAGEResults
results = MiRAGEResults.load("output/qa_multihop_pass.json")
print(f"Loaded {len(results)} QA pairs")
df = results.to_dataframe()
Step 3: CLI Usage (Alternative)
You can also use MiRAGE from the command line:
# Set API key
export GEMINI_API_KEY="your-gemini-key"
# Basic usage
run_mirage --input data/my_documents --output output/my_dataset --num-qa-pairs 10
# With API key as argument
run_mirage -i data/my_documents -o output/my_dataset --backend gemini --api-key YOUR_GEMINI_KEY
# Using OpenAI
run_mirage -i data/my_documents -o output/my_dataset --backend openai --api-key YOUR_OPENAI_KEY
# Using local Ollama (no API key needed)
run_mirage -i data/my_documents -o output/my_dataset --backend ollama
# Generate a config file for full customization
run_mirage --init-config
Note: When using --api-key, always specify --backend to indicate which service the key is for.
Step 5: Check Results
ls output/my_dataset/
# qa_multihop_pass.json - Generated QA pairs (always created)
# chunks.json - Semantic chunks (always created)
# multihop_visualization.html - Interactive visualization (always created)
# embeddings/ - FAISS index and embeddings
# Optional outputs (if --deduplication and --evaluation flags used):
# qa_deduplicated.json - Deduplicated QA pairs (with --deduplication)
# evaluation_report.json - Quality metrics (with --evaluation)
Quick Test
# Verify installation
run_mirage --version
# Run preflight checks
run_mirage --preflight
# Generate 1 QA pair for testing
run_mirage --input data/sample --output results/test --num-qa-pairs 1
Usage (CLI)
Basic Usage (QA Generation Only)
By default, MiRAGE runs the core pipeline: document processing, chunking, embedding, and QA generation/verification. Deduplication and evaluation are OFF by default.
# Default: Generates QA pairs without deduplication or evaluation
run_mirage --input <INPUT_DIR> --output <OUTPUT_DIR> --num-qa-pairs 100
With Deduplication
To merge similar QA pairs and remove duplicates:
run_mirage -i data/documents -o output/results --num-qa-pairs 100 --deduplication
With Evaluation Metrics
To compute quality metrics (faithfulness, relevancy, etc.):
run_mirage -i data/documents -o output/results --num-qa-pairs 100 --evaluation
Full Pipeline (Deduplication + Evaluation)
run_mirage -i data/documents -o output/results --num-qa-pairs 100 --deduplication --evaluation
With All Options
run_mirage \
--input data/documents \
--output output/results \
--backend gemini \
--api-key YOUR_GEMINI_KEY \
--num-qa-pairs 100 \
--max-workers 4 \
--max-depth 2 \
--embedding-model auto \
--reranker-model gemini_vlm \
--deduplication \
--evaluation \
--verbose
Auto-Selected Reranker
The reranker is automatically selected based on your backend/API keys:
- Gemini backend/key -> Uses Gemini VLM reranker (fast, API-based, uses same model as VLM config)
- OpenAI backend -> Uses Gemini VLM if Gemini key available, else MonoVLM
- No API keys -> Falls back to MonoVLM (local model, slower)
You can override with --reranker-model flag (options: gemini_vlm, monovlm, text_embedding).
Backend Options:
gemini(default) - RequiresGEMINI_API_KEYor--api-keyopenai- RequiresOPENAI_API_KEYor--api-keyollama- No API key needed (runs locally)
Pipeline Steps:
| Step | Description | Default |
|------|-------------|---------|
| 1. Document Processing | PDF/HTML to Markdown | Mandatory |
| 2. Chunking | Semantic chunking | Mandatory |
| 3. Embedding | FAISS index creation | Mandatory |
| 4. Domain Detection | Expert persona extraction | Mandatory |
| 5. QA Generation | Multi-hop QA with verification | Mandatory |
| 6. Deduplication | Merge similar QA pairs | OFF (use --deduplication) |
| 7. Evaluation | Quality metrics | OFF (use --evaluation) |
Run Preflight Checks
Before running the full pipeline, verify your setup:
run_mirage --preflight
Using Sample Dataset
# Prepare sample data (if you have it)
mkdir -p data/sample
cp /path/to/your/documents/*.pdf data/sample/
# Run on sample
run_mirage -i data/sample -o output/sample_results --num-qa-pairs 10
API Keys Setup
Google Gemini
- Get API key from: https://makersuite.google.com/app/apikey
- Set environment variable:
export GEMINI_API_KEY="your-key-here"
Or create a file:
mkdir -p ~/.config/gemini
echo "your-key-here" > ~/.config/gemini/api_key.txt
OpenAI
- Get API key from: https://platform.openai.com/api-keys
- Set environment variable:
export OPENAI_API_KEY="your-key-here"
Ollama (Local - Free)
No API key needed! Just install Ollama:
# Install
curl -fsSL https://ollama.com/install.sh | sh
# Start server
ollama serve
# Pull models
ollama pull llama3 # For text
ollama pull llava # For vision
