SkillAgentSearch skills...

M3

🏥🤖 Query MIMIC-IV medical data using natural language through Model Context Protocol (MCP). Transform healthcare research with AI-powered database interactions - supports both local MIMIC-IV SQLite demo dataset and full BigQuery datasets.

Install / Use

/learn @rafiattrach/M3
About this skill

Quality Score

0/100

Supported Platforms

Claude Code
Cursor

README

M3: MIMIC-IV + MCP + Models 🏥🤖

<div align="center"> <img src="webapp/public/m3_logo_transparent.png" alt="M3 Logo" width="300"/> </div>

Query MIMIC-IV medical data using natural language through MCP clients

<a href="https://www.python.org/downloads/"><img alt="Python" src="https://img.shields.io/badge/Python-3.10+-blue?logo=python&logoColor=white"></a> <a href="https://modelcontextprotocol.io/"><img alt="MCP" src="https://img.shields.io/badge/MCP-Compatible-green?logo=ai&logoColor=white"></a> <a href="https://github.com/rafiattrach/m3/actions/workflows/tests.yaml"><img alt="Tests" src="https://github.com/rafiattrach/m3/actions/workflows/tests.yaml/badge.svg"></a> <a href="https://github.com/rafiattrach/m3/actions/workflows/pre-commit.yaml"><img alt="Code Quality" src="https://github.com/rafiattrach/m3/actions/workflows/pre-commit.yaml/badge.svg"></a> <a href="https://github.com/rafiattrach/m3/pulls"><img alt="PRs Welcome" src="https://img.shields.io/badge/PRs-welcome-brightgreen.svg"></a>

Transform medical data analysis with AI! Ask questions about MIMIC-IV data in plain English and get instant insights. Choose between local demo data (free) or full cloud dataset (BigQuery).

Features

  • 🔍 Natural Language Queries: Ask questions about MIMIC-IV data in plain English
  • 🏠 Local DuckDB + Parquet: Fast local queries for demo and full dataset using Parquet files with DuckDB views
  • ☁️ BigQuery Support: Access full MIMIC-IV dataset on Google Cloud
  • 🔒 Enterprise Security: OAuth2 authentication with JWT tokens and rate limiting
  • 🛡️ SQL Injection Protection: Read-only queries with comprehensive validation

🚀 Quick Start

📺 Prefer video tutorials? Check out step-by-step video guides covering setup, PhysioNet configuration, and more.

Install uv (required for uvx)

We use uvx to run the MCP server. Install uv from the official installer, then verify with uv --version.

macOS:

brew install uv

Linux (or macOS without Homebrew):

curl -LsSf https://astral.sh/uv/install.sh | sh
# macOS - enable for GUI apps like Claude Desktop:
sudo ln -s $(which uv) $(which uvx) /usr/local/bin/

Windows (PowerShell):

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Verify installation:

uv --version

BigQuery Setup (Optional - Full Dataset)

Skip this if using DuckDB demo database.

  1. Install Google Cloud SDK:

    • macOS: brew install google-cloud-sdk
    • Windows/Linux: https://cloud.google.com/sdk/docs/install
  2. Authenticate:

    gcloud auth application-default login
    

    Opens your browser - choose the Google account with BigQuery access to MIMIC-IV.

M3 Initialization

Supported clients: Claude Desktop, Cursor, Goose, and more.

<table> <tr> <td width="50%">

DuckDB (Demo or Full Dataset)

To create a m3 directory and navigate into it run:

mkdir m3 && cd m3

If you want to use the full dataset, download it manually from PhysioNet and place it into m3/m3_data/raw. For using the demo set you can continue and run:

uv init && uv add m3-mcp && \
uv run m3 init DATASET_NAME && uv run m3 config --quick

Replace DATASET_NAME with mimic-iv-demo or mimic-iv-full and copy & paste the output of this command into your client config JSON file.

Demo dataset (16MB raw download size) downloads automatically on first query.

Full dataset (10.6GB raw download size) needs to be downloaded manually.

</td> <td width="50%">

BigQuery (Full Dataset)

Requires GCP credentials and PhysioNet access.

Paste this into your client config JSON file:

{
  "mcpServers": {
    "m3": {
      "command": "uvx",
      "args": ["m3-mcp"],
      "env": {
        "M3_BACKEND": "bigquery",
        "M3_PROJECT_ID": "your-project-id"
      }
    }
  }
}

Replace your-project-id with your Google Cloud project ID.

</td> </tr> </table>

That's it! Restart your MCP client and ask:

  • "What tools do you have for MIMIC-IV data?"
  • "Show me patient demographics from the ICU"
  • "What is the race distribution in admissions?"

Backend Comparison

| Feature | DuckDB (Demo) | DuckDB (Full) | BigQuery (Full) | |---------|---------------|---------------|-----------------| | Cost | Free | Free | BigQuery usage fees | | Setup | Zero config | Manual Download | GCP credentials required | | Data Size | 100 patients, 275 admissions | 365k patients, 546k admissions | 365k patients, 546k admissions | | Speed | Fast (local) | Fast (local) | Network latency | | Use Case | Learning, development | Research (local) | Research, production |


Alternative Installation Methods

Already have Docker or prefer pip? Here are other ways to run m3:

🐳 Docker (No Python Required)

<table> <tr> <td width="50%">

DuckDB (Local):

git clone https://github.com/rafiattrach/m3.git && cd m3
docker build -t m3:lite --target lite .
docker run -d --name m3-server m3:lite tail -f /dev/null
</td> <td width="50%">

BigQuery:

git clone https://github.com/rafiattrach/m3.git && cd m3
docker build -t m3:bigquery --target bigquery .
docker run -d --name m3-server \
  -e M3_BACKEND=bigquery \
  -e M3_PROJECT_ID=your-project-id \
  -v $HOME/.config/gcloud:/root/.config/gcloud:ro \
  m3:bigquery tail -f /dev/null
</td> </tr> </table>

MCP config (same for both):

{
  "mcpServers": {
    "m3": {
      "command": "docker",
      "args": ["exec", "-i", "m3-server", "python", "-m", "m3.mcp_server"]
    }
  }
}

Stop: docker stop m3-server && docker rm m3-server

pip Install + CLI Tools

pip install m3-mcp

💡 CLI commands: Run m3 --help to see all available options.

Useful CLI commands:

  • m3 init mimic-iv-demo - Download demo database
  • m3 config - Generate MCP configuration interactively
  • m3 config claude --backend bigquery --project-id YOUR_PROJECT_ID - Quick BigQuery setup

Example MCP config:

{
  "mcpServers": {
    "m3": {
      "command": "m3-mcp-server",
      "env": {
        "M3_BACKEND": "duckdb"
      }
    }
  }
}

Local Development

For contributors:

git clone https://github.com/rafiattrach/m3.git && cd m3
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -e ".[dev]"
pre-commit install

MCP config:

{
  "mcpServers": {
    "m3": {
      "command": "/path/to/m3/.venv/bin/python",
      "args": ["-m", "m3.mcp_server"],
      "cwd": "/path/to/m3",
      "env": {
        "M3_BACKEND": "duckdb"
      }
    }
  }
}

Using UV (Recommended)

Assuming you have UV installed.

Step 1: Clone and Navigate

# Clone the repository
git clone https://github.com/rafiattrach/m3.git
cd m3

Step 2: Create UV Virtual Environment

# Create virtual environment
uv venv

Step 3: Install M3

uv sync
# Do not forget to use `uv run` to any subsequent commands to ensure you're using the `uv` virtual environment

🗄️ Database Configuration

After installation, choose your data source:

Option A: Local Demo (DuckDB + Parquet)

Perfect for learning and development - completely free!

  1. Initialize demo dataset:

    m3 init mimic-iv-demo
    
  2. Setup MCP Client:

    m3 config
    

    Alternative: For Claude Desktop specifically:

    m3 config claude --backend duckdb --db-path /Users/you/path/to/m3_data/databases/mimic_iv_demo.duckdb
    
  3. Restart your MCP client and ask:

    • "What tools do you have for MIMIC-IV data?"
    • "Show me patient demographics from the ICU"

Option B: Local Full Dataset (DuckDB + Parquet)

Run the entire MIMIC-IV dataset locally with DuckDB views over Parquet.

  1. Acquire CSVs (requires PhysioNet credentials):

    • Download the official MIMIC-IV CSVs from PhysioNet and place them under:
      • /Users/you/path/to/m3/m3_data/raw_files/mimic-iv-full/hosp/
      • /Users/you/path/to/m3/m3_data/raw_files/mimic-iv-full/icu/
    • Note: m3 init's auto-download function currently only supports the demo dataset. Use your browser or wget to obtain the full dataset.
  2. Initialize full dataset:

    m3 init mimic-iv-full
    
    • This may take up to 30 minutes, depending on your system (e.g. 10 minutes for MacBook Pro M3)
    • Performance knobs (optional):
      export M3_CONVERT_MAX_WORKERS=6   # number of parallel files (default=4)
      export M3_DUCKDB_MEM=4GB          # DuckDB memory limit per worker (default=3GB)
      export M3_DUCKDB_THREADS=4        # DuckDB threads per worker (default=2)
      
      Pay attention to your system specifications, especially if you have enough memory.
  3. Select dataset and verify:

    m3 use full # optional, as this automatically got set to full
    m3 status
    
    • Status prints active dataset, local DB path, Parquet presence, quick row counts and total Parquet size.
  4. Configure MCP client (uses the full local DB):

    m3 config
    # or
    m3 config claude --backend duckdb --db-path /Users/you/path/to/m3/m3_data/databases/mimic_iv_full.duckdb
    

Option C: BigQuery (Full Dataset)

For researchers needing complete MIMIC-IV data

Prerequisites
  • Google Cloud account and project with billing enabled
  • Access to MIMIC-IV on BigQuery (requires PhysioNet credentialing)
Setup Steps
  1. Install Google Cloud CLI:

    macOS (with Homebrew):

    brew install google-cloud-sdk
    

    **Wind

View on GitHub
GitHub Stars70
CategoryData
Updated17d ago
Forks17

Languages

Python

Security Score

95/100

Audited on Mar 19, 2026

No findings