Xlr8
High-performance read acceleration layer for MongoDB. Decomposes large range queries into parallel chunks and executes them using a memory-bounded execution model with a Rust-based backend for CPU-intensive processing. Streams compressed Parquet output for analytics and data-lake ingestion, while integrating with the PyMongo API
Install / Use
/learn @XLR8-DB/Xlr8README
Minimal Code Changes
# Before: PyMongo
df = pd.DataFrame(collection.find(query))
# After: XLR8 - just wrap and go!
xlr8_collection = accelerate(collection, schema, mongodb_uri)
^ Union(str, callback)
df = xlr8_collection.find(query).to_dataframe()
That's it. Same query syntax, same DataFrame output-just faster.
The Problem
When running analytical queries over large MongoDB collections, you encounter two fundamental bottlenecks:
flowchart LR
subgraph Bottleneck1["I/O Bottleneck"]
A1[Python] -->|"Single cursor"| B1[MongoDB]
B1 -->|"Network RTT"| C1[Wait...]
C1 -->|"Next batch"| A1
end
subgraph Bottleneck2["CPU Bottleneck"]
A2[Python GIL] -->|"Holds lock"| B2[BSON decode]
B2 -->|"Still locked"| C2[Build dict]
C2 -->|"Still locked"| D2[Next doc]
end
I/O Bound: PyMongo uses a single cursor, fetching documents one batch at a time. Your CPU sits idle waiting for network round trips.
CPU/GIL Bound: Even with the data in hand, Python's Global Interpreter Lock (GIL) means BSON decoding and DataFrame construction happen on a single core.
These aren't PyMongo limitations-they're inherent to Python's design. XLR8 provides a solution.
How XLR8 Solves It
flowchart LR
subgraph Solution["XLR8: Rust Backend (GIL-Free) + Tokio Async + Cache-First"]
direction LR
Q["Your Query<br/>cursor.to_dataframe(...)"] --> PLAN["Execution plan<br/>chunking + worker count + RAM budget"]
PLAN --> GIL["Python releases GIL<br/>(py.allow_threads)"]
GIL --> RT["Rust Backend<br/>Tokio async runtime"]
RT --> W1["Worker 1<br/>async fetch + BSON→Arrow"]
RT --> W2["Worker 2<br/>async fetch + BSON→Arrow"]
RT --> W3["Worker 3<br/>async fetch + BSON→Arrow"]
RT --> WN["Worker N<br/>async fetch + BSON→Arrow"]
W1 --> M1{"RAM limit reached?<br/>flush_ram_limit_mb"}
W2 --> M2{"RAM limit reached?<br/>flush_ram_limit_mb"}
W3 --> M3{"RAM limit reached?<br/>flush_ram_limit_mb"}
WN --> MN{"RAM limit reached?<br/>flush_ram_limit_mb"}
M1 -->|flush| C1["Write Parquet shard<br/>.cache/<hash>/part_0001.parquet"]
M2 -->|flush| C2["Write Parquet shard<br/>.cache/<hash>/part_0002.parquet"]
M3 -->|flush| C3["Write Parquet shard<br/>.cache/<hash>/part_0003.parquet"]
MN -->|flush| CN["Write Parquet shard<br/>.cache/<hash>/part_00NN.parquet"]
C1 --> READ["Read shards (Arrow/DuckDB)"]
C2 --> READ
C3 --> READ
CN --> READ
READ --> DF["Assemble final DataFrame"]
end
XLR8 releases Python's GIL and hands execution to a Rust backend powered by Tokio's async runtime. Multiple workers fetch from MongoDB in parallel, convert BSON to Arrow, and write Parquet shards-all without touching the GIL.
The result? Your analytical queries run upto 4x faster, especially for large result sets.
Installation
pip install xlr8
XLR8 requires Python 3.11+ and includes pre-compiled Rust extensions.
Quick Start
from pymongo import MongoClient
from xlr8 import accelerate, Schema, Types
from datetime import datetime, timezone, timedelta
from bson import ObjectId
# Connect to MongoDB
client = MongoClient("mongodb://localhost:27017")
collection = client["iot"]["sensor_readings"]
# Define your schema
schema = Schema(
time_field="timestamp",
fields={
"timestamp": Types.Timestamp("ms", tz="UTC"),
"device_id": Types.ObjectId(),
"reading": Types.Any(), # Handles int, float, string dynamically
},
avg_doc_size_bytes=200,
)
# Wrap collection with XLR8
xlr8_col = accelerate(collection, schema=schema, mongo_uri="mongodb://localhost:27017")
# Query like normal PyMongo
cursor = xlr8_col.find({
"device_id": ObjectId("507f1f77bcf86cd799439011"),
"timestamp": {"$gte": datetime(2024, 1, 1, tzinfo=timezone.utc),
"$lt": datetime(2024, 6, 1, tzinfo=timezone.utc)}
}).sort("timestamp", 1)
# Get DataFrame - parallel fetch, cached for reuse
df = cursor.to_dataframe(
chunking_granularity=timedelta(days=7),
max_workers=8,
)
Key Features
<table> <tr> <td width="50%" valign="top">🦀 GIL-Free Rust Backend
Python's GIL is released via py.allow_threads(). Rust's Tokio runtime handles async I/O and CPU-intensive work across all cores.
⚡ Parallel MongoDB Fetching
Queries are split into time-based chunks. Each worker maintains its own MongoDB connection, fetching in parallel.
</td> </tr> <tr> <td width="50%" valign="top">💾 Query aware cache
Data is stored in the query-hash folder, cursors can be supplied a start and end date to filter through the cache.
</td> <td width="50%" valign="top">$or queries are automatically split into independent "brackets" that can be executed in parallel.
$or: each branch becomes its own bracket (while shared filters are kept as global constraints).$in: stays intact within each bracket - MongoDB handles it efficiently with index scans.
Before execution, XLR8 builds an execution plan that detects overlapping brackets (cases where multiple brackets could match the same document) and ensures results are correct and deterministic. This behavior is covered by extensive tests to prevent duplicates or missing rows.
</td> </tr> <tr> <td width="50%" valign="top">🔀 DuckDB K-Way Merge
When sorting is required, DuckDB performs a GIL-free K-way merge across sorted shards-O(N log K) complexity.
</td> <td width="50%" valign="top">🐻❄️ Pandas & Polars Support
to_dataframe() returns pandas. to_polars() returns native Polars. Choose based on your downstream analytics.
📊 Memory-Controlled Execution
Set flush_ram_limit_mb to control RAM per worker. Process large datasets without OOM errors.
📤 Stream to Data Lakes
stream_to_callback() partitions data by time and custom fields-perfect for S3/GCS ingestion pipelines.
Cloud & Container Benefits
XLR8's architecture provides specific advantages in cloud environments:
flowchart TB
subgraph Benefits["Compute savings"]
direction LR
subgraph Speed["Faster Queries"]
S1[Parallel fetch] --> S2[Reduced container up time]
S2 --> S3[Lower cloud billable time]
end
subgraph Memory["Memory Control"]
M1[Predictable memory usage]
M1 --> M2[Smaller container instances]
end
end
| Benefit | How XLR8 Helps |
|---------|----------------|
| Reduced container runtime | Parallel execution finishes faster → lower billable seconds |
| Cache-first processing | Fetch once, process many times without hitting MongoDB |
| Smaller instances | Memory control via flush_ram_limit_mb allows smaller container sizes |
| Predictable costs | Consistent memory footprint = consistent billing |
Performance Benchmarks
Real-world benchmarks comparing XLR8 against vanilla PyMongo + pandas on a production-like workload.
Test Environment
| Component | Specification |
|-----------|---------------|
| MongoDB | Atlas M30 (General), GCP europe-west2 (London) |
| Compute | GCP Cloud Run Jobs, 8 vCPU / 32 GB RAM, europe-west2 |
| Dataset | Forex candlestick data, 27 currency pairs, ~54K docs/day |
| Query | Time-range filter + $in on 27 instruments |
Methodology
- PyMongo baseline: Stream cursor → build DataFrames in 300k-row batches →
pd.concat() - XLR8:
cursor.to_dataframe(max_workers=14, chunking_granularity=4 days, cache_read=False) - Each test runs sequentially to avoid database contention
Results
| Period | Rows | PyMongo Time | XLR8 Time | Speedup | |--------|-----:|-------------:|----------:|:-----------:| | 3 months | 4.8M | 89.5s | 31.1s | 2.9x | | 6 months | 9.8M | 177.4s | 54.1s | 3.3x | | 1 year | 19.7M | 371.2s | 109.3s | 3.4x | | 1.5 years | 29.8M | 555.5s | 157.4s | 3.5x | | 2 years | 39.7M | 760.7s | 204.0s | 3.7x | | 2.5 years | 49.7M | 949.5s | 252.6s | 3.8x |
Visualization
<p align="center"> <img src="https://raw.githubusercontent.com/XLR8-DB/xlr8/main/.github/benchmark_results.png" alt="XLR8 Benchmark Results" width="900"/> </p>Key Takeaways
- Consistent 3-4x speedup across all data sizes
- Throughput: XLR8 sustains ~180-195K rows/sec vs PyMongo's ~52-55K rows/sec
- Scales linearly: Speedup improves with larger datasets as parallelism amortizes overhead
- Memory bounded: Barri
Related Skills
feishu-drive
344.4k|
things-mac
344.4kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
344.4kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
postkit
PostgreSQL-native identity, configuration, metering, and job queues. SQL functions that work with any language or driver
