Memflux
A high-performance, multi-model, asynchronous, in-memory database written in Rust. It uniquely combines the speed and API style of a Redis-like key-value store with a powerful, built-in SQL and Cypher query engines, making it a versatile hybrid database.
Install / Use
/learn @peak-sh/MemfluxREADME
MemFlux - In-Memory Multi-Model Database with SQL
:warning: Alpha Phase Notice :warning:
This project is in an incredibly early, pre-alpha stage of development. It is currently a solo project. As such, you should expect:
- Bugs and Instability: The server may crash, commands might not work as expected, and data could be corrupted.
- Inconsistencies: The API and feature set are subject to frequent and breaking changes without notice.
- Incomplete Features: Many SQL features and Redis commands are either missing or only partially implemented.
This is not ready for any form of production use. It is a learning and experimentation project. Contributions are very much welcomed!
What is MemFlux?
MemFlux is an experimental, high-performance, in-memory, multi-model database engine built in Rust. It aims to blend the speed and simplicity of key-value stores like Redis with the power and flexibility of SQL databases.
Database is designed with a dual-purpose architecture: it can be run as a standalone server (compatible with the Redis protocol) or be embedded directly into your applications as a library via a C-compatible FFI. This approach allows MemFlux to function as both a fast, networked database and a powerful, in-process database for languages like Python, C++, and more.
Core Features:
- Multi-Model Data: Natively supports:
- Bytes/Strings: Classic key-value operations.
- JSON Documents: Rich, schemaless JSON manipulation at the key or sub-path level.
- Lists & Sets: Redis-compatible list and set operations.
- Property Graph: A complete property graph model with nodes (labels, properties) and relationships (types, properties).
- Low-Level Table/Row Commands: A direct, Redis-style command interface (
TABLE.CREATE,ROW.SET, etc.) for manipulating tabular data, complementing the SQL engine.
- Integrated Query Engines:
- SQL Query Engine: A feature-rich SQL engine for querying JSON and tabular data. Supports complex
SELECTs,JOINs, CTEs (WITH RECURSIVE), DML, DDL, and advanced constraints. - Cypher Query Engine: A powerful, from-scratch engine for querying the property graph, supporting
MATCH,CREATE,MERGE,RETURN,DELETE,SET, path variables, variable-length traversals (-[:KNOWS*1..3]->), and functions likeshortestPath().
- SQL Query Engine: A feature-rich SQL engine for querying JSON and tabular data. Supports complex
- Seamless SQL & Graph Interoperability:
- Query graph data with SQL: Graph nodes and relationships are automatically exposed as virtual SQL tables.
- Query SQL data with Cypher: SQL tables with foreign keys can be traversed as if they were graph nodes and relationships.
GRAPH_MATCHin SQL: Embed Cypher queries directly inside your SQLFROMclause to perform hybrid queries that join graph results with SQL tables.
- Transactional Integrity with MVCC: Provides ACID-like properties with Snapshot Isolation using a Multi-Version Concurrency Control (MVCC) architecture. This allows for non-blocking reads and safe, concurrent writes across all data models.
- Dual-Mode Operation:
- Standalone Server: Run as a TCP server with a Redis-compatible (RESP) protocol.
- Embedded Library: Integrate directly into your application via a C-compatible Foreign Function Interface (FFI) for zero-latency, in-process database operations.
- Configurable Durability & Persistence: Persistence can be enabled or disabled. When enabled, durability is achieved through a Write-Ahead Log (WAL) and periodic snapshotting, with configurable durability levels (
fsync,full). - Secondary Indexing: Create indexes on JSON fields to dramatically accelerate SQL query performance.
- Configurable Memory Management: Set a
maxmemorylimit and choose from multiple eviction policies (LRU,LFU,ARC,LFRU,Random) to control memory usage. - TLS Encryption: Secure client connections with TLS when running in server mode.
Note: As this is an alpha project, many of the features listed above are still under heavy development and may be incomplete or unstable.
Full Documentation
While this README provides a quick start, the complete documentation contains a detailed reference for every command, SQL feature, and internal system.
---> Start with the Documentation Index <---
Key sections include:
- Configuration: How to configure the server, including memory limits, persistence, and TLS.
- Python Library Guide: A guide for using MemFlux as an embedded library in Python.
- Commands: Detailed reference for all non-SQL, Redis-style commands.
- SQL Reference: A comprehensive guide to the SQL engine, from DDL to complex
SELECTqueries.
How to Use It
MemFlux can be used in two primary ways: as a standalone server or as an embedded library.
1. As a Standalone Server
In this mode, MemFlux runs as a background process and accepts client connections over the network using the Redis (RESP) protocol.
Running the Server:
- Build the server:
cargo build --release - Run the server binary:
./target/release/memflux-server
The server will start and listen on 127.0.0.1:8360.
Connecting to the Server: You can connect using any Redis-compatible client. For interactive use, the included Python script is recommended:
# Run the interactive client
python3 test.py
2. As an Embedded Library (via FFI)
In this mode, the database engine is loaded directly into your application's process, eliminating network overhead and providing direct, high-performance access.
The primary interface for this is the Python library included in libs/python/.
Using the Python Library:
-
Build the dynamic library:
cargo build --release -
Use the
memfluxPython module in your script:import libs.python.memflux as memflux # Provided python lib relative from project root import sys # Path to the compiled shared library if sys.platform == "win32": LIB_PATH = "./target/release/memflux.dll" elif sys.platform == "darwin": LIB_PATH = "./target/release/libmemflux.dylib" else: LIB_PATH = "./target/release/libmemflux.so" # Configuration for the database instance DB_CONFIG = { "persistence": True, "durability": "fsync", "wal_file": "memflux.wal", "wal_overflow_file": "memflux.wal.overflow", "snapshot_file": "memflux.snapshot", "snapshot_temp_file": "memflux.snapshot.tmp", "wal_size_threshold_mb": 128, "maxmemory_mb": 0, "eviction_policy": "lru", "isolation_level": "serializable", } # Connect to the database (loads it in-process) conn = memflux.connect(config=DB_CONFIG, lib=LIB_PATH) with conn.cursor() as cur: cur.execute("SQL CREATE TABLE products (id INT, name TEXT, price REAL)") cur.execute("SQL INSERT INTO products VALUES (?, ?, ?)", (1, 'Laptop', 1200.50)) cur.execute("SQL SELECT name, price FROM products WHERE price > ?", (1000,)) product = cur.fetchone() print(product) # Output: {'name': 'Laptop', 'price': 1200.5} conn.close()
For a more detailed guide, see the Python Library Guide.
How It Works (A Basic Overview)
Note: This is a simplified explanation of an alpha-stage project. The implementation details are subject to change.
-
Core Library (
src/lib.rs): The core database logic is encapsulated in a Rust library. This library manages the in-memory storage (DashMap), persistence, indexing, and the SQL query engine. It exposes a high-levelMemFluxDBstruct. -
Server Binary (
src/main.rs): Thememflux-serverbinary is a lightweight wrapper around the core library. It handles TCP connections, TLS, and uses the RESP protocol to parse client requests, which it passes to theMemFluxDBinstance. -
FFI Layer (
src/ffi.rs): A C-compatible Foreign Function Interface exposes the core library's functionality, allowing it to be loaded and used directly by other languages (like Python'sctypesmodule) for in-process execution. -
Persistence Engine (MVCC): To prevent data loss and enable transactions, MemFlux uses a Multi-Version Concurrency Control (MVCC) model combined with a Write-Ahead Log (WAL).
- Versioning: Instead of overwriting data, every write operation creates a new version of a value, tagged with a transaction ID. A key now points to a chain of these versions.
- WAL: Every data-modifying command is first serialized and written to the WAL file (
memflux.wal) on disk. This ensures that no acknowledged write is ever lost. - Snapshots & Compaction: When the WAL file grows too large, a non-blocking snapshot of the current state of the database is written to disk. A background vacuum process cleans up old, no-longer-visible data versions to reclaim memory.
- Recovery: On startup, the server restores the last snapshot and replays any subsequent entries from the WAL to bring the database to a consistent, up-to-date state.
-
SQL Query Engine Pipeline:
- Parser: The raw SQL string is tokenized and parsed into an Abstract Syntax Tree (AST), which is a tree-like representation of the query structure.
- Logical Planner: The AST is converted into a Logical Plan. This plan represents the "what" of the query
