Reindexer

Reindexer is an embeddable, in-memory, document-oriented database with a high-level Query builder interface.

Reindexer's goal is to provide fast search with complex queries. We at Restream weren't happy with Elasticsearch and created Reindexer as a more performant alternative.

The core is written in C++ and the application level API is in Go.

This document describes Go connector and its API. To get information about reindexer server and HTTP API refer to reindexer documentation

Features
Usage
- SQL compatible interface
Installation
- Installation for server mode
  - Official docker image
- Installation for embedded mode
Advanced Usage
Events subscription
Logging, debug, profiling and tracing
Integration with other program languages
Limitations and known issues
Getting help
References

Features

Key features:

Sortable indices
Aggregation queries
Indices on array fields
Complex primary keys
Composite indices
Join operations
Full-text search
Up to 256 indexes (255 user's index + 1 internal index) for each namespace
ORM-like query interface
SQL queries

Performance

Performance has been our top priority from the start, and we think we managed to get it pretty good. Benchmarks show that Reindexer's performance is on par with a typical key-value database. On a single CPU core, we get:

up to 500K queries/sec for queries SELECT * FROM items WHERE id='?'
up to 50K queries/sec for queries SELECT * FROM items WHERE year > 2010 AND name = 'string' AND id IN (....)
up to 20K queries/sec for queries SELECT * FROM items WHERE year > 2010 AND name = 'string' JOIN subitems ON ...

See benchmarking results and more details in benchmarking repo

Memory Consumption

Reindexer aims to consume as little memory as possible; most queries are processed without any memory allocation at all.

To achieve that, several optimizations are employed, both on the C++ and Go level:

Documents and indices are stored in dense binary C++ structs, so they don't impose any load on Go's garbage collector.
String duplicates are merged.
Memory overhead is about 32 bytes per document + ≈4-16 bytes per each search index.
There is an object cache on the Go level for deserialized documents produced after query execution. Future queries use pre-deserialized documents, which cuts repeated deserialization and allocation costs
The Query interface uses sync.Pool for reusing internal structures and buffers. The combination of these technologies allows Reindexer to handle most queries without any allocations.

Full text search

Reindexer has internal full text search engine. Full text search usage documentation and examples are here

Vector indexes (ANN/KNN)

Reindexer has internal k-nearest neighbors search engine. k-nearest neighbors search usage documentation and examples are here

Hybrid search

Reindexer has internal hybrid full text and k-nearest neighbors search engine. Its usage documentation and examples are here

Disk Storage

Reindexer can store documents to and load documents from disk via LevelDB. Documents are written to the storage backend asynchronously by large batches automatically in background.

When a namespace is created, all its documents are stored into RAM, so the queries on these documents run entirely in in-memory mode.

Replication

Reindexer supports synchronous and asynchronous replication. Check replication documentation here

Sharding

Reindexer has some basic support for sharding. Check sharding documentation here

Usage

Here is complete example of basic Reindexer usage:

package main

// Import package
import (
	"fmt"
	"math/rand"

	"github.com/restream/reindexer/v5"
	// choose how the Reindexer binds to the app (in this case "builtin," which means link Reindexer as a static library)
	_ "github.com/restream/reindexer/v5/bindings/builtin"

	// OR use Reindexer as standalone server and connect to it via TCP or unix domain socket (if available).
	// _ "github.com/restream/reindexer/v5/bindings/cproto"

	// OR link Reindexer as static library with bundled server.
	// _ "github.com/restream/reindexer/v5/bindings/builtinserver"
	// "github.com/restream/reindexer/v5/bindings/builtinserver/config"

)

// Define struct with reindex tags. Fields must be exported - private fields can not be written into reindexer
type Item struct {
	ID       int64  `reindex:"id,,pk"`    // 'id' is primary key
	Name     string `reindex:"name"`      // add index by 'name' field
	Articles []int  `reindex:"articles"`  // add index by articles 'articles' array
	Year     int    `reindex:"year,tree"` // add sortable index by 'year' field
}

func main() {
	// Init a database instance and choose the binding (builtin)
	db, err := reindexer.NewReindex("builtin:///tmp/reindex/testdb")

	// OR - Init a database instance and choose the binding (connect to server via TCP sockets)
	// Database should be created explicitly via reindexer_tool or via WithCreateDBIfM

Reindexer

Install / Use

README

Reindexer

Table of contents:

Features

Performance

Memory Consumption

Full text search

Vector indexes (ANN/KNN)

Hybrid search

Disk Storage

Replication

Sharding

Usage