Reindexer
Embeddable, in-memory, document-oriented database with a high-level Query builder interface.
Install / Use
/learn @Restream/ReindexerREADME
Reindexer
Reindexer is an embeddable, in-memory, document-oriented database with a high-level Query builder interface.
Reindexer's goal is to provide fast search with complex queries. We at Restream weren't happy with Elasticsearch and created Reindexer as a more performant alternative.
The core is written in C++ and the application level API is in Go.
This document describes Go connector and its API. To get information about reindexer server and HTTP API refer to reindexer documentation
Table of contents:
- Features
- Usage
- Installation
- Advanced Usage
- Index Types and Their Capabilities
- Nested Structs
- Sort
- Functions
- Counting
- Text pattern search with LIKE condition
- Update queries
- Delete queries Truncate queries
- Transactions and batch update
- Join
- Subqueries (nested queries)
- Complex Primary Keys and Composite Indexes
- Aggregations
- Search in array fields
- Atomic on update functions
- Expire Data from Namespace by Setting TTL
- Direct JSON operations
- Using object cache
- Events subscription
- Logging, debug, profiling and tracing
- Integration with other program languages
- Limitations and known issues
- Getting help
- References
Features
Key features:
- Sortable indices
- Aggregation queries
- Indices on array fields
- Complex primary keys
- Composite indices
- Join operations
- Full-text search
- Up to 256 indexes (255 user's index + 1 internal index) for each namespace
- ORM-like query interface
- SQL queries
Performance
Performance has been our top priority from the start, and we think we managed to get it pretty good. Benchmarks show that Reindexer's performance is on par with a typical key-value database. On a single CPU core, we get:
- up to 500K queries/sec for queries
SELECT * FROM items WHERE id='?' - up to 50K queries/sec for queries
SELECT * FROM items WHERE year > 2010 AND name = 'string' AND id IN (....) - up to 20K queries/sec for queries
SELECT * FROM items WHERE year > 2010 AND name = 'string' JOIN subitems ON ...
See benchmarking results and more details in benchmarking repo
Memory Consumption
Reindexer aims to consume as little memory as possible; most queries are processed without any memory allocation at all.
To achieve that, several optimizations are employed, both on the C++ and Go level:
-
Documents and indices are stored in dense binary C++ structs, so they don't impose any load on Go's garbage collector.
-
String duplicates are merged.
-
Memory overhead is about 32 bytes per document + ≈4-16 bytes per each search index.
-
There is an object cache on the Go level for deserialized documents produced after query execution. Future queries use pre-deserialized documents, which cuts repeated deserialization and allocation costs
-
The Query interface uses
sync.Poolfor reusing internal structures and buffers. The combination of these technologies allows Reindexer to handle most queries without any allocations.
Full text search
Reindexer has internal full text search engine. Full text search usage documentation and examples are here
Vector indexes (ANN/KNN)
Reindexer has internal k-nearest neighbors search engine. k-nearest neighbors search usage documentation and examples are here
Hybrid search
Reindexer has internal hybrid full text and k-nearest neighbors search engine. Its usage documentation and examples are here
Disk Storage
Reindexer can store documents to and load documents from disk via LevelDB. Documents are written to the storage backend asynchronously by large batches automatically in background.
When a namespace is created, all its documents are stored into RAM, so the queries on these documents run entirely in in-memory mode.
Replication
Reindexer supports synchronous and asynchronous replication. Check replication documentation here
Sharding
Reindexer has some basic support for sharding. Check sharding documentation here
Usage
Here is complete example of basic Reindexer usage:
package main
// Import package
import (
"fmt"
"math/rand"
"github.com/restream/reindexer/v5"
// choose how the Reindexer binds to the app (in this case "builtin," which means link Reindexer as a static library)
_ "github.com/restream/reindexer/v5/bindings/builtin"
// OR use Reindexer as standalone server and connect to it via TCP or unix domain socket (if available).
// _ "github.com/restream/reindexer/v5/bindings/cproto"
// OR link Reindexer as static library with bundled server.
// _ "github.com/restream/reindexer/v5/bindings/builtinserver"
// "github.com/restream/reindexer/v5/bindings/builtinserver/config"
)
// Define struct with reindex tags. Fields must be exported - private fields can not be written into reindexer
type Item struct {
ID int64 `reindex:"id,,pk"` // 'id' is primary key
Name string `reindex:"name"` // add index by 'name' field
Articles []int `reindex:"articles"` // add index by articles 'articles' array
Year int `reindex:"year,tree"` // add sortable index by 'year' field
}
func main() {
// Init a database instance and choose the binding (builtin)
db, err := reindexer.NewReindex("builtin:///tmp/reindex/testdb")
// OR - Init a database instance and choose the binding (connect to server via TCP sockets)
// Database should be created explicitly via reindexer_tool or via WithCreateDBIfM
