Kolibrie

A SPARQL database, RDF Stream Processing Engine and RDF toolkit, supporting Neurosymbolic Stream Reasoning

Generate Convert Improve

Install / Use

/learn @StreamIntelligenceLab/Kolibrie

About this skill

Quality Score

0/100

README

Kolibrie

Kolibrie is a high-performance, concurrent, and feature-rich SPARQL query engine implemented in Rust. Designed for scalability and efficiency, it leverages Rust's robust concurrency model and advanced optimizations, including SIMD (Single Instruction, Multiple Data) and parallel processing with Rayon, to handle large-scale RDF (Resource Description Framework) datasets seamlessly.

With a comprehensive API, Kolibrie facilitates parsing, storing, and querying RDF data using SPARQL, Turtle, and N3 formats. Its advanced filtering, aggregation, join operations, and sophisticated optimization strategies make it a suitable choice for applications requiring complex semantic data processing. Additionally, the integration of the Volcano Optimizer and Knowledge Graph capabilities empowers users to perform cost-effective query planning and leverage rule-based inference for enhanced data insights.

Research Context

Kolibrie is developed within the Stream Intelligence Lab at KU Leuven, under the supervision of Prof. Pieter Bonte. The Stream Intelligence Lab focuses on Stream Reasoning, an emerging research field that integrates logic-based techniques from artificial intelligence with data-driven machine learning approaches to derive timely and actionable insights from continuous data streams. Our research emphasizes applications in the Internet of Things (IoT) and Edge processing, enabling real-time decision-making in dynamic environments such as autonomous vehicles, robotics, and web analytics.

For more information about our research and ongoing projects, please visit the Stream Intelligence Lab website.

Features

Efficient RDF Parsing: Supports parsing RDF/XML, Turtle, and N3 formats with robust error handling and prefix management.
Concurrent Processing: Utilizes Rayon and Crossbeam for parallel data processing, ensuring optimal performance on multi-core systems.
SIMD Optimizations: Implements SIMD instructions for accelerated query filtering and aggregation.
Flexible Querying: Supports complex SPARQL queries, including SELECT, INSERT, FILTER, GROUP BY, and VALUES clauses.
Volcano Optimizer: Incorporates a cost-based query optimizer based on the Volcano model to determine the most efficient execution plans.
Reasoner: Provides robust support for building and querying knowledge graphs, including ABox (instance-level) and TBox (schema-level) assertions, dynamic rule-based inference, and backward chaining.
Streaming and Sliding Windows: Handles timestamped triples and sliding window operations for time-based data analysis.
Machine Learning Integration: Seamlessly integrates with Python ML frameworks through PyO3 bindings.
Extensible Dictionary Encoding: Efficiently encodes and decodes RDF terms using a customizable dictionary.
Comprehensive API: Offers a rich set of methods for data manipulation, querying, and result processing.
Support Python

[!WARNING] utilizing CUDA is experimental and under the development

Installation

Native Installation

Ensure you have Rust installed (version 1.60 or higher).

Clone the repository:

git clone https://github.com/StreamIntelligenceLab/Kolibrie.git
cd Kolibrie

Build the project:

cargo build --release

Then, include it in your project:

use kolibrie::SparqlDatabase;

WebUI

To run webui:

cargo run --bin kolibrie-http-server

After that in the browser type localhost:8080 or 0.0.0.0:8080

Docker Installation

Kolibrie provides Docker support with multiple configurations optimized for different use cases. The Docker setup automatically handles all dependencies including Rust, CUDA (for GPU builds), and Python ML frameworks which are fully integrated into Kolibrie.

Prerequisites

Docker installed
Docker Compose installed
For GPU support: NVIDIA Docker runtime installed

Quick Start with Docker Compose

Kolibrie offers three deployment profiles:

1. CPU Build (Default - Recommended for Most Users)

Runs with web UI on port 8080:

docker compose up --build
# or explicitly:
docker compose --profile cpu up --build

Access the web UI at http://localhost:8080

2. GPU Build (Requires NVIDIA GPU)

GPU-accelerated build with CUDA support:

docker compose --profile gpu up --build

Access the web UI at http://localhost:8080

3. Development Build

Interactive shell for development (auto-detects GPU):

docker compose --profile dev up --build

This drops you into a bash shell with full access to Kolibrie tools.

Running Without Docker Compose

If you prefer using Docker directly:

CPU Build with Web UI:

Build:

docker build \
  --build-arg GPU_VENDOR=none \
  --build-arg ENABLE_WEB_UI=true \
  -t kolibrie:cpu \
  .

Run:

docker run -d \
  --name kolibrie \
  -p 8080:8080 \
  -v $(pwd)/data:/app/data \
  -v $(pwd)/models:/app/ml/examples/models \
  kolibrie:cpu

GPU Build with Web UI:

Build:

docker build \
  --build-arg GPU_VENDOR=nvidia \
  --build-arg CUDA_VERSION=11.8 \
  --build-arg BASE_IMAGE=nvidia/cuda:11.8-devel-ubuntu22.04 \
  --build-arg ENABLE_WEB_UI=true \
  -t kolibrie:gpu \
  .

Run:

docker run -d \
  --name kolibrie-gpu \
  --gpus all \
  -p 8080:8080 \
  -v $(pwd)/data:/app/data \
  -v $(pwd)/models:/app/ml/examples/models \
  kolibrie:gpu

Development Build (Shell Access):

Build:

docker build \
  --build-arg GPU_VENDOR=none \
  --build-arg ENABLE_WEB_UI=false \
  -t kolibrie:dev \
  .

Run:

docker run -it \
  --name kolibrie-dev \
  -v $(pwd):/app \
  kolibrie:dev \
  bash

For GPU-enabled development shell:

docker run -it \
  --name kolibrie-gpu-dev \
  --gpus all \
  -v $(pwd):/app \
  kolibrie:gpu \
  bash

Usage

Initializing the Database

Create a new instance of the SparqlDatabase:

use kolibrie::SparqlDatabase;

fn main() {
    let mut db = SparqlDatabase::new();
    // Your code here
}

Parsing RDF Data

Kolibrie supports parsing RDF data from files or strings in various formats.

Parsing RDF/XML from a File

db.parse_rdf_from_file("data.rdf");

Parsing RDF/XML from a String

let rdf_data = r#"
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:foaf="http://xmlns.com/foaf/0. 1/">
    
    <rdf:Description rdf:about="http://example.org/alice">
        <foaf:name>Alice</foaf:name>
        <foaf:age>30</foaf:age>
    </rdf:Description>
</rdf:RDF>
"#;

db.parse_rdf(rdf_data);

Parsing Turtle Data from a String

let turtle_data = r#"
@prefix ex: <http://example.org/> .

ex:Alice ex:knows ex:Bob . 
ex:Bob ex:knows ex:Charlie .
"#;

db.parse_turtle(turtle_data);

Parsing N3 Data from a String

let n3_data = r#"
@prefix ex: <http://example.org/> .

ex:Alice ex:knows ex:Bob .
ex:Bob ex:knows ex:Charlie .
"#;

db.parse_n3(n3_data);

Parsing N-Triples from a String

let ntriples_data = r#"
<http://example.org/john> <http://example.org/hasFriend> <http://example.org/jane> . 
<http://example.org/jane> <http://example.org/name> "Jane Doe" . 
<http://example.org/john> <http://example.org/age> "30"^^<http://www.w3.org/2001/XMLSchema#integer> .
"#;

db.parse_ntriples_and_add(ntriples_data);

Adding Triples Programmatically

Add individual triples directly to the database:

db.add_triple_parts(
    "http://example.org/alice",
    "http://xmlns.com/foaf/0.1/name",
    "Alice"
);

db.add_triple_parts(
    "http://example.org/alice",
    "http://xmlns.com/foaf/0.1/age",
    "30"
);

Executing SPARQL Queries

Execute SPARQL queries to retrieve and manipulate data.

Basic SELECT Query

use kolibrie::execute_query::execute_query;

let sparql_query = r#"
PREFIX ex: <http://example.org/>
SELECT ?s ?o
WHERE {
    ?s ex:knows ?o .
}
"#;

let results = execute_query(sparql_query, &mut db);

for row in results {
    println!("Subject: {}, Object: {}", row[0], row[1]);
}

Query with FILTER

let sparql = r#"
PREFIX ex: <http://example.org/vocab#>

SELECT ?name ?attendees
WHERE {
    ?event ex:name ?name .
    ?event ex:attendees ?attendees . 
    FILTER (?attendees > 50)
}
"#;

le

Related Skills

himalaya

350.1k

CLI to manage emails via IMAP/SMTP. Use `himalaya` to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language).

taskflow

350.1k

name: taskflow description: Use when work should span one or more detached tasks but still behave like one job with a single owner context. TaskFlow is the durable flow substrate under authoring layer

coding-agent

350.1k

Delegate coding tasks to Codex, Claude Code, or Pi agents via background process

notion

350.1k

Notion API for creating and managing pages, databases, and blocks.