SkillAgentSearch skills...

Orionbelt Analytics

Ontology-based MCP server that analyzes database schemas (PostgreSQL, Snowflake, ClickHouse, Dremio) and generates RDF/OWL ontologies with SQL mappings for fan-trap-free Text-to-SQL.

Install / Use

/learn @ralfbecher/Orionbelt Analytics

README

<!-- mcp-name: io.github.ralfbecher/orionbelt-analytics --> <p align="center"> <img src="assets/ORIONBELT Logo.png" alt="OrionBelt Logo" width="400"> </p> <h1 align="center">OrionBelt Analytics</h1> <p align="center"><strong>The Ontology-based MCP server for your Text-2-SQL convenience.</strong></p>

Version 1.1.0 Python 3.13+ License: BSL 1.1 FastMCP RDF/OWL

BigQuery PostgreSQL Snowflake ClickHouse Dremio Databricks DuckDB MySQL

This project provides a production-ready Python-based MCP (Model Context Protocol) server that analyzes relational database schemas (PostgreSQL, MySQL, Snowflake, ClickHouse, Dremio, BigQuery, DuckDB/MotherDuck, and Databricks SQL) and automatically generates comprehensive ontologies in RDF/Turtle format with direct SQL mappings.

Better Together: Combine with OrionBelt Semantic Layer for a complete AI-powered analytics stack. The Semantic Layer compiles declarative YAML models into dialect-specific, optimized SQL — ensuring correct joins, aggregations, and fan-trap-free queries across PostgreSQL, MySQL, Snowflake, ClickHouse, Dremio, BigQuery, DuckDB, and Databricks. Run both MCP servers side-by-side in Claude Desktop for schema-aware ontology generation and guaranteed-correct SQL compilation.

GraphRAG: The Foundation for OBML Model Creation

OrionBelt Analytics provides the schema intelligence that powers semantic modeling:

When creating OBML models (OrionBelt Modeling Language) for the Semantic Layer, users and LLMs need to understand database structure, relationships, and join paths. GraphRAG serves as the foundational discovery engine that makes intelligent semantic modeling possible.

The Workflow:

  1. Schema Discovery (Analytics + GraphRAG)

    • analyze_schema() automatically initializes GraphRAG
    • Graph structure built from foreign keys (up to 12 hops)
    • Vector embeddings created for semantic similarity search
  2. OBML Model Creation (Analytics + GraphRAG + LLM)

    • LLM queries GraphRAG: "What tables contain revenue data?"
    • LLM queries GraphRAG: "How do customers connect to orders?"
    • LLM creates OBML model with metrics, dimensions, and join paths
    • User/LLM loads OBML into Semantic Layer
  3. Business Queries (Semantic Layer)

    • Natural language queries via pre-built OBML models
    • Guaranteed-correct SQL compilation with fan-trap prevention
  4. Ad-hoc Technical Queries (Analytics + GraphRAG)

    • Direct SQL execution for queries not in OBML model
    • GraphRAG-assisted discovery for new relationships

Why GraphRAG is Essential:

  • Zero-setup intelligence - No manual modeling required for schema discovery
  • Intelligent relationship discovery - Graph algorithms find complex join paths automatically
  • Semantic similarity - Vector search identifies related concepts across tables
  • Foundation for OBML - Provides the raw intelligence needed to create accurate semantic models

Without GraphRAG: Creating OBML models requires manual schema inspection, tedious relationship discovery, and extensive user guidance.

With GraphRAG: LLMs can intelligently discover schema structure and create accurate OBML models automatically, while still supporting ad-hoc technical queries that fall outside the semantic layer.

Key Philosophy: Automatic Ontology Integration

Our main analysis tool get_analysis_context() automatically includes ontology generation, making semantic context readily available for every query.

Architecture Overview

<p align="center"> <img src="assets/architecture.png" alt="OrionBelt Analytics Architecture" width="900"> </p>

OrionBelt Analytics combines multiple AI-powered technologies to provide intelligent database understanding:

  • GraphRAG - Graph-based relationship discovery with 12-hop traversal
  • Vector Search - Semantic similarity via ChromaDB embeddings
  • RDF Store - Persistent ontology storage with SPARQL query interface
  • Connection Scoping - Isolated storage per database connection
  • FastMCP - Model Context Protocol server for Claude Desktop integration

🌟 Key Features

🔗 Database Connectivity

  • PostgreSQL, MySQL, Snowflake, ClickHouse, Dremio, BigQuery, DuckDB/MotherDuck, and Databricks SQL support with connection pooling
  • Environment variable fallback - parameters optional, uses .env when not provided
  • Enhanced connection management with retry logic and timeout handling
  • Automatic dependency management for all database connectors
  • Cloud-native support - BigQuery, MotherDuck, and Databricks SQL for cloud data warehouses
  • Local analytics - DuckDB for fast local OLAP workloads

🎯 13 Essential Tools

  • Streamlined workflow with focused, purpose-built tools
  • Interactive charting (generate_chart) with direct image rendering
  • Comprehensive schema analysis with automatic ontology generation
  • Semantic name resolution for business-friendly ontologies
  • Custom ontology loading from external files
  • Built-in workflow guidance via FastMCP Context integration
  • Focus on results - maximum effectiveness with minimum complexity

🧠 Automatic Ontology Generation

  • Self-sufficient ontologies with direct database references (db:sqlReference, db:sqlJoinCondition)
  • Business context inference from table and column naming patterns
  • Complete SQL mappings embedded directly in ontology
  • Fan-trap detection and query safety validation

🗺️ R2RML Mapping Generation

  • W3C-compliant R2RML mappings auto-generated alongside schema analysis
  • SQL query templates with rr:sqlQuery and rr:sqlVersion rr:SQL2008
  • XSD datatype mapping from SQL types to RDF datatypes
  • Configurable base IRI via environment variable (R2RML_BASE_IRI)

🔍 SPARQL Query Support

  • Persistent RDF store via Oxigraph with SPARQL 1.1 support
  • Semantic schema queries using the db: namespace ontology annotations
  • Connection-scoped storage - each database gets isolated RDF store
  • 7 SPARQL tools for semantic exploration:
    • query_sparql() - Execute custom SPARQL SELECT queries
    • query_sparql_ask() - Execute SPARQL ASK queries (true/false)
    • list_tables_sparql() - List tables via semantic query
    • find_columns_by_type_sparql() - Find columns by data type
    • add_rdf_knowledge() - Add custom metadata triples
    • store_ontology_in_rdf() - Persist ontologies for SPARQL access
    • get_rdf_store_stats() - RDF store statistics

Key ontology properties for SPARQL queries:

  • db:sqlReference - Maps properties to SQL columns (e.g., "customers.customer_id")
  • db:sqlJoinCondition - Stores JOIN conditions (e.g., "orders.customer_id = customers.customer_id")
  • db:hasColumn, db:columnName, db:dataType - Column metadata
  • db:tableName, db:relationshipType - Schema structure

Example SPARQL query:

PREFIX db: <http://example.com/db#>
SELECT ?table ?column ?dataType
WHERE {
    ?table db:hasColumn ?column .
    ?column db:columnName ?columnName .
    ?column db:dataType ?dataType .
    FILTER(CONTAINS(LCASE(?columnName), "revenue"))
}

🛡️ Advanced SQL Safety

  • Fan-trap prevention protocols with mandatory relationship analysis
  • Query pattern validation to prevent data multiplication errors
  • Safe aggregation patterns (UNION, separate CTEs, window functions)
  • Comprehensive SQL validation before execution

⚡ Performance & Reliability

  • Concurrent processing with thread pool management
  • Connection pooling and resource optimization
  • Comprehensive error handling with structured responses
  • Production-ready logging and monitoring

Installation

# Install all dependencies (recommended)
uv sync

Key dependencies: FastMCP, SQLAlchemy, database connectors (PostgreSQL, MySQL, Snowflake, ClickHouse, Dremio, BigQuery, DuckDB, Databricks), RDFLib, Pydantic

Project Structure

orionbelt-analytics/
├── src/
│   ├── __init__.py                 # Package initialization
│   ├── main.py                     # FastMCP server entry point (13 tools)
│   ├── database_manager.py         # Database connection and analysis
│   ├── ontology_generator.py       # RDF ontology generation with SQL mappings
│   ├── r2rml_generator.py          # W3C R2RML mapping generation
│   ├── dremio_client.py            # Dremio database client
│   ├── security.py                 # Security and validation utilities
│   ├── chart_u
View on GitHub
GitHub Stars7
CategoryDevelopment
Updated4h ago
Forks2

Languages

Python

Security Score

75/100

Audited on Mar 26, 2026

No findings