SkillAgentSearch skills...

Semantica

Semantica 🧠 — A framework for building semantic layers, context graphs, and decision intelligence systems with explainability and provenance.

Install / Use

/learn @Hawksight-AI/Semantica

README

<div align="center"> <img src="Semantica Logo.png" alt="Semantica Logo" width="420"/>

🧠 Semantica

A Framework for Building Context Graphs and Decision Intelligence Layers for AI

Python 3.8+ License: MIT PyPI Version Total Downloads CI Discord X Discord X

⭐ Give us a Star • 🍴 Fork us • 💬 Join our Discord • 🐦 Follow on X

Transform Chaos into Intelligence. Build AI systems with context graphs, decision tracking, and advanced knowledge engineering that are explainable, traceable, and trustworthy — not black boxes.

</div>

The Problem

AI agents today are capable but not trustworthy:

  • No memory structure — agents store embeddings, not meaning. Retrieval is fuzzy; there's no way to ask why something was recalled.
  • No decision trail — agents make decisions continuously but record nothing. When something goes wrong, there's no history to debug or audit.
  • No provenance — outputs cannot be traced back to source facts. In regulated industries, this is a compliance blocker.
  • No reasoning transparency — black-box answers with no explanation of how a conclusion was reached.
  • No conflict detection — contradictory facts silently coexist in vector stores, producing unpredictable answers.

These aren't edge cases. They are the reason AI cannot be deployed in healthcare, finance, legal, and government without custom guardrails built from scratch.

The Solution

Semantica is the context and intelligence layer you add to your AI stack:

  • Context Graphs — structured graph of entities, relationships, and decisions your agent builds as it works. Queryable, traceable, persistent.
  • Decision Intelligence — every decision is a first-class object: recorded, linked causally, searchable by precedent, and analyzable for downstream impact.
  • Provenance — every fact links to its source. W3C PROV-O compliant. Full lineage from ingestion to inference.
  • Reasoning engines — forward chaining, Rete networks, deductive, abductive, and SPARQL reasoning. Explainable inference paths, not black-box answers.
  • Deduplication & QA — conflict detection, entity resolution, and validation built into the pipeline.

Works alongside LangChain, LlamaIndex, AutoGen, CrewAI, and any LLM provider — Semantica is not a replacement, it's the accountability layer on top.

⚡ Quick Installation

pip install semantica

What's New in v0.3.0

First stable release — Production/Stable on PyPI. Ships across three stages: 0.3.0-alpha, 0.3.0-beta, and 0.3.0 stable.

| Area | Highlights | |------|-----------| | Context Graphs | Temporal validity windows (valid_from/valid_until), weighted BFS (min_weight), cross-graph navigation (link_graph, navigate_to, resolve_links) with full save/load persistence | | Decision Intelligence | Complete lifecycle: record_decisiontrace_decision_chainanalyze_decision_impactfind_similar_decisions; hybrid precedent search; PolicyEngine with versioned rules | | KG Algorithms | PageRank, betweenness, community detection (Louvain), Node2Vec embeddings, link prediction, path finding — all returning structured dicts | | Semantic Extraction | LLM relation extraction fixed (no silent drops); _match_pattern rewritten; duplicate relation bug removed; "llm_typed" metadata corrected | | Deduplication v2 | blocking_v2/hybrid_v2 candidate generation (63.6% faster); two-stage prefilter (18–25% faster); semantic dedup v2 (6.98x faster) | | Delta Processing | SPARQL-based incremental diff; delta_mode pipelines; snapshot versioning with prune_versions() | | Export | RDF format aliases ("ttl", "json-ld", etc.); ArangoDB AQL export; Apache Parquet export (Spark/BigQuery/Databricks ready) | | Pipeline | FailureHandler with LINEAR/EXPONENTIAL/FIXED backoff; PipelineValidator returning ValidationResult; retry loop fixed | | Graph Backends | Apache AGE (SQL injection fixed), AWS Neptune, FalkorDB, PgVector (HNSW/IVFFlat indexing) | | Tests | 886+ passing, 0 failures — 335 context, ~430 KG, 70 semantic extraction, 85 real-world E2E |

See RELEASE_NOTES.md for the full per-contributor breakdown and CHANGELOG for the complete diff.


Unreleased / Coming Next

| Area | Highlights | |------|-----------| | SHACL Constraints | OntologyEngine.to_shacl() auto-derives SHACL shapes from any OWL ontology; validate_graph() returns structured SHACLValidationReport with plain-English violation explanations; three quality tiers ("basic", "standard", "strict"); three output formats (Turtle, JSON-LD, N-Triples); 3-level inheritance propagation |


Features

Context & Decision Intelligence

  • Context Graphs — structured graph of entities, relationships, and decisions; queryable, causal, persistent
  • Decision tracking — record, link, and analyze every agent decision with add_decision(), record_decision()
  • Causal chains — link decisions with add_causal_relationship(), trace lineage with trace_decision_chain()
  • Precedent search — hybrid similarity search over past decisions with find_similar_decisions()
  • Influence analysisanalyze_decision_impact(), analyze_decision_influence() — understand downstream effects
  • Policy engine — enforce business rules with check_decision_rules(); automated compliance validation
  • Agent memoryAgentMemory with short/long-term storage, conversation history, and statistics
  • Cross-system context capturecapture_cross_system_inputs() for multi-agent pipelines

Knowledge Graphs

  • Knowledge graph construction — entities, relationships, properties, typed edges
  • Graph algorithms — PageRank, betweenness centrality, clustering coefficient, community detection
  • Node embeddings — Node2Vec embeddings via NodeEmbedder
  • Similarity — cosine similarity via SimilarityCalculator
  • Link prediction — score potential new edges via LinkPredictor
  • Temporal graphs — time-aware nodes and edges
  • Incremental / delta processing — update graphs without full recompute

Semantic Extraction

  • Entity extraction — named entity recognition, normalization, classification
  • Relation extraction — triplet generation from raw text using LLMs or rule-based methods
  • LLM-typed extraction — extraction with typed relation metadata
  • Deduplication v1 — Jaro-Winkler similarity, basic blocking
  • Deduplication v2blocking_v2, hybrid_v2, semantic_v2 strategies with max_candidates_per_entity
  • Triplet deduplicationdedup_triplets() for removing duplicate (subject, predicate, object) triples

Reasoning Engines

  • Forward chainingReasoner with IF/THEN string rules and dict facts
  • Rete networkReteEngine for high-throughput production rule matching
  • Deductive reasoningDeductiveReasoner for classical inference
  • Abductive reasoningAbductiveReasoner for hypothesis generation from observations
  • SPARQL reasoningSPARQLReasoner for query-based inference over RDF graphs

Provenance & Auditability

  • Entity provenanceProvenanceTracker.track_entity(id, source_url, metadata)
  • Algorithm provenanceAlgorithmTrackerWithProvenance tracks computation lineage
  • Graph builder provenanceGraphBuilderWithProvenance records entity source lineage from URLs
  • W3C PROV-O compliant — lineage tracking across all modules
  • Change management — version control with checksums, audit trails, compliance support

Vector Store

  • Backends — FAISS, Pinecone, Weaviate, Qdrant, Milvus, PgVector, in-memory
  • Semantic search — top-k retrieval by embedding similarity
  • Hybrid search — vector + keyword with configurable weights
  • Filtered search — metadata-based filtering on any field
  • Custom similarity weights — tune retrieval per use case

🌐 Graph Database Support

  • AWS Neptune — Amazon Neptune graph database with IAM authentication
  • Apache AGE — PostgreSQL graph extension with openCypher via SQL
  • FalkorDB — native support; DecisionQuery and CausalChainAnalyzer work directly with FalkorDB row/header shapes

Data Ingestion

  • File formats — PDF, DOCX, HTML, JSON, CSV, Excel, PPTX, archives
  • Web crawlWebIngestor with configurable depth
  • DatabasesDBIngestor with SQL query support
  • SnowflakeSnowflakeIngestor with table/query ingestion, pagination, and key-pair/OAuth auth
  • Docling — advanced document parsing with table and layout extraction (PDF, DOCX, PPTX, XLSX)
  • Media — image OCR, audio/video metadata extraction

Export Formats

  • RDF — Turtle (.ttl), JSON-LD, N-Triples (.nt), XML via RDFExporter
  • ParquetParquetExporter for entities, relationships, and full KG export
  • *ArangoDB AQL
View on GitHub
GitHub Stars906
CategoryOperations
Updated3h ago
Forks146

Languages

Python

Security Score

100/100

Audited on Mar 29, 2026

No findings