Knwler

Turn documents into structured knowledge.

Knwler is a lightweight Python tool that extracts structured knowledge graphs from documents using AI. Feed it a PDF, URL, or text file and receive a richly connected network of entities, relationships, and topics — complete with an interactive HTML report and exports ready for your favorite graph analytics platform.

Built for compliance teams, legal departments, research analysts, and anyone who needs to rapidly understand the structure hidden inside dense documents.

No big package dependencies, runs local if you wish, no licenses, no fuss.

Why Knwler?
What's New in v1.0
Key Features
Supported Backends
Cost & Performance
Quick Start
CLI Overview
Examples
Integration & Export
Documentation
Benchmarking
Disclaimer

Why Knwler?

| Challenge | How Knwler Solves It | | ----------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | | Manually mapping entities and relationships in 100+ page regulatory documents | Automated extraction produces a navigable knowledge graph in minutes | | Expensive vendor lock-in for document intelligence | Runs fully local with Ollama or LM Studio (zero data leaves your machine) or via cloud providers for speed | | Documents in multiple languages across jurisdictions | Auto-detects language and adapts all prompts — supports English, German, French, Spanish, Dutch, Italian, Portuguese, and Chinese | | Results trapped inside one tool | Exports to HTML, GML, GraphML, JSONLD/RDF, and raw JSON — import directly into Neo4j, Gephi, yEd, SurrealDB, GraphDB, Neptune, or any graph platform | | High per-document processing costs | ~$0.20 per 20-page PDF with OpenAI; completely free when running locally; LLM response caching means re-runs cost nothing | | Processing many documents at scale | Batch processing for OpenAI and Gemini with SQLite-based resume support, plus directory ingestion | | Understanding what matters most in the graph | Built-in graph analytics and reporting — find the most important entities, chunks, and clusters |

Knwler does not implement graph RAG — it focuses on one thing: turning unstructured text into a knowledge graph. You decide how to embed, which graph database to use, and what agentic framework to layer on top.

What's New in v1.0

This is a major release with many new features and improvements. See the full CHANGELOG for details.

Highlights:

Google Gemini added as an LLM backend (including batch processing)
Anthropic added as an LLM backend (Claude Sonnet/Haiku)
LM Studio support for local inference
Directory ingestion — process a whole directory of files with --directory
Multi-document consolidation — merge knowledge graphs from multiple extractions into one
Batch processing for OpenAI and Gemini with SQLite-based resume
Graph analytics CLI commands — convert graphs to GML/GraphML/JSONLD and generate analytical reports
JSONLD/RDF export for triple stores (GraphDB, StarDog, etc.)
AWS Neptune import script
Fetch command — give a URL (PDF or webpage) or a Wikipedia topic and Knwler fetches, parses, and extracts automatically
Benchmark suite to compare speed and quality across models and providers
Dark/light theme toggle and a new 3-column report template
Entity disambiguation — same name, different type (e.g. Apple the company vs. apple the fruit)
Additional languages: Italian, Portuguese, and Simplified Chinese
Async throughout the pipeline
Higher-level Python API (knwler.api) for downstream integrations
Cache and results directories moved to ~/.knwler/

Key Features

Multiple LLM Backends — Cloud or Fully Local

Choose between OpenAI, Anthropic, or Google Gemini for cloud speed, or Ollama / LM Studio for fully offline, air-gapped operation. You can switch backends between runs and incrementally augment the same graph. See Models & Providers.

Automatic Schema Discovery

The pipeline analyzes a sample of your document and infers the optimal entity types and relation types — no manual ontology engineering required. You can also supply your own schema.

Entity Disambiguation

Apple as a company or apple as a fruit? Knwler identifies nodes based on name and type, ensuring entities are pinned correctly. Disambiguated entities are highlighted with type badges in the exported report.

Multilingual by Design

Language is auto-detected on every run. All prompts and console/UI output are localized. Currently supported: English, German, French, Spanish, Dutch, Italian, Portuguese, and Simplified Chinese. Adding a new language means extending a single JSON file. See Language.

Multi-Document Consolidation

Process multiple documents and consolidate the resulting knowledge graphs into a single unified graph. Entity descriptions from multiple sources are intelligently merged via LLM-powered summarization. Works both as a post-processing step and as a standalone command.

Cluster Detection & Topic Assignment

The Louvain algorithm automatically discovers clusters of related entities and an LLM labels each cluster with human-readable topics — giving you instant thematic insight into the document's structure.

Self-Contained HTML Report

Export a single HTML file with interactive network visualization (with adjustable degree-threshold slider), entity index, topic overview, and rephrased text chunks — shareable without any server or dependencies. Markdown in rephrased chunks and descriptions is rendered properly.

Multiple templates out of the box:

A standard report with network visualization
A 3-column report without graph viz
A graph-viz focused report with custom layout algorithm

See Templates and HTML Export.

Rich Export Ecosystem

JSON — the canonical graph.json output for any downstream use
GML / GraphML — open directly in yEd, Gephi, or any standards-compliant graph tool
JSONLD / RDF — for triple stores like GraphDB, StarDog, and AWS Neptune. See JSONLD
HTML — standalone interactive report with customizable templates
Neo4j — direct import with constraints and indexes. See [Neo4j](https://knwler.com/docs/neo4j.

Knwler

Install / Use

README

Knwler

Table of Contents

Why Knwler?

What's New in v1.0

Key Features

Multiple LLM Backends — Cloud or Fully Local

Automatic Schema Discovery

Entity Disambiguation

Multilingual by Design

Multi-Document Consolidation

Cluster Detection & Topic Assignment

Self-Contained HTML Report

Rich Export Ecosystem