Knwler
Knwler is a lightweight, single-file Python tool that extracts structured knowledge graphs from documents using AI. Feed it a PDF or text file and receive a richly connected network of entities, relationships, and topics — complete with an interactive HTML report and exports ready for your favorite graph analytics platform.
Install / Use
/learn @Orbifold/KnwlerREADME
Knwler
Turn documents into structured knowledge.
Knwler is a lightweight Python tool that extracts structured knowledge graphs from documents using AI. Feed it a PDF, URL, or text file and receive a richly connected network of entities, relationships, and topics — complete with an interactive HTML report and exports ready for your favorite graph analytics platform.
Built for compliance teams, legal departments, research analysts, and anyone who needs to rapidly understand the structure hidden inside dense documents.
No big package dependencies, runs local if you wish, no licenses, no fuss.
<div style="display:grid; grid-template-columns:repeat(4,1fr); gap:2px;"> <a href="https://knwler.com/benchmark" target="_blank"><img src="https://knwler.com/images/benchmarks.png" style="height:200px;"/></a> <a href="https://knwler.com/examples/ogma/index.html" target="_blank"><img src="https://knwler.com/images/ogma.jpg" style="height:200px;"/></a> <a href="https://knwler.com/docs/surreal.html" target="_blank"><img src="https://knwler.com/images/SurrealExport.jpg" style="height:200px;"/></a> <a href="https://knwler.com/docs/visualization.html" target="_blank"><img src="https://knwler.com/images/yEd.png" style="height:200px;"/></a> <a href="https://knwler.com/examples/yfiles/index.html" target="_blank"><img src="https://knwler.com/images/yFilesKnwler.jpg" style="height:200px;"/></a> <a href="https://knwler.com/examples/yfiles/index.html" target="_blank"><img src="https://knwler.com/images/yFilesKnwlerDark.jpg" style="height:200px;"/></a> <a href="https://knwler.com/examples/ogma/index.html" target="_blank"><img src="https://knwler.com/images/OgmaKnwler.jpg" style="height:200px;"/></a> <a href="https://knwler.com/docs/neo4j.html" target="_blank"><img src="https://knwler.com/images/Neo4jExport.jpg" style="height:200px;"/></a> <a href="https://knwler.com/examples/HRCustom/index.html" target="_blank"><img src="https://knwler.com/images/HumanRightsFancy.jpg" style="height:200px;"/></a> <a href="https://knwler.com/docs/graphdb.html" target="_blank"><img src="https://knwler.com/images/GraphDBVisualGraph.jpg" style="height:200px;"/></a> <a href="https://knwler.com/examples/NIST/index.html" target="_blank"><img src="https://knwler.com/images/ColumnsTemplate.jpg" style="height:200px;"/></a> <a href="https://knwler.com/examples/Deloitte/index.html" target="_blank"><img src="https://knwler.com/images/DefaultTemplate.jpg" style="height:200px;"/></a> <a href="https://knwler.com/examples/3d/index.html" target="_blank"><img src="https://knwler.com/images/3d.png" style="height:200px;"/></a> </div>Table of Contents
- Why Knwler?
- What's New in v1.0
- Key Features
- Supported Backends
- Cost & Performance
- Quick Start
- CLI Overview
- Examples
- Integration & Export
- Documentation
- Benchmarking
- Disclaimer
Why Knwler?
| Challenge | How Knwler Solves It | | ----------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | | Manually mapping entities and relationships in 100+ page regulatory documents | Automated extraction produces a navigable knowledge graph in minutes | | Expensive vendor lock-in for document intelligence | Runs fully local with Ollama or LM Studio (zero data leaves your machine) or via cloud providers for speed | | Documents in multiple languages across jurisdictions | Auto-detects language and adapts all prompts — supports English, German, French, Spanish, Dutch, Italian, Portuguese, and Chinese | | Results trapped inside one tool | Exports to HTML, GML, GraphML, JSONLD/RDF, and raw JSON — import directly into Neo4j, Gephi, yEd, SurrealDB, GraphDB, Neptune, or any graph platform | | High per-document processing costs | ~$0.20 per 20-page PDF with OpenAI; completely free when running locally; LLM response caching means re-runs cost nothing | | Processing many documents at scale | Batch processing for OpenAI and Gemini with SQLite-based resume support, plus directory ingestion | | Understanding what matters most in the graph | Built-in graph analytics and reporting — find the most important entities, chunks, and clusters |
Knwler does not implement graph RAG — it focuses on one thing: turning unstructured text into a knowledge graph. You decide how to embed, which graph database to use, and what agentic framework to layer on top.
What's New in v1.0
This is a major release with many new features and improvements. See the full CHANGELOG for details.
Highlights:
- Google Gemini added as an LLM backend (including batch processing)
- Anthropic added as an LLM backend (Claude Sonnet/Haiku)
- LM Studio support for local inference
- Directory ingestion — process a whole directory of files with
--directory - Multi-document consolidation — merge knowledge graphs from multiple extractions into one
- Batch processing for OpenAI and Gemini with SQLite-based resume
- Graph analytics CLI commands — convert graphs to GML/GraphML/JSONLD and generate analytical reports
- JSONLD/RDF export for triple stores (GraphDB, StarDog, etc.)
- AWS Neptune import script
- Fetch command — give a URL (PDF or webpage) or a Wikipedia topic and Knwler fetches, parses, and extracts automatically
- Benchmark suite to compare speed and quality across models and providers
- Dark/light theme toggle and a new 3-column report template
- Entity disambiguation — same name, different type (e.g. Apple the company vs. apple the fruit)
- Additional languages: Italian, Portuguese, and Simplified Chinese
- Async throughout the pipeline
- Higher-level Python API (
knwler.api) for downstream integrations - Cache and results directories moved to
~/.knwler/
Key Features
Multiple LLM Backends — Cloud or Fully Local
Choose between OpenAI, Anthropic, or Google Gemini for cloud speed, or Ollama / LM Studio for fully offline, air-gapped operation. You can switch backends between runs and incrementally augment the same graph. See Models & Providers.
Automatic Schema Discovery
The pipeline analyzes a sample of your document and infers the optimal entity types and relation types — no manual ontology engineering required. You can also supply your own schema.
Entity Disambiguation
Apple as a company or apple as a fruit? Knwler identifies nodes based on name and type, ensuring entities are pinned correctly. Disambiguated entities are highlighted with type badges in the exported report.
Multilingual by Design
Language is auto-detected on every run. All prompts and console/UI output are localized. Currently supported: English, German, French, Spanish, Dutch, Italian, Portuguese, and Simplified Chinese. Adding a new language means extending a single JSON file. See Language.
Multi-Document Consolidation
Process multiple documents and consolidate the resulting knowledge graphs into a single unified graph. Entity descriptions from multiple sources are intelligently merged via LLM-powered summarization. Works both as a post-processing step and as a standalone command.
Cluster Detection & Topic Assignment
The Louvain algorithm automatically discovers clusters of related entities and an LLM labels each cluster with human-readable topics — giving you instant thematic insight into the document's structure.
Self-Contained HTML Report
Export a single HTML file with interactive network visualization (with adjustable degree-threshold slider), entity index, topic overview, and rephrased text chunks — shareable without any server or dependencies. Markdown in rephrased chunks and descriptions is rendered properly.
Multiple templates out of the box:
- A standard report with network visualization
- A 3-column report without graph viz
- A graph-viz focused report with custom layout algorithm
See Templates and HTML Export.
Rich Export Ecosystem
- JSON — the canonical
graph.jsonoutput for any downstream use - GML / GraphML — open directly in yEd, Gephi, or any standards-compliant graph tool
- JSONLD / RDF — for triple stores like GraphDB, StarDog, and AWS Neptune. See JSONLD
- HTML — standalone interactive report with customizable templates
- Neo4j — direct import with constraints and indexes. See [Neo4j](https://knwler.com/docs/neo4j.
