Hedl
Token-efficient data serialization for LLM/AI. 50% fewer tokens than JSON, 93% better value/token. Rust, schema validation, LSP.
Install / Use
/learn @dweve-ai/HedlQuality Score
Category
Development & EngineeringSupported Platforms
README
The Problem
You're building AI applications and sending structured data to LLMs. Like everyone else, you're probably using JSON.
But have you actually looked at what you're paying for?
{"id": "u1", "name": "Alice", "email": "alice@company.com", "role": "admin"}
{"id": "u2", "name": "Bob", "email": "bob@company.com", "role": "user"}
{"id": "u3", "name": "Carol", "email": "carol@company.com", "role": "user"}
{"id": "u4", "name": "Dave", "email": "dave@company.com", "role": "user"}
{"id": "u5", "name": "Eve", "email": "eve@company.com", "role": "user"}
See those "id":, "name":, "email":, "role": strings? They show up five times. That's not your data. That's overhead. Pure waste.
At Claude's pricing ($3/million tokens), a 10,000-user dataset costs $15 just in repeated key names. Every single API call. The more records you have, the more you pay to say the same words over and over.
The Solution
What if you could declare your structure once and then just send the data?
%V:2.0
%NULL:~
%QUOTE:"
%S:User:[id, name, email, role]
---
users: @User
|u1,Alice,alice@company.com,admin
|u2,Bob,bob@company.com,user
|u3,Carol,carol@company.com,user
|u4,Dave,dave@company.com,user
|u5,Eve,eve@company.com,user
Same data. 56% fewer tokens.
The schema declaration (%S:) lets you define your structure once, then send only the values. No repeated keys, no brackets, no quotes around simple strings.
This is HEDL: Hierarchical Entity Data Language. A data format designed from the ground up for the economics of LLM applications.
flowchart LR
subgraph Input["Your Data"]
JSON["JSON"]
XML["XML"]
YAML["YAML"]
CSV["CSV"]
More["..."]
end
subgraph MCP["HEDL MCP Server"]
Convert["Auto-Convert"]
end
subgraph LLM["LLM"]
AI["Claude / GPT / etc."]
end
JSON --> Convert
XML --> Convert
YAML --> Convert
CSV --> Convert
More --> Convert
Convert -->|"56% fewer tokens"| AI
AI -->|"Response"| Convert
Convert -->|"Back to original format"| JSON
style Convert fill:#ff9,stroke:#333
style AI fill:#9ff,stroke:#333
The MCP server handles everything automatically. Your AI agent sends JSON like it always did, the server converts to HEDL (saving you 56% on tokens), the LLM processes it, and responses come back in your original format. Zero code changes on your end.
Quickstart
Option 1: MCP Server (Recommended)
The fastest way to start saving tokens is the MCP server. Add HEDL to your AI agent with zero code changes.
{
"mcpServers": {
"hedl": {
"command": "hedl-mcp",
"args": ["--auto-convert"]
}
}
}
That's literally it. Your agent now uses 56% fewer tokens automatically.
<p align="center"> <a href="https://dweve-ai.github.io/hedl-playground/"><strong>Try the Live Demo</strong></a> - Convert JSON to HEDL in your browser </p>Option 2: CLI
If you want to experiment with HEDL from the command line, install the CLI:
cargo install hedl-cli
# Convert your existing JSON to HEDL
echo '{"users": [{"name": "Alice"}, {"name": "Bob"}]}' | hedl from-json
# Convert back to JSON when you need it
echo '%V:2.0
%NULL:~
%QUOTE:"
---
greeting: Hello, World!' | hedl to-json
Option 3: Rust Library
For full control, use the library directly:
cargo add hedl
use hedl::{parse, to_json, from_json};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let doc = parse(r#"
%V:2.0
%NULL:~
%QUOTE:"
%S:User:[id,name,role]
---
users: @User
|alice,Alice Smith,admin
|bob,Bob Jones,user
"#)?;
// Convert to JSON for APIs that need it
let json = to_json(&doc)?;
// Convert JSON to HEDL for your LLM prompts
let hedl = from_json(&json_str)?;
Ok(())
}
Why HEDL
"But will LLMs actually understand it?"
This was the first question we asked ourselves. We didn't assume the answer. We tested it.
We ran 571 structured data extraction questions across 7 real-world datasets, testing Mistral Large, DeepSeek Chat, and NVIDIA GLM-4.7. Real questions. Real data. Rigorous methodology.
| Format | Accuracy | Tokens/Question | Accuracy per 1K Tokens | |--------|:--------:|:---------------:|:----------------------:| | HEDL | 80.4% | 6,912 | 0.12 | | JSON | 70.1% | 15,697 | 0.05 | | YAML | 69.8% | 13,535 | 0.05 | | TOON | 68.2% | 7,320 | 0.09 | | XML | 68.6% | 18,164 | 0.04 | | CSV | 67.3% | 8,049 | 0.08 |
HEDL delivers 2.4x more correct answers per token than JSON.
HEDL wins on both accuracy (+10.3 percentage points over JSON) and efficiency (56% fewer tokens). At scale, this compounds dramatically: for the same budget, HEDL lets you send 2x the context while getting more correct answers.
CSV is efficient but falls apart on complex queries. YAML is nearly as verbose as JSON. XML is worst on both metrics. TOON is another token-efficient format, but HEDL beats it by +12.2 accuracy points with similar token usage.
HEDL is the only format that's both compact AND comprehensible to LLMs.
The Token Economics
Here's what real benchmarks look like. Real data. Real savings.
| Dataset Type | JSON Tokens | HEDL Tokens | Savings | |--------------|:-----------:|:-----------:|:-------:| | Flat user records | 15,697 | 6,912 | 56.0% | | Product catalog | 15,623 | 6,842 | 56.2% | | Nested blog posts | 15,771 | 6,981 | 55.7% | | Order history | 15,698 | 6,912 | 56.0% | | Config files | 476 | 210 | 55.9% |
Average savings: 56%
At scale, this adds up fast. A service processing 1 billion tokens monthly saves $1,680/month by switching from JSON to HEDL. Same data. Same comprehension. Half the cost.
Beyond Token Savings
HEDL isn't just about compression. It's about building better AI applications.
Schema Validation catches malformed data before it hits your LLM:
%S:Product:[sku, name, price]
---
products: @Product
|ABC-123,Widget,29.99
|DEF-456,Gadget,not_a_price # Error caught at parse time
Type-Safe References let you link entities without duplicating data:
users: @User
|alice,Alice Smith,alice@company.com
orders: @Order
|ord-001,@User:alice,2024-01-15,299.99
# ^^^^^^^^^^^^ validated at parse time
List Literals use (...) syntax for ordered sequences:
%S:Article:[id,title,tags,score]
---
articles: @Article
|art-1,Introduction to HEDL,(tutorial,beginner,data),4.5
|art-2,Advanced Patterns,(advanced,optimization),4.8
|art-3,No Tags,(),3.2
Lists use (...) for any scalar values (strings, references, etc.), distinct from tensors [...] which are for numeric data only.
LSP Integration gives you real-time validation and autocomplete in your editor: syntax highlighting, auto-completion (@Us → @User:alice), hover documentation, go-to-definition, and error squiggles before you even save the file.
Headers and Metadata
Every HEDL document starts with headers:
%V:2.0 # Version
%NULL:~ # Null character
%QUOTE:" # Quote character
Count Metadata helps LLMs understand your data without scanning all rows:
%V:2.0
%NULL:~
%QUOTE:"
%S:Order:[id,customer,status,total]
%C:Order.total=1247
%C:Order.status:delivered=892,shipped=234,pending=121
---
orders: @Order
|o1,cust-001,delivered,99.99
# ... 1246 more orders
1-Space Indentation keeps things clean:
%V:2.0
%NULL:~
%QUOTE:"
---
root:
child: # Exactly 1 space
grandchild: # Exactly 1 space per level
Benchmarks
Performance (2026-02-02, release build)
| Operation | Latency (p50) | Size | |-----------|:-------------:|:----:| | Parsing | 37.1 µs | Tiny | | Parsing | 396 µs | Small | | Parsing | 12.1 ms | Medium | | JSON Conversion | 10.0 µs | Tiny | | JSON Conversion | 115 µs | Small | | JSON Conversion | 1.10 ms | Medium | | Validation | 23.7 µs | Small | | Canonicalization | 83.5 µs | Tiny | | Full Pipeline | 1.04 ms | Small |
Scaling Characteristics
HEDL scales linearly: O(n) parsing, O(depth) for nesting. Median latencies stay under 15ms for all document sizes, and tails are predictable (p99 latencies available in benchmark baselines). For really large files, hedl-stream provides streaming support with bounded memory usage.
Test Coverage
We take testing seriously: 10,000+ tests across 19 crates. Unit tests, integration tests, property-based testing with proptest, and fuzz testing. Zero unsafe code in the core parser.
Ecosystem
HEDL plays well with others. Use it alongside your existing tools.
Format Conversion
| Format | Import | Export | Streaming | Use Case | |--------|:------:|:------:|:---------:|----------| | *
