SkillAgentSearch skills...

Tealeaf

TeaLeaf: A schema-aware data format with human-readable text and compact binary.

Install / Use

/learn @krishjag/Tealeaf
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

TeaLeaf Data Format

Rust CI .NET CI crates.io NuGet codecov License: MIT

A schema-aware data format with human-readable text and compact binary representation.

~51% fewer input tokens than JSON for LLM applications, with zero accuracy loss.

TeaLeaf Workflow


Table of Contents


Overview

TeaLeaf is a data format that combines:

  • Human-readable text (.tl) for editing and version control
  • Compact binary (.tlbx) for storage and transmission
  • Inline schemas for validation and compression
  • JSON interoperability for easy integration

Motivation

The existing data format landscape presents trade-offs that TeaLeaf attempts to bridge. TeaLeaf does not attempt to replace any of the formats listed below, but rather presents a different perspective that users can objectively compare to identify if it fits their specific use cases.

| Format | Observation | |--------|-------------| | JSON | Verbose, no comments, no schema | | YAML | Indentation-sensitive, error-prone at scale | | Protobuf | Schema external, binary-only, requires codegen | | Avro | Schema embedded but not human-readable | | CSV/TSV/TOON | Too simple for nested or typed data | | MessagePack/CBOR | Compact but schemaless |

Converting some formats to binary yielded marginal benefits. Schema information was almost always external, requiring coordination between files.

TeaLeaf was designed to unify these concerns: a single file that humans can read and edit, that compiles to an efficient binary, with schemas inline rather than external. Though the format is general-purpose, LLM-Context Engineering uses cases can take advantage of significant token efficiency compared to JSON.

Quick Compare: JSON vs TeaLeaf

The same data — TeaLeaf uses schemas so field names are defined once, not repeated per record:

<table> <tr> <th>TeaLeaf (schemas with nested structures)</th> <th>JSON (no schema, names repeated)</th> </tr> <tr> <td valign="top">
# Schema: define structure once
@struct Location (city: string, country: string)
@struct Department (name: string, location: Location)
@struct Employee (
  id: int,
  name: string,
  role: string,
  department: Department,
  skills: []string,
)

# Data: field names not repeated
employees: @table Employee [
  (1, "Alice", "Engineer",
    ("Platform", ("Seattle", "USA")),
    ["rust", "python"])
  (2, "Bob", "Designer",
    ("Product", ("Austin", "USA")),
    ["figma", "css"])
  (3, "Carol", "Manager",
    ("Platform", ("Seattle", "USA")),
    ["leadership", "agile"])
]
</td> <td valign="top">
{
  "employees": [
    {
      "id": 1,
      "name": "Alice",
      "role": "Engineer",
      "department": {
        "name": "Platform",
        "location": {
          "city": "Seattle",
          "country": "USA"
        }
      },
      "skills": ["rust", "python"]
    },
    {
      "id": 2,
      "name": "Bob",
      "role": "Designer",
      "department": {
        "name": "Product",
        "location": {
          "city": "Austin",
          "country": "USA"
        }
      },
      "skills": ["figma", "css"]
    },
    {
      "id": 3,
      "name": "Carol",
      "role": "Manager",
      "department": {
        "name": "Platform",
        "location": {
          "city": "Seattle",
          "country": "USA"
        }
      },
      "skills": ["leadership", "agile"]
    }
  ]
}
</td> </tr> </table>

Why This Matters:

| Aspect | JSON | TeaLeaf | |--------|------|---------| | Field names | Repeated for every record | Defined once in schema | | Types | Implicit, inferred at runtime | Explicit in schema, structural checks at parse | | Binary size | Large (names + values) | Compact (positional data only) | | LLM tokens | 9,829 tokens (retail example shown below) | 5,632 tokens (43% fewer) | | Validation | External tools needed | Field count validation via schema |

The schema approach means:

  • Text format is human-readable with explicit types
  • Binary format stores only values (field names in schema table)
  • String deduplication — "Seattle", "USA", "Platform" stored once, referenced by index

Workflow Real Example

A complete retail orders dataset demonstrating the full TeaLeaf workflow:

┌─────────────────────────────────────────────────────────────────────────────┐
│                           RETAIL ORDERS WORKFLOW                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   retail_orders.json ──────► retail_orders.tl ───────► retail_orders.tlbx   │
│        36.8 KB       from-json     14.5 KB     compile       6.9 KB         │
│      9,829 tokens                5,632 tokens (43% fewer)                   │
│                                                                             │
│   • 10 orders            • 11 schemas defined      • 81% size reduction     │
│   • 4 products           • Human-readable          • 43% fewer LLM tokens   │
│   • 3 customers          • Comments & formatting   • Fast transmission      │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                              LLM ANALYSIS                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   test_retail_analysis.ps1                                                  │
│         │                                                                   │
│         ▼                                                                   │
│   Anthropic API (retail_orders.tl) ──────► responses/retail_analysis.tl     │
│                                                                             │
│   • Sends TeaLeaf-formatted order data  • Business intelligence insights    │
│   • Schema-first = fewer tokens         • Revenue analysis                  │
│   • Structured prompts                  • Customer segmentation             │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Try it yourself:

| File | Description | |------|-------------| | examples/retail_orders.json | Original JSON (36.8 KB, 9,829 tokens) | | examples/retail_orders.tl | TeaLeaf text format (14.5 KB, 5,632 tokens) | | examples/retail_orders.tlbx | TeaLeaf binary (6.9 KB) | | examples/test_retail_analysis.ps1 | Send to Anthropic API | | examples/responses/retail_analysis.tl | Anthropics's analysis |


Token Comparison

Installation

Pre-built Binaries

Download the latest release from GitHub Releases.

| Platform | Architecture | Download | |----------|-------------|----------| | Windows | x64 | tealeaf-windows-x64.zip | | Windows | ARM64 | tealeaf-windows-arm64.zip | | Linux | x64 | tealeaf-linux-x64.tar.gz | | Linux | ARM64 | tealeaf-linux-arm64.tar.gz | | Linux (musl) | x64 | tealeaf-linux-musl-x64.tar.gz | | macOS | x64 (Intel) | tealeaf-macos-x64.tar.gz | | macOS | ARM64 (Apple Silicon) | tealeaf-macos-arm64.tar.gz |

Quick Install

Windows (PowerShell):

# Download and extract to current directory
Invoke-WebRequest -Uri "https://github.com/krishjag/tealeaf/releases/latest/download/tealeaf-windows-x64.zip" -OutFile tealeaf.zip
Expand-Archive tealeaf.zip -DestinationPath .

# Optional: add to PATH
$env:PATH += ";$PWD"

Linux/macOS:

# Download and extract (replace with your platform)
curl -LO https://github.com/krishjag/tealeaf/releases/latest/download/tealeaf-linux-x64.tar.gz
tar -xzf tealeaf-linux-x64.tar.gz

# Optional: move to PATH
sudo mv tealeaf /usr/local/bin/

Related Skills

View on GitHub
GitHub Stars6
CategoryDevelopment
Updated24d ago
Forks1

Languages

Rust

Security Score

90/100

Audited on Mar 16, 2026

No findings