Tealeaf

TeaLeaf: A schema-aware data format with human-readable text and compact binary.

Generate Convert Improve

Install / Use

/learn @krishjag/Tealeaf

About this skill

Quality Score

0/100

README

TeaLeaf Data Format

A schema-aware data format with human-readable text and compact binary representation.

~51% fewer input tokens than JSON for LLM applications, with zero accuracy loss.

TeaLeaf Workflow

Overview
Installation
CLI
Language Bindings
Design Rationale
- Size Comparison
Use Cases
- Context Engineering (LLM/AI)
- Other Use Cases
Specification

Overview

TeaLeaf is a data format that combines:

Human-readable text (.tl) for editing and version control
Compact binary (.tlbx) for storage and transmission
Inline schemas for validation and compression
JSON interoperability for easy integration

Motivation

The existing data format landscape presents trade-offs that TeaLeaf attempts to bridge. TeaLeaf does not attempt to replace any of the formats listed below, but rather presents a different perspective that users can objectively compare to identify if it fits their specific use cases.

| Format | Observation | |--------|-------------| | JSON | Verbose, no comments, no schema | | YAML | Indentation-sensitive, error-prone at scale | | Protobuf | Schema external, binary-only, requires codegen | | Avro | Schema embedded but not human-readable | | CSV/TSV/TOON | Too simple for nested or typed data | | MessagePack/CBOR | Compact but schemaless |

Converting some formats to binary yielded marginal benefits. Schema information was almost always external, requiring coordination between files.

TeaLeaf was designed to unify these concerns: a single file that humans can read and edit, that compiles to an efficient binary, with schemas inline rather than external. Though the format is general-purpose, LLM-Context Engineering uses cases can take advantage of significant token efficiency compared to JSON.

Quick Compare: JSON vs TeaLeaf

The same data — TeaLeaf uses schemas so field names are defined once, not repeated per record:

<table> <tr> <th>TeaLeaf (schemas with nested structures)</th> <th>JSON (no schema, names repeated)</th> </tr> <tr> <td valign="top">

# Schema: define structure once
@struct Location (city: string, country: string)
@struct Department (name: string, location: Location)
@struct Employee (
  id: int,
  name: string,
  role: string,
  department: Department,
  skills: []string,
)

# Data: field names not repeated
employees: @table Employee [
  (1, "Alice", "Engineer",
    ("Platform", ("Seattle", "USA")),
    ["rust", "python"])
  (2, "Bob", "Designer",
    ("Product", ("Austin", "USA")),
    ["figma", "css"])
  (3, "Carol", "Manager",
    ("Platform", ("Seattle", "USA")),
    ["leadership", "agile"])
]

</td> <td valign="top">

{
  "employees": [
    {
      "id": 1,
      "name": "Alice",
      "role": "Engineer",
      "department": {
        "name": "Platform",
        "location": {
          "city": "Seattle",
          "country": "USA"
        }
      },
      "skills": ["rust", "python"]
    },
    {
      "id": 2,
      "name": "Bob",
      "role": "Designer",
      "department": {
        "name": "Product",
        "location": {
          "city": "Austin",
          "country": "USA"
        }
      },
      "skills": ["figma", "css"]
    },
    {
      "id": 3,
      "name": "Carol",
      "role": "Manager",
      "department": {
        "name": "Platform",
        "location": {
          "city": "Seattle",
          "country": "USA"
        }
      },
      "skills": ["leadership", "agile"]
    }
  ]
}

</td> </tr> </table>

Why This Matters:

| Aspect | JSON | TeaLeaf | |--------|------|---------| | Field names | Repeated for every record | Defined once in schema | | Types | Implicit, inferred at runtime | Explicit in schema, structural checks at parse | | Binary size | Large (names + values) | Compact (positional data only) | | LLM tokens | 9,829 tokens (retail example shown below) | 5,632 tokens (43% fewer) | | Validation | External tools needed | Field count validation via schema |

The schema approach means:

Text format is human-readable with explicit types
Binary format stores only values (field names in schema table)
String deduplication — "Seattle", "USA", "Platform" stored once, referenced by index

Workflow Real Example

A complete retail orders dataset demonstrating the full TeaLeaf workflow:

┌─────────────────────────────────────────────────────────────────────────────┐
│                           RETAIL ORDERS WORKFLOW                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   retail_orders.json ──────► retail_orders.tl ───────► retail_orders.tlbx   │
│        36.8 KB       from-json     14.5 KB     compile       6.9 KB         │
│      9,829 tokens                5,632 tokens (43% fewer)                   │
│                                                                             │
│   • 10 orders            • 11 schemas defined      • 81% size reduction     │
│   • 4 products           • Human-readable          • 43% fewer LLM tokens   │
│   • 3 customers          • Comments & formatting   • Fast transmission      │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                              LLM ANALYSIS                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   test_retail_analysis.ps1                                                  │
│         │                                                                   │
│         ▼                                                                   │
│   Anthropic API (retail_orders.tl) ──────► responses/retail_analysis.tl     │
│                                                                             │
│   • Sends TeaLeaf-formatted order data  • Business intelligence insights    │
│   • Schema-first = fewer tokens         • Revenue analysis                  │
│   • Structured prompts                  • Customer segmentation             │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Try it yourself:

| File | Description | |------|-------------| | examples/retail_orders.json | Original JSON (36.8 KB, 9,829 tokens) | | examples/retail_orders.tl | TeaLeaf text format (14.5 KB, 5,632 tokens) | | examples/retail_orders.tlbx | TeaLeaf binary (6.9 KB) | | examples/test_retail_analysis.ps1 | Send to Anthropic API | | examples/responses/retail_analysis.tl | Anthropics's analysis |

Token Comparison

Installation

Pre-built Binaries

Download the latest release from GitHub Releases.

| Platform | Architecture | Download | |----------|-------------|----------| | Windows | x64 | tealeaf-windows-x64.zip | | Windows | ARM64 | tealeaf-windows-arm64.zip | | Linux | x64 | tealeaf-linux-x64.tar.gz | | Linux | ARM64 | tealeaf-linux-arm64.tar.gz | | Linux (musl) | x64 | tealeaf-linux-musl-x64.tar.gz | | macOS | x64 (Intel) | tealeaf-macos-x64.tar.gz | | macOS | ARM64 (Apple Silicon) | tealeaf-macos-arm64.tar.gz |

Quick Install

Windows (PowerShell):

# Download and extract to current directory
Invoke-WebRequest -Uri "https://github.com/krishjag/tealeaf/releases/latest/download/tealeaf-windows-x64.zip" -OutFile tealeaf.zip
Expand-Archive tealeaf.zip -DestinationPath .

# Optional: add to PATH
$env:PATH += ";$PWD"

Linux/macOS:

# Download and extract (replace with your platform)
curl -LO https://github.com/krishjag/tealeaf/releases/latest/download/tealeaf-linux-x64.tar.gz
tar -xzf tealeaf-linux-x64.tar.gz

# Optional: move to PATH
sudo mv tealeaf /usr/local/bin/

Related Skills

node-connect

352.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.3k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

352.5k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

352.5k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。

krishjag

View profile

View on GitHub

GitHub Stars6

CategoryDevelopment

Updated24d ago

Forks1

krishjag/tealeaf

Languages

Rust

Security Score

90/100

Audited on Mar 16, 2026

No findings

Tealeaf

Install / Use

README

TeaLeaf Data Format

Table of Contents

Overview

Motivation

Quick Compare: JSON vs TeaLeaf

Workflow Real Example

Installation

Pre-built Binaries

Quick Install

Related Skills