SkillAgentSearch skills...

Edn.c

A fast, zero-copy EDN (Extensible Data Notation) reader written in C11 with SIMD acceleration.

Install / Use

/learn @DotFox/Edn.c
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

EDN.C

A fast, zero-copy EDN (Extensible Data Notation) reader written in C11 with SIMD acceleration.

CI License: MIT

TL;DR - What is EDN?

EDN (Extensible Data Notation) is a data format similar to JSON, but richer and more extensible. Think of it as "JSON with superpowers":

  • JSON-like foundation: Maps {:key value}, vectors [1 2 3], strings, numbers, booleans, null (nil)
  • Additional built-in types: Sets #{:a :b}, keywords :keyword, symbols my-symbol, characters \newline, lists (1 2 3)
  • Extensible via tagged literals: #inst "2024-01-01", #uuid "..."—transform data at parse time with custom readers
  • Human-friendly: Comments, flexible whitespace, designed to be readable and writable by both humans and programs
  • Language-agnostic: Originally from Clojure, but useful anywhere you need rich, extensible data interchange

Why EDN over JSON? More expressive types (keywords, symbols, sets), native extensibility through tags (no more {"__type": "Date", "value": "..."} hacks), and better support for configuration files and data interchange in functional programming environments.

Learn more: Official EDN specification

Features

  • 🚀 Fast: SIMD-accelerated parsing with NEON (ARM64), SSE4.2 (x86_64) and SIMD128 (WebAssembly) support
  • 🌐 WebAssembly: Full WASM SIMD128 support for high-performance parsing in browsers and Node.js
  • 💾 Zero-copy: Minimal allocations, references input data where possible
  • 🎯 Simple API: Easy-to-use interface with comprehensive type support
  • 🧹 Memory-safe: Arena allocator for efficient cleanup - single edn_free() call
  • 🔧 Zero Dependencies: Pure C11 with standard library only
  • ✅ Fully Tested: 340+ tests across 24 test suites
  • 📖 UTF-8 Native: All string inputs and outputs are UTF-8 encoded
  • 🏷️ Tagged Literals: Extensible data types with custom reader support
  • 🗺️ Map Namespace Syntax: Clojure-compatible #:ns{...} syntax (optional, disabled by default)
  • 🔤 Extended Characters: \formfeed, \backspace, and octal \oNNN literals (optional, disabled by default)
  • 📝 Metadata: Clojure-style metadata ^{...} syntax (optional, disabled by default)
  • 📄 Text Blocks: Java-style multi-line text blocks """\n...\n""" (experimental, disabled by default)
  • 🔢 Ratio Numbers: Clojure-compatible ratio literals 22/7 (optional, disabled by default)
  • 🔣 Extended Integers: Hex (0xFF), octal (0777), binary (2r1010), and arbitrary radix (36rZZ) formats (optional, disabled by default)
  • 🔢 Underscore in Numeric Literals: Visual grouping with underscores 1_000_000, 3.14_15_92, 0xDE_AD_BE_EF (optional, disabled by default)

Table of Contents

Installation

Requirements

  • C11 compatible compiler (GCC 4.9+, Clang 3.1+, MSVC 2015+)
  • Make (Unix/macOS) or CMake (Windows/cross-platform)
  • Supported platforms:
    • macOS (Apple Silicon M1/M2/M3, Intel) - NEON/SSE4.2 SIMD
    • Linux (ARM64, x86_64) - NEON/SSE4.2 SIMD
    • Windows (x86_64, ARM64) - NEON/SSE4.2 SIMD via MSVC/MinGW/Clang
    • WebAssembly - SIMD128 support for browsers and Node.js

Build Library

Unix/macOS/Linux:

# Clone the repository
git clone https://github.com/DotFox/edn.c.git
cd edn.c

# Build static library (libedn.a)
make

# Run tests to verify build
make test

Windows:

# Clone the repository
git clone https://github.com/DotFox/edn.c.git
cd edn.c

# Build with CMake (works with MSVC, MinGW, Clang)
.\build.bat

# Or use PowerShell script
.\build.ps1 -Test

See docs/WINDOWS.md for detailed Windows build instructions.

Integrate Into Your Project

Option 1: Link static library

# Compile your code
gcc -o myapp myapp.c -I/path/to/edn.c/include -L/path/to/edn.c -ledn

# Or add to your Makefile
CFLAGS += -I/path/to/edn.c/include
LDFLAGS += -L/path/to/edn.c -ledn

Option 2: Include source directly

Copy include/edn.h and all files from src/ into your project and compile them together.

Quick Start

#include "edn.h"
#include <stdio.h>

int main(void) {
    const char *input = "{:name \"Alice\" :age 30 :languages [:clojure :rust]}";
    
    // Read EDN string
    edn_result_t result = edn_read(input, 0);
    
    if (result.error != EDN_OK) {
        fprintf(stderr, "Parse error at line %zu, column %zu: %s\n",
                result.error_start.line, result.error_start.column, result.error_message);
        return 1;
    }
    
    // Access the parsed map
    edn_value_t *map = result.value;
    printf("Parsed map with %zu entries\n", edn_map_count(map));
    
    // Look up a value by key
    edn_result_t key_result = edn_read(":name", 0);
    edn_value_t *name_value = edn_map_lookup(map, key_result.value);
    
    if (name_value != NULL && edn_type(name_value) == EDN_TYPE_STRING) {
        size_t len;
        const char *name = edn_string_get(name_value, &len);
        printf("Name: %.*s\n", (int)len, name);
    }
    
    // Clean up - frees all allocated memory
    edn_free(key_result.value);
    edn_free(map);
    
    return 0;
}

Output:

Parsed map with 3 entries
Name: Alice

Whitespace and Control Characters

EDN.C follows Clojure's exact behavior for whitespace and control character handling:

Whitespace Characters

The following characters act as whitespace delimiters (separate tokens):

| Character | Hex | Name | Common Use | |-----------|------|----------------------|---------------------| | | 0x20 | Space | Standard spacing | | \t | 0x09 | Tab | Indentation | | \n | 0x0A | Line Feed (LF) | Unix line ending | | \r | 0x0D | Carriage Return (CR) | Windows line ending | | \f | 0x0C | Form Feed | Page break | | \v | 0x0B | Vertical Tab | Vertical spacing | | , | 0x2C | Comma | Optional separator | | FS | 0x1C | File Separator | Data separation | | GS | 0x1D | Group Separator | Data separation | | RS | 0x1E | Record Separator | Data separation | | US | 0x1F | Unit Separator | Data separation |

Examples:

// All of these parse as vectors with 3 elements:
edn_read("[1 2 3]", 0);          // spaces
edn_read("[1,2,3]", 0);          // commas
edn_read("[1\t2\n3]", 0);        // tabs and newlines
edn_read("[1\f2\x1C3]", 0);      // formfeed and file separator

Control Characters in Identifiers

Control characters 0x00-0x1F (except whitespace delimiters) are valid in identifiers (symbols and keywords):

Valid identifier characters:

  • 0x00 - 0x08: NUL, SOH, STX, ETX, EOT, ENQ, ACK, BEL, Backspace
  • 0x0E - 0x1B: Shift Out through Escape

Examples:

// Backspace in symbol - valid!
edn_result_t r = edn_read("[\bfoo]", 0);  // 1-element vector
edn_vector_count(r.value);  // Returns 1
edn_free(r.value);

// Control characters in middle of identifier
const char input[] = {'[', 'f', 'o', 'o', 0x08, 'b', 'a', 'r', ']', 0};
r = edn_read(input, sizeof(input) - 1);
edn_vector_count(r.value);  // Returns 1 (symbol: "foo\bbar")
edn_free(r.value);

// Versus whitespace - separates into 2 elements
edn_result_t r2 = edn_read("[foo\tbar]", 0);  // Tab is whitespace
edn_vector_count(r2.value);  // Returns 2 (symbols: "foo" and "bar")
edn_free(r2.value);

Note on null bytes (0x00): When using string literals with strlen(), null bytes will truncate the string. Always pass explicit length for data containing null bytes:

const char data[] = {'[', 'a', 0x00, 'b', ']', 0};
edn_result_t r = edn_read(data, 5);  // Pass exact length: 5 bytes (excluding terminator)

API Reference

Core Functions

edn_read()

Read EDN from a UTF-8 string.

edn_result_t edn_read(const char *input, size_t length);

Parameters:

  • input: UTF-8 encoded string containing EDN data (must remain valid for zero-copy strings)
  • length: Length of input in bytes, or 0 to use strlen(input)

Returns: edn_result_t containing:

  • value: Parsed EDN value (NULL on error)
  • error: Error code (EDN_OK on success)
  • error_start: Start of error range (edn_error_position_t with offset, line, column)
  • error_end: End of error range (edn_error_position_t with offset, line, column)
  • error_message: Human-readable error description

Important: The returned value must be freed with edn_free().

edn_free()

Free an EDN value and all associated memory.

void edn_free(edn_value_t *value);

Parameters:

  • value: Value to free (may be NULL)

Note: This frees the entire value tree. Do not call free() on indi

View on GitHub
GitHub Stars169
CategoryDevelopment
Updated2d ago
Forks0

Languages

C

Security Score

95/100

Audited on Apr 8, 2026

No findings