SkillAgentSearch skills...

Rosettes

⌾⌾⌾ Rosettes — ReDoS-safe syntax highlighter for Python 3.14+ with free-threading.

Install / Use

/learn @lbliii/Rosettes

README

⌾⌾⌾ Rosettes

PyPI version Build Status Python 3.14+ License: MIT

A Python syntax highlighter and Pygments alternative for secure code highlighting and existing CSS themes.

from rosettes import highlight

html = highlight("def hello(): print('world')", "python")

What is Rosettes?

Rosettes is a syntax highlighter for Python 3.14t. Hand-written state machines, O(n) guaranteed, zero ReDoS risk. Safe for untrusted input in web apps and APIs.

Why people pick it:

  • O(n) guaranteed — Hand-written state machines, no regex backtracking
  • Zero ReDoS — No exploitable patterns, safe for untrusted input
  • Free-threading native — All lexer state is local variables, keyword tables are frozenset, tokens are immutable. Highlight from any number of threads with zero contention.
  • Pygments compatible — Drop-in CSS class compatibility for existing themes
  • 55 languages — Python, JavaScript, Rust, Go, and 51 more

Use Rosettes For

  • HTML code highlighting — Highlight source code for docs, blogs, and web apps
  • Pygments migration paths — Keep existing CSS themes with Pygments-compatible classes
  • Security-sensitive rendering — Highlight untrusted input without regex backtracking risk
  • Parallel highlighting — Process many code blocks across threads with highlight_many()
  • Python-native docs stacks — Use with Bengal, Patitas, or custom site generators

Installation

pip install rosettes

Requires Python 3.14+


Quick Start

| Function | Description | |----------|-------------| | highlight(code, lang) | Generate HTML with syntax highlighting | | tokenize(code, lang) | Get raw tokens for custom processing | | highlight_many(items) | Parallel highlighting for multiple blocks | | list_languages() | List all 55 supported languages |


Features

| Feature | Description | Docs | |---------|-------------|------| | Choosing Rosettes | When it fits, migration from Pygments, and tradeoffs | Choosing Rosettes → | | Basic Highlighting | highlight() and tokenize() functions | Highlighting → | | Parallel Processing | highlight_many() for multi-core systems | Parallel → | | Line Highlighting | Highlight specific lines, add line numbers | Lines → | | CSS Styling | Semantic or Pygments-compatible classes | Styling → | | Custom Formatters | Build terminal, LaTeX, or custom output | Extending → |

📚 Full documentation: lbliii.github.io/rosettes


Usage

<details> <summary><strong>Basic Highlighting</strong> — Generate HTML from code</summary>
from rosettes import highlight

# Basic highlighting
html = highlight("def foo(): pass", "python")
# <div class="rosettes" data-language="python">...</div>

# With line numbers
html = highlight(code, "python", show_linenos=True)

# Highlight specific lines
html = highlight(code, "python", hl_lines={2, 3, 4})
</details> <details> <summary><strong>Parallel Processing</strong> — Speed up multiple blocks</summary>

For 8+ code blocks, use highlight_many() for parallel processing:

from rosettes import highlight_many

blocks = [
    ("def foo(): pass", "python"),
    ("const x = 1;", "javascript"),
    ("fn main() {}", "rust"),
]

# Highlight in parallel
results = highlight_many(blocks)

On Python 3.14t with free-threading, this provides 1.5-2x speedup for 50+ blocks.

</details> <details> <summary><strong>Tokenization</strong> — Raw tokens for custom processing</summary>
from rosettes import tokenize

tokens = tokenize("x = 42", "python")
for token in tokens:
    print(f"{token.type.name}: {token.value!r}")
# NAME: 'x'
# WHITESPACE: ' '
# OPERATOR: '='
# WHITESPACE: ' '
# NUMBER_INTEGER: '42'
</details> <details> <summary><strong>CSS Class Styles</strong> — Semantic or Pygments</summary>

Semantic (default) — Readable, self-documenting:

html = highlight(code, "python")
# <span class="syntax-keyword">def</span>
# <span class="syntax-function">hello</span>
.syntax-keyword { color: #ff79c6; }
.syntax-function { color: #50fa7b; }
.syntax-string { color: #f1fa8c; }

Pygments-compatible — Use existing themes:

html = highlight(code, "python", css_class_style="pygments")
# <span class="k">def</span>
# <span class="nf">hello</span>
</details>

Supported Languages

<details> <summary><strong>55 languages</strong> with full syntax support</summary>

| Category | Languages | |----------|-----------| | Core | Python, JavaScript, TypeScript, JSON, YAML, TOML, Bash, HTML, CSS, Diff | | Systems | C, C++, Rust, Go, Zig | | JVM | Java, Kotlin, Scala, Groovy, Clojure | | Apple | Swift | | Scripting | Ruby, Perl, PHP, Lua, R, PowerShell | | Functional | Haskell, Elixir | | Data/Query | SQL, CSV, GraphQL | | Markup | Markdown, XML | | Config | INI, Nginx, Dockerfile, Makefile, HCL | | Schema | Protobuf | | Modern | Dart, Julia, Nim, Gleam, V | | AI/ML | Mojo, Triton, CUDA, Stan | | Other | PKL, CUE, Tree, Kida, Jinja, Plaintext |

</details>

Architecture

<details> <summary><strong>State Machine Lexers</strong> — O(n) guaranteed</summary>

Every lexer is a hand-written finite state machine:

┌─────────────────────────────────────────────────────────────┐
│                    State Machine Lexer                       │
│                                                              │
│  ┌─────────┐   char    ┌─────────┐   char    ┌─────────┐   │
│  │ INITIAL │ ────────► │ STRING  │ ────────► │ ESCAPE  │   │
│  │ STATE   │           │ STATE   │           │ STATE   │   │
│  └─────────┘           └─────────┘           └─────────┘   │
│      │                      │                     │         │
│      │ emit                 │ emit                │ emit    │
│      ▼                      ▼                     ▼         │
│  [Token]               [Token]               [Token]        │
└─────────────────────────────────────────────────────────────┘

Key properties:

  • Single character lookahead (O(n) guaranteed)
  • No backtracking (no ReDoS possible)
  • Immutable state (thread-safe)
  • Local variables only (no shared mutable state)
</details> <details> <summary><strong>Thread Safety</strong> — Free-threading ready</summary>

All public APIs are thread-safe:

  • Lexers use only local variables during tokenization
  • Formatter state is immutable
  • Registry uses functools.cache for memoization
  • Module declares itself safe for free-threading (PEP 703)
</details>

Performance

On a 10,000-line Python file:

  • Tokenize — ~12ms

  • Highlight — ~18ms

  • Parallel highlighting — Run python benchmarks/benchmark_parallel.py to see scaling on your machine. Example with 100 code blocks on 8-core:

      Threads    Time      Speedup
      1          0.04s     1.00x
      2          0.02s     1.61x
      4          0.02s     2.53x
      8          0.02s     2.10x
    

Documentation

📚 lbliii.github.io/rosettes

| Section | Description | |---------|-------------| | Get Started | Installation and quickstart | | Highlighting | Core highlighting APIs | | Styling | CSS classes and themes | | Reference | Complete API documentation | | About | Architecture and design |


Development

git clone https://github.com/lbliii/rosettes.git
cd rosettes
uv sync --group dev
pytest

Run parallel benchmark (free-threading scaling demo):

python benchmarks/benchmark_parallel.py

The Bengal Ecosystem

A structured reactive stack — every layer written in pure Python for 3.14t free-threading.

| | | | | |--:|---|---|---| | ᓚᘏᗢ | Bengal | Static site generator | Docs | | ∿∿ | Purr | Content runtime | — | | ⌁⌁ | Chirp | Web framework | Docs | | =^..^= | Pounce | ASGI server | Docs | | )彡 | Kida | Template engine | Docs | | ฅᨐฅ | Patitas | Markdown parser | Docs | | ⌾⌾⌾ | Rosettes | Syntax highlighter ← You are here | Docs |

Python-native. Free-threading ready. No npm required.


License

MIT License — see LICENSE for details.

View on GitHub
GitHub Stars4
CategoryDevelopment
Updated26d ago
Forks0

Languages

Python

Security Score

90/100

Audited on Mar 10, 2026

No findings