Klongpy
High-Performance Klong array language in Python.
Install / Use
/learn @briangu/KlongpyREADME
KlongPy: A High-Performance Array Language with Autograd
KlongPy is a Python adaptation of the Klong array language, offering high-performance vectorized operations. It prioritizes compatibility with Python, thus allowing seamless integration of Python's expansive ecosystem while retaining Klong's succinctness.
KlongPy backends include NumPy and optional PyTorch (CPU, CUDA, and Apple MPS). When PyTorch is enabled, automatic differentiation (autograd) is supported; otherwise, numeric differentiation is the default.
Full documentation: https://klongpy.org
New to v0.7.0, KlongPy now brings gradient-based programming to an already-succinct array language, so you can differentiate compact array expressions directly. It's also a batteries-included system with IPC, DuckDB-backed database tooling, web/websocket support, and other integrations exposed seamlessly from the language.
Backends include NumPy and optional PyTorch (CPU, CUDA, and Apple MPS). When PyTorch is enabled, gradients use autograd; otherwise numeric differentiation is the default.
PyTorch gradient descent (10+ lines):
import torch
x = torch.tensor(5.0, requires_grad=True)
optimizer = torch.optim.SGD([x], lr=0.1)
for _ in range(100):
loss = x ** 2
optimizer.zero_grad()
loss.backward()
optimizer.step()
print(x) # ~0
KlongPy gradient descent (2 lines):
f::{x^2}; s::5.0
{s::s-(0.1*f:>s)}'!100 :" s -> 0"
Array languages like APL, K, and Q revolutionized finance by treating operations as data transformations, not loops. KlongPy brings this philosophy to machine learning: gradients become expressions you compose, not boilerplate you maintain. The result is a succint mathematical-like notation that is automatically extended to machine learning.
Quick Install
# REPL + NumPy backend (pick one option below)
pip install "klongpy[repl]"
kgpy
# Enable torch backend (autograd + GPU)
pip install "klongpy[torch]"
kgpy --backend torch
# Everything (web, db, websockets, torch, repl)
pip install "klongpy[all]"
REPL
$ kgpy
Welcome to KlongPy REPL v0.7.0
Author: Brian Guarraci
Web: http://klongpy.org
Backend: torch (mps)
]h for help; Ctrl-D or ]q to quit
$>
Why KlongPy?
For Quants and Traders
Optimize portfolios with gradients in a language designed for arrays:
:" Portfolio optimization: gradient of Sharpe ratio"
returns::[0.05 0.08 0.03 0.10] :" Annual returns per asset"
vols::[0.15 0.20 0.10 0.25] :" Volatilities per asset"
w::[0.25 0.25 0.25 0.25] :" Portfolio weights"
sharpe::{(+/x*returns)%((+/((x^2)*(vols^2)))^0.5)}
sg::sharpe:>w :" Gradient of Sharpe ratio"
.d("sharpe gradient="); .p(sg)
sharpe gradient=[0.07257738709449768 0.032256484031677246 0.11693036556243896 -0.22176480293273926]
For ML Researchers
Neural networks in pure array notation:
:" Single-layer neural network with gradient descent"
.bkf(["exp"])
sigmoid::{1%(1+exp(0-x))}
forward::{sigmoid((w1*x)+b1)}
X::[0.5 1.0 1.5 2.0]; Y::[0.2 0.4 0.6 0.8]
w1::0.1; b1::0.1; lr::0.1
loss::{+/((forward'X)-Y)^2}
:" Train with multi-param gradients"
{grads::loss:>[w1 b1]; w1::w1-(lr*grads@0); b1::b1-(lr*grads@1)}'!1000
.d("w1="); .d(w1); .d(" b1="); .p(b1)
w1=1.74 b1=-2.17
For Scientists
Express mathematics directly:
:" Gradient of f(x,y,z) = x^2 + y^2 + z^2 at [1,2,3]"
f::{+/x^2}
f:>[1 2 3]
[2.0 4.0 6.0]
The Array Language Advantage
Array languages express what you want, not how to compute it. This enables automatic optimization:
| Operation | Python | KlongPy |
|-----------|--------|---------|
| Sum an array | sum(a) | +/a |
| Running sum | np.cumsum(a) | +\a |
| Dot product | np.dot(a,b) | +/a*b |
| Average | sum(a)/len(a) | (+/a)%#a |
| Gradient | 10+ lines | f:>x |
| Multi-param grad | 20+ lines | loss:>[w b] |
| Jacobian | 15+ lines | x∂f |
| Optimizer | 10+ lines | {w::w-(lr*f:>w)} |
KlongPy inherits from the APL family tree (APL → J → K/Q → Klong), adding Python integration and automatic differentiation.
Performance
Run the included benchmark on any backend:
kgpy --backend torch --device cpu examples/bench_compiler.kg
kgpy --backend numpy examples/bench_compiler.kg
Expression Compiler Benchmark (Apple M1 Mac Studio)
Both backends include an expression compiler that converts Klong ASTs to a backend-neutral IR, then generates platform-specific Python functions. Expressions compile once and are cached — subsequent calls pay only execution cost.
Times are per-call averages over 1000 iterations.
| Operation | Elements | NumPy | Torch CPU |
|-----------|------:|------:|----------:|
| Arithmetic | | | |
| a+b | 100K | 0.066 ms | 0.155 ms |
| a*2+b | 100K | 0.091 ms | 0.276 ms |
| (a+b)*(a-b) | 100K | 0.124 ms | 0.390 ms |
| (a*2+b*3)%(a+1)-(b*c) | 100K | 0.218 ms | 0.870 ms |
| Lambdas | | | |
| {x+y}(a;b) | 100K | 0.074 ms | 0.156 ms |
| {(x+y)*(x-y)}(a;b) | 100K | 0.129 ms | 0.390 ms |
| {+/x*y}(a;b) | 100K | 0.098 ms | 0.285 ms |
| Reduce | | | |
| +/a | 100K | 0.065 ms | 0.150 ms |
| +/a*b | 100K | 0.095 ms | 0.293 ms |
| Scan | | | |
| +\ts cumsum | 10K | 0.083 ms | 0.054 ms |
| |\ts running max | 10K | 24.63 ms | 0.068 ms |
| (|\ts)-ts drawdown | 10K | 24.65 ms | 0.072 ms |
| |/(|\ts)-ts max drawdown | 10K | 24.68 ms | 0.081 ms |
| Real-world | | | |
| (+/p*s)%+/s VWAP | 100K | 0.151 ms | 0.512 ms |
| vwap::{(+/x*y)%+/y}; vwap(p;s) | 100K | 0.151 ms | 0.489 ms |
| a-(+/a)%#a de-mean | 100K | 0.088 ms | 0.292 ms |
NumPy is faster for element-wise and reduce operations on CPU. Torch excels at scan operations where it compiles to native tensor methods (cummax, cumsum) — running max is 362x faster than NumPy's interpreter fallback. Torch's full advantage appears on GPU (--device cuda or --device mps).
Complete Feature Set
KlongPy is a batteries-included platform with kdb+/Q-inspired features:
Core Language
- Vectorized Operations: NumPy/PyTorch-powered bulk array operations
- Automatic Differentiation: Native
:>operator for exact gradients - GPU Acceleration: CUDA and Apple MPS support via PyTorch
- Python Integration: Import any Python library with
.py()and.pyf()
Data Infrastructure (kdb+/Q-like)
- Fast Columnar Database: Zero-copy DuckDB integration for SQL on arrays
- Inter-Process Communication: Build ticker plants and distributed systems
- Table & Key-Value Store: Persistent storage for tables and data
- Web Server: Built-in HTTP server for APIs and dashboards
- WebSockets: Connect to WebSocket servers and handle messages in KlongPy
- Timers: Scheduled execution for periodic tasks
Documentation
- Quick Start Guide: Get running in 5 minutes
- PyTorch Backend & Autograd: Complete autograd reference
- Operator Reference: All language operators
- Performance Guide: Optimization tips
Full documentation: https://briangu.github.io/klongpy
Typing Special Characters
KlongPy uses Unicode operators for mathematical notation. Here's how to type them:
| Symbol | Name | Mac | Windows | Description |
|--------|------|-----|---------|-------------|
| ∇ | Nabla | Option + v then select, or Character Viewer | Alt + 8711 (numpad) | Numeric gradient |
| ∂ | Partial | Option + d | Alt + 8706 (numpad) | Jacobian operator |
Mac Tips:
- Option + d types
∂directly - For
∇, open Character Viewer with Ctrl + Cmd + Space, search "nabla" - Or simply copy-paste:
∇∂
Alternative: Use the function equivalents that don't require special characters:
3∇f :" Using nabla"
.jacobian(f;x) :" Instead of x∂f"
Syntax Cheat Sheet
Functions take up to 3 parameters, always named x, y, z:
:" Operators (right to left evaluation)"
5+3*2 :" 11 (3*2 first, then +5)"
+/[1 2 3] :" 6 (sum: + over /)"
*/[1 2 3] :" 6 (product: * over /)"
#[1 2 3] :" 3 (length)"
3|5 :" 5 (max)"
3&5 :" 3 (min)"
:" Functions"
avg::{(+/x)%#x} :" Monad (1 arg)"
dot::{+/x*y} :" Dyad (2 args)"
clip::{(x|y)&z} :" Triad (3 args): min(max(x,y),z)"
:" Adverbs (modifiers)"
f::{x^2}
f'[1 2 3] :" Each: apply f to each -> [1 4 9]"
+/[1 2 3] :" Over: fold/r
Related Skills
node-connect
346.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
claude-opus-4-5-migration
107.6kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
frontend-design
107.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
model-usage
346.8kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
