Teenygrad

teaching software 2.0 to programmers of software 1.0

Generate Convert Improve

Install / Use

/learn @j4orz/Teenygrad

About this skill

Quality Score

0/100

README

`teenygrad`

The Structure and Interpretation of Tensor Programs' capstone project

Contents

Motivation
SITP Installation (Book)
teenygrad Installation (Codebase)

Motivation

The SITP and teenygrad project is trying to fill a pedagogical gap in the discipline of deep learning systems. With traditional software 1.0, the languages and runtimes that makeup production-grade systems such as LLVM and Linux have way too much tail-end complexity (both fundamental and accidental) which make them inappropriate as learning vehicles. Instead, there exists teaching compilers and operating systems — to name a few,

a mini Lisp-like interpreter, a metacircular evaluator
a mini C-like compiler chibicc (in turn inspired by tcc and lcc),
a mini LLVM-like SSA instruction set Bril
a mini Unix-like operating system xv6
a mini x86-like instruction set LC3

For deep learning systems, given that the discpline is relatively new (the 2020-2025? era of scaling has just passed), the pedagogical material is quite nascent. While there are some great resources such as Sasha Rush's minitorch course at Cornell and Tianqi Chen's needl course at Carnegie Mellon, there are a few gaps that I personally would like to see filled, which is what SITP and teenygrad trying to do.

SITP Installation (Book)

Install mdbook
```
cd sitp/
mdbook serve
```

`teenygrad` Installation (Codebase)

Follow these instructions for a quick setup. To understand the physical layout of the project repo, refer to the ARCHITECTURE.md

Eager Mode

teenygrad eager mode (developed in part 1 and 2 of the book) has a mixed source of Python, Rust, and CUDA Rust in order to support CPU and GPU acceleration. The Python to Rust interop is implemented using CPython Extension Modules via PyO3, with the shared object files compiled by driving cargo via PyO3's build tool maturin.

CPU kernels (RISC-V)

CPU kernels do not use the docker container (for now).

cd teeny/
uv pip install maturin                             # install maturin (which drives pyo3)
cd rust && cargo run                               # run cpu acccelerated gemm kernel
maturin develop                                    # build shared object for cpython's extension modules
uv run examples/abstractions.py                    # run cpu accelerated gemm kernel from python

GPU kernels (PTX)

To enable GPU acceleration, teenygrad uses CUDA Rust, which in turn requires a specific version matrix required (notably the LLVM subset NVVM pinned to LLVM 7.x, because CUDA Rust targets NVVM rather than using LLVM's PTX codegen) and so docker containers and shell scripts provided by CUDA Rust are reused for teenygrad development.

Install NVIDIA Container Toolkit on your machine

Then run the following in your shell:

cd teeny/
sudo nvidia-ctk runtime configure --runtime=docker # set nvidia's container runtime to docker
sudo systemctl restart docker                      # restart docker
./dcr.sh                                           # create container with old version of llvm for cuda rust
./dex.sh "cd eagkers && cargo run --features gpu"  # run gpu accelerated gemm kernel
./dex.sh "maturin develop"                         # build the shared object for cpython's extension modules
./dex.sh "uv run examples/abstractions.py"         # run gpu accelerated gemm kernel from python

Also note that ./dcr.sh is the production container, so that any commands to run the Rust with cargo, build the Rust with maturin, or run the Python with uv must be qualified with ./dex.sh.

For VSCode development, when you open the project with VS Code you will be prompted with "Folder contains a Dev Container configuration file. Reopen folde to develop in a container" in which you press the button Reopen Container, which will restart vscode with the development container specified at .devcontainer with the CUDA Rust provided containers in order to enable rustanalyzer. The final step is to point rustanalyzer to the Rust and CUDA Rust source in settings.json:
```
{
  
  "rust-analyzer.linkedProjects": ["teeny/eagkers/Cargo.toml"],
  "rust-analyzer.cargo.features": ["gpu"],
}
```
Note that when VSCode opening the project's development container, none of the ./dex.sh commands from step 2 will work, since the development container doesn't have docker. For that, either enter those commands in the shell of a second VSCode editor, or simply different shell software.

Graph Mode

teenygrad graph mode (developed in part 3 of the book) is a pure Python Tensor compiler.

Related Skills

node-connect

350.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

350.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

350.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。

j4orz

View profile

View on GitHub

GitHub Stars63

CategoryDevelopment

Updated1d ago

Forks7

j4orz/teenygrad

Languages

Python

Security Score

95/100

Audited on Apr 5, 2026

No findings

Teenygrad

Install / Use

README

teenygrad

Motivation

SITP Installation (Book)

teenygrad Installation (Codebase)

Eager Mode

Graph Mode

Related Skills

`teenygrad`

`teenygrad` Installation (Codebase)