PureML — a tiny, transparent deep-learning framework in NumPy

PyPI downloads

LOGO

PureML is a learning-friendly deep-learning framework built entirely on top of NumPy. It aims to be small, readable, and hackable while still being practical for real experiments and teaching.

No hidden magic — a Tensor class + autodiff engine with dynamic computation graph and efficient VJPs for backward passes
Batteries included — core layers (Affine, Dropout, BatchNorm1d), common losses, common optimizers, and a DataLoader
Self-contained dataset demo — a ready-to-use MNIST reader and an end-to-end “MNIST Beater” model
Portable persistence — zarr-backed ArrayStorage with zip compression for saving/loading model state

If you like scikit-learn’s simplicity and wish deep learning felt the same way for small/medium projects, PureML is for you.

Install

PureML targets Python 3.11+ and NumPy 2.x.

pip install ym-pure-ml

The only runtime deps are: numpy, zarr

UPDATES:

v1.2.9 — What’s new

New layers:
- Conv1D: Added 1D convolution with stride, padding, dilation, optional bias, and configurable initialization (xavier-glorot-normal / kaiming-normal).
- Conv2D: Added 2D convolution with stride, padding, dilation, optional bias, and configurable initialization (xavier-glorot-normal / kaiming-normal).
- MeanPool1D: Added 1D average pooling with stride, padding, and dilation.
- MaxPool1D: Added 1D max pooling with stride, padding, and dilation.
- MeanPool2D: Added 2D average pooling with stride, padding, and dilation.
- MaxPool2D: Added 2D max pooling with stride, padding, and dilation.
- Dropout2d: Added spatial dropout for 4D tensors (B, C, H, W) that drops full feature maps per sample during training and acts as identity in eval mode.
- BatchNorm2d: Added batch normalization for 4D image tensors (B, C, H, W).
- LayerNorm1d: Added layer normalization over the feature axis with optional bias.
Clear development protocols:
- CONTRIBUTING.md: Added a detailed contributor guide covering architecture contracts, autodiff/layer/state protocols, branching model, and release workflow.
- New validator functions used during development to aid codebase consistency.
New weight initialization:
- Added Kaiming normal initialization and enabled it in Affine via method="kaiming-normal".
New activations:
- leaky_relu: Added elementwise LeakyReLU with configurable negative_slope (default 0.01).
Additional Tensor operations:
- Tensor.dot: Added a shape-validated dot operation for 1D·1D and 2D·2D.
- Tensor.general_transpose: Added N-D axis permutation.
- Tensor.squeeze: Removes size-1 dimensions (like NumPy squeeze).
- Tensor.unsqueeze: Inserts size-1 dimensions (like NumPy expand_dims).
- unfold1d: Added 1D unfolding utility for sliding-window workflows.
- unfold2d: Added 2D unfolding utility for sliding-window workflows.
New shape utilities for CNN design:
- output_len_1d(...): Added a public helper to compute 1D output length for Conv1D/Pool1D settings.
- output_shape_2d(...): Added a public helper to compute (H_out, W_out) for Conv2D/Pool2D settings.
Miscellaneous:
- calculate_gain: Added recommended gain computation for Affine/Conv1D/Conv2D, sigmoid, tanh, relu, and leaky_relu (with configurable negative slope).
- BatchNorm1d state fix: checkpoints now persist and restore eps, momentum, training, and num_features (in addition to running stats), with validation on restore.

v1.2.8 — What’s new

Modernization of packaging and CI/CD, documentation and attribution fixes, and the addition of community guidelines.

v1.2.7 — What’s new

BUG FIX: DataLoader's seed
- DataLoader now respects the provided seed (instead of always generating a new one) so shuffling is reproducible across runs when a seed is set.

v1.2.6 — What’s new

Small update: added logo to the docs and the PyPi page, as well as the README

v1.2.5 — What’s new

Affine and Embedding layers now ensure grads are ALWAYS tracked for supplied W and b tensors
Dedicated docs site is live: https://ymishchyriak.com/docs/PUREML-DOCS (mirrors this repo and stays current).

v1.2.4 — What’s new

Affine: optional bias
- Affine(fan_in, fan_out, bias=False, ...) is now supported.
- When bias=False, the layer keeps a zero bias tensor with requires_grad=False and excludes it from .parameters.
- Checkpointing: use_bias is persisted in named_buffers() and honored by apply_state().
  - Turning bias off during apply_state() zeroes the stored bias and freezes it.
  - Turning bias on reuses the same tensor and re-enables requires_grad.
- Loader accepts weight arrays shaped either (fan_in, fan_out) or (fan_out, fan_in) and auto-transposes to internal (n, m).

v1.2.3 — What’s new

Added utility functions rng_from_seed(seed: int | None = None) and get_random_seed() for reproducible RNG creation.
- rng_from_seed returns both the RNG and the resolved seed; if no seed is provided, get_random_seed() generates one using cryptographically secure OS randomness.
All core layers (Affine, Embedding, Dropout, etc.) now support explicit seeds for deterministic initialization.
Seeds and initialization methods are now persisted in checkpoints and automatically restored via each layer’s apply_state() method.
Fixed the "Fit MNIST in a few lines" README example — it now correctly calls the .numpy() method (parentheses were missing).

v1.2.2 — What’s new

Autograd: correct detach semantics (+ in-place variant)
- Tensor.detach() now returns a new leaf tensor that shares storage, has no creator, and requires_grad=False.
- New in-place Tensor.detach_() for stopping tracking on the current object.
- New Tensor.requires_grad_(bool) toggler (in-place), PyTorch-style.
- Migration note: if you relied on the old in-place behavior of detach(), switch to detach_() or reassign: x = x.detach().
Safe array export API
- New Tensor.numpy(copy=True, readonly=False) helper:
  - copy=True returns a defensive copy (default).
  - readonly=True marks the returned array non-writable (works with views or copies).
- Rationale: keep .data as the mutable param buffer for optimizers, while providing a safe way for read-only exports.
- Bottom-line: DO NOT ACCESS .data attribute directly, unless you REALLY need it! Instead, call .numpy() API. In future updates, .data may be hidden completely to avoid users accidentally mutating tensors.
Graph utilities: iterative and memory-safe
- _collect_graph() rewritten as an iterative ancestor walk (no recursion limits).
- zero_grad_graph() and detach_graph() now use a single traversal and
  - free each node’s cached forward context via fn._free_fwd_ctx() before unlinking,
  - set t._creator=None, t.grad=None,
  - and (for detach_graph) t.requires_grad=False to prevent future history building.
- Net effect: lower peak memory and safer teardown of large graphs.
Docs/logging polish
- Clearer docstrings for graph collection (it collects upstream/ancestor nodes).
- More informative debug logs for backward/graph utilities.

v1.2.1 — What’s new

BUG FIX: NN base-class API
- Now, self(x, y, ...) does not error within a class inheriting from NN. Previously, __call__ function expected a single tensor, but now the signature is (*args, **kwargs), so you can define the .predict method with any signature and still use self(...) interface.
training_utils: TensorDataset now ALWAYS returns Tensor instances
- Previously, if you initialized a TensorDataset from numpy arrays, it would return numpy array instances via __getitem__; Now, we enforce Tensor output, which protects us from downstream errors. In case you want to access the numpy data, just call .numpy() method on your Tensor.

v1.2.0 — What’s new

Autodiff-aware slicing (NumPy semantics) for Tensor
- Supports ints, slices, ellipsis (...), None (newaxis), boolean masks, and advanced integer arrays.
- Backward pass scatter-adds into a zeros-like array of the input’s shape (handles repeated/overlapping indices correctly).
Embedding layer
- A learned lookup table for integer indices: input (...,) of ints → output (..., D) embeddings.
- API: Embedding(V, D, pad_idx=None, W=None).
- If pad_idx is set, that row is initialized to zeros and receives no gradient (useful for <PAD> tokens).
- Correctly accumulates gradients for repeated indices.
BUG FIX: TensorValuedFunction context merging
- User-supplied forward contexts are now merged into the node’s internal context and persist through backward.
- Previously, the node could overwrite the provided context in some advanced cases, leading to missing cached values (e.g., padding_idx, flattened indices) during gradient computation.

PureML

Install / Use

README