PureML
Transparent, NumPy-only deep learning framework for teaching, small-scale projects, prototyping, and reproducible experiments. No CUDA, no giant dependency tree. Batteries included: VJP autograd, layers, activations, losses, optimizers, Zarr checkpoints, and more!
Install / Use
/learn @Yehor-Mishchyriak/PureMLREADME
PureML — a tiny, transparent deep-learning framework in NumPy

PureML is a learning-friendly deep-learning framework built entirely on top of NumPy. It aims to be small, readable, and hackable while still being practical for real experiments and teaching.
- No hidden magic — a Tensor class + autodiff engine with dynamic computation graph and efficient VJPs for backward passes
- Batteries included — core layers (Affine, Dropout, BatchNorm1d), common losses, common optimizers, and a
DataLoader - Self-contained dataset demo — a ready-to-use MNIST reader and an end-to-end “MNIST Beater” model
- Portable persistence — zarr-backed
ArrayStoragewith zip compression for saving/loading model state
If you like scikit-learn’s simplicity and wish deep learning felt the same way for small/medium projects, PureML is for you.
Install
PureML targets Python 3.11+ and NumPy 2.x.
pip install ym-pure-ml
The only runtime deps are: numpy, zarr
UPDATES:
v1.2.9 — What’s new
- New layers:
- Conv1D: Added 1D convolution with stride, padding, dilation, optional bias, and configurable initialization (
xavier-glorot-normal/kaiming-normal). - Conv2D: Added 2D convolution with stride, padding, dilation, optional bias, and configurable initialization (
xavier-glorot-normal/kaiming-normal). - MeanPool1D: Added 1D average pooling with stride, padding, and dilation.
- MaxPool1D: Added 1D max pooling with stride, padding, and dilation.
- MeanPool2D: Added 2D average pooling with stride, padding, and dilation.
- MaxPool2D: Added 2D max pooling with stride, padding, and dilation.
- Dropout2d: Added spatial dropout for 4D tensors (
B, C, H, W) that drops full feature maps per sample during training and acts as identity in eval mode. - BatchNorm2d: Added batch normalization for 4D image tensors (
B, C, H, W). - LayerNorm1d: Added layer normalization over the feature axis with optional bias.
- Conv1D: Added 1D convolution with stride, padding, dilation, optional bias, and configurable initialization (
- Clear development protocols:
CONTRIBUTING.md: Added a detailed contributor guide covering architecture contracts, autodiff/layer/state protocols, branching model, and release workflow.- New validator functions used during development to aid codebase consistency.
- New weight initialization:
- Added Kaiming normal initialization and enabled it in
Affineviamethod="kaiming-normal".
- Added Kaiming normal initialization and enabled it in
- New activations:
- leaky_relu: Added elementwise LeakyReLU with configurable
negative_slope(default0.01).
- leaky_relu: Added elementwise LeakyReLU with configurable
- Additional Tensor operations:
- Tensor.dot: Added a shape-validated dot operation for
1D·1Dand2D·2D. - Tensor.general_transpose: Added N-D axis permutation.
- Tensor.squeeze: Removes size-1 dimensions (like NumPy
squeeze). - Tensor.unsqueeze: Inserts size-1 dimensions (like NumPy
expand_dims). - unfold1d: Added 1D unfolding utility for sliding-window workflows.
- unfold2d: Added 2D unfolding utility for sliding-window workflows.
- Tensor.dot: Added a shape-validated dot operation for
- New shape utilities for CNN design:
output_len_1d(...): Added a public helper to compute 1D output length for Conv1D/Pool1D settings.output_shape_2d(...): Added a public helper to compute(H_out, W_out)for Conv2D/Pool2D settings.
- Miscellaneous:
- calculate_gain: Added recommended gain computation for
Affine/Conv1D/Conv2D,sigmoid,tanh,relu, andleaky_relu(with configurable negative slope). - BatchNorm1d state fix: checkpoints now persist and restore
eps,momentum,training, andnum_features(in addition to running stats), with validation on restore.
- calculate_gain: Added recommended gain computation for
v1.2.8 — What’s new
- Modernization of packaging and CI/CD, documentation and attribution fixes, and the addition of community guidelines.
v1.2.7 — What’s new
- BUG FIX: DataLoader's seed
- DataLoader now respects the provided
seed(instead of always generating a new one) so shuffling is reproducible across runs when a seed is set.
- DataLoader now respects the provided
v1.2.6 — What’s new
- Small update: added logo to the docs and the PyPi page, as well as the README
v1.2.5 — What’s new
AffineandEmbeddinglayers now ensure grads are ALWAYS tracked for supplied W and b tensors- Dedicated docs site is live: https://ymishchyriak.com/docs/PUREML-DOCS (mirrors this repo and stays current).
v1.2.4 — What’s new
- Affine: optional bias
- Affine(fan_in, fan_out, bias=False, ...) is now supported.
- When bias=False, the layer keeps a zero bias tensor with requires_grad=False and excludes it from .parameters.
- Checkpointing: use_bias is persisted in named_buffers() and honored by apply_state().
- Turning bias off during apply_state() zeroes the stored bias and freezes it.
- Turning bias on reuses the same tensor and re-enables requires_grad.
- Loader accepts weight arrays shaped either (fan_in, fan_out) or (fan_out, fan_in) and auto-transposes to internal (n, m).
v1.2.3 — What’s new
- Added utility functions
rng_from_seed(seed: int | None = None)andget_random_seed()for reproducible RNG creation.rng_from_seedreturns both the RNG and the resolved seed; if no seed is provided,get_random_seed()generates one using cryptographically secure OS randomness.
- All core layers (
Affine,Embedding,Dropout, etc.) now support explicit seeds for deterministic initialization. - Seeds and initialization methods are now persisted in checkpoints and automatically restored via each layer’s
apply_state()method. - Fixed the "Fit MNIST in a few lines" README example — it now correctly calls the
.numpy()method (parentheses were missing).
v1.2.2 — What’s new
-
Autograd: correct
detachsemantics (+ in-place variant)Tensor.detach()now returns a new leaf tensor that shares storage, has no creator, andrequires_grad=False.- New in-place
Tensor.detach_()for stopping tracking on the current object. - New
Tensor.requires_grad_(bool)toggler (in-place), PyTorch-style. - Migration note: if you relied on the old in-place behavior of
detach(), switch todetach_()or reassign:x = x.detach().
-
Safe array export API
- New
Tensor.numpy(copy=True, readonly=False)helper:copy=Truereturns a defensive copy (default).readonly=Truemarks the returned array non-writable (works with views or copies).
- Rationale: keep
.dataas the mutable param buffer for optimizers, while providing a safe way for read-only exports. - Bottom-line: DO NOT ACCESS
.dataattribute directly, unless you REALLY need it! Instead, call .numpy() API. In future updates, .data may be hidden completely to avoid users accidentally mutating tensors.
- New
-
Graph utilities: iterative and memory-safe
_collect_graph()rewritten as an iterative ancestor walk (no recursion limits).zero_grad_graph()anddetach_graph()now use a single traversal and- free each node’s cached forward context via
fn._free_fwd_ctx()before unlinking, - set
t._creator=None,t.grad=None, - and (for
detach_graph)t.requires_grad=Falseto prevent future history building.
- free each node’s cached forward context via
- Net effect: lower peak memory and safer teardown of large graphs.
-
Docs/logging polish
- Clearer docstrings for graph collection (it collects upstream/ancestor nodes).
- More informative debug logs for backward/graph utilities.
v1.2.1 — What’s new
- BUG FIX: NN base-class API
- Now, self(x, y, ...) does not error within a class inheriting from NN. Previously,
__call__function expected a single tensor, but now the signature is (*args, **kwargs), so you can define the .predict method with any signature and still use self(...) interface.
- Now, self(x, y, ...) does not error within a class inheriting from NN. Previously,
- training_utils: TensorDataset now ALWAYS returns Tensor instances
- Previously, if you initialized a TensorDataset from numpy arrays, it would return numpy array instances via
__getitem__; Now, we enforce Tensor output, which protects us from downstream errors. In case you want to access the numpy data, just call .numpy() method on your Tensor.
- Previously, if you initialized a TensorDataset from numpy arrays, it would return numpy array instances via
v1.2.0 — What’s new
-
Autodiff-aware slicing (NumPy semantics) for
Tensor- Supports ints, slices, ellipsis (
...),None(newaxis), boolean masks, and advanced integer arrays. - Backward pass scatter-adds into a zeros-like array of the input’s shape (handles repeated/overlapping indices correctly).
- Supports ints, slices, ellipsis (
-
Embeddinglayer- A learned lookup table for integer indices: input
(...,)of ints → output(..., D)embeddings. - API:
Embedding(V, D, pad_idx=None, W=None). - If
pad_idxis set, that row is initialized to zeros and receives no gradient (useful for<PAD>tokens). - Correctly accumulates gradients for repeated indices.
- A learned lookup table for integer indices: input
-
BUG FIX:
TensorValuedFunctioncontext merging- User-supplied forward contexts are now merged into the node’s internal context and persist through backward.
- Previously, the node could overwrite the provided context in some advanced cases, leading to missing cached values (e.g.,
padding_idx, flattened indices) during gradient computation.
