SkillAgentSearch skills...

Mojograd

Implementation of Karpathy's micrograd in Mojo :fire:

Install / Use

/learn @automata/Mojograd
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

🔥grad

<br /> <img src="https://github.com/automata/mojograd/assets/49062/ca242b1e-e7d7-485d-8bb7-7e27ff0d69a8" width="512px" /> <br />

mojograd is a Mojo implementation of micrograd, a reverse-mode autodiff library with a PyTorch-like API.

The goal is to be as close as possible to micrograd, keeping a pretty clean syntax to define computational graphs. Like micrograd, it only supports scalar values for now, but we plan to extend it to support Tensors in the near future.

Note that mojograd is in WIP and relies on static register passable structures, so backward pass copies values and can be really slow (Mojo traits support should improve that, so please stay tuned!). However, even now with zero optimizations, forward pass is already ~40x faster than the original Python implementation (see benchmarks bellow).

Using

mojograd dynamically builds a computational graph by overloading operators on Value type, performing the forward pass. Just write your expression like a normal (non-diff) equation and call backward() to perform the backward pass:

from mojograd import Value

var a = Value(2.0)
var b = Value(3.0)
var c: Float32 = 2.0
var d = b**c
var e = a + c
e.backward()

a.print() # => <Value data: 2.0 grad: 1.0 op:  >
b.print() # => <Value data: 3.0 grad: 0.0 op:  >
d.print() # => <Value data: 9.0 grad: 0.0 op: ** >
e.print() # => <Value data: 4.0 grad: 1.0 op: + > 

For a more complete example (a simple Multi-Layer Perceptron), please check the tests.mojo file. You can run it with:

mojo tests.mojo

Benchmarks

MLP binary classifier

When compared to original Python implementation, mojograd is up to ~40 times faster in forward pass.

| # parameters | micrograd (Python) (sec) | mojograd (Mojo) (sec) | speed up | |--------------|--------------------------|-----------------------|----------| | 367 | 0.001 | 0.00006 | x20 | | 1185 | 0.004 | 0.0001 | x40 | | 4417 | 0.01 | 0.0005 | x20 | | 17025 | 0.06 | 0.002 | x30 |

Changelog

  • 2023.11.19
    • Benchmarking inference and comparing with micrograd
  • 2023.11.18
    • Optimization pass through the code
  • 2023.11.14
    • Rebuild the whole thing using pointer handling (dangerous) to register-passables
    • Got the full micrograd implementation working!
    • MLP example training and inference working!
  • 2023.09.05
    • Starting from scratch based on suggestions from Jack Clayton
    • Topological sort works but I'm messing something with memory handling, the gradients are not getting updated
  • 2023.07.04
    • Ported Neuron, Layer and MLP
    • Back to use yakupc55's List (need register_passable data struct)
  • 2023.06.30
    • Finally got it working! Only missing pow ops and review it
View on GitHub
GitHub Stars77
CategoryDevelopment
Updated2mo ago
Forks4

Languages

Python

Security Score

85/100

Audited on Jan 10, 2026

No findings