Pyrtlnet
A hardware implementation of quantized neural network inference in the PyRTL hardware description language.
Install / Use
/learn @UCSBarchlab/PyrtlnetREADME
pyrtlnet
Train it. Quantize it. Synthesize and simulate it — in hardware. All in Python.
pyrtlnet is a self-contained example of a quantized neural network that runs
end-to-end in Python. From model training, to software inference, to hardware
generation, all the way to simulating that custom inference hardware at the
logic-gate level — you can do it all right from the Python REPL. We hope you
will find pyrtlnet (rhymes with turtle-net) a complete and understandable
walkthrough that goes from TensorFlow training
to hardware simulation, with the PyRTL
hardware description language. Main features include:
-
Quantized neural network training with TensorFlow. The resulting inference network is fully quantized, so all inference calculations are done with integers.
-
Four different quantized inference implementations operating at different levels of abstraction. All four implementations produce the same output, in the same format, providing a useful framework to extend either from the top-down or the bottom-up.
-
A reference quantized inference implementation, using the standard LiteRT
Interpreter. -
A software implementation of quantized inference, using NumPy and fxpmath, to verify the math performed by the reference implementation.
-
A PyRTL hardware implementation of quantized inference that is simulated right at the logic gate level.
-
A deployment of the PyRTL hardware design to a Pynq Z2 FPGA.
-
-
A new PyRTL linear algebra library, including a composable
WireMatrix2Dmatrix abstraction and an output-stationary systolic array for matrix multiplication. -
An extensive suite of unit tests, and continuous integration testing.
-
Understandable and documented code!
pyrtlnetis designed to be, first and foremost, understandable and readable (even when that comes at the expense of performance). Reference documentation is extracted from docstrings with Sphinx.
Installation
-
Install
git. -
Clone this repository, and
cdto the repository's root directory.$ git clone https://github.com/UCSBarchlab/pyrtlnet.git $ cd pyrtlnet -
Install
uv. -
(optional) Install Verilator if you want to export the inference hardware to Verilog, and simulate the Verilog version of the hardware.
Usage
-
Run:
$ uv run tensorflow_training.pyin this repository's root directory.
tensorflow_training.pytrains a quantized neural network with TensorFlow, on the MNIST data set, and produces a quantizedtflitesaved model file, namedquantized.tflite.Sample output:
Training unquantized model. Epoch 1/10 1875/1875 [==============================] - 1s 350us/step - loss: 0.6532 - accuracy: 0.8202 Epoch 2/10 1875/1875 [==============================] - 1s 346us/step - loss: 0.3304 - accuracy: 0.9039 Epoch 3/10 1875/1875 [==============================] - 1s 347us/step - loss: 0.2944 - accuracy: 0.9145 Epoch 4/10 1875/1875 [==============================] - 1s 350us/step - loss: 0.2719 - accuracy: 0.9205 Epoch 5/10 1875/1875 [==============================] - 1s 352us/step - loss: 0.2551 - accuracy: 0.9245 Epoch 6/10 1875/1875 [==============================] - 1s 348us/step - loss: 0.2403 - accuracy: 0.9288 Epoch 7/10 1875/1875 [==============================] - 1s 350us/step - loss: 0.2280 - accuracy: 0.9330 Epoch 8/10 1875/1875 [==============================] - 1s 346us/step - loss: 0.2178 - accuracy: 0.9358 Epoch 9/10 1875/1875 [==============================] - 1s 348us/step - loss: 0.2092 - accuracy: 0.9378 Epoch 10/10 1875/1875 [==============================] - 1s 350us/step - loss: 0.2023 - accuracy: 0.9403 Evaluating unquantized model. 313/313 [==============================] - 0s 235us/step - loss: 0.1994 - accuracy: 0.9414 Training quantized model and writing quantized.tflite and quantized.npz. Epoch 1/2 1875/1875 [==============================] - 1s 410us/step - loss: 0.1963 - accuracy: 0.9426 Epoch 2/2 1875/1875 [==============================] - 1s 408us/step - loss: 0.1936 - accuracy: 0.9423 ... Evaluating quantized model. 313/313 [==============================] - 0s 286us/step - loss: 0.1996 - accuracy: 0.9413 Writing mnist_test_data.npz.The script's output shows that the unquantized model achieved
0.9414accuracy on the test data set, while the quantized model achieved0.9413accuracy on the test data set.This script produces
quantized.tfliteandquantized.npzfiles which includes all the model's weights, biases, and quantization parameters.quantized.tfliteis a standard.tflitesaved model file that can be read by tools like the Model Explorer.quantized.npzstores the weights, biases, and quantization parameters as NumPy saved arrays.quantized.npzis read by all the provided inference implementations. -
Run:
$ uv run litert_inference.pyin this repository's root directory.
litert_inference.pyruns one test image through the reference LiteRT inference implementation.Sample output:

The script outputs many useful pieces of information:
-
A display of the input image, in this case a picture of the digit
7. This display requires a terminal that supports 24-bit color, like gnome-terminal or iTerm2. -
The displayed image is the first image in the test data set (
image_index 0). The image is in the first batch of inputs processed (batch 0), and the image is the first in its batch (batch_index 0). -
The input's
shape (12, 12), and the input's data typedtype float32. -
The output from the first layer of the network (
layer0), withshape (18,)anddtype int8. -
The output from the second layer of the network (
layer1), withshape (10,)anddtype int8. -
A bar chart displaying the network's final output, which is the data from the
layer1 outputabove. The bar chart shows the un-normalized probability that the image contains each digit.In this case, the digit
7is the most likely, with a score of93, followed by the digit3with a score of58. The digit7is labeled asactualbecause it is the actual prediction generated by the neural network. It is also labeled asexpectedbecause the labeled test data confirms that the image actually depicts the digit7.
The
litert_inference.pyscript has many command line flags:-
--start_imageselects other images to run from the test data set. -
--num_imagesdetermines how many consecutive images to run from the test data set. When--num_imagesis greater than one, the script processes several images from the test data set, and prints an overall accuracy score. -
--batch_sizedetermines how many images are processed at once. When--num_imagesis not evenly divisible by--batch_size, the last batch will be a partial batch.
All of the provided inference scripts accept these command line flags. For example:
$ uv run litert_inference.py --start_image=7 --num_images=10 --batch_size=3 LiteRT Inference image_index 7 batch 0 batch_index 0 Expected: 9 | Actual: 9 LiteRT Inference image_index 8 batch 0 batch_index 1 Expected: 5 | Actual: 6 LiteRT Inference image_index 9 batch 0 batch_index 2 Expected: 9 | Actual: 9 LiteRT Inference image_index 10 batch 1 batch_index 0 Expected: 0 | Actual: 0 LiteRT Inference image_index 11 batch 1 batch_index 1 Expected: 6 | Actual: 6 LiteRT Inference image_index 12 batch 1 batch_index 2 Expected: 9 | Actual: 9 LiteRT Inference image_index 13 batch 2 batch_index 0 Expected: 0 | Actual: 0 LiteRT Inference image_index 14 batch 2 batch_index 1 Expected: 1 | Actual: 1 LiteRT Inference image_index 15 batch 2 batch_index 2 Expected: 5 | Actual: 5 LiteRT Inference image_index 16 batch 3 batch_index 0 Expected: 9 | Actual: 9 9/10 correct predictions, 90.0% accuracyIn this case, the model mispredicts
image_index 8, which is predicted to be a -
Related Skills
node-connect
338.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
claude-opus-4-5-migration
83.4kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
frontend-design
83.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
model-usage
338.0kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
