SkillAgentSearch skills...

Pyrtlnet

A hardware implementation of quantized neural network inference in the PyRTL hardware description language.

Install / Use

/learn @UCSBarchlab/Pyrtlnet
About this skill

Quality Score

0/100

Supported Platforms

Zed

README

pyrtlnet

Build Status Documentation Status

Train it. Quantize it. Synthesize and simulate it — in hardware. All in Python.

pyrtlnet is a self-contained example of a quantized neural network that runs end-to-end in Python. From model training, to software inference, to hardware generation, all the way to simulating that custom inference hardware at the logic-gate level — you can do it all right from the Python REPL. We hope you will find pyrtlnet (rhymes with turtle-net) a complete and understandable walkthrough that goes from TensorFlow training to hardware simulation, with the PyRTL hardware description language. Main features include:

  • Quantized neural network training with TensorFlow. The resulting inference network is fully quantized, so all inference calculations are done with integers.

  • Four different quantized inference implementations operating at different levels of abstraction. All four implementations produce the same output, in the same format, providing a useful framework to extend either from the top-down or the bottom-up.

    1. A reference quantized inference implementation, using the standard LiteRT Interpreter.

    2. A software implementation of quantized inference, using NumPy and fxpmath, to verify the math performed by the reference implementation.

    3. A PyRTL hardware implementation of quantized inference that is simulated right at the logic gate level.

    4. A deployment of the PyRTL hardware design to a Pynq Z2 FPGA.

  • A new PyRTL linear algebra library, including a composable WireMatrix2D matrix abstraction and an output-stationary systolic array for matrix multiplication.

  • An extensive suite of unit tests, and continuous integration testing.

  • Understandable and documented code! pyrtlnet is designed to be, first and foremost, understandable and readable (even when that comes at the expense of performance). Reference documentation is extracted from docstrings with Sphinx.

Installation

  1. Install git.

  2. Clone this repository, and cd to the repository's root directory.

    $ git clone https://github.com/UCSBarchlab/pyrtlnet.git
    $ cd pyrtlnet
    
  3. Install uv.

  4. (optional) Install Verilator if you want to export the inference hardware to Verilog, and simulate the Verilog version of the hardware.

Usage

  1. Run:

    $ uv run tensorflow_training.py
    

    in this repository's root directory. tensorflow_training.py trains a quantized neural network with TensorFlow, on the MNIST data set, and produces a quantized tflite saved model file, named quantized.tflite.

    Sample output:

    Training unquantized model.
    Epoch 1/10
    1875/1875 [==============================] - 1s 350us/step - loss: 0.6532 - accuracy: 0.8202
    Epoch 2/10
    1875/1875 [==============================] - 1s 346us/step - loss: 0.3304 - accuracy: 0.9039
    Epoch 3/10
    1875/1875 [==============================] - 1s 347us/step - loss: 0.2944 - accuracy: 0.9145
    Epoch 4/10
    1875/1875 [==============================] - 1s 350us/step - loss: 0.2719 - accuracy: 0.9205
    Epoch 5/10
    1875/1875 [==============================] - 1s 352us/step - loss: 0.2551 - accuracy: 0.9245
    Epoch 6/10
    1875/1875 [==============================] - 1s 348us/step - loss: 0.2403 - accuracy: 0.9288
    Epoch 7/10
    1875/1875 [==============================] - 1s 350us/step - loss: 0.2280 - accuracy: 0.9330
    Epoch 8/10
    1875/1875 [==============================] - 1s 346us/step - loss: 0.2178 - accuracy: 0.9358
    Epoch 9/10
    1875/1875 [==============================] - 1s 348us/step - loss: 0.2092 - accuracy: 0.9378
    Epoch 10/10
    1875/1875 [==============================] - 1s 350us/step - loss: 0.2023 - accuracy: 0.9403
    Evaluating unquantized model.
    313/313 [==============================] - 0s 235us/step - loss: 0.1994 - accuracy: 0.9414
    Training quantized model and writing quantized.tflite and quantized.npz.
    Epoch 1/2
    1875/1875 [==============================] - 1s 410us/step - loss: 0.1963 - accuracy: 0.9426
    Epoch 2/2
    1875/1875 [==============================] - 1s 408us/step - loss: 0.1936 - accuracy: 0.9423
    ...
    Evaluating quantized model.
    313/313 [==============================] - 0s 286us/step - loss: 0.1996 - accuracy: 0.9413
    Writing mnist_test_data.npz.
    

    The script's output shows that the unquantized model achieved 0.9414 accuracy on the test data set, while the quantized model achieved 0.9413 accuracy on the test data set.

    This script produces quantized.tflite and quantized.npz files which includes all the model's weights, biases, and quantization parameters. quantized.tflite is a standard .tflite saved model file that can be read by tools like the Model Explorer. quantized.npz stores the weights, biases, and quantization parameters as NumPy saved arrays. quantized.npz is read by all the provided inference implementations.

  2. Run:

    $ uv run litert_inference.py
    

    in this repository's root directory. litert_inference.py runs one test image through the reference LiteRT inference implementation.

    Sample output:

    litert_inference.py screenshot

    The script outputs many useful pieces of information:

    1. A display of the input image, in this case a picture of the digit 7. This display requires a terminal that supports 24-bit color, like gnome-terminal or iTerm2.

    2. The displayed image is the first image in the test data set (image_index 0). The image is in the first batch of inputs processed (batch 0), and the image is the first in its batch (batch_index 0).

    3. The input's shape (12, 12), and the input's data type dtype float32.

    4. The output from the first layer of the network (layer0), with shape (18,) and dtype int8.

    5. The output from the second layer of the network (layer1), with shape (10,) and dtype int8.

    6. A bar chart displaying the network's final output, which is the data from the layer1 output above. The bar chart shows the un-normalized probability that the image contains each digit.

      In this case, the digit 7 is the most likely, with a score of 93, followed by the digit 3 with a score of 58. The digit 7 is labeled as actual because it is the actual prediction generated by the neural network. It is also labeled as expected because the labeled test data confirms that the image actually depicts the digit 7.

    The litert_inference.py script has many command line flags:

    • --start_image selects other images to run from the test data set.

    • --num_images determines how many consecutive images to run from the test data set. When --num_images is greater than one, the script processes several images from the test data set, and prints an overall accuracy score.

    • --batch_size determines how many images are processed at once. When --num_images is not evenly divisible by --batch_size, the last batch will be a partial batch.

    All of the provided inference scripts accept these command line flags. For example:

    $ uv run litert_inference.py --start_image=7 --num_images=10 --batch_size=3
    LiteRT Inference image_index 7 batch 0 batch_index 0
    Expected: 9 | Actual: 9
    LiteRT Inference image_index 8 batch 0 batch_index 1
    Expected: 5 | Actual: 6
    LiteRT Inference image_index 9 batch 0 batch_index 2
    Expected: 9 | Actual: 9
    
    LiteRT Inference image_index 10 batch 1 batch_index 0
    Expected: 0 | Actual: 0
    LiteRT Inference image_index 11 batch 1 batch_index 1
    Expected: 6 | Actual: 6
    LiteRT Inference image_index 12 batch 1 batch_index 2
    Expected: 9 | Actual: 9
    
    LiteRT Inference image_index 13 batch 2 batch_index 0
    Expected: 0 | Actual: 0
    LiteRT Inference image_index 14 batch 2 batch_index 1
    Expected: 1 | Actual: 1
    LiteRT Inference image_index 15 batch 2 batch_index 2
    Expected: 5 | Actual: 5
    
    LiteRT Inference image_index 16 batch 3 batch_index 0
    Expected: 9 | Actual: 9
    
    9/10 correct predictions, 90.0% accuracy
    

    In this case, the model mispredicts image_index 8, which is predicted to be a

Related Skills

View on GitHub
GitHub Stars11
CategoryDevelopment
Updated21d ago
Forks3

Languages

Python

Security Score

95/100

Audited on Mar 5, 2026

No findings