TensorFrost

A static optimizing tensor compiler with a Python frontend, autodifferentiation, and a more "shader-like" syntax.

Generate Convert Improve

Install / Use

/learn @MichaelMoroz/TensorFrost

About this skill

Quality Score

0/100

README

🔢🥶 TensorFrost

A static optimizing tensor compiler with a Python frontend, autodifferentiation, and a more "shader-like" syntax.

Currently working platforms: | Backend/OS | CodeGen Only | C++/OpenMP | GLSL/OpenGL | CUDA | GLSL/Vulkan | WGSL/WebGPU | |------------|-----|--------|------|--------|------------|------------| | Windows | ✅ | 🚧 | 🚧 | ⛔ | ⛔ | ⛔ | | Linux | ✅ | 🚧 | 🚧 | ⛔ | ⛔ | ⛔ | | MacOS | ✅ | ⛔ | ⛔ | ⛔ | ⛔ | ⛔ |

For more detail about this project, please read my blog post! Writing an optimizing tensor compiler from scratch

The current version of the library is still in early beta, and at this point I would strongly recommend not to use this for any serious projects. It is also very likely that there will be breaking updates in the future, as a lot of the code is not finalized.

Examples

Installation

From PyPI

You can install the latest version of the library from PyPI:

pip install tensorfrost

From source

You need to have CMake installed to build the library.

First clone the repository:

git clone --recurse-submodules https://github.com/MichaelMoroz/TensorFrost.git
cd TensorFrost

Then you can either install the library for development.

py -$YOUR_PYTHON_VERSION$ -m pip install --upgrade pip setuptools wheel
py -$YOUR_PYTHON_VERSION$ -m pip install -e Python/ -v # install the library for development

This will link the build folder to Python, so any changes you make to the source code will be reflected in the installed library, in case of changed CPP code you need to rebuild the file library files using cmake --build . --config Release command, or by using any IDE.

You can also build a wheel file for distribution. This will create a wheel file in the dist folder.

py -$YOUR_PYTHON_VERSION$ -m pip wheel ./Python -w dist -v # build a wheel file

[!TIP] If you are using a Linux distribution that doesn't support installing packages through pip (e.g. Arch Linux), read Using a Virtual Environment.

Using a Virtual Environment

Certain Linux distributions (e.g. Arch Linux) want you to use their package manager to manage system-wide Python packages instead of pip. TensorFrost uses pip to install itself once built, so before running CMake you will need to activate a Virtual Environment.

From the TensorFrost directory, create a venv:
```
python -m venv ./venv
```
Activate the venv:
```
source venv/bin/activate
```

Now, you can use pip to install the library:

python -m pip install --upgrade pip setuptools wheel
python -m pip install -e Python/ -v # install the library for development
python -m pip wheel ./Python -w dist -v # build a wheel file

[!TIP] The newly-created venv is treated like a fresh Python installation, so you may need to reinstall any needed packages such as numpy, matplotlib, and tqdm if you are trying out the examples. pip works fine once the venv is active (e.g. pip install numpy).

Usage

Setup

For the library to work you need a C++ compiler that supports C++17 (Currently only Microsoft Visual Studio Compiler on Windows, and gcc on Linux)

First you need to import the library:

import TensorFrost as tf

Then you need to initialize the library with the device you want to use and the kernel compiler flags (different for each platform):

tf.initialize(tf.cpu) # or tf.opengl

TensorFrost will find any available MSVC(Windows) or GCC(Linux) compiler and use it to compile the main code and the kernels. In OpenGL mode the driver compiles the kernels. (TODO: compile the main code into python for faster compile times, MSVC is super slow, 1.5 seconds for a single function)

[!TIP] If you are compiling a large program it is useful to change the compilation flags to just "" to avoid the long compile times. Especially a problem on Windows.
tf.initialize(tf.opengl, "")

You can have TensorFrost in code generation mode instead (you cant run tensor programs here), it is much faster, but you would need to use the code manually afterwards:

tf.initialize(tf.codegen, kernel_lang = tf.hlsl_lang) # or tf.glsl_lang for OpenGL, or tf.cpp_lang for C++

After you compiled all the tensor programs you need, you can get all the generated code and save it to a file:

# Save all the compiled functions
cpp_header = tf.get_cpp_header()
all_main_functions = tf.get_all_generated_main_functions() #always in C++
with open('tensorfrost_main.cpp', 'w') as f:
    f.write(cpp_header)
    for func in all_main_functions:
        f.write(func)

# Save all the compiled kernels
all_kernels = tf.get_all_generated_kernels() #depends on the kernel_lang
for i, kernel in enumerate(all_kernels):
    with open('generated_kernels/kernel_{}.hlsl'.format(i), 'w') as f:
        f.write(kernel)

Right now you cant just compile the code and run it, since it also requires a Kernel compiler and executor as well as memory manager for tensors. In the future I plan to add all the required functions for that too, for better portability.

Basic usage

Now you can create and compile functions, for example here is a very simple function does a wave simulation:

def WaveEq():
    #shape is not specified -> shape is inferred from the input tensor (can result in slower execution)
    u = tf.input([-1, -1], tf.float32)
    #shape must match 
    v = tf.input(u.shape, tf.float32)

    i,j = u.indices
    laplacian = u[i-1, j] + u[i+1, j] + u[i, j-1] + u[i, j+1] - u * 4.0
    v_new = v + dt*laplacian
    u_new = u + dt*v_new

    return v_new, u_new

wave_eq = tf.compile(WaveEq)

As you can see, inputs are not arguments to the function, but are created inside the function. This is because some inputs can be constrained by the shape of other inputs, and the shape of the input tensor is not known at compile time. You can give shape arguments to the input function, constants for exactly matching shapes, or -1 for any shape. If you want to constrain the shape of the input tensor, you need to get the shape of the other tensor and use it as an argument to the input function.

The tensor programs take and output tensor memory buffers, which can be created from numpy arrays:

A = tf.tensor(np.zeros([100, 100], dtype=np.float32))
B = tf.tensor(np.zeros([100, 100], dtype=np.float32))

Then you can run the program:

A, B = wave_eq(A, B)

As you can see the inputs are given to the compiled function in the same order as they are created in the function.

To get the result back into a numpy array, you can use the numpy property:

Anp = A.numpy

Operations

TensorFrost supports most of the basic numpy operations, including indexing, arithmetic, and broadcasting. The core operation is the indexing operation, which is used to specify indices for accessing the tensor data. Depending on the dimensinality of the tensor there can be N indices. This operation is similar to numpy's np.ogrid and np.mgrid functions, but it

Related Skills

node-connect

345.9k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

106.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

345.9k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

345.9k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。