TensorFrost
A static optimizing tensor compiler with a Python frontend, autodifferentiation, and a more "shader-like" syntax.
Install / Use
/learn @MichaelMoroz/TensorFrostREADME
🔢🥶 TensorFrost
A static optimizing tensor compiler with a Python frontend, autodifferentiation, and a more "shader-like" syntax.
Currently working platforms: | Backend/OS | CodeGen Only | C++/OpenMP | GLSL/OpenGL | CUDA | GLSL/Vulkan | WGSL/WebGPU | |------------|-----|--------|------|--------|------------|------------| | Windows | ✅ | 🚧 | 🚧 | ⛔ | ⛔ | ⛔ | | Linux | ✅ | 🚧 | 🚧 | ⛔ | ⛔ | ⛔ | | MacOS | ✅ | ⛔ | ⛔ | ⛔ | ⛔ | ⛔ |
For more detail about this project, please read my blog post! Writing an optimizing tensor compiler from scratch
The current version of the library is still in early beta, and at this point I would strongly recommend not to use this for any serious projects. It is also very likely that there will be breaking updates in the future, as a lot of the code is not finalized.
Examples
<a href="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/Simulation/wave_simulation.ipynb"><img src="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/Demos/sin_gordon.gif?raw=true" height="192px"></a> <a href="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/Simulation/fluid_simulation.ipynb"><img src="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/Demos/fluid_sim.gif?raw=true" height="192px"></a> <a href="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/GUI/buddhabrot.py"><img src="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/Demos/buddhabrot.gif?raw=true" height="192px"></a> <a href="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/GUI/interactive_path_tracer.py"><img src="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/Demos/path_tracer.gif?raw=true" height="192px"></a> <a href="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/Simulation/n-body.ipynb"><img src="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/Demos/n_body.gif?raw=true" height="192px"></a> <a href="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/Rendering/neural_embed.ipynb"><img src="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/Demos/neural_embed.gif?raw=true" height="192px"></a> <a href="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/ML/NCA/"><img src="https://github.com/MichaelMoroz/TensorFrost/blob/main/examples/Demos/nca.gif?raw=true" height="192px"></a>
Installation
From PyPI
You can install the latest version of the library from PyPI:
pip install tensorfrost
From source
You need to have CMake installed to build the library.
First clone the repository:
git clone --recurse-submodules https://github.com/MichaelMoroz/TensorFrost.git
cd TensorFrost
Then you can either install the library for development.
py -$YOUR_PYTHON_VERSION$ -m pip install --upgrade pip setuptools wheel
py -$YOUR_PYTHON_VERSION$ -m pip install -e Python/ -v # install the library for development
This will link the build folder to Python, so any changes you make to the source code will be reflected in the installed library, in case of changed CPP code you need to rebuild the file library files using cmake --build . --config Release command, or by using any IDE.
You can also build a wheel file for distribution. This will create a wheel file in the dist folder.
py -$YOUR_PYTHON_VERSION$ -m pip wheel ./Python -w dist -v # build a wheel file
[!TIP] If you are using a Linux distribution that doesn't support installing packages through pip (e.g. Arch Linux), read Using a Virtual Environment.
Using a Virtual Environment
Certain Linux distributions (e.g. Arch Linux) want you to use their package manager to manage system-wide Python packages instead of pip. TensorFrost uses pip to install itself once built, so before running CMake you will need to activate a Virtual Environment.
-
From the TensorFrost directory, create a venv:
python -m venv ./venv -
Activate the venv:
source venv/bin/activate -
Now, you can use pip to install the library:
python -m pip install --upgrade pip setuptools wheel python -m pip install -e Python/ -v # install the library for development python -m pip wheel ./Python -w dist -v # build a wheel file
[!TIP] The newly-created venv is treated like a fresh Python installation, so you may need to reinstall any needed packages such as
numpy,matplotlib, andtqdmif you are trying out the examples.pipworks fine once the venv is active (e.g.pip install numpy).
Usage
Setup
For the library to work you need a C++ compiler that supports C++17 (Currently only Microsoft Visual Studio Compiler on Windows, and gcc on Linux)
First you need to import the library:
import TensorFrost as tf
Then you need to initialize the library with the device you want to use and the kernel compiler flags (different for each platform):
tf.initialize(tf.cpu) # or tf.opengl
TensorFrost will find any available MSVC(Windows) or GCC(Linux) compiler and use it to compile the main code and the kernels. In OpenGL mode the driver compiles the kernels. (TODO: compile the main code into python for faster compile times, MSVC is super slow, 1.5 seconds for a single function)
[!TIP] If you are compiling a large program it is useful to change the compilation flags to just "" to avoid the long compile times. Especially a problem on Windows.
tf.initialize(tf.opengl, "")
You can have TensorFrost in code generation mode instead (you cant run tensor programs here), it is much faster, but you would need to use the code manually afterwards:
tf.initialize(tf.codegen, kernel_lang = tf.hlsl_lang) # or tf.glsl_lang for OpenGL, or tf.cpp_lang for C++
After you compiled all the tensor programs you need, you can get all the generated code and save it to a file:
# Save all the compiled functions
cpp_header = tf.get_cpp_header()
all_main_functions = tf.get_all_generated_main_functions() #always in C++
with open('tensorfrost_main.cpp', 'w') as f:
f.write(cpp_header)
for func in all_main_functions:
f.write(func)
# Save all the compiled kernels
all_kernels = tf.get_all_generated_kernels() #depends on the kernel_lang
for i, kernel in enumerate(all_kernels):
with open('generated_kernels/kernel_{}.hlsl'.format(i), 'w') as f:
f.write(kernel)
Right now you cant just compile the code and run it, since it also requires a Kernel compiler and executor as well as memory manager for tensors. In the future I plan to add all the required functions for that too, for better portability.
Basic usage
Now you can create and compile functions, for example here is a very simple function does a wave simulation:
def WaveEq():
#shape is not specified -> shape is inferred from the input tensor (can result in slower execution)
u = tf.input([-1, -1], tf.float32)
#shape must match
v = tf.input(u.shape, tf.float32)
i,j = u.indices
laplacian = u[i-1, j] + u[i+1, j] + u[i, j-1] + u[i, j+1] - u * 4.0
v_new = v + dt*laplacian
u_new = u + dt*v_new
return v_new, u_new
wave_eq = tf.compile(WaveEq)
As you can see, inputs are not arguments to the function, but are created inside the function. This is because some inputs can be constrained by the shape of other inputs, and the shape of the input tensor is not known at compile time. You can give shape arguments to the input function, constants for exactly matching shapes, or -1 for any shape. If you want to constrain the shape of the input tensor, you need to get the shape of the other tensor and use it as an argument to the input function.
The tensor programs take and output tensor memory buffers, which can be created from numpy arrays:
A = tf.tensor(np.zeros([100, 100], dtype=np.float32))
B = tf.tensor(np.zeros([100, 100], dtype=np.float32))
Then you can run the program:
A, B = wave_eq(A, B)
As you can see the inputs are given to the compiled function in the same order as they are created in the function.
To get the result back into a numpy array, you can use the numpy property:
Anp = A.numpy
Operations
TensorFrost supports most of the basic numpy operations, including indexing, arithmetic, and broadcasting.
The core operation is the indexing operation, which is used to specify indices for accessing the tensor data. Depending on the dimensinality of the tensor there can be N indices. This operation is similar to numpy's np.ogrid and np.mgrid functions, but it
Related Skills
node-connect
345.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
106.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
345.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
345.9kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
