Burn
Burn is a next generation tensor library and Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.
Install / Use
/learn @tracel-ai/BurnREADME
<img src="https://www.runblaze.dev/ci-blaze-powered.png" width="125px"/>
Burn is a next generation Tensor Library and Deep Learning Framework that doesn't compromise on <br /> flexibility, efficiency and portability.
<br/> </div> <div align="left">Burn is both a tensor library and a deep learning framework optimized for numerical computing, model inference and model training. Burn leverages Rust to perform optimizations normally only available in static-graph frameworks, offering optimal speed without impacting flexibility.
Backend
<div align="left"> <img align="right" src="https://raw.githubusercontent.com/tracel-ai/burn/main/assets/backend-chip.png" height="96px"/>Burn strives to be as fast as possible on as many hardwares as possible, with robust implementations. We believe this flexibility is crucial for modern needs where you may train your models in the cloud, then deploy on customer hardwares, which vary from user to user.
</div>Supported Backends
Most backends support all operating systems, so we don't mention them in the tables below.
GPU Backends:
| | CUDA | ROCm | Metal | Vulkan | WebGPU | LibTorch | | ------- | ---- | ---- | ----- | ------ | ------ | -------- | | Nvidia | ☑️ | - | - | ☑️ | ☑️ | ☑️ | | AMD | - | ☑️ | - | ☑️ | ☑️ | ☑️ | | Apple | - | - | ☑️ | - | ☑️ | ☑️ | | Intel | - | - | - | ☑️ | ☑️ | - | | Qualcom | - | - | - | ☑️ | ☑️ | - | | Wasm | - | - | - | - | ☑️ | - |
CPU Backends:
| | Cpu (CubeCL) | NdArray | LibTorch | | ------ | ------------ | ------- | -------- | | X86 | ☑️ | ☑️ | ☑️ | | Arm | ☑️ | ☑️ | ☑️ | | Wasm | - | ☑️ | - | | no-std | - | ☑️ | - |
<br />Compared to other frameworks, Burn has a very different approach to supporting many backends. By design, most code is generic over the Backend trait, which allows us to build Burn with swappable backends. This makes composing backend possible, augmenting them with additional functionalities such as autodifferentiation and automatic kernel fusion.
<details> <summary> Autodiff: Backend decorator that brings backpropagation to any backend 🔄 </summary> <br />Contrary to the aforementioned backends, Autodiff is actually a backend decorator. This means that it cannot exist by itself; it must encapsulate another backend.
The simple act of wrapping a base backend with Autodiff transparently equips it with autodifferentiation support, making it possible to call backward on your model.
use burn::backend::{Autodiff, Wgpu};
use burn::tensor::{Distribution, Tensor};
fn main() {
type Backend = Autodiff<Wgpu>;
let device = Default::default();
let x: Tensor<Backend, 2> = Tensor::random([32, 32], Distribution::Default, &device);
let y: Tensor<Backend, 2> = Tensor::random([32, 32], Distribution::Default, &device).require_grad();
let tmp = x.clone() + y.clone();
let tmp = tmp.matmul(x);
let tmp = tmp.exp();
let grads = tmp.backward();
let y_grad = y.grad(&grads).unwrap();
println!("{y_grad}");
}
Of note, it is impossible to make the mistake of calling backward on a model that runs on a backend that does not support autodiff (for inference), as this method is only offered by an Autodiff backend.
See the Autodiff Backend README for more details.
</details> <details> <summary> Fusion: Backend decorator that brings kernel fusion to all first-party backends </summary> <br />This backend decorator enhances a backend with kernel fusion, provided that the inner backend
supports it. Note that you can compose this backend with other backend decorators such as Autodiff.
All first-party accelerated backends (like WGPU and CUDA) use Fusion by default (burn/fusion
feature flag), so you typically don't need to apply it manually.
#[cfg(not(feature = "fusion"))]
pub type Cuda<F = f32, I = i32> = CubeBackend<CudaRuntime, F, I, u8>;
#[cfg(feature = "fusion")]
pub type Cuda<F = f32, I = i32> = burn_fusion::Fusion<CubeBackend<CudaRuntime, F, I, u8>>;
Of note, we plan to implement automatic gradient checkpointing based on compute bound and memory bound operations, which will work gracefully with the fusion backend to make your code run even faster during training, see this issue.
See the Fusion Backend README for more details.
</details> <details> <summary> Router (Beta): Backend decorator that composes multiple backends into a single one </summary> <br />That backend simplifies hardware operability, if for instance you want to execute some operations on the CPU and other operations on the GPU.
use burn::tensor::{Distribution, Tensor};
use burn::backend::{
NdArray, Router, Wgpu, ndarray::NdArrayDevice, router::duo::MultiDevice, wgpu::WgpuDevice,
};
fn main() {
type Backend = Router<(Wgpu, NdArray)>;
let device_0 = MultiDevice::B1(WgpuDevice::DiscreteGpu(0));
let device_1 = MultiDevice::B2(NdArrayDevice::Cpu);
let tensor_gpu =
Tensor::<Backend, 2>::random([3, 3], burn::tensor::Distribution::Default, &device_0);
let tensor_cpu =
Tensor::<Backend, 2>::random([3, 3], burn::tensor::Distribution::Default, &device_1);
}
</details>
<details>
<summary>
Remote (Beta): Backend decorator for remote backend execution, useful for distributed computations
</summary>
<br />
That backend has two parts, one client and one server. The client sends tensor operations over the network to a remote compute backend. You can use any first-party backend as server in a single line of code:
fn main_server() {
// Start a server on port 3000.
burn::server::start::<burn::backend::Cuda>(Default::default(), 3000);
}
fn main_client() {
// Create a client that communicate with the server on port 3000.
use burn::backend::{Autodiff, RemoteBackend};
type Backend = Autodiff<RemoteDevice>;
let device = RemoteDevice::new("ws://localhost:3000");
let tensor_gpu =
Tensor::<Backend, 2>::random([3, 3], Distribution::Default, &device);
}
</details>
<br />
Training & Inference
<div align="left"> <img align="right" src="https://raw.githubusercontent.com/tracel-ai/burn/main/assets/ember-wall.png" height="96px"/>The whole deep learning workflow is made easy with Burn, as you can monitor your training progress with an ergonomic dashboard, and run inference everywhere from embedded devices to large GPU clusters.
Burn was built from the ground up with training and inference in mind. It's also worth noting how Burn, in comparison to frameworks like PyTorch, simplifies the transition from training to deployment, eliminating the need for code changes.
</div> <div align="center"> <br /> <a href="https://www.youtube.com/watch?v=N9RM5CQbNQc" target="_blank"> <img src="https://raw.githubusercontent.com/tracel-ai/burn/main/assets/burn-train-tui.png" alt="Burn Train TUI" width="75%"> </a> </div> <br />Click on the following sections to expand 👇
<details> <summary> Training Dashboard 📈 </summary> <br />As you can see in the previous video (click on the picture!), a new terminal UI dashboard based on the Ratatui crate allows users to follow their training with ease without having to connect to any external application.
You can visualize your training and validation metrics updating in real-time and analyze the lifelong progression or recent history of any registered metrics using only the arrow keys. Break from the training loop without crashing, allowing potential checkpoints to be fully written or important pieces of code to complete without interruption 🛡
</details> <details> <summary> ONNX Support 🐫 </summary> <br />Burn supports importing ONNX (Open Neural Network Exchange) models through the burn-onnx crate, allowing you to easily port models from TensorFlow or PyTorch to Burn. The ONNX model is converted into Rust code that uses Burn's native APIs, enabling the imported model to run on any Burn backend (CPU, GPU, WebAssembly) and benefit from all of Burn's optimizations like automatic kernel fusion.
Our ONNX support is further described in this section of the Burn Book 🔥.
</details> <details> <summary> Importing PyTorch or Safetensors Models 🚚 </summary> <br />Note: This crate is in active development and currently supports a limited set of ONNX operators.
You can load weights from PyTorch or Safetensors formats directly into your Burn-defined models. This makes it easy to reuse existing models while benefiting from Burn's performance and deployment fe
Related Skills
himalaya
336.2kCLI to manage emails via IMAP/SMTP. Use `himalaya` to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language).
coding-agent
336.2kDelegate coding tasks to Codex, Claude Code, or Pi agents via background process
tavily
336.2kTavily web search, content extraction, and research tools.
prd
Raito Bitcoin ZK client web portal.
