SkillAgentSearch skills...

Candle

Minimalist ML framework for Rust

Install / Use

/learn @huggingface/Candle
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

candle

discord server Latest version Documentation License License

Candle is a minimalist ML framework for Rust with a focus on performance (including GPU support) and ease of use. Try our online demos: whisper, LLaMA2, T5, yolo, Segment Anything.

Get started

Make sure that you have candle-core correctly installed as described in Installation.

Let's see how to run a simple matrix multiplication. Write the following to your myapp/src/main.rs file:

use candle_core::{Device, Tensor};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let device = Device::Cpu;

    let a = Tensor::randn(0f32, 1., (2, 3), &device)?;
    let b = Tensor::randn(0f32, 1., (3, 4), &device)?;

    let c = a.matmul(&b)?;
    println!("{c}");
    Ok(())
}

cargo run should display a tensor of shape Tensor[[2, 4], f32].

Having installed candle with Cuda support, simply define the device to be on GPU:

- let device = Device::Cpu;
+ let device = Device::new_cuda(0)?;

For more advanced examples, please have a look at the following section.

Check out our examples

These online demos run entirely in your browser:

We also provide some command line based examples using state of the art models:

  • LLaMA v1, v2, and v3: general LLM, includes the SOLAR-10.7B variant.
  • Falcon: general LLM.
  • Codegeex4: Code completion, code interpreter, web search, function calling, repository-level
  • GLM4: Open Multilingual Multimodal Chat LMs by THUDM
  • Gemma v1 and v2: 2b and 7b+/9b general LLMs from Google Deepmind.
  • RecurrentGemma: 2b and 7b Griffin based models from Google that mix attention with a RNN like state.
  • Phi-1, Phi-1.5, Phi-2, and Phi-3: 1.3b, 2.7b, and 3.8b general LLMs with performance on par with 7b models.
  • StableLM-3B-4E1T: a 3b general LLM pre-trained on 1T tokens of English and code datasets. Also supports StableLM-2, a 1.6b LLM trained on 2T tokens, as well as the code variants.
  • Mamba: an inference only implementation of the Mamba state space model.
  • Mistral7b-v0.1: a 7b general LLM with better performance than all publicly available 13b models as of 2023-09-28.
  • Mixtral8x7b-v0.1: a sparse mixture of experts 8x7b general LLM with better performance than a Llama 2 70B model with much faster inference.
  • StarCoder and StarCoder2: LLM specialized to code generation.
  • Qwen1.5: Bilingual (English/Chinese) LLMs.
  • RWKV v5 and v6: An RNN with transformer level LLM performance.
  • Replit-code-v1.5: a 3.3b LLM specialized for code completion.
  • Yi-6B / Yi-34B: two bilingual (English/Chinese) general LLMs with 6b and 34b parameters.
  • Quantized LLaMA: quantized version of the LLaMA model using the same quantization techniques as llama.cpp.
  • Quantized Qwen3 MoE: support gguf quantized models of Qwen3 MoE models.
<img src="https://github.com/huggingface/candle/raw/main/candle-examples/examples/quantized/assets/aoc.gif" width="600">
  • Stable Diffusion: text to image generative model, support for the 1.5, 2.1, SDXL 1.0 and Turbo versions.
<img src="https://github.com/huggingface/candle/raw/main/candle-examples/examples/stable-diffusion/assets/stable-diffusion-xl.jpg" width="200">
  • Wuerstchen: another text to image generative model.
<img src="https://github.com/huggingface/candle/raw/main/candle-examples/examples/wuerstchen/assets/cat.jpg" width="200">

<img src="https://github.com/huggingface/candle/raw/main/candle-examples/examples/yolo-v8/assets/bike.od.jpg" width="200"><img src="https://github.com/huggingface/candle/raw/main/candle-examples/examples/yolo-v8/assets/bike.pose.jpg" width="200">

<img src="https://github.com/huggingface/candle/raw/main/candle-examples/examples/segment-anything/assets/sam_merged.jpg" width="200">
  • SegFormer: transformer based semantic segmentation model.
  • Whisper: speech recognition model.
  • EnCodec: high-quality audio compression model using residual vector quantization.
  • MetaVoice: foundational model for text-to-speech.
  • Parler-TTS: large text-to-speech model.
  • T5, Bert, JinaBert : useful for sentence embeddings.
  • DINOv2: computer vision model trained using self-supervision (can be used for imagenet classification, depth evaluation, segmentation).
  • VGG, RepVGG: computer vision models.
  • BLIP: image to text model, can be used to generate captions for an image.
  • CLIP: multi-model vision and language model.
  • TrOCR: a transformer OCR model, with dedicated submodels for hand-writing and printed recognition.
  • Marian-MT: neural machine translation model, generates the translated text from the input text.
  • Moondream: tiny computer-vision model that can answer real-world questions about images.

Run them using commands like:

cargo run --example quantized --release

In order to use CUDA add --features cuda to the example command line. If you have cuDNN installed, use --features cudnn for even more speedups.

There are also some wasm examples for whisper and llama2.c. You can either build them with trunk or try them online: whisper, llama2, T5, Phi-1.5, and Phi-2, Segment Anything Model.

For LLaMA2, run the following command to retrieve the weight files and start a test server:

cd candle-wasm-examples/llama2-c
wget https://huggingface.co/spaces/lmz/candle-llama2/resolve/main/model.bin
wget https://huggingface.co/spaces/lmz/candle-llama2/resolve/main/tokenizer.json
trunk serve --release --port 8081

And then head over to http://localhost:8081/.

<!--- ANCHOR: useful_libraries --->

Useful External Resources

  • candle-tutorial: A very detailed tutorial showing how to convert a PyTorch model to Candle.
  • candle-lora: Efficient and ergonomic LoRA implementation for Candle. candle-lora has
    out-of-the-box LoRA support for many models from Candle, which can be found here.
  • candle-video: Rust library for text-to-video generation (LTX-Video and related models) built on Candle, focused on fast, Python-free inferen
View on GitHub
GitHub Stars19.9k
CategoryDevelopment
Updated2h ago
Forks1.5k

Languages

Rust

Security Score

95/100

Audited on Apr 8, 2026

No findings