SkillAgentSearch skills...

Llm4s

Scala 3 bindings for llama.cpp 🦙

Install / Use

/learn @donderom/Llm4s
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

llm4s

Maven Central Version GitHub Actions Workflow Status License

<p align="center"> <img src="logo.svg" width="128" alt="llm4s logo"> </p>

Experimental Scala 3 bindings for llama.cpp using Slinc.

Setup

Add llm4s to your build.sbt:

libraryDependencies += "com.donderom" %% "llm4s" % "0.14.0-b7672"

For JDK 17 add .jvmopts file in the project root:

--add-modules=jdk.incubator.foreign
--enable-native-access=ALL-UNNAMED

Compatibility

  • Scala: 3.3.0
  • JDK: 17 or 19
  • llama.cpp: The version suffix refers to the latest supported llama.cpp release (e.g. version 0.14.0-b7672 means that it supports the b7672 release). The newer releases are usually supported as well, provided there are no API changes.
<details> <summary>Older versions</summary>

| llm4s | Scala | JDK | llama.cpp (commit hash) | |------:|----------:|-------:|------------------------:| | 0.11+ | 3.3.0 | 17, 19 | 229ffff (May 8, 2024) | | 0.10+ | 3.3.0 | 17, 19 | 49e7cb5 (Jul 31, 2023) | | 0.6+ | 3.3.0-RC3 | --- | 49e7cb5 (Jul 31, 2023) | | 0.4+ | 3.3.0-RC3 | --- | 70d26ac (Jul 23, 2023) | | 0.3+ | 3.3.0-RC3 | --- | a6803ca (Jul 14, 2023) | | 0.1+ | 3.3.0-RC3 | 17, 19 | 447ccbe (Jun 25, 2023) |

</details>

Usage

import java.nio.file.Paths
import com.donderom.llm4s.*

// Path to the llama.cpp shared library
System.load("./build/bin/libllama.dylib")

// Path to the model supported by llama.cpp
val model = Paths.get("Llama-3.2-3B-Instruct-Q6_K.gguf")
val prompt = "What is LLM?"

Completion

val llm = Llm(model)

// To print generation as it goes
llm(prompt).foreach: stream =>
  stream.foreach: token =>
    print(token)

// Or build a string
llm(prompt).foreach(stream => println(stream.mkString))

llm.close()

Embeddings

val llm = Llm(model)
llm.embeddings(prompt).foreach: embeddings =>
  embeddings.foreach: embd =>
    print(embd)
    print(' ')
llm.close()

Self-contained Scala CLI example (with basic Llama 3 model):

Run.scala:

//> using scala 3.3.0
//> using jvm adoptium:17
//> using java-opt --add-modules=jdk.incubator.foreign
//> using java-opt --enable-native-access=ALL-UNNAMED
//> using dep com.donderom::llm4s:0.14.0-b7672

import com.donderom.llm4s.Llm
import java.nio.file.Paths
import scala.util.Using

object Main extends App:
  System.load("./build/bin/libllama.dylib")
  val model = Paths.get("Llama-3.2-3B-Instruct-Q6_K.gguf")
  val prompt = "What is LLM?"
  Using(Llm(model)): llm =>         // llm : com.donderom.llm4s.Llm
    llm(prompt).foreach: stream =>  // stream : LazyList[String]
      stream.foreach: token =>      // token : String
        print(token)
scala-cli Run.scala

Self-contained Scala CLI example (with configured gpt-oss model):

Run.scala:

//> using scala 3.3.0
//> using jvm adoptium:17
//> using java-opt --add-modules=jdk.incubator.foreign
//> using java-opt --enable-native-access=ALL-UNNAMED
//> using dep com.donderom::llm4s:0.14.0-b7672

import com.donderom.llm4s.{ContextParams, FlashAttention, Llm, LlmParams}
import java.nio.file.Paths
import scala.util.Using

object Main extends App:
  System.load("./build/bin/libllama.dylib")
  val model = Paths.get("gpt-oss-20b-mxfp4.gguf")
  val prompt = "What is LLM?"
  // Use Flash attention and context size provided by the model
  val params = LlmParams(context = ContextParams(flashAttention = FlashAttention.On))
  Using(Llm(model)): llm =>                // llm : com.donderom.llm4s.Llm
   llm(prompt, params).foreach: stream =>  // stream : LazyList[String]
      stream.foreach: token =>             // token : String
        print(token)
scala-cli Run.scala

Related Skills

View on GitHub
GitHub Stars65
CategoryDevelopment
Updated1mo ago
Forks6

Languages

Scala

Security Score

100/100

Audited on Feb 13, 2026

No findings