Llama.node
Node.js binding of llama.cpp
Install / Use
/learn @mybigday/Llama.nodeREADME
llama.node
An another Node binding of llama.cpp to make same API with llama.rn as much as possible.
Platform Support
- macOS
- arm64: CPU and Metal GPU acceleration
- x86_64: CPU only
- Windows (x86_64 and arm64)
- CPU
- GPU acceleration via Vulkan
- GPU acceleration via CUDA (x86_64)
- Linux (x86_64 and arm64)
- CPU
- GPU acceleration via Vulkan
- GPU acceleration via CUDA
Installation
npm install @fugood/llama.node
Usage
import { loadModel } from '@fugood/llama.node'
// Initial a Llama context with the model (may take a while)
const context = await loadModel({
model: 'path/to/gguf/model',
n_ctx: 2048,
n_gpu_layers: 99, // > 0: enable GPU
// lib_variant: 'vulkan', // Change backend
})
// Do completion
const { text } = await context.completion(
{
prompt: 'This is a conversation between user and llama, a friendly chatbot. respond in simple markdown.\n\nUser: Hello!\nLlama:',
n_predict: 100,
stop: ['</s>', 'Llama:', 'User:'],
// n_threads: 4,
},
(data) => {
// This is a partial completion callback
const { token } = data
},
)
console.log('Result:', text)
Lib Variants
- [x]
default: General usage, not support GPU except macOS (Metal) - [x]
vulkan: Support GPU Vulkan (Windows/Linux), but some scenario might unstable - [x]
cuda: Support GPU CUDA (Windows/Linux), but only for limited capabilityLinux: (x86_64: 8.9, arm64: 8.7) Windows: x86_64 - 12.0
License
MIT
<p align="center"> <a href="https://bricks.tools"> <img width="90px" src="https://avatars.githubusercontent.com/u/17320237?s=200&v=4"> </a> <p align="center"> Built and maintained by <a href="https://bricks.tools">BRICKS</a>. </p> </p>
