Llm
A powerful Rust library and CLI tool to unify and orchestrate multiple LLM, Agent and voice backends (OpenAI, Claude, Gemini, Ollama, ElevenLabs...) with a single, extensible API. Build, chain, evaluate, and serve complex multi-step AI workflows — including speech-to-text, text-to-speech, completions, vision, and reasoning.
Install / Use
/learn @graniet/LlmQuality Score
Category
Development & EngineeringSupported Platforms
README
LLM
Note: This crate name previously belonged to another project. The current implementation represents a new and different library. The previous crate is now archived and will not receive any updates. ref: https://github.com/rustformers/llm
LLM is a Rust library that lets you use multiple LLM backends in a single project: OpenAI, Anthropic (Claude), Ollama, DeepSeek, xAI, Phind, Groq, Google, Cohere, Mistral, Hugging Face and ElevenLabs. With a unified API and builder style - similar to the Stripe experience - you can easily create chat, text completion, speak-to-text requests without multiplying structures and crates.
Key Features
- Multi-backend: Manage OpenAI, Anthropic, Ollama, DeepSeek, xAI, Phind, Groq, OpenRouter, Cohere, Elevenlabs and Google through a single entry point.
- Multi-step chains: Create multi-step chains with different backends at each step.
- Templates: Use templates to create complex prompts with variables.
- Builder pattern: Configure your LLM (model, temperature, max_tokens, timeouts...) with a few simple calls.
- Chat & Completions: Two unified traits (
ChatProviderandCompletionProvider) to cover most use cases. - Extensible: Easily add new backends.
- Rust-friendly: Designed with clear traits, unified error handling, and conditional compilation via features.
- Validation: Add validation to your requests to ensure the output is what you expect.
- Resilience (retry/backoff): Enable resilient calls with exponential backoff and jitter.
- Evaluation: Add evaluation to your requests to score the output of LLMs.
- Parallel Evaluation: Evaluate multiple LLM providers in parallel and select the best response based on scoring functions.
- Function calling: Add function calling to your requests to use tools in your LLMs.
- REST API: Serve any LLM backend as a REST API with openai standard format.
- Vision: Add vision to your requests to use images in your LLMs.
- Reasoning: Add reasoning to your requests to use reasoning in your LLMs.
- Structured Output: Request structured output from certain LLM providers based on a provided JSON schema.
- Speech to text: Transcribe audio to text
- Text to speech: Transcribe text to audio
- Memory: Store and retrieve conversation history with sliding window (soon others) and shared memory support
- Agentic: Build reactive agents that can cooperate via shared memory, with configurable triggers, roles and validation.
Use any LLM backend on your project
Simply add LLM to your Cargo.toml:
[dependencies]
llm = { version = "1.2.4", features = ["openai", "anthropic", "ollama", "deepseek", "xai", "phind", "google", "groq", "mistral", "Elevenlabs"] }
Use any LLM on cli
LLM includes a command-line tool for easily interacting with different LLM models. You can install it with: cargo install llm
- Use
llmto start an interactive chat session - Use
llm openai:gpt-4oto start an interactive chat session with provider:model - Use
llm set OPENAI_API_KEY your_keyto configure your API key - Use
llm default openai:gpt-4to set a default provider - Use
echo "Hello World" | llmto pipe - Use
llm --provider openai --model gpt-4 --temperature 0.7for advanced options
Serving any LLM backend as a REST API
- Use standard messages format
- Use step chains to chain multiple LLM backends together
- Expose the chain through a REST API with openai standard format
[dependencies]
llm = { version = "1.2.4", features = ["openai", "anthropic", "ollama", "deepseek", "xai", "phind", "google", "groq", "api", "mistral", "elevenlabs"] }
More details in the api_example
More examples
| Name | Description |
|------|-------------|
| anthropic_example | Demonstrates integration with Anthropic's Claude model for chat completion |
| anthropic_streaming_example | Anthropic streaming chat example demonstrating real-time token generation |
| chain_example | Shows how to create multi-step prompt chains for exploring programming language features |
| deepseek_example | Basic DeepSeek chat completion example with deepseek-chat models |
| embedding_example | Basic embedding example with OpenAI's API |
| multi_backend_example | Illustrates chaining multiple LLM backends (OpenAI, Anthropic, DeepSeek) together in a single workflow |
| ollama_example | Example of using local LLMs through Ollama integration |
| openai_example | Basic OpenAI chat completion example with GPT models |
| resilient_example | Simple retry/backoff wrapper usage |
| openai_streaming_example | OpenAI streaming chat example demonstrating real-time token generation |
| phind_example | Basic Phind chat completion example with Phind-70B model |
| validator_example | Basic validator example with Anthropic's Claude model |
| xai_example | Basic xAI chat completion example with Grok models |
| xai_streaming_example | X.AI streaming chat example demonstrating real-time token generation |
| evaluation_example | Basic evaluation example with Anthropic, Phind and DeepSeek |
| evaluator_parallel_example | Evaluate multiple LLM providers in parallel |
| google_example | Basic Google Gemini chat completion example with Gemini models |
| google_streaming_example | Google streaming chat example demonstrating real-time token generation |
| google_pdf | Google Gemini chat with PDF attachment |
| google_image | Google Gemini chat with PDF attachment |
| google_embedding_example | Basic Google Gemini embedding example with Gemini models |
| tool_calling_example | Basic tool calling example with OpenAI |
| google_tool_calling_example | Google Gemini function calling example with complex JSON schema for meeting scheduling |
| json_schema_nested_example | Advanced example demonstrating deeply nested JSON schemas with arrays of objects and complex data structures |
| tool_json_schema_cycle_example | Complete tool calling cycle with JSON schema validation and structured responses |
| unified_tool_calling_example | Unified tool calling with selectable provider - demonstrates multi-turn tool use and tool choice |
| deepclaude_pipeline_example | Basic deepclaude pipeline example with DeepSeek and Claude |
| api_example | Basic API (openai standard format) example with OpenAI, Anthropic, DeepSeek and Groq |
| api_deepclaude_example | Basic API (openai standard format) example with DeepSeek and Claude |
| anthropic_vision_example | Basic anthropic vision example with Anthropic |
| openai_vision_example | Basic openai vision example with OpenAI |
| openai_reasoning_example | Basic openai reasoning example with OpenAI |
| anthropic_thinking_example | Anthropic reasoning example |
| elevenlabs_stt_example | Speech-to-text transcription example using ElevenLabs |
| elevenlabs_tts_example | Text-to-speech example using ElevenLabs |
| openai_stt_example | Speech-to-text transcription example using OpenAI |
| openai_tts_example | Text-to-speech example using OpenAI |
| tts_rodio_example | Text-to-speech with rodio example using OpenAI |
| chain_audio_text_example | Example demonstrating a multi-step chain combining speech-to-text and text processing |
| xai_search_chain_tts_example | Example demonstrating a multi-step chain combining XAI search, OpenAI summarization, and ElevenLabs text-to-speech with Rodio playback |
| xai_search_example | Example demonstrating X.AI search functionality with search modes, date ranges, and source filtering |
| memory_example | Automatic memory integration - LLM remembers conversation context across calls |
| memory_share_example | Example demonstrating shared memory between multiple LLM providers |
| trim_strategy_example | Example demonstrating memory trimming strategies with automatic summarization |
| [agent_builder_example](exa
Related Skills
node-connect
342.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
85.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
342.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
342.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
