DreamServer
Local AI anywhere, for everyone — LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. No cloud, no subscriptions.
Install / Use
/learn @Light-Heart-Labs/DreamServerREADME
Dream Server
Own your AI. One person, one dream, one machine at a time.
A handful of companies control the vast majority of global AI traffic — and with it, your data, your costs, and your uptime. Every query you send to a centralized provider is business intelligence you don’t own, running on infrastructure you don’t control, priced on terms you can’t negotiate.
If AI is becoming critical infrastructure, it shouldn’t be rented. Self-hosting local AI should be a sovereign human right, not a career choice.
Dream Server is the exit. A fully local AI stack — LLM inference, chat, voice, agents, workflows, RAG, image generation, and privacy tools — deployed on your hardware with a single command. No cloud. No subscriptions. No one watching.

New here? Read the Friendly Guide or listen to the audio version — a complete walkthrough of what Dream Server is, how it works, and how to make it your own. No technical background needed.
</div>Platform Support — March 2026
| Platform | Status | |----------|--------| | Linux (NVIDIA + AMD) | Supported — install and run today | | Windows (NVIDIA + AMD) | Supported — install and run today | | macOS (Apple Silicon) | Supported — install and run today |
Tested Linux distros: Ubuntu 24.04/22.04, Debian 12, Fedora 41+, Arch Linux, CachyOS, openSUSE Tumbleweed. Other distros using apt, dnf, pacman, or zypper should also work — open an issue if yours doesn't.
Windows: Requires Docker Desktop with WSL2 backend. NVIDIA GPUs use Docker GPU passthrough; AMD Strix Halo runs natively with Lemonade (NPU + ROCm + Vulkan acceleration).
macOS: Requires Apple Silicon (M1+) and Docker Desktop. llama-server runs natively with Metal GPU acceleration; all other services run in Docker.
See the Support Matrix for details.
Why Dream Server?
Because running your own AI shouldn't require a CS degree and a weekend of debugging CUDA drivers. Right now, setting up local AI means stitching together a dozen projects, writing Docker configs from scratch, and praying everything talks to each other. Most people give up and go back to paying OpenAI.
We built Dream Server so you don't have to.
- One command — detects your GPU, picks the right model, generates credentials, launches everything
- Chatting in under 2 minutes — bootstrap mode gives you a working model instantly while your full model downloads in the background
- 13 services, pre-wired — chat, agents, voice, workflows, search, RAG, image generation, privacy tools. All talking to each other out of the box
- Fully moddable — every service is an extension. Drop in a folder, run
dream enable, done
curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/DreamServer/v2.3.2/dream-server/get-dream-server.sh | bash
Open http://localhost:3000 and start chatting.
No GPU? Dream Server also runs in cloud mode — same full stack, powered by OpenAI/Anthropic/Together APIs instead of local inference:
./install.sh --cloud
<div align="center">Port conflicts? Every port is configurable via environment variables. See
.env.examplefor the full list, or override at install time:WEBUI_PORT=9090 ./install.sh

The DREAMGATE installer handles everything — GPU detection, model selection, service orchestration.
</div> <details> <summary><b>Manual install (Linux)</b></summary>git clone https://github.com/Light-Heart-Labs/DreamServer.git
cd DreamServer/dream-server
./install.sh
</details>
<details>
<summary><b>Windows (PowerShell)</b></summary>
Requires Docker Desktop with WSL2 backend enabled. Install Docker Desktop first and make sure it is running before you start.
Open PowerShell as Administrator and run:
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
git clone https://github.com/Light-Heart-Labs/DreamServer.git
cd DreamServer
.\install.ps1
The
Set-ExecutionPolicycommand allows the installer script to run in the current session. It does not change your system-wide policy.
The installer detects your GPU, picks the right model, generates credentials, starts all services, and creates a Desktop shortcut to the Dashboard. Manage with .\dream-server\installers\windows\dream.ps1 status.
Requires Apple Silicon (M1+) and Docker Desktop. Install Docker Desktop first and make sure it is running before you start.
git clone https://github.com/Light-Heart-Labs/DreamServer.git
cd DreamServer/dream-server
./install.sh
The installer detects your chip, picks the right model for your unified memory, launches llama-server natively with Metal acceleration, and starts all other services in Docker. Manage with ./dream-macos.sh status.
See the macOS Quickstart for details.
</details>What's In The Box
Chat & Inference
- Open WebUI — full-featured chat interface with conversation history, web search, document upload, and 30+ languages
- llama-server — high-performance LLM inference with continuous batching, auto-selected for your GPU
- LiteLLM — API gateway supporting local/cloud/hybrid modes
Voice
- Whisper — speech-to-text
- Kokoro — text-to-speech
Agents & Automation
- OpenClaw — autonomous AI agent framework
- n8n — workflow automation with 400+ integrations (Slack, email, databases, APIs)
Knowledge & Search
- Qdrant — vector database for retrieval-augmented generation (RAG)
- SearXNG — self-hosted web search (no tracking)
- Perplexica — deep research engine
Creative
- ComfyUI — node-based image generation
Privacy & Ops
- Privacy Shield — PII scrubbing proxy for API calls
- Dashboard — real-time GPU metrics, service health, model management
Hardware Auto-Detection
The installer detects your GPU and picks the optimal model automatically. No manual configuration.
NVIDIA
| VRAM | Model | Example GPUs | |------|-------|--------------| | < 8 GB | Qwen3.5 2B (Q4_K_M) | Any GPU or CPU-only | | 8–11 GB | Qwen3.5 9B (Q4_K_M) | RTX 4060 Ti, RTX 3060 12GB | | 12–20 GB | Qwen3.5 9B (Q4_K_M) | RTX 3090, RTX 4080 | | 20–40 GB | Qwen3 30B-A3B MoE (Q4_K_M) | RTX 4090, A6000 | | 40+ GB | Qwen3 30B-A3B (MoE, Q4_K_M) | A100, multi-GPU | | 90+ GB | Qwen3 Coder Next (80B MoE, Q4_K_M) | Multi-GPU A100/H100 |
AMD Strix Halo (Unified Memory)
| Unified RAM | Model | Hardware | |-------------|-------|----------| | 64–89 GB | Qwen3 30B-A3B (30B MoE) | Ryzen AI MAX+ 395 (64GB) | | 90+ GB | Qwen3 Coder Next (80B MoE) | Ryzen AI MAX+ 395 (96GB) |
Apple Silicon (Unified Memory, Metal)
| Unified RAM | Model | Example Hardware | |-------------|-------|-----------------| | < 16 GB | Qwen3.5 2B (Q4_K_M) | M1/M2 base (8GB) | | 16–24 GB | Qwen3.5 4B (Q4_K_M) | M4 Mac Mini (16GB) | | 32 GB | Qwen3.5 9B (Q4_K_M) | M4 Pro Mac Mini, M3 Max MacBook Pro | | 48 GB | Qwen3 30B-A3B (MoE, Q4_K_M) | M4 Pro (48GB), M2 Max (48GB) | | 64+ GB | Qwen3 30B-A3B (MoE, Q4_K_M) | M2 Ultra Mac Studio, M4 Max (64GB+) |
Override tier selection: ./install.sh --tier 3
Bootstrap Mode
No waiting for large downloads. Dream Server uses bootstrap mode by default:
- Downloads a tiny 1.5B model in under a minute
- You start chatting immediately
- The full model downloads in the background
- Hot-swap to the full model when it's ready — zero downtime

The installer pulls all services in parallel. Downloads are resume-capable — interrupted downloads pick up where they left off.
</div>Skip bootstrap: ./install.sh --no-bootstrap
Switching Models
The installer picks a model for your hardware, but you can switch anytime:
dream model current # What's running now?
dream model list # Show all available tiers
dream model swap T3 # Switch to a different tier
If the new model isn't downloaded yet, pre-fetch it first:
./scripts/pre-download.sh --tier 3 # Download before switching
dream model swap T3 # Then swap (restarts llama-server)
Already have a GGUF you want to use? Drop it in data/models/, update GGUF_FILE and LLM_MODEL in .env, and restart:
docker compose restart llama-server
Rollback is automatic — if a new model fails to load, Dream Server reverts to your previous model.
Extensibility
Dream Server is designed to be modded. Every service is an extension — a folder with a manifest.yaml and a compose.yaml. The dashboard, CLI, health checks, and compose stack all discover extensions automatically.
extensions/services/
my-service/
manifest.yaml # Metadata: name, port, health en
Related Skills
node-connect
352.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
prose
352.5kOpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.
frontend-design
111.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
