DreamServer

Local AI anywhere, for everyone — LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. No cloud, no subscriptions.

Generate Convert Improve

Install / Use

/learn @Light-Heart-Labs/DreamServer

About this skill

Quality Score

0/100

README

Dream Server

Own your AI. One person, one dream, one machine at a time.

A handful of companies control the vast majority of global AI traffic — and with it, your data, your costs, and your uptime. Every query you send to a centralized provider is business intelligence you don’t own, running on infrastructure you don’t control, priced on terms you can’t negotiate.

If AI is becoming critical infrastructure, it shouldn’t be rented. Self-hosting local AI should be a sovereign human right, not a career choice.

Dream Server is the exit. A fully local AI stack — LLM inference, chat, voice, agents, workflows, RAG, image generation, and privacy tools — deployed on your hardware with a single command. No cloud. No subscriptions. No one watching.

Dream Server Dashboard

New here? Read the Friendly Guide or listen to the audio version — a complete walkthrough of what Dream Server is, how it works, and how to make it your own. No technical background needed.

</div>

Platform Support — March 2026

| Platform | Status | |----------|--------| | Linux (NVIDIA + AMD) | Supported — install and run today | | Windows (NVIDIA + AMD) | Supported — install and run today | | macOS (Apple Silicon) | Supported — install and run today |

Tested Linux distros: Ubuntu 24.04/22.04, Debian 12, Fedora 41+, Arch Linux, CachyOS, openSUSE Tumbleweed. Other distros using apt, dnf, pacman, or zypper should also work — open an issue if yours doesn't.

Windows: Requires Docker Desktop with WSL2 backend. NVIDIA GPUs use Docker GPU passthrough; AMD Strix Halo runs natively with Lemonade (NPU + ROCm + Vulkan acceleration).

macOS: Requires Apple Silicon (M1+) and Docker Desktop. llama-server runs natively with Metal GPU acceleration; all other services run in Docker.

See the Support Matrix for details.

Why Dream Server?

Because running your own AI shouldn't require a CS degree and a weekend of debugging CUDA drivers. Right now, setting up local AI means stitching together a dozen projects, writing Docker configs from scratch, and praying everything talks to each other. Most people give up and go back to paying OpenAI.

We built Dream Server so you don't have to.

One command — detects your GPU, picks the right model, generates credentials, launches everything
Chatting in under 2 minutes — bootstrap mode gives you a working model instantly while your full model downloads in the background
13 services, pre-wired — chat, agents, voice, workflows, search, RAG, image generation, privacy tools. All talking to each other out of the box
Fully moddable — every service is an extension. Drop in a folder, run dream enable, done

curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/DreamServer/v2.3.2/dream-server/get-dream-server.sh | bash

Open http://localhost:3000 and start chatting.

No GPU? Dream Server also runs in cloud mode — same full stack, powered by OpenAI/Anthropic/Together APIs instead of local inference:
./install.sh --cloud

Port conflicts? Every port is configurable via environment variables. See .env.example for the full list, or override at install time:
WEBUI_PORT=9090 ./install.sh

Dream Server Installer

The DREAMGATE installer handles everything — GPU detection, model selection, service orchestration.

</div> <details> <summary><b>Manual install (Linux)</b></summary>

git clone https://github.com/Light-Heart-Labs/DreamServer.git
cd DreamServer/dream-server
./install.sh

</details> <details> <summary><b>Windows (PowerShell)</b></summary>

Requires Docker Desktop with WSL2 backend enabled. Install Docker Desktop first and make sure it is running before you start.

Open PowerShell as Administrator and run:

Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
git clone https://github.com/Light-Heart-Labs/DreamServer.git
cd DreamServer
.\install.ps1

The Set-ExecutionPolicy command allows the installer script to run in the current session. It does not change your system-wide policy.

The installer detects your GPU, picks the right model, generates credentials, starts all services, and creates a Desktop shortcut to the Dashboard. Manage with .\dream-server\installers\windows\dream.ps1 status.

</details> <details> <summary><b>macOS (Apple Silicon)</b></summary>

Requires Apple Silicon (M1+) and Docker Desktop. Install Docker Desktop first and make sure it is running before you start.

git clone https://github.com/Light-Heart-Labs/DreamServer.git
cd DreamServer/dream-server
./install.sh

The installer detects your chip, picks the right model for your unified memory, launches llama-server natively with Metal acceleration, and starts all other services in Docker. Manage with ./dream-macos.sh status.

See the macOS Quickstart for details.

</details>

What's In The Box

Chat & Inference

Open WebUI — full-featured chat interface with conversation history, web search, document upload, and 30+ languages
llama-server — high-performance LLM inference with continuous batching, auto-selected for your GPU
LiteLLM — API gateway supporting local/cloud/hybrid modes

Voice

Whisper — speech-to-text
Kokoro — text-to-speech

Agents & Automation

OpenClaw — autonomous AI agent framework
n8n — workflow automation with 400+ integrations (Slack, email, databases, APIs)

Knowledge & Search

Qdrant — vector database for retrieval-augmented generation (RAG)
SearXNG — self-hosted web search (no tracking)
Perplexica — deep research engine

Creative

ComfyUI — node-based image generation

Privacy & Ops

Privacy Shield — PII scrubbing proxy for API calls
Dashboard — real-time GPU metrics, service health, model management

Hardware Auto-Detection

The installer detects your GPU and picks the optimal model automatically. No manual configuration.

NVIDIA

| VRAM | Model | Example GPUs | |------|-------|--------------| | < 8 GB | Qwen3.5 2B (Q4_K_M) | Any GPU or CPU-only | | 8–11 GB | Qwen3.5 9B (Q4_K_M) | RTX 4060 Ti, RTX 3060 12GB | | 12–20 GB | Qwen3.5 9B (Q4_K_M) | RTX 3090, RTX 4080 | | 20–40 GB | Qwen3 30B-A3B MoE (Q4_K_M) | RTX 4090, A6000 | | 40+ GB | Qwen3 30B-A3B (MoE, Q4_K_M) | A100, multi-GPU | | 90+ GB | Qwen3 Coder Next (80B MoE, Q4_K_M) | Multi-GPU A100/H100 |

AMD Strix Halo (Unified Memory)

| Unified RAM | Model | Hardware | |-------------|-------|----------| | 64–89 GB | Qwen3 30B-A3B (30B MoE) | Ryzen AI MAX+ 395 (64GB) | | 90+ GB | Qwen3 Coder Next (80B MoE) | Ryzen AI MAX+ 395 (96GB) |

Apple Silicon (Unified Memory, Metal)

| Unified RAM | Model | Example Hardware | |-------------|-------|-----------------| | < 16 GB | Qwen3.5 2B (Q4_K_M) | M1/M2 base (8GB) | | 16–24 GB | Qwen3.5 4B (Q4_K_M) | M4 Mac Mini (16GB) | | 32 GB | Qwen3.5 9B (Q4_K_M) | M4 Pro Mac Mini, M3 Max MacBook Pro | | 48 GB | Qwen3 30B-A3B (MoE, Q4_K_M) | M4 Pro (48GB), M2 Max (48GB) | | 64+ GB | Qwen3 30B-A3B (MoE, Q4_K_M) | M2 Ultra Mac Studio, M4 Max (64GB+) |

Override tier selection: ./install.sh --tier 3

Bootstrap Mode

No waiting for large downloads. Dream Server uses bootstrap mode by default:

Downloads a tiny 1.5B model in under a minute
You start chatting immediately
The full model downloads in the background
Hot-swap to the full model when it's ready — zero downtime

Installer downloading modules

The installer pulls all services in parallel. Downloads are resume-capable — interrupted downloads pick up where they left off.

</div>

Skip bootstrap: ./install.sh --no-bootstrap

Switching Models

The installer picks a model for your hardware, but you can switch anytime:

dream model current              # What's running now?
dream model list                 # Show all available tiers
dream model swap T3              # Switch to a different tier

If the new model isn't downloaded yet, pre-fetch it first:

./scripts/pre-download.sh --tier 3    # Download before switching
dream model swap T3                    # Then swap (restarts llama-server)

Already have a GGUF you want to use? Drop it in data/models/, update GGUF_FILE and LLM_MODEL in .env, and restart:

docker compose restart llama-server

Rollback is automatic — if a new model fails to load, Dream Server reverts to your previous model.

Extensibility

Dream Server is designed to be modded. Every service is an extension — a folder with a manifest.yaml and a compose.yaml. The dashboard, CLI, health checks, and compose stack all discover extensions automatically.

extensions/services/
  my-service/
    manifest.yaml      # Metadata: name, port, health en

Related Skills

node-connect

352.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

prose

352.5k

OpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.

frontend-design

111.3k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

352.5k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).