SkillAgentSearch skills...

DreamServer

Local AI anywhere, for everyone — LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. No cloud, no subscriptions.

Install / Use

/learn @Light-Heart-Labs/DreamServer

README

<div align="center">

Dream Server

Own your AI. One person, one dream, one machine at a time.

A handful of companies control the vast majority of global AI traffic — and with it, your data, your costs, and your uptime. Every query you send to a centralized provider is business intelligence you don’t own, running on infrastructure you don’t control, priced on terms you can’t negotiate.

If AI is becoming critical infrastructure, it shouldn’t be rented. Self-hosting local AI should be a sovereign human right, not a career choice.

Dream Server is the exit. A fully local AI stack — LLM inference, chat, voice, agents, workflows, RAG, image generation, and privacy tools — deployed on your hardware with a single command. No cloud. No subscriptions. No one watching.

License: Apache 2.0 GitHub Stars Release

Dream Server Dashboard

Watch the demo

New here? Read the Friendly Guide or listen to the audio version — a complete walkthrough of what Dream Server is, how it works, and how to make it your own. No technical background needed.

</div>

Platform Support — March 2026

| Platform | Status | |----------|--------| | Linux (NVIDIA + AMD) | Supported — install and run today | | Windows (NVIDIA + AMD) | Supported — install and run today | | macOS (Apple Silicon) | Supported — install and run today |

Tested Linux distros: Ubuntu 24.04/22.04, Debian 12, Fedora 41+, Arch Linux, CachyOS, openSUSE Tumbleweed. Other distros using apt, dnf, pacman, or zypper should also work — open an issue if yours doesn't.

Windows: Requires Docker Desktop with WSL2 backend. NVIDIA GPUs use Docker GPU passthrough; AMD Strix Halo runs natively with Lemonade (NPU + ROCm + Vulkan acceleration).

macOS: Requires Apple Silicon (M1+) and Docker Desktop. llama-server runs natively with Metal GPU acceleration; all other services run in Docker.

See the Support Matrix for details.


Why Dream Server?

Because running your own AI shouldn't require a CS degree and a weekend of debugging CUDA drivers. Right now, setting up local AI means stitching together a dozen projects, writing Docker configs from scratch, and praying everything talks to each other. Most people give up and go back to paying OpenAI.

We built Dream Server so you don't have to.

  • One command — detects your GPU, picks the right model, generates credentials, launches everything
  • Chatting in under 2 minutes — bootstrap mode gives you a working model instantly while your full model downloads in the background
  • 13 services, pre-wired — chat, agents, voice, workflows, search, RAG, image generation, privacy tools. All talking to each other out of the box
  • Fully moddable — every service is an extension. Drop in a folder, run dream enable, done
curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/DreamServer/v2.3.2/dream-server/get-dream-server.sh | bash

Open http://localhost:3000 and start chatting.

No GPU? Dream Server also runs in cloud mode — same full stack, powered by OpenAI/Anthropic/Together APIs instead of local inference:

./install.sh --cloud

Port conflicts? Every port is configurable via environment variables. See .env.example for the full list, or override at install time:

WEBUI_PORT=9090 ./install.sh
<div align="center">

Dream Server Installer

The DREAMGATE installer handles everything — GPU detection, model selection, service orchestration.

</div> <details> <summary><b>Manual install (Linux)</b></summary>
git clone https://github.com/Light-Heart-Labs/DreamServer.git
cd DreamServer/dream-server
./install.sh
</details> <details> <summary><b>Windows (PowerShell)</b></summary>

Requires Docker Desktop with WSL2 backend enabled. Install Docker Desktop first and make sure it is running before you start.

Open PowerShell as Administrator and run:

Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
git clone https://github.com/Light-Heart-Labs/DreamServer.git
cd DreamServer
.\install.ps1

The Set-ExecutionPolicy command allows the installer script to run in the current session. It does not change your system-wide policy.

The installer detects your GPU, picks the right model, generates credentials, starts all services, and creates a Desktop shortcut to the Dashboard. Manage with .\dream-server\installers\windows\dream.ps1 status.

</details> <details> <summary><b>macOS (Apple Silicon)</b></summary>

Requires Apple Silicon (M1+) and Docker Desktop. Install Docker Desktop first and make sure it is running before you start.

git clone https://github.com/Light-Heart-Labs/DreamServer.git
cd DreamServer/dream-server
./install.sh

The installer detects your chip, picks the right model for your unified memory, launches llama-server natively with Metal acceleration, and starts all other services in Docker. Manage with ./dream-macos.sh status.

See the macOS Quickstart for details.

</details>

What's In The Box

Chat & Inference

  • Open WebUI — full-featured chat interface with conversation history, web search, document upload, and 30+ languages
  • llama-server — high-performance LLM inference with continuous batching, auto-selected for your GPU
  • LiteLLM — API gateway supporting local/cloud/hybrid modes

Voice

  • Whisper — speech-to-text
  • Kokoro — text-to-speech

Agents & Automation

  • OpenClaw — autonomous AI agent framework
  • n8n — workflow automation with 400+ integrations (Slack, email, databases, APIs)

Knowledge & Search

  • Qdrant — vector database for retrieval-augmented generation (RAG)
  • SearXNG — self-hosted web search (no tracking)
  • Perplexica — deep research engine

Creative

  • ComfyUI — node-based image generation

Privacy & Ops

  • Privacy Shield — PII scrubbing proxy for API calls
  • Dashboard — real-time GPU metrics, service health, model management

Hardware Auto-Detection

The installer detects your GPU and picks the optimal model automatically. No manual configuration.

NVIDIA

| VRAM | Model | Example GPUs | |------|-------|--------------| | < 8 GB | Qwen3.5 2B (Q4_K_M) | Any GPU or CPU-only | | 8–11 GB | Qwen3.5 9B (Q4_K_M) | RTX 4060 Ti, RTX 3060 12GB | | 12–20 GB | Qwen3.5 9B (Q4_K_M) | RTX 3090, RTX 4080 | | 20–40 GB | Qwen3 30B-A3B MoE (Q4_K_M) | RTX 4090, A6000 | | 40+ GB | Qwen3 30B-A3B (MoE, Q4_K_M) | A100, multi-GPU | | 90+ GB | Qwen3 Coder Next (80B MoE, Q4_K_M) | Multi-GPU A100/H100 |

AMD Strix Halo (Unified Memory)

| Unified RAM | Model | Hardware | |-------------|-------|----------| | 64–89 GB | Qwen3 30B-A3B (30B MoE) | Ryzen AI MAX+ 395 (64GB) | | 90+ GB | Qwen3 Coder Next (80B MoE) | Ryzen AI MAX+ 395 (96GB) |

Apple Silicon (Unified Memory, Metal)

| Unified RAM | Model | Example Hardware | |-------------|-------|-----------------| | < 16 GB | Qwen3.5 2B (Q4_K_M) | M1/M2 base (8GB) | | 16–24 GB | Qwen3.5 4B (Q4_K_M) | M4 Mac Mini (16GB) | | 32 GB | Qwen3.5 9B (Q4_K_M) | M4 Pro Mac Mini, M3 Max MacBook Pro | | 48 GB | Qwen3 30B-A3B (MoE, Q4_K_M) | M4 Pro (48GB), M2 Max (48GB) | | 64+ GB | Qwen3 30B-A3B (MoE, Q4_K_M) | M2 Ultra Mac Studio, M4 Max (64GB+) |

Override tier selection: ./install.sh --tier 3


Bootstrap Mode

No waiting for large downloads. Dream Server uses bootstrap mode by default:

  1. Downloads a tiny 1.5B model in under a minute
  2. You start chatting immediately
  3. The full model downloads in the background
  4. Hot-swap to the full model when it's ready — zero downtime
<div align="center">

Installer downloading modules

The installer pulls all services in parallel. Downloads are resume-capable — interrupted downloads pick up where they left off.

</div>

Skip bootstrap: ./install.sh --no-bootstrap


Switching Models

The installer picks a model for your hardware, but you can switch anytime:

dream model current              # What's running now?
dream model list                 # Show all available tiers
dream model swap T3              # Switch to a different tier

If the new model isn't downloaded yet, pre-fetch it first:

./scripts/pre-download.sh --tier 3    # Download before switching
dream model swap T3                    # Then swap (restarts llama-server)

Already have a GGUF you want to use? Drop it in data/models/, update GGUF_FILE and LLM_MODEL in .env, and restart:

docker compose restart llama-server

Rollback is automatic — if a new model fails to load, Dream Server reverts to your previous model.


Extensibility

Dream Server is designed to be modded. Every service is an extension — a folder with a manifest.yaml and a compose.yaml. The dashboard, CLI, health checks, and compose stack all discover extensions automatically.

extensions/services/
  my-service/
    manifest.yaml      # Metadata: name, port, health en

Related Skills

View on GitHub
GitHub Stars449
CategoryDevelopment
Updated6m ago
Forks129

Languages

Rust

Security Score

100/100

Audited on Apr 9, 2026

No findings