NestOS
NestOS is a full-stack GUI automation agent. It's not just "chat-driven". Think of it as a digital teammate with a real screen.
Install / Use
/learn @Bigchx/NestOSREADME
NestOS

NestOS is a full-stack GUI automation agent. It's not just "chat-driven". Think of it as a digital teammate with a real screen.
It runs on a server with a full Linux desktop (GUI), so it can click, type, and handle browser tasks like a human, while still doing system-level ops work in the background.
Read this in other languages: English | 简体中文
How It Works
- You send a task in chat.
- The engine executes it inside a remote desktop environment.
- If it hits blockers (captcha, 2FA, etc.), it pauses and asks for human help.
- After you step in, it resumes automatically and reports results in real time.
Core Capabilities (Already Working)
1. Human-in-the-loop verification (Captcha Wall)
Most automation breaks on sliders, Cloudflare checks, or 2FA. NestOS handles this flow:
- OpenClaw tries auto-login first.
- If verification is too hard, it pauses and sends a screenshot.
- You jump in via XRDP and solve it once.
- OpenClaw detects success and continues the task.
With human takeover available when needed, login and interaction success rates go way up.
2. End-to-end project deployment from the web
NestOS is not limited to editing local files. It can build and ship projects directly:
- Browse GitHub and read project docs.
- Pull code with
git clone. - Detect stack and install deps (
npm install,pip install,docker compose up, etc.). - Configure ports and reverse proxy, then return a live URL.
3. Visual operations instead of blind CLI guessing
You can watch the whole process through XRDP and see exactly what it's doing on desktop Chrome.
4. Rich content reading (Powerful but token-heavy)
It can read content across the web, including sites without RSS, but token usage is higher.
Tech Stack (Full-Stack Ready)
- Infrastructure:
Linux (Ubuntu) + 1Panel + Docker/Compose - Multi-language runtime:
PHP / Python / Node.js / Go / Java - Browser execution layer:
OpenClaw Browser (Chrome) - GUI layer:
XRDP + XFCE4 - Control channels:
Web / Discord / Telegram / Feishu / ...(supports all software that can integrate with OpenClaw, such as WhatsApp) - AI memory layer:
qmd index + semantic retrieval + local logs
Problems It Solves
- Captcha wall: combines AI execution with human assist when needed.
- Environment silos: not tied to one language or framework.
- Blind ops: XRDP gives you a clear visual execution path.
- Not just a Mac mini story anymore: Linux can deliver equally strong (or stronger) GUI automation, and in practice you do not need extra skills beyond NestOS for core workflows.
Installation
Prerequisites
- OS:
Ubuntu 22.04 x64(currently the only supported version; compatibility on other systems is unknown) - Recommended spec:
4 vCPU / 8 GB RAMor higher, with decent bandwidth - Recommendation: install this skill on a fresh environment when possible; if not, make a full backup first to avoid environment conflicts or crashes
- Note: current version still has some rough edges and bugs
Step 1: Install OpenClaw
curl -fsSL https://openclaw.ai/install.sh | bash
Step 2: Install the NestOS skill via chat
After your OpenClaw API is configured, send this in chat (before install, you can set your local timezone; bootstrap_NestOS.sh defaults to Asia/Shanghai):
Go to github.com/bigchx/nestos, download nestos-bootstrap.skill, and install the skill by following the prompts. Before running installation, confirm timezone: if I do not specify one, keep default Asia/Shanghai; if I provide one (for example, America/Los_Angeles), install with that timezone.
Step 3: Track installation progress
- Progress callbacks are still incomplete (cron reporting is not fully polished).
- During install, it's normal to ask for progress updates manually.
- Reference timing: on
4C8G + 6 Mbps, install usually takes more than6 minutes.
Step 4: Confirm installation
- The server will reboot once after install; a temporary disconnect is expected.
- Visit
http://<your-ip>:9000. If the page loads normally (for exampleHello, nestos!), core setup is done. - If the page does not open, first confirm the corresponding ports in the returned config are allowed.
Step 5: First-time initialization (important)
- In OpenClaw chat, ask it to return generated config details, then back them up and rotate sensitive defaults.
- Open/allow the corresponding ports listed in that returned config.
- Log in to XRDP once.
- Open desktop Chrome and make sure it launches correctly.
- Import the preloaded plugin from
Downloadsand verify it works. - Exit remote desktop.
Daily Use
After first-time initialization, you can mostly just send chat instructions to update websites or run browser tasks.
Known Use Cases
- Social media ops: log in, schedule posts, monitor comments, and triage inboxes (with manual takeover when verification appears).
- Prompt-to-site updates: use natural language to edit pages, replace content, tweak components, and run pre-publish checks.
- RSS and web aggregation: collect RSS and non-RSS sources, then generate unified digests.
- GitHub direct deployment: read repo docs, clone code, install deps, start services, and return a live URL.
- Competitive and market monitoring: track keyword pages on a schedule and report content diffs.
- Routine ops automation: batch back-office logins, config checks, screenshot archiving, and alerting.
TODO / Known Issues
- [ ] Better install progress visibility and active status callbacks
- [ ] Resource usage hints for heavy multi-tab sessions
- [ ] More advanced cross-platform automation scenarios
Acknowledgements
This project stands on top of great open-source work. Big thanks to the maintainers and contributors.
If NestOS helps you, please consider giving it a Star. If you hit issues, open an Issue or send a PR.
If this project genuinely helped you, feel free to visit bigchx.com and support our original 3D model designs (mostly published on MakerWorld).
Related Skills
node-connect
353.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
353.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
353.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
