LMAgent
LMAgent is a locally-hosted AI agent that connects to any OpenAI-compatible LLM and autonomously completes real tasks reading and writing files, running shell commands, managing git, and more. Pure Python, no cloud dependency, runs entirely on your hardware. Available as a terminal REPL, one-shot CLI, or web UI. Works on Windows, macOS, and Linux.
Install / Use
/learn @janglerjoe-commits/LMAgentREADME
LMAgent
<p align="center"> <img src="LMAgentLogo.png" alt="LMAgent Logo" width="200"> </p> <p align="center"> A locally-hosted AI agent that connects to any OpenAI-compatible LLM and autonomously completes real tasks.<br> Reads and writes files. Runs shell commands. Manages git. Coordinates complex multi-step work through a hierarchical sub-agent system.<br> <strong>Everything runs on your machine. No cloud. No subscriptions.</strong> </p> <p align="center"> <a href="https://www.youtube.com/shorts/-_jKwfssAvA"> <img src="https://img.youtube.com/vi/-_jKwfssAvA/hqdefault.jpg" alt="Watch the demo"> </a> </p> <p align="center"> <em>✦ UI updated — new demo coming soon ✦</em> </p><p align="center"> <img src="discordlmagent.png" alt="LMAgent on Discord" width="200"> </p>
LMAgent doesn't have to live in a browser tab. Once agent_web.py is running, you can wire up Discord, Telegram, WhatsApp, or SMS and talk to your agent from your phone, your server, or wherever you already spend time. Each platform gets its own persistent session — your conversation survives restarts and picks up where you left off. Send /new on any platform to wipe the session and start fresh. Discord — run it as a bot in your own server or via DMs. Mention it or message it directly and it responds in-thread. Telegram — long-polling bot, works on mobile out of the box. Good option if you want to fire off tasks on the go. WhatsApp — connects via the Green API. Same idea — message it like a contact, get a reply when the task is done. SMS — Twilio webhook. If you want to send a task from a basic phone with no app, this is the option. Only one platform is active at a time. You switch between them from the web UI's messaging panel, which also shows connection status and a live feed of recent messages. Setup is just API keys in your .env and agent_messaging.py alongside the other files.
Thank You
Started this for me. Somewhere along the way, it became something people actually gave a damn about.
66 stars is a number, but what it really represents is people choosing to spend attention on something I poured effort into. That means more than I can say.
Thank you, I don't say that as a formality.
- Install dependencies bashpip install requests flask colorama psutil docker Requires Python 3.10 or later. Optional extras — only install what you need: bashpip install Pillow # image uploads in the web UI pip install discord.py # Discord messaging integration pip install python-telegram-bot # Telegram messaging integration pip install twilio # SMS integration WhatsApp (Green API) and QR code sign-in require no extra packages. If a messaging dep is missing, LMAgent will print the exact install command and disable that platform — it won't crash.
What Is This?
You give LMAgent a task in plain English. It figures out the steps, uses real tools to execute them, checks its own work, and tells you when it's done.
Good at:
- Generating or refactoring code across multiple files
- Processing batches of files — renaming, converting, summarising
- Building web projects with HTML, CSS, and JavaScript
- Answering questions about your own codebase using grep and read
- Running shell commands and reacting to their output
- Any multi-step task that would take you several tool switches to do manually
Not magic: It will get stuck occasionally — especially on models smaller than 7B. The loop detector catches most infinite loops automatically, but it isn't perfect. Keep git handy as a rollback.
Quick Start
1. Install dependencies
pip install requests flask colorama psutil docker
Requires Python 3.10 or later.
2. Start Docker Desktop
Download here. The sandbox container is created automatically on first use. On macOS/Linux, Docker is optional — a process-group fallback is used, but it does not isolate the filesystem.
3. Set up your LLM
LMAgent works with any OpenAI-compatible API. The easiest option is LM Studio:
- Download and install LM Studio
- Load a model (7B+ instruct or coder models work best)
- Start the local server — it runs at
http://localhost:1234by default
4. Create a workspace
mkdir ~/lmagent_workspace
This is the only directory the agent can read and write. It cannot touch anything outside it.
5. Create a .env file
WORKSPACE="/home/you/lmagent_workspace"
LLM_URL="http://localhost:1234/v1/chat/completions"
LLM_API_KEY="lm-studio"
LLM_MODEL=""
PERMISSION_MODE="normal"
6. Run it
Terminal REPL:
python agent_main.py
Web UI:
python agent_web.py
Open http://localhost:7860 — a PIN is printed to the console on startup.
How It Works — The 9 Files
LMAgent is nine files. Each one has a distinct job and they stack cleanly on top of each other.
agent_core.py — The Engine Room
Pure infrastructure. No LLM prompting logic lives here — just the plumbing everything else depends on.
- Config — all settings from env vars /
.env(model URL, token limits, workspace path, feature flags) - Safety — validates file paths stay inside the workspace, blocks dangerous shell commands, prevents path traversal
- ShellSession — a persistent bash or PowerShell process for running commands
- Session/State management — JSON-based persistence for conversations, todos, plans, and agent state across runs
- Message compaction — when the conversation exceeds the token budget, old messages are summarised so the agent doesn't run out of context mid-task
- Loop detection — notices when the agent is spinning (repeated tools, no progress, empty replies) and raises a warning before it wastes your time
- MCP integration — manages connections to external tool servers via the Model Context Protocol (JSON-RPC)
agent_tools.py — The Toolbox
Every concrete action the agent can take lives here. Each tool is a Python function that validates its inputs, executes safely, and returns structured JSON. The LLM never runs commands directly — it calls tools, and the tools run commands.
- File tools —
read,write,edit,glob,grep,ls,mkdir— all sandboxed to the workspace with path safety checks - Shell tool — runs commands inside the Docker sandbox with configurable timeout and memory limits
- Git tools —
git_status,git_diff,git_add,git_commit,git_branchwith ref-name validation so the agent can't inject arbitrary git refs - Todo & Plan tools — bookkeeping helpers;
todo_completeautomatically tells the agent to stop when all work is done - Vision tool — sends an image to a loaded VLM (LLaVA, Qwen-VL, Pixtral, etc.) with a prompt; auto-detects whether a vision-capable model is loaded and hides itself if not
- Delegation tools —
delegate(one sub-agent, one deliverable),decompose(up to 8 sequential sub-tasks with dependency ordering), and a backward-compattaskwrapper — all route through BCA TOOL_SCHEMAS+TOOL_HANDLERS— the master registry mapping every tool name to its JSON schema (sent to the LLM) and its Python handler function
agent_llm.py — The LLM Interface
Handles all communication with the model and interprets what comes back.
SYSTEM_PROMPT— the agent's personality and rules: the prime directive ("stop the instant the job is done"), tool usage guidelines, delegation examples, andBLOCKED/WAITformats. Carefully reworded so models default to stopping rather than doing moreLLMClient— sends messages to the LLM endpoint with retry logic, streaming support, automatic recovery if the server drops, and JSON auto-repair for truncated tool-call argumentsdetect_completion()— decides whether the agent is done by scanning forTASK_COMPLETE, short "done"-style replies, or question-asking patterns. Bug-fixed so short completions like "Done." or "All files written." no longer fall through and cause unnecessary extra loops_process_tool_calls()— the post-response dispatcher: checks permissions, calls_execute_tool()for each tool, handles MCP responses, detects todo-loop situations, and injects hard-stop messages when the agent is spinningrun_plan_mode()— a separate lightweight loop just for generating a JSON execution plan, no file writesrun_sub_agent()— a backward-compat isolated agent runner with a restricted tool allowlist; new code uses BCA instead
agent_bca.py — The Sub-Agent Architecture
Implements the Brief-Contract Architecture — a system for spawning focused child agents without drowning them in context.
The problem it solves: naive sub-agent systems pass the parent's full conversation history to every child. On a model with an 8k context window, a sub-agent spawned at iteration 80 gets ~40k tokens of noise and immediately fails.
The fix — four principles:
- Structured Briefs — each child gets a minimal
brief.json(objective + deliverable spec + extracted relevant data). The child reads only this, never the parent's conversation history - Result Contracts — every child writes a structured
result.jsonwhen done. The parent reads JSON — no fragile string parsing - Depth-Scoped Recursion — every agent carries a depth integer. At max depth,
delegateanddecomposeare removed from the tool list entirely — the model literally cannot attempt further recursion - Scope Isolation — each agent gets a private scratch directory for temp work; deliverables always go to workspace-root-relative paths
The three delegation tools:
delegate— spawn one focused sub-agent for one atomic objective with one clear deliverabledecompose— split into up to 8 sequential sub-tasks with dependency ordering; each task's artifacts are automatically injected into dependent tasks' bri
