MobileAgent
Mobile-Agent: The Powerful GUI Agent Family
Install / Use
/learn @X-PLUG/MobileAgentQuality Score
Category
Development & EngineeringSupported Platforms
README
👏 Welcome to try Mobile-Agent-v3.5 via our <img src="./assets/tongyi.png" width="14px" style="display:inline;"> Modelscope online demo or <img src="./assets/aliyun.png" width="14px" style="display:inline;"> Bailian online demo!
❗️We provide the Mobile-Agent-v3.5 API on <img src="./assets/aliyun.png" width="14px" style="display:inline;">Bailian for quick experience. View the documentation.
<p align="center"> 🤗 <a href="https://huggingface.co/collections/mPLUG/gui-owl-15" target="_blank">GUI-Owl-1.5 Collection</a> | <img src="./assets/tongyi.png" width="14px" style="display:inline;"> <a href="https://modelscope.cn/collections/iic/GUI-Owl-15" target="_blank">GUI-Owl-1.5 Collection</a> </p> <p align="center"> 🤗 <a href="https://huggingface.co/mPLUG/GUI-Owl-32B" target="_blank">GUI-Owl-32B</a> | <img src="./assets/tongyi.png" width="14px" style="display:inline;"> <a href="https://modelscope.cn/models/iic/GUI-Owl-32B" target="_blank">GUI-Owl-32B</a> | 🤗 <a href="https://huggingface.co/mPLUG/GUI-Owl-7B" target="_blank">GUI-Owl-7B</a> | <img src="./assets/tongyi.png" width="14px" style="display:inline;"> <a href="https://modelscope.cn/models/iic/GUI-Owl-7B" target="_blank">GUI-Owl-7B</a> </p> </div> <div align="center"> <a href="README.md">English</a> | <a href="README_zh.md">简体中文</a> <hr> </div>📢News
[2026.3.31]🔥🔥 The Mobile-Agent-v3.5 is now available on Alibaba Cloud Wuying Cloud Phone (无影云手机) - a cloud-based Android environment for seamless Mobile Use experience. Learn More: Alibaba Cloud Wuying Cloud Phone | Documentation.[2026.3.19]🔥🔥 The GUI-Owl-1.5 series models are now available for online inference. Please refer to the <img src="./assets/aliyun.png" width="14px" style="display:inline;"> Alibaba Cloud Bailian, and <img src="./assets/tongyi.png" width="14px" style="display:inline;"> Modelscope API-Inference.[2026.2.14]🔥 GUI-Owl-1.5 is released, a new family of native multi-platform GUI agent foundation models (2B/4B/8B/32B/235B; Instruct & Thinking). The next-generation native GUI agent model family built on Qwen3-VL, supporting desktop/mobile/browser automation and achieving SOTA results on 20+ GUI benchmarks, with strong performance on end-to-end tasks, grounding, tool/MCP calling, and long-horizon memory. Model weights are available on HuggingFace. Technical report is avaliable on Link. See the GUI-Owl 1.5 README for details.[2025.11.25]The GUI-Owl series models are now available for online inference, thanks to Alibaba Cloud Bailian for providing computing power support. Please refer to the Link.[2025.10.30]We released OSWorld-MCP, which is a benchmark for evaluating Model Context Protocol (MCP) tool invocation capabilities in real-world scenarios. See the Link.[2025.9.24]We've released the demo on ModelScope that's based on Wuying Cloud Desktop and Phone. No need to deploy models locally or prepare devices, just input your instruction to experience Mobile-Agent-v3! <img src="./assets/tongyi.png" width="14px" style="display:inline;"> ModelScope Demo Link and <img src="./assets/aliyun.png" width="14px" style="display:inline;"> Bailian Demo Link. For a limited-time free Mobile-Agent-v3 API, please check the documentation. The new version based on Qwen-3-VL is coming soon.[2025.9.19]GUI-Critic-R1 has been accepted by The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025).[2025.9.16]We have released our latest work, UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning. The paper, code, dataset and model are now open-sourced.[2025.9.16]We've open-sourced the code of GUI-Owl and Mobile-Agent-v3 on OSWorld, AndroidWorld, and real-world mobile scenarios. See the OSWorld Code. The OSWorld RL-tuned checkpoint of GUI-Owl is also released. See the AndroidWorld Code and Real-world Scenarios Code.[2025.8.20]All new GUI-Owl and Mobile-Agent-v3 are released! Technical report can be found here. And model checkpoint will be released on GUI-Owl-7B and GUI-Owl-32B.- GUI-Owl is a multi-modal cross-platform GUI VLM with GUI perception, grounding, and end-to-end operation capabilities.
- Mobile-Agent-v3 is a cross-platform multi-agent framework based on GUI-Owl. It provides capabilities such as planning, progress management, reflection, and memory.
[2025.8.14]Mobile-Agent-v3 won the best demo award at the The 24rd China National Conference on Computational Linguistics (CCL 2025).[2025.3.17]PC-Agent has been accepted by the ICLR 2025 Workshop.[2024.9.26]Mobile-Agent-v2 has been accepted by The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024).[2024.7.29]Mobile-Agent won the best demo award at the The 23rd China National Conference on Computational Linguistics (CCL 2024).[2024.3.10]Mobile-Agent has been accepted by the ICLR 2024 Workshop.
📊Results
<div align="center"> <p align="center"> <img src="assets/result.png"/> </p> </div>👀Features
<div align="center"> <p align="center"> <img src="assets/framework.png"/> </p> </div>📝Series of Work
- Mobile-Agent-v3.5 (Preprint): Multi-platform Fundamental GUI Agents. [Paper] [Code]
- Mobile-Agent-v3 (Preprint): Multi-modal and multi-platform GUI agent. [Paper] [Code]
- UI-S1 (Preprint): Advancing GUI Automation via Semi-online Reinforcement Learning. [Paper] [Code] [Dataset]
- GUI-Critic-R1 (NeurIPS 2025): A GUI-Critic for pre-operative error diagnosis method. [Paper] [Code]
- PC-Agent (ICLR 2025 Workshop): Multi-agent for multimodal PC operation. [Paper] [Code]
- Mobile-Agent-E (Preprint): Multi-agent for self-evolving mobile phone operation. [Paper] [Code]
- Mobile-Agent-v2 (NeurIPS 2024): Multi-agent for multimodal mobile phone operation. [Paper] [Code]
- Mobile-Agent-v1 (ICLR 2024 Workshop): Single-agent for multimodal mobile phone
Related Skills
imsg
344.1kiMessage/SMS CLI for listing chats, history, and sending messages via Messages.app.
node-connect
344.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
oracle
344.1kBest practices for using the oracle CLI (prompt + file bundling, engines, sessions, and file attachment patterns).
lobster
344.1kLobster Lobster executes multi-step workflows with approval checkpoints. Use it when: - User wants a repeatable automation (triage, monitor, sync) - Actions need human approval before executing (s
