Cli
The AI agent script CLI for Programmable Prompt Engine.
Install / Use
/learn @offline-ai/CliREADME
Offline AI PPE CLI(WIP)
【English|中文】
The AI agent script CLI for Programmable Prompt Engine.
Enjoying this project? Please star it! 🌟
Features:
-
Programmable Prompt Engineering (PPE) language is a simple and natural scripting language designed for handling prompt information. This language is used to develop various agents that can be reused, inherited, combined, or called.
-
Achieve or approximate the performance of ChatGPT 4 with open-source LLMs of medium to small scale (35B-4B parameters).
-
User-friendly for ai development and creation of intelligent applications...
-
Low-code or even no-code solutions for rapid ai development...
-
Flexible, adding custom instructions within scripts, and inter-script calls...
-
The data is completely open to the script, and the input and output data, even the internal data, can be freely accessed in the script
-
Powerful, enabling event transmission seamlessly between client and server with numerous utility functions...
-
Secure, supporting encrypted execution and usage limits for scripts(TODO)...
-
Enable the local deployment and execution of large language models (LLMs) such as LLaMA, Qwen, Gemma, Phi, GLM, Mistral, and more.
-
The AI Agent Script follows the Programmable Prompt Engine Specification.
- Visit the site for the detailed AI Agent script usage.
-
- Unit Test Fixture Demo: https://github.com/offline-ai/cli/tree/main/examples/split-text-paragraphs
-
Smart caching of LLM large models and intelligent agent invocation results to accelerate execution and reduce token expenses.
-
Support for Multi LLM Service Providers:
- (Recommended) Builtin local LLM provider(llama.cpp) as default to protect the security and privacy of the knowledge.
- Download GGUF model file first:
ai brain download hf://bartowski/Qwen_QwQ-32B-GGUF -q q4_0 - Run with the default brain model file:
ai run example.ai.yaml - Run with specified the model file:
ai run example.ai.yaml -P local://bartowski-qwq-32b.Q4_0.gguf
- Download GGUF model file first:
- OpenAI Compatible Service Provider:
- OpenAI:
ai run example.ai.yaml -P openai://chatgpt-4o-latest --apiKey “sk-XXX” - DeepSeek:
ai run example.ai.yaml -P openai://deepseek-chat -u https://api.deepseek.com/ --apiKey “sk-XXX” - Siliconflow:
ai run example.ai.yaml -P openai://Qwen/Qwen2.5-Coder-7B-Instruct -u https://api.siliconflow.cn/ --apiKey “sk-XXX” - Anthropic(Claude):
ai run example.ai.yaml -P openai://claude-3-7-sonnet-latest -u https://api.anthropic.com/v1/ --apiKey “sk-XXX”
- OpenAI:
- llama-cpp Server(llama-server) Provider:
ai run example.ai.yaml -P llamacpp- llama-cpp Server does not support specifying model name, It is specified with the model parameter when llama-server is started.
- You can specify or arbitrarily switch LLM model or provider in the PPE script.
--- parameters: model: openai://deepseek-chat apiUrl: https://api.deepseek.com/ apiKey: "sk-XXX" --- system: You are a helpful assistant. user: "tell me a joke" --- assistant: "[[AI]]" --- assistant: "[[AI:model='local://bartowski-qwq-32b.Q4_0.gguf']]" - (Recommended) Builtin local LLM provider(llama.cpp) as default to protect the security and privacy of the knowledge.
-
Builtin local LLM provider(llama.cpp) Features:
- By default, it automatically detects memory and GPU, and uses the best computing layer by default. It automatically allocates gpu - layers and context window size (it will adopt the largest possible value) to get the best performance from the hardware without manually configuring anything.
- It is recommended to configure the context window yourself.
- System security: Support for system template anti-injection (to avoid jailbreaking).
- Support for general tool invocation (Tool Funcs) of any LLM models (only for builtin local LLM provider):
- Can be supported without specific training of LLM, requiring LLM can accurately follow instructions.
- Minimum adaptation for 3B model, recommended to use 7B and above.
- Dual permission control:
- Scripts set the list of tools AI can use.
- Users set the list of tools scripts can use.
- Support for General Thinking Mode (
shouldThink) of large models (only for builtin local LLM provider):- Can be supported without specific training of LLM, requiring LLM can accurately follow instructions.
- Answer first then think (
last). - Think first then answer(
first). - Think deeply then answer(
deep): 7B and above.
- By default, it automatically detects memory and GPU, and uses the best computing layer by default. It automatically allocates gpu - layers and context window size (it will adopt the largest possible value) to get the best performance from the hardware without manually configuring anything.
-
Package support.
-
PPE supports direct invocation of wasm.
-
Support for multiple structured response output format types(
response_format.type):- JSON format.
- YAML format.
- Natural Language Object(NOBJ) format.
- Set
outputwith JSON Schema format.PPE will automatically parse the content generated by AI in the corresponding format into anObjectfor code use.
Developing an intelligent application with AI Agent Script Engine involves just three steps:
- Choose an appropriate brain🧠 (LLM Large Language Model)
- Select a parameter size based on your application's requirements; larger sizes offer better quality but consume more resources and increase response time...
- Choose the model's expertise: Different models are trained with distinct methods and datasets, resulting in unique capabilities...
- Optimize quantization: Higher levels of quantization (compression) result in faster speed and smaller size, but potentially lower accuracy...
- Decide on the optimal context window size (
max_tokens): Typically, 2048 is sufficient; this parameter also influences model performance... - Use the client (
@offline-ai/cli) directly to download the AI brain:ai brain download
- Create the ai application's agent script file and debug prompts using the client (
@offline-ai/cli):ai run your_script.ai.yaml --interactive --loglevel info. - Integrate the script into your ai application.
- One-click packaging into standalone intelligent applications (TODO)
- Offline AI PPE CLI(WIP)
- Quick Start
- API mode, translate the TODO file to English
- interactive mode
- Usage
- Commands
- Credit
Quick Start
- Quick Start Programming Guide
- More examples
- AI Applications written in PPE Language:
- AI Guide App For PPE Guide - WIP
ai run guidein the project root folder to run the guide
- AI Terminal Shell
- AI Guide App For PPE Guide - WIP
- LLM Inference Providers:
llamacpp: llama.cpp server as the default local LLM provider. If noprovideris specified,llamacppis used.openai: Also supports OpenAI-compatible service API providers.--provider openai://chatgpt-4o-latest --apiKey “sk-XXX”
Note: Limitations of OpenAI-Compatible Service API Providers
- OpenAI must be a large model (
gpt-4o) released after2024-07-18to supportjson-schema. Before this date, onlyjsonis guaranteed, not thejson-schema. - All
siliconflowmodels only guaranteejsonsupport, notjson-schemasupport. [[Fruit:|Apple|Banana]]: Syntax for forcing AI to choose either Apple or Banana will be invalid.
PPE CLI Command
ai is the shell CLI command to manage the brain(LLM) files and run a PPE agent script mainly.
- Run script file command
ai run, eg,ai run -f calculator.ai.yaml "{content: '32+12*53'}"-fis used to specify the script file.{content: '32+12*53'}is the optional json input to the script.- Scripts will display intermediate echo outputs during processing when streaming output. This can be controlled with
--streamEcho true|line|false. To keep the displayed echo outputs, use--no-consoleClear. - Script can be single YAML file (
.ai.yaml) or directory.- Directory must have an entry point script file with the same name as the directory. Other scripts in the directory can call each other.
- Manage the brain files command
ai brainincludeai brain download,ai brain list/search. - Run
ai helporai help [command]to get more.
Programmable Prompt Engine Language
Programmable Prompt Engine (PPE) Language is a message-processing language, similar to the YAML format.
PPE is designed to define AI prompt messages and their input/output configurations. It allows for the creation of a reusable and programmable prompt system akin to software engineering practices.
I. Core Structure
- Message-Based Dialogue: Defines interactions as a series of messages with roles (system, user, assistant).
- YAML-Like: Syntax is similar to YAML, making it readable and easy to understand.
- Dialogue Separation: Uses triple dashes (
---) or asterisks (***) to clearly mark dialogue turns.
II. Reusability & Configuration
- Input/Output Configuration (Front-Matter): Defines input requirements (using
inputkeyword) and expected output format (usingoutputkeyword with JSON Schema). - Prompt Template: Embeds variables from input configuration or prompt settings into messages using Jinja2 templates (
{{variable_name}}). - Custom Script Types: Allows defining reusable script types (
type: type) for code and configuration inheritance.
III. AI Capabilities
- **Advanced AI Replacem
Related Skills
node-connect
353.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
353.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
353.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
