Cli

The AI agent script CLI for Programmable Prompt Engine.

Generate Convert Improve

Install / Use

/learn @offline-ai/Cli

About this skill

Quality Score

0/100

README

Offline AI PPE CLI(WIP)

【English|中文】

The AI agent script CLI for Programmable Prompt Engine.

Enjoying this project? Please star it! 🌟

Features:

Programmable Prompt Engineering (PPE) language is a simple and natural scripting language designed for handling prompt information. This language is used to develop various agents that can be reused, inherited, combined, or called.
Achieve or approximate the performance of ChatGPT 4 with open-source LLMs of medium to small scale (35B-4B parameters).
User-friendly for ai development and creation of intelligent applications...
Low-code or even no-code solutions for rapid ai development...
Flexible, adding custom instructions within scripts, and inter-script calls...
The data is completely open to the script, and the input and output data, even the internal data, can be freely accessed in the script
Powerful, enabling event transmission seamlessly between client and server with numerous utility functions...
Secure, supporting encrypted execution and usage limits for scripts(TODO)...
Enable the local deployment and execution of large language models (LLMs) such as LLaMA, Qwen, Gemma, Phi, GLM, Mistral, and more.
The AI Agent Script follows the Programmable Prompt Engine Specification.
- Visit the site for the detailed AI Agent script usage.
PPE Fixtures Unit Test
- Unit Test Fixture Demo: https://github.com/offline-ai/cli/tree/main/examples/split-text-paragraphs
Smart caching of LLM large models and intelligent agent invocation results to accelerate execution and reduce token expenses.
Support for Multi LLM Service Providers:
- (Recommended) Builtin local LLM provider(llama.cpp) as default to protect the security and privacy of the knowledge.
  - Download GGUF model file first: ai brain download hf://bartowski/Qwen_QwQ-32B-GGUF -q q4_0
  - Run with the default brain model file: ai run example.ai.yaml
  - Run with specified the model file: ai run example.ai.yaml -P local://bartowski-qwq-32b.Q4_0.gguf
- OpenAI Compatible Service Provider:
  - OpenAI: ai run example.ai.yaml -P openai://chatgpt-4o-latest --apiKey “sk-XXX”
  - DeepSeek: ai run example.ai.yaml -P openai://deepseek-chat -u https://api.deepseek.com/ --apiKey “sk-XXX”
  - Siliconflow: ai run example.ai.yaml -P openai://Qwen/Qwen2.5-Coder-7B-Instruct -u https://api.siliconflow.cn/ --apiKey “sk-XXX”
  - Anthropic(Claude): ai run example.ai.yaml -P openai://claude-3-7-sonnet-latest -u https://api.anthropic.com/v1/ --apiKey “sk-XXX”
- llama-cpp Server(llama-server) Provider: ai run example.ai.yaml -P llamacpp
  - llama-cpp Server does not support specifying model name, It is specified with the model parameter when llama-server is started.
- You can specify or arbitrarily switch LLM model or provider in the PPE script.
```
---
parameters:
  model: openai://deepseek-chat
  apiUrl: https://api.deepseek.com/
  apiKey: "sk-XXX"
---
system: You are a helpful assistant.
user: "tell me a joke"
---
assistant: "[[AI]]"
---
assistant: "[[AI:model='local://bartowski-qwq-32b.Q4_0.gguf']]"
```
Builtin local LLM provider(llama.cpp) Features:
- By default, it automatically detects memory and GPU, and uses the best computing layer by default. It automatically allocates gpu - layers and context window size (it will adopt the largest possible value) to get the best performance from the hardware without manually configuring anything.
  - It is recommended to configure the context window yourself.
- System security: Support for system template anti-injection (to avoid jailbreaking).
- Support for general tool invocation (Tool Funcs) of any LLM models (only for builtin local LLM provider):
  - Can be supported without specific training of LLM, requiring LLM can accurately follow instructions.
  - Minimum adaptation for 3B model, recommended to use 7B and above.
  - Dual permission control:
    1. Scripts set the list of tools AI can use.
    2. Users set the list of tools scripts can use.
- Support for General Thinking Mode (shouldThink) of large models (only for builtin local LLM provider):
  - Can be supported without specific training of LLM, requiring LLM can accurately follow instructions.
  - Answer first then think (last).
  - Think first then answer(first).
  - Think deeply then answer(deep): 7B and above.
Package support.
PPE supports direct invocation of wasm.
Support for multiple structured response output format types(response_format.type):
- JSON format.
- YAML format.
- Natural Language Object(NOBJ) format.
- Set output with JSON Schema format.PPE will automatically parse the content generated by AI in the corresponding format into an Object for code use.

Developing an intelligent application with AI Agent Script Engine involves just three steps:

Choose an appropriate brain🧠 (LLM Large Language Model)
- Select a parameter size based on your application's requirements; larger sizes offer better quality but consume more resources and increase response time...
- Choose the model's expertise: Different models are trained with distinct methods and datasets, resulting in unique capabilities...
- Optimize quantization: Higher levels of quantization (compression) result in faster speed and smaller size, but potentially lower accuracy...
- Decide on the optimal context window size (max_tokens): Typically, 2048 is sufficient; this parameter also influences model performance...
- Use the client (@offline-ai/cli) directly to download the AI brain: ai brain download
Create the ai application's agent script file and debug prompts using the client (@offline-ai/cli): ai run your_script.ai.yaml --interactive --loglevel info.
Integrate the script into your ai application.
One-click packaging into standalone intelligent applications (TODO)

Offline AI PPE CLI(WIP)
Quick Start
API mode, translate the TODO file to English
interactive mode
Usage
Commands
Credit

Quick Start

Quick Start Programming Guide
More examples
AI Applications written in PPE Language:
- AI Guide App For PPE Guide - WIP
  - ai run guide in the project root folder to run the guide
- AI Terminal Shell
LLM Inference Providers:
- llamacpp: llama.cpp server as the default local LLM provider. If no provider is specified, llamacpp is used.
- openai: Also supports OpenAI-compatible service API providers.
  - --provider openai://chatgpt-4o-latest --apiKey “sk-XXX”

Note: Limitations of OpenAI-Compatible Service API Providers

OpenAI must be a large model (gpt-4o) released after 2024-07-18 to support json-schema. Before this date, only json is guaranteed, not the json-schema.
All siliconflow models only guarantee json support, not json-schema support.
[[Fruit:|Apple|Banana]]: Syntax for forcing AI to choose either Apple or Banana will be invalid.

PPE CLI Command

ai is the shell CLI command to manage the brain(LLM) files and run a PPE agent script mainly.

Run script file command ai run, eg, ai run -f calculator.ai.yaml "{content: '32+12*53'}"
- -f is used to specify the script file.
- {content: '32+12*53'} is the optional json input to the script.
- Scripts will display intermediate echo outputs during processing when streaming output. This can be controlled with --streamEcho true|line|false. To keep the displayed echo outputs, use --no-consoleClear.
- Script can be single YAML file (.ai.yaml) or directory.
  - Directory must have an entry point script file with the same name as the directory. Other scripts in the directory can call each other.
Manage the brain files command ai brain include ai brain download, ai brain list/search.
Run ai help or ai help [command] to get more.

Programmable Prompt Engine Language

Programmable Prompt Engine (PPE) Language is a message-processing language, similar to the YAML format.

PPE is designed to define AI prompt messages and their input/output configurations. It allows for the creation of a reusable and programmable prompt system akin to software engineering practices.

I. Core Structure

Message-Based Dialogue: Defines interactions as a series of messages with roles (system, user, assistant).
YAML-Like: Syntax is similar to YAML, making it readable and easy to understand.
Dialogue Separation: Uses triple dashes (---) or asterisks (***) to clearly mark dialogue turns.

II. Reusability & Configuration

Input/Output Configuration (Front-Matter): Defines input requirements (using input keyword) and expected output format (using output keyword with JSON Schema).
Prompt Template: Embeds variables from input configuration or prompt settings into messages using Jinja2 templates ({{variable_name}}).
Custom Script Types: Allows defining reusable script types (type: type) for code and configuration inheritance.

Related Skills

node-connect

353.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.7k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

353.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

353.3k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。

offline-ai

View profile

View on GitHub

GitHub Stars78

CategoryDevelopment

Updated11d ago

Forks9

offline-ai/cli

Languages

TypeScript

Security Score

85/100

Audited on Mar 29, 2026

No findings

Cli

Install / Use

README

Offline AI PPE CLI(WIP)

Quick Start

PPE CLI Command

Programmable Prompt Engine Language

I. Core Structure

II. Reusability & Configuration

III. AI Capabilities

Related Skills