AgentRun
The easiest, and fastest way to run AI-generated Python code safely
Install / Use
/learn @tjmlabs/AgentRunREADME
AgentRun: Run AI Generated Code Safely
AgentRun is a Python library that makes it easy to run Python code safely from large language models (LLMs) with a single line of code. Built on top of the Docker Python SDK and RestrictedPython, it provides a simple, transparent, and user-friendly API to manage isolated code execution.
AgentRun automatically installs and uninstalls dependencies with optional caching, limits resource consumption, checks code safety, and sets execution timeouts. It has 97% test coverage with full static typing and only two dependencies.
[!NOTE] Looking for a state of the art RAG API? Check out ColiVara, also from us.
Why?
Giving code execution ability to LLMs is a massive upgrade. Consider the following user query: what is 12345 * 54321? or even something more ambitious like what is the average daily move of Apple stock during the last week?? With code execution it is possible for LLMs to answer both accurately by executing code.
However, executing untrusted code is dangerous and full of potential footguns. For instance, without proper safeguards, an LLM might generate harmful code like this:
import os
# deletes all files and directories
os.system('rm -rf /')
This package gives code execution ability to any LLM in a single line of code, while preventing and guarding against dangerous code.
Key Features
- Safe code execution: AgentRun checks the generated code for dangerous elements before execution
- Isolated Environment: Code is executed in a fully isolated docker container
- Configurable Resource Management: You can set how much compute resources the code can consume, with sane defaults
- Timeouts: Set time limits on how long a script can take to run
- Dependency Management: Complete control on what dependencies are allowed to install
- Dependency Caching: AgentRun gives you the ability to cache any dependency in advance in the docker container to optimize performance.
- Automatic Cleanups: AgentRun cleans any artifacts created by the generated code.
- Comes with a REST API: Hate setting up docker? AgentRun comes with already configured docker setup for self-hosting.
- Transparent Exception Handling: AgentRun returns the same exact output as running Python in your system - exceptions and tracebacks included. No cryptic docker messages.
If you want to use your own Docker configuration, install this package with pip and simply initialize AgentRun with a running Docker container. Additionally, you can use an already configured Docker Compose setup and API that is ready for self-hosting by cloning this repo.
Unless you are comfortable with Docker, we highly recommend using the REST API with the already configured Docker as a standalone service.
Getting Started
There are two ways to use AgentRun, depending on your needs: with pip for your own Docker setup, or directly as a REST API as a standalone service (recommended).
REST API
Clone the github repository and start immediately with a standalone REST API.
git clone https://github.com/Jonathan-Adly/agentrun
cd agentrun/agentrun-api
cp .env.example .env.dev
docker-compose up -d --build
Then - you have a fully up and running code execution API. Code in --> output out
fetch('http://localhost:8000/v1/run/', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
code: "print('hello, world!')"
})
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error('Error:', error));
Or if you prefer the terminal.
curl -X POST http://localhost:8000/v1/run/ -H "Content-Type: application/json" -d '{"code": "print(\'hello, world!\')"}'
pip install
Install AgentRun with a single command via pip (you will need to configure your own Docker setup):
pip install agentrun
Here is a simple example:
from agentrun import AgentRun
runner = AgentRun(container_name="my_container") # container should be running
code_from_llm = get_code_from_llm(prompt) # "print('hello, world!')"
result = runner.execute_code_in_container(code_from_llm)
print(result)
#> "Hello, world!"
Difference | Python Package | REST API |
--------- | -------------- | ----------- |
Docker setup| You set it up | Already setup for you |
Installation| Pip | Git clone |
Ease of use | Easy | Super Easy |
Requirements| A running docker container| Docker installed |
Customize | Fully | Partially |
Usage
Now, let's see AgentRun in action with something more complicated. We will take advantage of function calling and AgentRun, to have LLMs write and execute code on the fly to solve arbitrary tasks. You can find the full code under docs/examples/
First, we will install the needed packages. We are using mixtral here via groq to keep things fast and with minimal depenencies, but AgentRun works with any LLM out of the box. All what's required is for the LLM to return a code snippet.
FYI: OpenAI assistant tool
code_interpretercan execute code. AgentRun is a transparent, open-source version that can work with any LLM.
!pip install groq
!pip install requests
Next, we will setup a function that executed the code and returns an output. We are using the API here, so make sure to have it running before trying this.
Here is the steps to run the API:
git clone https://github.com/Jonathan-Adly/agentrun
cd agentrun/agentrun-api
cp .env.example .env.dev
docker-compose up -d --build
def execute_python_code(code: str) -> str:
response = requests.post("http://localhost:8000/v1/run/", json={"code": code})
output = response.json()["output"]
return output
Next, we will setup our LLM function calling skeleton code. We need:
- An LLM client such Groq or OpenAI or Anthropic (alternatively, you can use litellm as wrapper)
- The model you will use
- Our code execution tool - that encourages the LLM model to send us python code to execute reliably
from groq import Groq
import json
client = Groq(api_key ="Your API Key")
MODEL = 'mixtral-8x7b-32768'
tools = [
{
"type": "function",
"function": {
"name": "execute_python_code",
"description": "Sends a python code snippet to the code execution environment and returns the output. The code execution environment can automatically import any library or package by importing.",
"parameters": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "The code snippet to execute. Must be a valid python code. Must use print() to output the result.",
},
},
"required": ["code"],
},
},
},
]
Next, we will setup a function to call our LLM of choice.
def chat_completion_request(messages, tools=None, tool_choice=None, model=GPT_MODEL):
try:
response = client.chat.completions.create(
model=model,
messages=messages,
tools=tools,
tool_choice=tool_choice,
)
return response
except Exception as e:
print("Unable to generate ChatCompletion response")
print(f"Exception: {e}")
return e
Finally, we will set up a function that takes the user query and returns an answer. Using AgentRun to execute code when the LLM determines code execution is necesary to answer the question
def get_answer(query):
messages = []
messages.append(
{
"role": "system",
"content": """Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.\n
Use the execute_python_code tool to run code if a question is better solved with code. You can use any package in the code snippet by simply importing. Like `import requests` would work fine.\n
""",
}
)
messages.append({"role": "user", "content": query})
chat_response = chat_completion_request(messages, tools=tools)
message = chat_response.choices[0].message
# tool call versus content
if message.tool_calls:
tool_call = message.tool_calls[0]
arg = json.loads(tool_call.function.arguments)["code"]
print(f"Executing code: {arg}")
answer = execute_python_code(arg)
# Optional: call an LLM again to turn
Related Skills
node-connect
344.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
96.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
344.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
344.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
