<div align="center"> <h1>Monty</h1> </div> <div align="center"> <h3>A minimal, secure Python interpreter written in Rust for use by AI.</h3> </div> <div align="center"> <a href="https://github.com/pydantic/monty/actions/workflows/ci.yml?query=branch%3Amain"><img src="https://github.com/pydantic/monty/actions/workflows/ci.yml/badge.svg" alt="CI"></a> <a href="https://codspeed.io/pydantic/monty?utm_source=badge"><img src="https://img.shields.io/badge/CodSpeed-Performance%20Tracked-blue?logo=data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMTYiIGhlaWdodD0iMTYiIHZpZXdCb3g9IjAgMCAxNiAxNiIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIj48cGF0aCBkPSJNOCAwTDAgOEw4IDE2TDE2IDhMOCAwWiIgZmlsbD0id2hpdGUiLz48L3N2Zz4=" alt="Codspeed"></a> <a href="https://codecov.io/gh/pydantic/monty"><img src="https://codecov.io/gh/pydantic/monty/graph/badge.svg?token=HX4RDQX5OG" alt="Coverage"></a> <a href="https://pypi.python.org/pypi/pydantic-monty"><img src="https://img.shields.io/pypi/v/pydantic-monty.svg" alt="PyPI"></a> <a href="https://github.com/pydantic/monty"><img src="https://img.shields.io/pypi/pyversions/pydantic-monty.svg" alt="versions"></a> <a href="https://github.com/pydantic/monty/blob/main/LICENSE"><img src="https://img.shields.io/github/license/pydantic/monty.svg?v=2" alt="license"></a> <a href="https://logfire.pydantic.dev/docs/join-slack/"><img src="https://img.shields.io/badge/Slack-Join%20Slack-4A154B?logo=slack" alt="Join Slack" /></a> </div>

Experimental - This project is still in development, and not ready for the prime time.

A minimal, secure Python interpreter written in Rust for use by AI.

Monty avoids the cost, latency, complexity and general faff of using a full container based sandbox for running LLM generated code.

Instead, it lets you safely run Python code written by an LLM embedded in your agent, with startup times measured in single digit microseconds not hundreds of milliseconds.

What Monty can do:

Run a reasonable subset of Python code - enough for your agent to express what it wants to do
Completely block access to the host environment: filesystem, env variables and network access are all implemented via external function calls the developer can control
Call functions on the host - only functions you give it access to
Run typechecking - monty supports full modern python type hints and comes with ty included in a single binary to run typechecking
Be snapshotted to bytes at external function calls, meaning you can store the interpreter state in a file or database, and resume later
Startup extremely fast (<1μs to go from code to execution result), and has runtime performance that is similar to CPython (generally between 5x faster and 5x slower)
Be called from Rust, Python, or Javascript - because Monty has no dependencies on cpython, you can use it anywhere you can run Rust
Control resource usage - Monty can track memory usage, allocations, stack depth, and execution time and cancel execution if it exceeds preset limits
Collect stdout and stderr and return it to the caller
Run async or sync code on the host via async or sync code on the host
Use a small subset of the standard library: sys, os, typing, asyncio, re, datetime (soon), dataclasses (soon), json (soon)

What Monty cannot do:

Use the rest of the standard library
Use third party libraries (like Pydantic), support for external python library is not a goal
define classes (support should come soon)
use match statements (again, support should come soon)

In short, Monty is extremely limited and designed for one use case:

To run code written by agents.

For motivation on why you might want to do this, see:

Codemode from Cloudflare
Programmatic Tool Calling from Anthropic
Code Execution with MCP from Anthropic
Smol Agents from Hugging Face

In very simple terms, the idea of all the above is that LLMs can work faster, cheaper and more reliably if they're asked to write Python (or Javascript) code, instead of relying on traditional tool calling. Monty makes that possible without the complexity of a sandbox or risk of running code directly on the host.

Note: Monty will (soon) be used to implement codemode in Pydantic AI

Usage

Monty can be called from Python, JavaScript/TypeScript or Rust.

Python

To install:

uv add pydantic-monty

(Or pip install pydantic-monty for the boomers)

Usage:

from typing import Any

import pydantic_monty

code = """
async def agent(prompt: str, messages: Messages):
    while True:
        print(f'messages so far: {messages}')
        output = await call_llm(prompt, messages)
        if isinstance(output, str):
            return output
        messages.extend(output)

await agent(prompt, [])
"""

type_definitions = """
from typing import Any

Messages = list[dict[str, Any]]

async def call_llm(prompt: str, messages: Messages) -> str | Messages:
    raise NotImplementedError()

prompt: str = ''
"""

m = pydantic_monty.Monty(
    code,
    inputs=['prompt'],
    script_name='agent.py',
    type_check=True,
    type_check_stubs=type_definitions,
)


Messages = list[dict[str, Any]]


async def call_llm(prompt: str, messages: Messages) -> str | Messages:
    if len(messages) < 2:
        return [{'role': 'system', 'content': 'example response'}]
    else:
        return f'example output, message count {len(messages)}'


async def main():
    output = await pydantic_monty.run_monty_async(
        m,
        inputs={'prompt': 'testing'},
        external_functions={'call_llm': call_llm},
    )
    print(output)
    #> example output, message count 2


if __name__ == '__main__':
    import asyncio

    asyncio.run(main())

Iterative Execution with External Functions

Use start() and resume() to handle external function calls iteratively, giving you control over each call:

import pydantic_monty

code = """
data = fetch(url)
len(data)
"""

m = pydantic_monty.Monty(code, inputs=['url'])

# Start execution - pauses when fetch() is called
result = m.start(inputs={'url': 'https://example.com'})

print(type(result))
#> <class 'pydantic_monty.FunctionSnapshot'>
print(result.function_name)  # fetch
#> fetch
print(result.args)
#> ('https://example.com',)

# Perform the actual fetch, then resume with the result
result = result.resume(return_value='hello world')

print(type(result))
#> <class 'pydantic_monty.MontyComplete'>
print(result.output)
#> 11

Serialization

Both Monty and snapshot types like FunctionSnapshot can be serialized to bytes and restored later. This allows caching parsed code or suspending execution across process boundaries:

import pydantic_monty

# Serialize parsed code to avoid re-parsing
m = pydantic_monty.Monty('x + 1', inputs=['x'])
data = m.dump()

# Later, restore and run
m2 = pydantic_monty.Monty.load(data)
print(m2.run(inputs={'x': 41}))
#> 42

# Serialize execution state mid-flight
m = pydantic_monty.Monty('fetch(url)', inputs=['url'])
progress = m.start(inputs={'url': 'https://example.com'})
state = progress.dump()

# Later, restore and resume (e.g., in a different process)
progress2 = pydantic_monty.load_snapshot(state)
result = progress2.resume(return_value='response data')
print(result.output)
#> response data

Rust

use monty::{MontyRun, MontyObject, NoLimitTracker, PrintWriter};

let code = r#"
def fib(n):
    if n <= 1:
        return n
    return fib(n - 1) + fib(n - 2)

fib(x)
"#;

let runner = MontyRun::new(code.to_owned(), "fib.py", vec!["x".to_owned()]).unwrap();
let result = runner.run(vec![MontyObject::Int(10)], NoLimitTracker, PrintWriter::Stdout).unwrap();
assert_eq!(result, MontyObject::Int(55));

Serialization

MontyRun and RunProgress can be serialized using the dump() and load() methods:

use monty::{MontyRun, MontyObject, NoLimitTracker, PrintWriter};

// Serialize parsed code
let runner = MontyRun::new("x + 1".to_owned(), "main.py", vec!["x".to_owned()]).unwrap();
let bytes = runner.dump().unwrap();

// Later, restore and run
let runner2 = MontyRun::load(&bytes).unwrap();
let result = runner2.run(vec![MontyObject::Int(41)], NoLimitTracker, PrintWriter::Stdout).unwrap();
assert_eq!(result, MontyObject::Int(42));

PydanticAI Integration

Monty will power code-mode in Pydantic AI. Instead of making sequential tool calls, the LLM writes Python code that calls your tools as functions and Monty executes it safely.

import asyncio
import json

import logfire
from httpx import AsyncClient
from pydantic_ai import Agent, RunContext
from pydantic_ai.toolsets.code_mode import CodeModeToolset
from pydantic_ai.toolsets.function import FunctionToolset
from typing_extensions import TypedDict

logfire.configure()
logfire.instrument_pydantic_ai()


class LatLng(TypedDict):
    lat: float
    lng: float


weather_toolset: FunctionToolset[AsyncClient] = FunctionToolset()


@weather_toolset.tool
async def get_lat_lng(
    ctx: RunContext[AsyncClient], location_description: str
) -> LatLng:
    """Get the latitude and longitude of a location."""
    # NOTE: the response here will be random, and is not related to the location description.
    r = await ctx.deps.get(
        'https://demo-endpoints.pydantic.workers.dev/latlng',
        params={'location': location_description},
    )
    r.raise_for_status()
    return json.loads(r.content)


@weather_toolset.tool
async def get_temp(ctx: RunContext[AsyncClient], lat: float, lng: float) -> float:
    """Get the temp at a location."""
    # NO

Monty

Install / Use

README

Usage

Python

Iterative Execution with External Functions

Serialization

Rust

Serialization

PydanticAI Integration