Mandala

A simple & elegant experiment tracking framework that integrates persistence logic & best practices directly into Python

Generate Convert Improve

Install / Use

/learn @amakelov/Mandala

About this skill

Quality Score

0/100

README

<div align="center"> <br> <img src="assets/logo-no-background.png" height=128 alt="logo" align="center"> <br> <a href="#install">Install</a> | <a href="https://colab.research.google.com/github/amakelov/mandala/blob/master/docs_source/tutorials/01_hello.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | <a href="#tutorials">Tutorials</a> | <a href="https://amakelov.github.io/mandala/">Docs</a> | <a href="#blogs--papers">Blogs</a> | <a href="#faqs">FAQs</a> </div>

Automatically save, query & version Python computations

mandala eliminates the effort and code overhead of ML experiment tracking (and beyond) with two generic tools:

The @op decorator:
- captures inputs, outputs and code (+dependencies) of Python function calls
- automatically reuses past results & never computes the same call twice
- designed to be composed into end-to-end persisted programs, enabling efficient iterative development in plain-Python, without thinking about the storage backend.

<table style="border-collapse: collapse; border: none;"> <tr> <td style="border: none;"> <ol start="2"> <li> The <a href="https://amakelov.github.io/mandala/blog/01_cf/">ComputationFrame</a> data structure: <ul> <li> <strong>automatically organizes executions of imperative code</strong> into a high-level computation graph of variables and operations. Detects patterns like feedback loops, branching/merging and aggregation/indexing </li> <li> <strong>queries relationships between variables</strong> by extracting a dataframe where columns are variables and operations in the graph, and each row contains values/calls of a (possibly partial) execution of the graph </li> <li> <strong>automates exploration and high-level operations</strong> over heterogeneous "webs" of <code>@op</code> calls </li> </ul> </li> </ol> </td style="border: none;"> <td><img src="output.svg" alt="Description" width="2700"/></td> </tr> </table>

Video demo

A quick demo of running computations in mandala and simultaneously updating a view of the corresponding ComputationFrame and the dataframe extracted from it (code can be found here):

https://github.com/amakelov/mandala/assets/1467702/85185599-10fb-479e-bf02-442873732906

Install

pip install pymandala

pip install git+https://github.com/amakelov/mandala

Tutorials

Quickstart: <a href="https://colab.research.google.com/github/amakelov/mandala/blob/master/docs_source/tutorials/01_hello.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | read in docs
ComputationFrames: <a href="https://colab.research.google.com/github/amakelov/mandala/blob/master/docs_source/blog/01_cf.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | read in docs
Toy ML project: <a href="https://colab.research.google.com/github/amakelov/mandala/blob/master/docs_source/tutorials/02_ml.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | read in docs

Blogs & papers

Tidy Computations: introduces the ComputationFrame data structure and its applications
Practical Dependency Tracking for Python Function Calls: describes the motivations and designs behind mandala's dependency tracking system
The paper, which is to appear in the SciPy 2024 proceedings.
A discussion on Hacker News

FAQs

How is this different from other experiment tracking frameworks?

Compared to popular tools like W&B, MLFlow or Comet, mandala:

is integrated with the actual Python code execution on a more granular level
- the function call is the synchronized unit of persistence, versioning and querying, as opposed to an entire script or notebook, leading to more efficient reuse and incremental development.
- going even further, Python collections (e.g. list, dict) can be made transparent to the storage system, so that individual elements are stored and tracked separately and can be reused across collections and calls.
- since it's memoization-based as opposed to logging-based, you don't have to think about how to name any of the things you log.
provides the ComputationFrame data structure, a powerful & simple way to represent, query and manipulate complex saved computations.
automatically resolves the version of every @op call from the current state of the codebase and the inputs to the call.

How is the `@op` cache invalidated?

given inputs for a call to an @op, e.g. f, it searches for a past call to f on inputs with the same contents (as determined by a hash function) where the dependencies accessed by this call (including f itself) have versions compatible with their current state.
compatibility between versions of a function is decided by the user: you have the freedom to mark certain changes as compatible with past results, though see the limitations about marking changes as compatible.
internally, mandala uses slightly modified joblib hashing to compute a content hash for Python objects. This is practical for many use cases, but not perfect, as discussed in the limitations section.

Can I change the code of `@op`s, and what happens if I do?

a frequent use case: you have some @op you've been using, then want to extend its functionality in a way that doesn't invalidate the past results. The recommended way is to add a new argument a, and provide a default value for it wrapped with NewArgDefault(x). When a value equal to x is passed for this argument, the storage falls back on calls before
beyond changes like this, you probably want to use the versioning system to detect dependencies of @ops and changes to them. See the documentation.

Is it production-ready?

mandala is in alpha, and the API is subject to change.
moreover, there are known performance bottlenecks that may make working with storages of 10k+ calls slow.

How self-contained is it?

mandala's core is a few kLoCs and only depends on pandas and joblib.
for visualization of ComputationFrames, you should have dot installed on the system level, and/or the Python graphviz library installed.

Limitations

The versioning system is currently not feature-rich and documented enough for realistic use cases. For example, it doesn't support removing old versions in a consistent way, or restricting ComputationFrames by function versions. Moreover, many of the error messages are not informative enough and/or don't suggest solutions.
When using versioning and you mark a change as compatible with past results, you should be careful if the change introduced new dependencies that are not tracked by mandala. Changes to such "invisible" dependencies may remain unnoticed by the storage system, leading you to believe that certain results are up to date when they are not.
See the "gotchas" notebook for mistakes to avoid: <a href="https://colab.research.google.com/github/amakelov/mandala/blob/master/docs_source/tutorials/gotchas.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Roadmap for future features

Overall

[x] support for named outputs in @ops
[ ] support for renaming @ops and their inputs/outputs

Memoization

[ ] add custom serialization for chosen objects
[ ] figure out a solution that ignores small numerical error in content hashing
[ ] improve the documentation on collections
[ ] support parallelization of @op execution via e.g. dask or ray
[ ] support for inputs/outputs to exclude from the storage

Computation frames

[x] add support for cycles in the computation graph
[ ] improve heuristics for the expand_... methods
[ ] add tools for restricting a CF to specific subsets of variable values via predicates
[ ] improve support & examples for using collections
[ ] add support for merging or splitting nodes in the CF and similar simplifications

Versioning

[ ] support ways to remove old versions in a consistent way
[ ] improve documentation and error messages
[ ] test this system more thoroughly
[ ] support restricting CFs by function versions
[ ] support ways to manually add dependencies to versions in order to avoid the "invisible dependency" problem

Performance

[ ] improve performance of the in-memory cache
[ ] improve performance of ComputationFrame operations

Galaxybrained vision

Aspirationally, mandala is about much more than ML experiment tracking. The main goal is to make persistence logic & best practices a natural extension of Python. Once this is achieved, the purely "computational" code you must write anyway doubles as a storage interface. It's hard to think of a simpler and more reliable way to manage computational artifacts.

A first-principles approach to managing computational artifacts

What we want from our storage are ways to

refer to artifacts with short, unambiguous descriptions: "here's [big messy Python object] I computed, which to me means [human-readable description]"
save artifacts: "save [big messy Python object]"
refer to artifacts and load them at a late

Related Skills

feishu-drive

340.5k

things-mac

340.5k

Manage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)

clawhub

340.5k

Use the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com

postkit

PostgreSQL-native identity, configuration, metering, and job queues. SQL functions that work with any language or driver

amakelov

View profile

View on GitHub

GitHub Stars538

CategoryData

Updated23d ago

Forks16

amakelov/mandala

Languages

Jupyter Notebook

Security Score

100/100

Audited on Mar 6, 2026

No findings

Mandala

Install / Use

README

Automatically save, query & version Python computations

Video demo

Install

Tutorials

Blogs & papers

FAQs

How is this different from other experiment tracking frameworks?

How is the @op cache invalidated?

Can I change the code of @ops, and what happens if I do?

Is it production-ready?

How self-contained is it?

Limitations

Roadmap for future features

Galaxybrained vision

A first-principles approach to managing computational artifacts

Related Skills

How is the `@op` cache invalidated?

Can I change the code of `@op`s, and what happens if I do?