UltraDict

Sychronized, streaming Python dictionary that uses shared memory as a backend

Generate Convert Improve

Install / Use

/learn @ronny-rentner/UltraDict

About this skill

Quality Score

0/100

README

UltraDict

Sychronized, streaming Python dictionary that uses shared memory as a backend

Warning: This is an early hack. There are only few unit tests and so on. Maybe not stable!

Features:

Fast (compared to other sharing solutions)
No running manager processes
Works in spawn and fork context
Safe locking between independent processes
Tested with Python >= v3.8 on Linux, Windows and Mac
Convenient, no setter or getters necessary
Optional recursion for nested dicts

General Concept

UltraDict uses multiprocessing.shared_memory to synchronize a dict between multiple processes.

It does so by using a stream of updates in a shared memory buffer. This is efficient because only changes have to be serialized and transferred.

If the buffer is full, UltraDict will automatically do a full dump to a new shared memory space, reset the streaming buffer and continue to stream further updates. All users of the UltraDict will automatically load full dumps and continue using streaming updates afterwards.

Issues

On Windows, if no process has any handles on the shared memory, the OS will gc all of the shared memory making it inaccessible for future processes. To work around this issue you can currently set full_dump_size which will cause the creator of the dict to set a static full dump memory of the requested size. This full dump memory will live as long as the creator lives. This approach has the downside that you need to plan ahead for your data size and if it does not fit into the full dump memory, it will break.

Alternatives

There are many alternatives:

How to use?

Simple

In one Python REPL:

Python 3.9.2 on linux
>>>
>>> from UltraDict import UltraDict
>>> ultra = UltraDict({ 1:1 }, some_key='some_value')
>>> ultra
{1: 1, 'some_key': 'some_value'}
>>>
>>> # We need the shared memory name in the other process.
>>> ultra.name
'psm_ad73da69'

In another Python REPL:

Python 3.9.2 on linux
>>>
>>> from UltraDict import UltraDict
>>> # Connect to the shared memory with the name above
>>> other = UltraDict(name='psm_ad73da69')
>>> other
{1: 1, 'some_key': 'some_value'}
>>> other[2] = 2

Back in the first Python REPL:

>>> ultra[2]
2

Nested

In one Python REPL:

Python 3.9.2 on linux
>>>
>>> from UltraDict import UltraDict
>>> ultra = UltraDict(recurse=True)
>>> ultra['nested'] = { 'counter': 0 }
>>> type(ultra['nested'])
<class 'UltraDict.UltraDict'>
>>> ultra.name
'psm_0a2713e4'

In another Python REPL:

Python 3.9.2 on linux
>>>
>>> from UltraDict import UltraDict
>>> other = UltraDict(name='psm_0a2713e4')
>>> other['nested']['counter'] += 1

Back in the first Python REPL:

>>> ultra['nested']['counter']
1

Performance comparison

Lets compare a classical Python dict, UltraDict, multiprocessing.Manager and Redis.

Note that this comparison is not a real life workload. It was executed on Debian Linux 11 with Redis installed from the Debian package and with the default configuration of Redis.

Python 3.9.2 on linux
>>>
>>> from UltraDict import UltraDict
>>> ultra = UltraDict()
>>> for i in range(10_000): ultra[i] = i
...
>>> len(ultra)
10000
>>> ultra[500]
500
>>> # Now let's do some performance testing
>>> import multiprocessing, redis, timeit
>>> orig = dict(ultra)
>>> len(orig)
10000
>>> orig[500]
500
>>> managed = multiprocessing.Manager().dict(orig)
>>> len(managed)
10000
>>> r = redis.Redis()
>>> r.flushall()
>>> r.mset(orig)

Read performance

>>> timeit.timeit('orig[1]', globals=globals()) # original
0.03832335816696286
>>> timeit.timeit('ultra[1]', globals=globals()) # UltraDict
0.5248982920311391
>>> timeit.timeit('managed[1]', globals=globals()) # Manager
40.85506196087226
>>> timeit.timeit('r.get(1)', globals=globals()) # Redis
49.3497632863
>>> timeit.timeit('ultra.data[1]', globals=globals()) # UltraDict data cache
0.04309639008715749

We are factor 15 slower than a real, local dict, but way faster than using a Manager. If you need full read performance, you can access the underlying cache ultra.data directly and get almost original dict performance, of course at the cost of not having real-time updates anymore.

Write performance

>>> min(timeit.repeat('orig[1] = 1', globals=globals())) # original
0.028232071083039045
>>> min(timeit.repeat('ultra[1] = 1', globals=globals())) # UltraDict
2.911152713932097
>>> min(timeit.repeat('managed[1] = 1', globals=globals())) # Manager
31.641707635018975
>>> min(timeit.repeat('r.set(1, 1)', globals=globals())) # Redis
124.3432381930761

We are factor 100 slower than a real, local Python dict, but still factor 10 faster than using a Manager and much fast than Redis.

Testing performance

There is an automated performance test in tests/performance/performance.py. If you run it, you get something like this:

python ./tests/performance/performance.py

Testing Performance with 1000000 operations each

Redis (writes) = 24,351 ops per second
Redis (reads) = 30,466 ops per second
Python MPM dict (writes) = 19,371 ops per second
Python MPM dict (reads) = 22,290 ops per second
Python dict (writes) = 16,413,569 ops per second
Python dict (reads) = 16,479,191 ops per second
UltraDict (writes) = 479,860 ops per second
UltraDict (reads) = 2,337,944 ops per second
UltraDict (shared_lock=True) (writes) = 41,176 ops per second
UltraDict (shared_lock=True) (reads) = 1,518,652 ops per second

Ranking:
  writes:
    Python dict = 16,413,569 (factor 1.0)
    UltraDict = 479,860 (factor 34.2)
    UltraDict (shared_lock=True) = 41,176 (factor 398.62)
    Redis = 24,351 (factor 674.04)
    Python MPM dict = 19,371 (factor 847.33)
  reads:
    Python dict = 16,479,191 (factor 1.0)
    UltraDict = 2,337,944 (factor 7.05)
    UltraDict (shared_lock=True) = 1,518,652 (factor 10.85)
    Redis = 30,466 (factor 540.9)
    Python MPM dict = 22,290 (factor 739.31)

I am interested in extending the performance testing to other solutions (like sqlite, memcached, etc.) and to more complex use cases with multiple processes working in parallel.

Parameters

Ultradict(*arg, name=None, create=None, buffer_size=10000, serializer=pickle, shared_lock=False, full_dump_size=None, auto_unlink=None, recurse=False, recurse_register=None, **kwargs)

name: Name of the shared memory. A random name will be chosen if not set. By default, if a name is given a new shared memory space is created if it does not exist yet. Otherwise the existing shared memory space is attached.

create: Can be either True or False or None. If set to True, a new UltraDict will be created and an exception is thrown if one exists already with the given name. If kept at the default value None, either a new UltraDict will be created if the name is not taken or an existing UltraDict will be attached.

Setting create=True does ensure not accidentally attaching to an existing UltraDict that might be left over.

buffer_size: Size of the shared memory buffer used for streaming changes of the dict. The buffer size limits the biggest change that can be streamed, so when you use large values or deeply nested dicts you might need a bigger buffer. Otherwise, if the buffer is too small, it will fall back to a full dump. Creating full dumps can be slow, depending on the size of your dict.

Whenever the buffer is full, a full dump will be created. A new shared memory is allocated just big enough for the full dump. Afterwards the streaming buffer is reset. All other users of the dict will automatically load the full dump and continue streaming updates.

(Also see the section Memory management below!)

serializer: Use a different serialized from the default pickle, e. g. marshal, dill, jsons. The module or object provided must support the methods loads() and dumps()

shared_lock: When writing to the same dict at the same time from multiple, independent processes, they need a shared lock to synchronize and not overwrite each other's changes. Shared locks are slow. They rely on the atomics package for atomic locks. By default, UltraDict will use a multiprocessing.RLock() instead which works well in fork context and is much faster.

(Also see the section Locking below!)

full_dump_size: If set, uses a static full dump memory instead of dynamically creating it. This might be necessary on Windows depending on your write behaviour. On Windows, the full dump memory goes away if the process goes away that had created the full dump. Thus you must plan ahead which processes might be writing to the dict and therefore creating full dumps.

auto_unlink: If True, the creator of the shared memory will automatically unlink the handle at exit so it is not visible or accessible to new processes. All existing, still connected processes can continue to use the dict.

recurse: If True, any nested dict objects will be automaticall wrapped in an UltraDict allowing transparent nested updates.

`recurse_re

Related Skills

node-connect

348.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

claude-opus-4-5-migration

109.1k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

frontend-design

109.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

model-usage

348.5k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

ronny-rentner

View profile

View on GitHub

GitHub Stars297

CategoryDevelopment

Updated16d ago

Forks28

ronny-rentner/UltraDict

Languages

Python

Security Score

100/100

Audited on Mar 20, 2026

No findings