InterProcessPyObjects
High-performance and seamless sharing and modification of Python objects between processes, without the periodic overhead of serialization and deserialization. Provides fast inter-process communication (IPC) via shared memory. Supports NumPy, Torch arrays, custom classes (including dataclass), classes with methods, and asyncio
Install / Use
/learn @FI-Mihej/InterProcessPyObjectsREADME
InterProcessPyObjects package
InterProcessPyObjects is a part of the Cengal library. If you have any questions or would like to participate in discussions, feel free to join the Cengal Discord. Your support and involvement are greatly appreciated as Cengal evolves.
This high-performance package delivers blazing-fast inter-process communication through shared memory, enabling Python objects to be shared across processes with exceptional efficiency. By minimizing the need for frequent serialization-deserialization, it enhances overall speed and responsiveness. The package offers a comprehensive suite of functionalities designed to support a diverse array of Python types and facilitate asynchronous IPC, optimizing performance for demanding applications.


API State
Stable. Guaranteed to not have braking changes in the future (see bellow for details).
Any hypothetical further API-breaking changes will lead to new module creation within the package. An old version will continue its existence and continue to be importable by an explicit address (see Details bellow).
<details> <summary title="Details"><kbd> Details </kbd></summary>The current (currently latest) version can be imported either by:
from ipc_py_objects import *
or by
from ipc_py_objects.versions.v_1 import *
If further braking changes will be made to the API - a new (v_2) version will be made. As result:
Current (v_1) version will continue to be accessible by an explicit address:
from ipc_py_objects.versions.v_1 import *
Latest (v_2) version will be accessible by either:
from ipc_py_objects import *
or by
from ipc_py_objects.versions.v_2 import *
This is a general approach across the entire Cengal library. It gives me the ability to effectively work on its huge codebase, even by myself.
By the way. I'm finishing an implementation of CengalPolyBuild - my package creation system which provides same approach to users. It is a comprehensive and hackable build system for multilingual Python packages: Cython (including automatic conversion from Python to Cython), C/C++, Objective-C, Go, and Nim, with ongoing expansions to include additional languages. Basically, it will provide easy access to all the same features I'm already using in the Cengal library package creation and management processes.
</details>Key Features
-
Shared Memory Communication:
- Enables sharing of Python objects directly between processes using shared memory.
- Utilizes a linked list of global messages to inform connected processes about new shared objects.
-
Lock-Free Synchronization:
- Uses memory barriers for efficient communication, avoiding slow syscalls.
- Ensures each process can access and modify shared memory without contention.
-
Supported Python Types:
- Handles various Python data structures including:
- Basic types:
None,bool, 64-bitint, largeint(arbitrary precision integers),float,complex,bytes,bytearray,str. - Standard types:
Decimal,slice,datetime,timedelta,timezone,date,time - Containers:
tuple,list, classes inherited from:AbstractSet(frozenset),MutableSet(set),MappingandMutableMapping(dict). - Pickable classes instances: custom classes including
dataclass
- Basic types:
- Allows mutable containers (lists, sets, mappings) to save basic types (
None,bool, 64 bitint,float) internally, optimizing memory use and speed.
- Handles various Python data structures including:
-
NumPy and Torch Support:
- Supports numpy arrays by creating shared bytes objects coupled with independent arrays.
- Supports torch tensors by coupling them with shared numpy arrays.
-
Custom Class Support:
- Projects pickable custom classes instances (including
dataclasses) onto shared dictionaries in shared memory. - Modifies the class instance to override attribute access methods, managing data fields within the shared dictionary.
- supports classes with or without
__dict__attr - supports classes with or without
__slots__attr
- Projects pickable custom classes instances (including
-
Asyncio Compatibility:
- Provides a wrapper module for async-await functionality, integrating seamlessly with asyncio.
- Ensures asynchronous operations work smoothly with the package's lock-free approach.
Import
To use this package, simply install it via pip:
pip install InterProcessPyObjects
Then import it into your project:
from ipc_py_objects import *
Main principles
- only one process has access to the shared memory at the same time
- working cycle:
- work on your tasks
- acquire access to shared memory
- work with shared memory as fast as possible (read and/or update data structures in shared memory)
- release access to shared memory
- continue your work on other tasks
- do not forget to manually destroy your shared objects when they are not needed already
- feel free to not destroy your shared object if you need it for a whole run and/or do not care about the shared memory waste
- data will not be preserved between Creator's sessions. Shared memory will be wiped just before Creator finished its work with a shared memory instance (Consumer's session will be finished already at this point)
! Important about hashmaps
Package, currently, uses Python hash() call which is reliable across interpreter session but unreliable across different interpreter sessions because of random seeding.
In order to use same seeding across different interpreter instances (and as result, be able to use hashmaps) you can set 'PYTHONHASHSEED` env var to some fixed integer value
<details> <summary title=".bashrc"><kbd> .bashrc </kbd></summary>export PYTHONHASHSEED=0
</details>
<details>
<summary title="Your bash script"><kbd> Your bash script </kbd></summary>
export PYTHONHASHSEED=0
python YOURSCRIPT.py
</details>
<details>
<summary title="Terminal"><kbd> Terminal </kbd></summary>
$ PYTHONHASHSEED=0 python YOURSCRIPT.py
</details>
An issue with the behavior of an integrated hash() call does Not affect the following data types:
None,bool,int,float,complex,str,bytes,bytearrayDecimal,slice,datetime,timedelta,timezone,date,timetuple,listsetwrapped byFastLimitedSetclass instance: for example by using.put_message(FastLimitedSet(my_set_obj))calldictwrapped byFastLimitedDictclass instance: for example by using.put_message(FastLimitedDict(my_dict_obj))call- an instances of custom classes including
dataclassby default: for example by using.put_message(my_obj)call - an instances of custom classes including
dataclasswrapped byForceStaticObjectCopyorForceStaticObjectInplaceclass instances. For example by using.put_message(ForceStaticObjectInplace(my_obj))call
It affects only the following data types:
AbstractSet(frozenset)MutableSet(set)MappingMutableMapping(dict)- an instances of custom classes including
dataclasswrapped byForceGeneralObjectCopyorForceGeneralObjectInplaceclass instances. For example by using.put_message(ForceGeneralObjectInplace(my_obj))call
Examples
- An async examples (with asyncio):
Receiver.py performance measurements
- CPU: i5-3570@3.40GHz (Ivy Bridge)
- RAM: 32 GBytes, DDR3, dual channel, 655 MHz
- OS: Ubuntu 20.04.6 LTS under WSL2. Windows 10
async with ashared_memory_context_manager.if_has_messages() as shared_memory:
# Taking a message with an object from the queue.
sso: SomeSharedObject = shared_memory.value.take_message() # 5_833 iterations/seconds
# We create local variables once in order to access them many times in the future, ensuring high performance.
# Applying a principle that is widely recommended for improving Python code.
company_metrics: List = sso.company_info.company_metrics # 12_479 iterations/seconds
some_employee: Employee = sso.company_info.some_employee # 10_568 iterations/seconds
data_dict: Dict = sso.data_dict # 16_362 iterations/seconds
numpy_ndarray: np.ndarray = data_dict['key3'] # 26_223 iterations/seconds
# Optimal work with shared data (through local va
