Py13x
A high-performance Python package built specifically for Python 3.13, and true free-threaded (no-GIL) execution. py13x provides utilities for parallel processing, memory management, profiling, and thread-safe operations optimized for GIL-free Python.
Install / Use
/learn @Donny-GUI/Py13xREADME
py13x
A high-performance Python package built specifically for Python 3.13+ and true free-threaded (no-GIL) execution. py13x provides utilities for parallel processing, memory management, profiling, and thread-safe operations optimized for GIL-free Python.
🚀 Features
- True Parallelism: Take full advantage of Python 3.13's free-threaded runtime
- High-Performance Thread Pool: Lightweight pool optimized for CPU-bound parallel work
- Parallel Patterns: Built-in map, fanout, and pipeline patterns
- Memory Management: Thread-local arenas, reusable buffers, and Rust-inspired ownership
- Profiling Tools: Lock contention detection, performance counters, and execution tracing
- Thread-Safe Primitives: Atomic integers, barriers, and synchronization utilities
- Runtime Detection: Verify free-threaded mode and system capabilities
- Graceful Shutdown: Signal handling for clean application termination
📋 Requirements
- Python 3.13+ with free-threaded runtime enabled
- Use
python3.13texecutable or build Python with--disable-gil
📦 Installation
pip install py13x
Or install from source:
git clone https://github.com/yourusername/py13x.git
cd py13x
pip install -e .
🔍 Quick Start
Verify Free-Threaded Runtime
from py13x.runtime import assert_free_threaded
# Ensure you're running in free-threaded mode
assert_free_threaded()
Parallel Map
from py13x.parallel import pmap
from py13x.runtime import cpu_count
def compute_heavy(x):
return x ** 2 + x ** 0.5
data = range(1000)
results = pmap(compute_heavy, data, workers=cpu_count())
Thread Pool
from py13x.threading import ParallelThreadPool
from py13x.runtime import cpu_count
pool = ParallelThreadPool(workers=cpu_count())
# Submit tasks
futures = [pool.submit(expensive_function, arg) for arg in args]
# Collect results
results = [f.result() for f in futures]
pool.shutdown()
Atomic Operations
from py13x.threading import AtomicInt
from threading import Thread
counter = AtomicInt(0)
def worker():
for _ in range(1000):
counter.inc()
threads = [Thread(target=worker) for _ in range(10)]
for t in threads: t.start()
for t in threads: t.join()
print(f"Final count: {counter.get()}") # 10000
Performance Tracing
from py13x.profiling import trace
with trace("data processing"):
result = process_large_dataset()
# Output: data processing: 0.123456s
Graceful Shutdown
from py13x.utils import shutdown
# Install signal handlers
shutdown.install()
# In worker loops
while not shutdown.should_exit():
process_batch()
cleanup()
📚 Module Reference
py13x.runtime
Runtime detection and system information:
assert_free_threaded(): Verify Python is running in free-threaded modecpu_count(): Get available CPU coresset_thread_affinity(cpu): Pin thread to specific CPU core (Linux only)
py13x.threading
Thread-safe primitives and pools:
ParallelThreadPool: High-performance thread poolAtomicInt: Thread-safe integer with atomic operationsbarrier(parties): Synchronization barrier for coordinating threads
py13x.parallel
High-level parallel patterns:
pmap(fn, data, workers): Parallel map operationfanout(fn, items, workers): Fan-out pattern for fire-and-forget tasksPipelineStage: Multi-stage pipeline processing
py13x.memory
Memory management utilities:
ThreadArena: Thread-local storage arenasReusableBuffer: Pre-allocated reusable buffersOwned[T]: Rust-inspired move semantics for ownership tracking
py13x.profiling
Performance profiling tools:
trace(label): Context manager for execution time tracingCounter: Throughput counter with rate calculationContentionProbe: Lock contention measurement
py13x.utils
Utility functions:
shutdown.install(): Install graceful shutdown signal handlersshutdown.should_exit(): Check if shutdown was requestedBoundedQueue: Queue with automatic backpressure
🎯 Use Cases
CPU-Bound Parallel Processing
from py13x.parallel import pmap
from py13x.runtime import cpu_count, assert_free_threaded
assert_free_threaded()
def cpu_intensive(data):
# Heavy computation without GIL interference
return complex_calculation(data)
results = pmap(cpu_intensive, large_dataset, workers=cpu_count())
Pipeline Processing
from py13x.parallel import PipelineStage
from queue import Queue
# Create pipeline stages
input_queue = Queue()
stage1_output = Queue()
stage2_output = Queue()
stage1 = PipelineStage(parse_data, input_queue, stage1_output)
stage2 = PipelineStage(transform_data, stage1_output, stage2_output)
stage1.start()
stage2.start()
# Feed data into pipeline
for item in data_source:
input_queue.put(item)
Lock Contention Analysis
from py13x.profiling import ContentionProbe
probe = ContentionProbe()
# Use instead of regular Lock
probe.acquire()
try:
# Critical section
shared_data.update()
finally:
probe.release()
print(f"Lock wait time: {probe.wait_time:.3f}s")
Thread-Local Caching
from py13x.memory import ThreadArena
def worker():
arena = ThreadArena.get()
# Each thread gets its own cache
if 'cache' not in arena:
arena['cache'] = {}
cache = arena['cache']
# Use thread-local cache without locks
⚡ Performance Tips
- Worker Count: For CPU-bound work, use
cpu_count(). For I/O-bound, you can exceed core count. - Batch Size: Process data in batches to reduce task submission overhead
- Memory Reuse: Use
ReusableBufferandThreadArenato minimize allocations - Profiling: Use
ContentionProbeto identify locking bottlenecks - Affinity: Consider pinning threads to cores for cache-sensitive workloads
🔬 Benchmarking
Compare performance with and without the GIL:
import time
from py13x.parallel import pmap
from py13x.runtime import assert_free_threaded, cpu_count
assert_free_threaded()
def benchmark():
start = time.perf_counter()
results = pmap(compute_intensive, data, workers=cpu_count())
elapsed = time.perf_counter() - start
print(f"Processed {len(results)} items in {elapsed:.3f}s")
print(f"Throughput: {len(results)/elapsed:.1f} items/sec")
benchmark()
🐛 Debugging
Enable detailed logging:
import logging
logging.basicConfig(level=logging.DEBUG)
# Use tracing for performance debugging
from py13x.profiling import trace
with trace("suspicious operation"):
potentially_slow_function()
🤝 Contributing
Contributions are welcome! Please ensure:
- Code works with Python 3.13+ free-threaded runtime
- All functions have type hints and docstrings
- Tests pass with
python3.13t -m pytest
📄 License
MIT License - see LICENSE file for details.
🔗 Links
⚠️ Notes
- This package requires Python 3.13+ with free-threaded runtime enabled
- Use
python3.13tor build Python with--disable-gilflag - Performance benefits are most significant for CPU-bound workloads
- Some operations may still require synchronization (locks, atomics)
Built for the future of parallel Python 🐍⚡
