Ems

Extended Memory Semantics - Persistent shared object memory and parallelism for Node.js and Python

Generate Convert Improve

Install / Use

/learn @mogill/Ems

About this skill

Quality Score

0/100

README

OSX | Linux | Node 4.1-14.x, Python2/3:

API Documentation | EMS Website

Extended Memory Semantics (EMS)

EMS makes possible persistent shared memory parallelism between Node.js, Python, and C/C++.

Extended Memory Semantics (EMS) unifies synchronization and storage primitives to address several challenges of parallel programming:

Allows any number or kind of processes to share objects
Manages synchronization and object coherency
Implements persistence to non-volatile memory and secondary storage
Provides dynamic load-balancing between processes
May substitute or complement other forms of parallelism

Examples: Parallel web servers, word counting

Parallel Execution Models Supported Fork Join, Bulk Synchronous Parallel, User defined
Atomic Operations Atomic Read-Modify-Write operations
Examples Parallel web servers, word counting
Benchmarks Bandwidth, Transaction processing
Synchronization as a Property of the Data, Not a Duty for Tasks Full/Empty tags
Installation Downloading from Git or NPM
Roadmap The Future™! It's all already happened

EMS is targeted at tasks too large for one core or one process but too small for a scalable cluster

A modern multi-core server has 16-32 cores and nearly 1TB of memory, equivalent to an entire rack of systems from a few years ago. As a consequence, jobs formerly requiring a Map-Reduce cluster can now be performed entirely in shared memory on a single server without using distributed programming.

Sharing Persistent Objects Between Python and Javascript

Inter-language example in interlanguage.{js,py} The animated GIF demonstrates the following steps:

Start Node.js REPL, create an EMS memory
Store "Hello"
Open a second session, begin the Python REPL
Connect Python to the EMS shared memory
Show the object created by JS is present in Python
Modify the object, and show the modification can be seen in JS
Exit both REPLs so no programs are running to "own" the EMS memory
Restart Python, show the memory is still present
Initialize a counter from Python
Demonstrate atomic Fetch and Add in JS
Start a loop in Python incrementing the counter
Simultaneously print and modify the value from JS
Try to read "empty" data from Python, the process blocks
Write the empty memory, marking it full, Python resumes execution

Types of Concurrency

<table> <tr> <td width="50%"> EMS extends application capabilities to include transactional memory and other fine-grained synchronization capabilities. EMS implements several different parallel execution models: <ul> <li> Fork-Join Multiprocess: execution begins with a single process that creates new processes when needed, those processes then wait for each other to complete. <li> Bulk Synchronous Parallel: execution begins with each process starting the program at the <code>main</code> entry point and executing all the statements <li> User Defined: parallelism may include ad-hoc processes and mixed-language applications </ul> </td> <td width="50%"> <center> <img height="350px" style="margin: 10px;" src="Docs/typesOfParallelism.svg" type="image/svg+xml" /> </center> </td> </tr> <tr> <td width="50%"> <center> <img height="350px" style="margin: 10px;" src="Docs/ParallelContextsBSP.svg" type="image/svg+xml" /> </center> </td> <td> <center> <img height="350px" style="margin: 10px;" src="Docs/ParallelContextsFJ.svg" type="image/svg+xml" /> </center> </td> </tr> </table>

Built in Atomic Operations

EMS operations may performed using any JSON data type, read-modify-write operations may use any combination of JSON data types. like operations on ordinary data.

Atomic read-modify-write operations are available in all concurrency modes, however collectives are not available in user defined modes.

Atomic Operations: Read, write, readers-writer lock, read when full and atomically mark empty, write when empty and atomically mark full
Primitives: Stacks, queues, transactions
Read-Modify-Write: Fetch-and-Add, Compare and Swap
Collective Operations: All basic OpenMP collective operations are implemented in EMS: dynamic, block, guided, as are the full complement of static loop scheduling, barriers, master and single execution regions

Examples and Benchmarks

API Documentation | EMS Website

Word Counting Using Atomic Operations

Word counting example

Map-Reduce is often demonstrated using word counting because each document can be processed in parallel, and the results of each document's dictionary reduced into a single dictionary. This EMS implementation also iterates over documents in parallel, but it maintains a single shared dictionary across processes, atomically incrementing the count of each word found. The final word counts are sorted and the most frequently appearing words are printed with their counts.

The performance of this program was measured using an Amazon EC2 instance: c4.8xlarge (132 ECUs, 36 vCPUs, 2.9 GHz, Intel Xeon E5-2666v3, 60 GiB memory The leveling of scaling around 16 cores despite the presence of ample work may be related to the use of non-dedicated hardware: Half of the 36 vCPUs are presumably HyperThreads or otherwise shared resource. AWS instances are also bandwidth limited to EBS storage, where our Gutenberg corpus is stored.

Bandwidth Benchmarking

STREAMS Example

A benchmark similar to STREAMS gives us the maximum speed EMS double precision floating point operations can be performed on a c4.8xlarge (132 ECUs, 36 vCPUs, 2.9 GHz, Intel Xeon E5-2666v3, 60 GiB memory.

Benchmarking of Transactions and Work Queues

Transactions and Work Queues Example

Transactional performance is measured alone, and again with a separate process appending new processes as work is removed from the queue. The experiments were run using an Amazon EC2 instance: <code>c4.8xlarge (132 ECUs, 36 vCPUs, 2.9 GHz, Intel Xeon E5-2666v3, 60 GiB memory</code>

Experiment Design

Six EMS arrays are created, each holding 1,000,000 numbers. During the benchmark, 1,000,000 transactions are performed, each transaction involves 1-5 randomly selected elements of randomly selected EMS arrays. The transaction reads all the elements and performs a read-modify-write operation involving at least 80% of the elements. After all the transactions are complete, the array elements are checked to confirm all the operations have occurred.

The parallel process scheduling model used is block dynamic (the default), where each process is responsible for successively smaller blocks of iterations. The execution model is bulk synchronous parallel, each processes enters the program at the same main entry point and executes all the statements in the program. forEach loops have their normal semantics of performing all iterations, parForEach loops are distributed across threads, each process executing only a portion of the total iteration space.

<table width=100%> <tr> <td width="50%"> <center> <img style="vertical-align:text-top;" src="Docs/tm_no_q.svg" /> Immediate Transactions: Each process generates a transaction on integer data then immediately performs it. </center> </td> <td width="50%"> <center> <img style="vertical-align:text-top;" src="Docs/tm_from_q.svg" /> Transactions from a Queue: One of the processes generates the individual transactions and appends them to a work queue the other threads get work from. Note: As the number of processes increases, the process generating the transactions and appending them to the work queue is starved out by processes performing transactions, naturally maximizing the data access rate. </center> </td> </tr> <tr> <td width="50%"> <center> <img style="vertical-align:text-top;" src="Docs/tm_no_q_str.svg"/> Immediate Transactions on Strings: Each process generates a transaction appending to a string, and then immediately performs the transaction. </center> </td> <td width="50%"> <center> Measurements </center> Elem. Ref'd: Total number of elements read and/or written Table Updates: Number of different EMS arrays (tables) written to Trans. Performed: Number of transactions performed across all EMS arrays (tables) Trans. Enqueued: Rate transactions are added to the work queue (only 1 generator thread in these experiments) </td> </tr> </table>

[Synchronization as a Property of the Data, Not a

Related Skills

node-connect

342.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

claude-opus-4-5-migration

85.3k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

frontend-design

85.3k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

model-usage

342.5k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

mogill

View profile

View on GitHub

GitHub Stars601

CategoryDevelopment

Updated28d ago

Forks35

mogill/ems

Languages

JavaScript

Security Score

85/100

Audited on Mar 3, 2026

No findings

Ems

Install / Use

README

API Documentation | EMS Website

Extended Memory Semantics (EMS)

Examples: Parallel web servers, word counting

Table of Contents

EMS is targeted at tasks too large for one core or one process but too small for a scalable cluster

Sharing Persistent Objects Between Python and Javascript

Types of Concurrency

Built in Atomic Operations

Examples and Benchmarks

API Documentation | EMS Website

Word Counting Using Atomic Operations

Bandwidth Benchmarking

Benchmarking of Transactions and Work Queues

Experiment Design

[Synchronization as a Property of the Data, Not a

Related Skills