SkillAgentSearch skills...

Ipc

[Start here!] Flow-IPC - Modern C++ toolkit for high-speed inter-process communication (IPC)

Install / Use

/learn @Flow-IPC/Ipc

README

Flow-IPC: Modern C++ toolkit for fast inter-process communication (IPC)

In this context, IPC means the sharing or transmission of a data structure from one process to another. In C++ systems programing, this is a common activity with significant impact on system performance. E.g., it is used heavily in microservices.

In serious C++ applications, high-performance IPC code tends to be difficult to develop and reuse, and the most obvious and effective technique to combat latency -- avoiding copying -- further increases the difficulty and decreases reusability by an order of magnitude.

This project -- Flow-IPC -- enables C++ code for IPC that is both performant and easy to develop/reuse, with no trade-off between the two.

Flow-IPC is for C++17 (or higher) programs built for Linux that run on x86-64 processors. (Support for macOS/BSD and ARM64 is planned as an incremental task. Adding networked IPC is also a natural next step, depending on demand.)

Documentation

The guided Manual explains how to use Flow-IPC. A comprehensive Reference is inter-linked with that Manual.

The project web site contains links to documentation for each individual release as well.

Please see below, in this README, for a Primer as to the specifics of Flow-IPC.

Obtaining the source code

  • As a tarball/zip: The project web site links to individual releases with notes, docs, download links.
  • Via Git: git clone --recurse-submodules git@github.com:Flow-IPC/ipc.git
    • Note: Don't forget --recurse-submodules.

Installation

See INSTALL guide.

Contributing

See CONTRIBUTING guide.


Flow-IPC Primer

Background

Flow-IPC focuses on IPC of data structures (and native sockets a/k/a FDs). I.e., the central scenario is: Process P1 has a data structure X, and it wants process P2 to access it (or a copy thereof) ASAP.

The OS and third-parties already avail C++ developers of many tools for/around IPC. Highlights:

  • Pipes, Unix domain socket streams, message queues (MQs), and more such IPC transports allow transmitting data (binary blobs and sometimes FDs). Data are copied into the kernel by P1, then out of the kernel by P2.
  • P1 can put X into shared memory (SHM) and signal P2 to access it directly there, eliminating both copy-X operations.
  • Zero-copy schema-based serialization tools, the best of which is Cap'n Proto, hugely help in representing structured data within binary blobs.

Conceptually, the IPC op above is not so different from triggering a function call with argument X in a different thread -- but across process boundaries. Unfortunately, in comparison to triggering F(X) in another thread in-process:

  • The resulting machine code is much slower, using more processor cycles and memory.
  • The source code to achieve it is much more difficult to develop and reuse, even with the help of powerful APIs including Boost.interprocess and Boost.asio.
    • If one wishes to avoid copying X -- the basic cause of the slowness -- one must use SHM to store X. This increases the difficulty 10-fold, and the resulting code is rarely reusable.

How does Flow-IPC help?

With Flow-IPC, the above IPC op is easy to code, for any form of "X," whether: blobs, FDs, nested STL-compliant containers, C-style structs with pointers, or Cap'n Proto schema-based structured data.

Moreover, it eliminates all copying of X -- which results in the best possible performance. This is called end-to-end zero-copy.

Example: End-to-end zero-copy performance, Cap'n Proto payload

graph: perf_demo capnp-classic versus capnp-Flow-IPC

The graph above is an example of the performance gains you can expect when using Flow-IPC zero-copy transmission, from the included perf_demo tool. (Here we use Cap'n Proto-described data. Native C++ structures have a similar performance profile.) In the graph, we compare the RTTs (latencies) of two techniques, for transmitted payloads of various sizes.

  • The blue line shows the latency (RTT) when using "classic" IPC over a Unix-domain stream socket. The server ::write()s the capnp-generated serialization, in order, into the socket FD; the client ::read()s it out of there.
  • The orange line shows the RTT when using Flow-IPC with zero-copy enabled.

In this example, app 1 is a memory-caching server that has pre-loaded into RAM a few files ranging in size from 100kb to 1Gb. App 2 (client) requests a file of some size. App 1 (server) responds with a single message containing the file's data structured as a sequence of chunks, each accompanied by that chunk's hash:

# Cap'n Proto schema (.capnp file, generates .h and .c++ source code using capnp compiler tool):

$Cxx.namespace("perf_demo::schema");
struct Body
{
  union
  {
    getCacheReq @0 :GetCacheReq;
    getCacheRsp @1 :GetCacheRsp;
  }
}

struct GetCacheReq
{
  fileName @0 :Text;
}
struct GetCacheRsp
{
  # We simulate the server returning files in multiple equally-sized chunks, each sized at its discretion.
  struct FilePart
  {
    data @0 :Data;
    dataSizeToVerify @1 :UInt64; # Recipient can verify that `data` blob's size is indeed this.
    dataHashToVerify @2 :Hash; # Recipient can hash `data` and verify it is indeed this.
  }
  fileParts @0 :List(FilePart);
}
# ...

App 2 receives the GetCacheRsp message and prints the round-trip time (RTT): from just before sending GetCacheReq to just after accessing some of the file data (e.g. rsp_root.getFileParts()[0].getHashToVerify() to check the first hash). This RTT is the IPC-induced latency: roughly speaking the time penalty compared to having a monolithic (1-process) application (instead of the split into app 1 and app 2).

Observations (tested using decent server-grade hardware):

  • With Flow-IPC: the round-trip latency is ~100 microseconds regardless of the size of the payload.
  • Without Flow-IPC: the latency is about 1 millisecond for a 1-megabyte payload and approaching a full second for a 1-gigabyte file.
    • Also significantly more RAM might be used at points.
  • For very small messages the two techniques perform similarly: ~100 microseconds.

The code for this, when using Flow-IPC, is straighforward. Here's how it might look on the client side:

// Specify that we *do* want zero-copy behavior, by merely choosing our backing-session type.
// In other words, setting this alias says, “be fast about Cap’n Proto things.”
// (Different (subsequent) capnp-serialization-backing and SHM-related behaviors are available;
// just change this alias’s value. E.g., omit `::shm::classic` to disable SHM entirely; or
// specify `::shm::arena_lend::jemalloc` to employ jemalloc-based SHM. Subsequent code remains
// the same! This demonstrates a key design tenet of Flow-IPC.)
using Session = ipc::session::shm::classic::Client_session<...>;

// IPC app universe: simple structs naming and describing the 2 apps involved.
//   - Name the apps, so client knows where to find server, and server knows who can connect to it.
//   - Specify certain items -- binary location, user/group -- will be cross-checked with the OS for safety.
//   - Specify a safety/permissions policy, so that internally permissions are set as restrictively as possible,
//     but not more.
// The applications should share this code (so the same statement should execute in the server app also).
const ipc::session::Client_app CLI_APP
  { "cacheCli",                                     // Name.
    "/usr/bin/cache_client.exec", CLI_UID, GID };   // Safety details.
const ipc::session::Server_app SRV_APP
  { { "cacheSrv", "/usr/bin/cache_server.exec", SRV_UID, GID },
    { CLI_APP.m_name },                             // Which apps may connect?  cacheCli may.
    "",                                             // (Optional path override; disregard.)
    ipc::util::Permissions_level::S_GROUP_ACCESS }; // Safety/permissions selector.
// ...

// Open session e.g. near start of program.  A session is the communication context between the processes
// engaging in IPC.  (You can create communication channels at will from the `session` object.  No more naming!)
Session session{ CLI_APP, SRV_APP, on_session_closed_func };
// Ask for 1 communication *channel* to be available on both sides from the very start of the session.
Session::Channels ipc_raw_channels(1);
session.sync_connect(session.mdt_builder(), &ipc_raw_channels); // Instantly open session -- and the 1 channel.
auto& ipc_raw_channel = ipc_raw_channels[0];
// (Can also instantly open more channel(s) anytime: `session.open_channel(&channel)`.)

// ipc_raw_channel is a raw (unstructured) channel for blobs (and/or FDs).  We want to speak capnp over it,
// so we upgrade it to a struc::Channel -- note the capnp-generated `perf_demo::schema::Body` class, as
// earlier declared in the .capnp schema.
Session::Structured_channel<perf_demo::schema::Body>
  ipc_channel{ nullptr, std::move(ipc_raw_channel), // "Eat" the raw channel object.
               ipc::transport::struc::Channel_base::S_SERIALIZE_VIA_SESSION_SHM, &session };
// Ready to exchange capnp messages via ipc_channel.

// ...

// Issue request and process response.  TIMING FOR ABOVE GRAPH STARTS HERE -->
auto req_msg = ipc_channel.create_msg();
req_msg.body_root()
  ->initGetCacheReq().setFileName("huge-file.bin"); // Vanilla capnp code: call Cap'n Proto-generated-API: mutators.

Related Skills

View on GitHub
GitHub Stars461
CategoryDevelopment
Updated2d ago
Forks24

Languages

C++

Security Score

100/100

Audited on Mar 24, 2026

No findings