Workflow
C++ Parallel Computing and Asynchronous Networking Framework
Install / Use
/learn @sogou/WorkflowREADME
Sogou C++ Workflow
As Sogou`s C++ server engine, Sogou C++ Workflow supports almost all back-end C++ online services of Sogou, including all search services, cloud input method, online advertisements, etc., handling more than 10 billion requests every day. This is an enterprise-level programming engine in light and elegant design which can satisfy most C++ back-end development requirements.
You can use it:
- To quickly build an HTTP server:
#include <stdio.h>
#include "workflow/WFHttpServer.h"
int main()
{
WFHttpServer server([](WFHttpTask *task) {
task->get_resp()->append_output_body("<html>Hello World!</html>");
});
if (server.start(8888) == 0) { // start server on port 8888
getchar(); // press "Enter" to end.
server.stop();
}
return 0;
}
- As a multifunctional asynchronous client, it currently supports
HTTP,Redis,MySQLandKafkaprotocols.MySQLprotocol supportsMariaDB,TiDBas well.
- To implement client/server on user-defined protocol and build your own RPC system.
- srpc is based on it and it is an independent open source project, which supports srpc, brpc, trpc and thrift protocols.
- To build asynchronous workflow; support common series and parallel structures, and also support any DAG structures.
- As a parallel computing tool. In addition to networking tasks, Sogou C++ Workflow also includes the scheduling of computing tasks. All types of tasks can be put into the same flow.
- As an asynchronous file IO tool in
Linuxsystem, with high performance exceeding any system call. Disk file IO is also a task. - To realize any high-performance and high-concurrency back-end service with a very complex relationship between computing and networking.
- To build a micro service system.
- This project has built-in service governance and load balancing features.
- Wiki link : PaaS Architecture
Compiling and Running Environment
- This project supports
Linux,macOS,Windows,Androidand other operating systems.Windowsversion is currently released as an independent branch, usingiocpto implement asynchronous networking. All user interfaces are consistent with theLinuxversion.
- Supports all CPU platforms, including 32 or 64-bit
x86processors, big-endian or little-endianarmprocessors,loongsonprocessors. - Master branch requires
OpenSSL 1.1or above, and BoringSSL is fully compatible. If you don't like SSL, you may checkout the nossl branch. - Uses the
C++11standard and therefore, it should be compiled with a compiler which supportsC++11. Does not rely onboostorasio. - No other dependencies. However, if you need
Kafkaprotocol, some compression libraries should be installed, includinglz4,zstdandsnappy.
Get Started (Linux, macOS):
git clone https://github.com/sogou/workflow
cd workflow
make
cd tutorial
make
With SRPC Tool (NEW!):
https://github.com/sogou/srpc/blob/master/tools/README.md
With apt-get on Debian Linux, ubuntu:
Sogou C++ Workflow has been packaged for Debian Linux and ubuntu 22.04.
To install the Workflow library for development purposes:
sudo apt-get install libworkflow-dev
To install the Workflow library for deployment:
sudo apt-get install libworkflow1
With dnf on Fedora Linux:
Sogou C++ Workflow has been packaged for Fedora Linux.
To install the Workflow library for development purposes:
sudo dnf install workflow-devel
To install the Workflow library for deployment:
sudo dnf install workflow
With xmake
If you want to use xmake to build workflow, you can see xmake build document
Tutorials
- Client
- Server
- Parallel tasks and Series
- Important topics
- Computing tasks
- Asynchronous File IO tasks
- User-defined protocol
- Other important tasks/components
- Service governance
- Connection context
- Built-in clients
Programming Paradigm
Program = Protocol + Algorithm + Workflow
- Protocol
- In most cases, users use built-in common network protocols, such as HTTP, Redis or various rpc.
- Users can also easily customize user-defined network protocol. In the customization, they only need to provide serialization and deserialization functions to define their own client/server.
- Algorithm
- In our design, the algorithm is a concept symmetrical to the protocol.
- If protocol call is rpc, then algorithm call is an apc (Async Procedure Call).
- We have provided some general algorithms, such as sort, merge, psort, reduce, which can be used directly.
- Compared with a user-defined protocol, a user-defined algorithm is much more common. Any complicated computation with clear boundaries should be packaged into an algorithm.
- In our design, the algorithm is a concept symmetrical to the protocol.
- Workflow
- Workflow is the actual business logic, which is to put the protocols and algorithms into the flow graph for use.
- The typical workflow is a closed series-parallel graph. Complex business logic may be a non-closed DAG.
- The workflow graph can be constructed directly or dynamically generated based on the results of each step. All tasks are executed asynchronously.
Structured Concurrency and Task Abstraction
- Our system contains five basic tasks: communication, computation, file IO, timer, and counter.
- All tasks are generated by the task factory, and users organize the concurrency structure by calling interfaces, such as series, parallel, DAG, etc.
- In most cases, the tasks generated by the user through the task factory is a complex task which encapsulates multiple asynchronous processes, but it is transparent to the user.
- For example, an HTTP request may include many asynchronous processes (DNS, redirection), but for user, it is just a networking task.
- File sorting seems to be an algorithm, but it actually includes many complex interaction processes between file IO and CPU computation.
- If you think of business logic as building circuits with well-designed electronic components, then each electronic component may be a complex circuit.
- The task abstraction mechanism greatly reduces the number of tasks users need to create and the depth of callbacks.
- Any task runs in a SeriesWork and the tasks in the same SeriesWork shares the series context, which simplifies data transfer between asynchronous tasks.
Callback and Memory Reclamation Mechanism
- All calls are executed asynchronously, and there is almost no operation that occupies a thread.
- Explicit callback mechanism. Users are aware that they are writing asynchronous programs.
- A set of object lifecycle mechanisms greatly simplifies memory management for asynchronous programs.
- The lifecycle of any task created by the framework is from creation until the callback function finishes running. There is no risk of leakage.
- If a task is created but the user does not want to run it, the user needs to release it through the
dismiss()interface.
- If a task is created but the user does not want to run it, the user needs to release it through the
- Any data in the task, such as the response of the network request, will also be recycled with the task. At this time, the user can use
std::move()to move the required data. - The project doesn’t use `std::shared_
- The lifecycle of any task created by the framework is from creation until the callback function finishes running. There is no risk of leakage.
