Scheduling

A simple and fast minimalistic header-only library allowing to run async tasks and execute task graphs.

Generate Convert Improve

Install / Use

/learn @dpuyda/Scheduling

About this skill

Quality Score

0/100

README

Scheduling

Scheduling is a simple, minimalistic and fast header-only library allowing you to run async tasks and task graphs.

Scheduling is developed with simplicity and performance in mind.

Examples
Benchmarks
Implementation details
- Chase-Lev deque
- Tasks
License
- Third-party licenses
References

Below are some GitHub projects related to thread pools and task graphs in which you may be interested:

Taskflow
enkiTS
CGraph
libfork
https://github.com/bshoshany/thread-pool
https://github.com/DeveloperPaul123/thread-pool

Examples

We start with examples demonstrating how to run async tasks and task graphs.

Add Scheduling to your project

To add Scheduling to your project, you can use CMake. To build Scheduling from source, you need C++20 or newer. For example:

set(CMAKE_CXX_STANDARD 20)

When using CMake 3.11 or newer, you can use FetchContent. For example:

include(FetchContent)

FetchContent_Declare(
  scheduling
  GIT_REPOSITORY https://github.com/dpuyda/scheduling
  GIT_TAG main
  GIT_SHALLOW TRUE
)

FetchContent_MakeAvailable(scheduling)

Alternatively, you can add Scheduling as a subdirectory. For example:

add_subdirectory(scheduling)

Then, link Scheduling to your target. For example:

target_link_libraries(${PROJECT_NAME} PRIVATE scheduling)

[!NOTE]
Because of the use of thread-local variables, Scheduling does not support dynamic linking.

Run async tasks

To run async tasks, create a ThreadPool instance. For example:

#include "scheduling/scheduling.hpp"
...
scheduling::ThreadPool thread_pool;

In the constructor, the ThreadPool class creates several worker threads that will be running in the background until the instance is destroyed. As an optional argument, the constructor of the ThreadPool class accepts the number of threads to create. By default, the number of threads is equal to std::thread::hardware_concurrency().

When the ThreadPool instance is created, submit a task. For example:

thread_pool.Submit([] {
  std::this_thread::sleep_for(std::chrono::seconds(1));
  std::cout << "Completed.\n";
});

A task is a function that does not accept arguments and returns void. Use lambda captures to pass input and output arguments to the task. Eventually, the task will be executed on one of the worker threads owned by the ThreadPool instance.

If needed, call Wait() to block the current thread until all submitted tasks are completed. For example:

thread_pool.Wait();

In the destructor, the ThreadPool class blocks the current thread until all submitted tasks are completed.

Build task graphs

A task graph is a collection of tasks and dependencies between them. Dependencies between tasks define the order in which the tasks should be executed.

To define a task graph, create an iterable collection of Task instances. For example:

#include "scheduling/scheduling.hpp"
...
std::vector<scheduling::Task> tasks;

The Task instances should not be destroyed or reallocated until their execution is completed by ThreadPool. When using std::vector, it may be convenient to reserve enough space to prevent storage reallocation. Assume we are going to add seven tasks:

tasks.reserve(7);

Add elements to tasks. For example, add tasks to calculate the value of (a + b) * (c + d) asynchronously. First, add tasks to get the values of a, b, c and d:

int a, b, c, d;

auto& get_a = tasks.emplace_back([&] {
  a = 1;
});

auto& get_b = tasks.emplace_back([&] {
  b = 2;
});

auto& get_c = tasks.emplace_back([&] {
  c = 3;
});

auto& get_d = tasks.emplace_back([&] {
  d = 4;
});

Next, add tasks to calculate a + b and c + d:

int sum_ab, sum_cd;

auto& get_sum_ab = tasks.emplace_back([&] {
  sum_ab = a + b;
});

auto& get_sum_cd = tasks.emplace_back([&] {
  sum_cd = c + d;
});

Finally, add the task to calculate the product (a + b) * (c + d):

int product;

auto& get_product = tasks.emplace_back([&] {
  product = sum_ab * sum_cd;
});

When all tasks are added, define task dependencies. The task get_sum_ab should be executed after get_a and get_b:

get_sum_ab.Succeed(&get_a, &get_b);

Similarly, the task get_sum_cd should be executed after get_c and get_d:

get_sum_cd.Succeed(&get_c, &get_d);

Finally, the task get_product should be executed after get_sum_ab and get_sum_cd:

get_product.Succeed(&get_sum_ab, &get_sum_cd);

When dependencies between tasks are defined, create a ThreadPool instance and submit the task graph for execution:

scheduling::ThreadPool thread_pool;
thread_pool.Submit(tasks);

If needed, call Wait() to block the current thread until all submitted tasks are completed:

thread_pool.Wait();

Wait until a specific task is completed

If needed, use synchronization primitives to wait until a specific task is completed. For example:

scheduling::ThreadPool thread_pool;
std::atomic_flag is_completed;
thread_pool.Submit([&] {
  is_completed.test_and_set();
  is_completed.notify_one();
});
is_completed.wait(false);

Wait until a condition is satisfied

Under certain scenarios, you might need to block the current thread and execute submitted tasks in a loop until a specific condition is met. For example, this can happen when tasks are submitted recursively, and a task cannot complete until other tasks have finished. In such cases, all worker threads may become blocked by tasks waiting for other tasks to complete. To continue executing tasks while waiting for a condition to be satisfied, use the Wait overload that accepts a predicate.

For example, below is a recursive function to calculate Fibonacci numbers without memoization used in the benchmarks:

int Fibonacci(ThreadPool& thread_pool, const int n) {
  if (n < 2) {
    return 1;
  }
  int a, b;
  std::atomic counter{0};
  thread_pool.Submit([&, n] {
    a = Fibonacci(thread_pool, n - 1);
    counter.fetch_add(1);
  });
  thread_pool.Submit([&, n] {
    b = Fibonacci(thread_pool, n - 2);
    counter.fetch_add(1);
  });
  thread_pool.Wait([&] { return counter.load() == 2; });
  return a + b;
}

[!WARNING]
Beware of recursion! If Wait is called recursively, e.g., as shown in the example above, the stack depth can be exceeded.

Define dependencies between task groups

To define dependencies between task groups, use empty tasks.

For example, assume that the tasks a and b should be executed before the tasks c and d. To ensure the order, create an empty task d which will follow a and b and precede c and d:

scheduling::Task d;
d.Succeed(&a, &b);
d.Precede(&c, &d);

Cancel a task

To cancel a task, call Task::Cancel().

Cancelling a task never fails. If Task::Cancel() returns false, this means that the task has been invoked earlier or will be invoked at least once after the cancellation. When a task is cancelled and will not be invoked anymore, its successors also will not be invoked.

For example:

scheduling::Task task;
thread_pool.Submit(&task);
if (task.Cancel()) {
  std::cout << "The task will not be invoked.\n";
}

To undo cancellation, call Task::Reset():

task.Reset();

Resubmit a task graph

You can submit a task graph multiple times, but you need to make sure that the graph is completed before resubmitting it. You cannot submit the same graph simultaneously from different threads.

Benchmarks

We compare Scheduling with Taskflow, a highly optimized library for parallel and heterogeneous programming [1], [2]. We use Google Benchmark for benchmarking. Comparison of Taskflow and popular libraries for async programming can be found in Taskflow documentation.

We measure the total time needed to create and execute tasks. The benchmarks were run on Intel(R) Xeon(R) CPU i7-1165G7 @ 2.80GHz with 32 CPUs, running Debian GNU/Linux 12 (bookworm):

Run on (32 X 2800.27 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x16)
  L1 Instruction 32 KiB (x16)
  L2 Unified 1024 KiB (x16)
  L3 Unified 33792 KiB (x2)
Load Average: 3.83, 2.00, 0.78

The comparison was made between commit ae8a367 of Scheduling and commit 9823d00 of Taskflow.

Fibonacci numbers

To b

Related Skills

node-connect

338.0k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

83.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

338.0k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

83.4k

Commit, push, and open a PR