SkillAgentSearch skills...

ParallelAlgorithms

Parallel C++ algorithms

Install / Use

/learn @DragonSpit/ParallelAlgorithms
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

ParallelAlgorithms

High Performance Parallel (and Sequential) C++ Algorithms, which accompany "Practical Parallel Algorithms in C++ and C#" book.

Multi-Core Parallel Sorting Algorithms:

Algorithm|Random|Presorted|Constant|Description --- | --- | --- | --- | --- LSD Radix Sort |2865|2907|4769| 48-core AWS C7a.24xlarge (AMD) LSD Radix Sort |2338|2297|2255| 48-core AWS C7i.24xlarge (Intel) LSD Radix Sort | 952| 831| 846| 14-core Intel i7-12700H Merge Radix Sort | 877| 945| 971| 48-core AWS C7a.24xlarge (AMD) Merge Sort |1176|1143|1143| 144-core Azure HBV4 (AMD) Merge Sort | 695| 946|1954| 48-core AWS C7i.24xlarge (Intel) Merge Sort | 174| 275| 617| 14-core Intel i7-12700H Merge Sort (in-place) | 272| 502| 549| 48-core AWS C7i.24xlarge (Intel) Merge Sort (in-place) | 90| 234| 339| 14-core Intel i7-12700H

The above performance is in millions of unsigned 32-bit integers/second when sorting an array of 100 million elements. Benchmarks ran on Linux.

High Performance Single-Core Sequential Algorithms

Algorithm|Random|Presorted|Constant|Description --- | --- | --- | --- | --- LSD Radix Sort (two phase) |153|139|159| 1-core of Intel i7-12700H Merge Sort | 12|136|177| 1-core of Intel i7-12700H Merge Sort (in-place) | 12| 97|296| 1-core of Intel i7-12700H MSD Radix Sort (in-place) | 41| 48| 46| 1-core of Intel i7-12700H

LSD Radix Sort single-core with two additional performance tools:

  • novel two-phase implementation reduces passes over the array to (1 + D), where D is the number of digits
  • de-randomization of writes to bins

Other Algorithms

Sorting algorithms provided in this repository:

  • Single-core LSD Radix Sort: Novel Two Phase
  • Multi-core Parallel LSD Radix Sort : linear time
  • Multi-core Parallel Merge Sort
  • Single-core In-Place Merge Sort
  • Multi-core Parallel In-Place Merge Sort
  • Single-core In-Place MSD Radix Sort: linear time
  • Numerous hyrid sorting algorithms - e.g. Paralle Merge Insertion Sort
  • Merge Radix Sort hybrids: linear time
  • Improved adaptivity to memory resources, even with virtual memory
  • Count Sort
  • Parallel Histogram
  • Block Swap
  • Parallel Merge
  • Radix Sort to support non-integer data types
  • Safer Average calculations
  • Blazing Fast sort of byte array
  • De-Randomization of Radix Sort writes to bins
  • Recursive and non-recursive Parallel Sum
  • Bottom-up Non-Recursive In-Place Merge Sort

Windows support:

  • VisualStudio 2022 Microsoft compiler and Intel's OneAPI compiler. Solution is included.
  • Intel's Threading Building Blocks (TBB) and Microsoft's Parallel Patterns Library (PPL)
  • C++17

Linux support:

  • g++ using Intel's Threading Building Blocks (TBB)
  • C++20

Building on Ubuntu Linux (22.04 LTS)

To install g++ which supports C++17:

sudo apt update
sudo apt upgrade
# reboot the machine
sudo apt install build-essential

To update gcc to support c++17 standard, Parallel STL and Intel's Threading Building Blocks (TBB):

sudo apt install libtbb-dev
git clone https://github.com/DragonSpit/ParallelAlgorithms.git
cd ParallelAlgorithms

To build on WSL Ubuntu, use g++ command and not gcc. The order of the following arguments matters!

g++ ParallelAlgorithms.cpp ParallelStdCppExample.cpp RadixSortLsdBenchmark.cpp MemoryUsage.cpp CountingSortParallelBenchmark.cpp SumBenchmark.cpp RadixSortMsdBenchmark.cpp ParallelMergeSortBenchmark.cpp -ltbb -std=c++20 -O3 -o ParallelAlgorithms

To build on AWS Ubuntu, use g++ command and not gcc. The order of the following arguments matters!

g++ ParallelAlgorithms.cpp ParallelStdCppExample.cpp RadixSortLsdBenchmark.cpp MemoryUsage.cpp CountingSortParallelBenchmark.cpp SumBenchmark.cpp RadixSortMsdBenchmark.cpp ParallelMergeSortBenchmark.cpp -ltbb -std=c++2a -O3 -o ParallelAlgorithms

To run it:

./ParallelAlgorithms

Building on Windows

On Windows, Visual Studio 2022, free and paid versions are supported. To build the project use Visual Studio 2022 to open ParallelAlgorithms.sln file, then select Build/RebuildSolution. Once the project builds, select "Local Windows Debugger" to run it. Or, open the CommandPrompt application, go to the "x64\Release" directory, and run ParallelAlgorithms.exe.

By default, the solution/project uses Microsoft C++ compiler, which avoid requiring installation of Intel's OneAPI. Intel's OneAPI C++ compiler is supported by the ParallelAlgorithms.sln VisualStudio 2022 Solution. Once Intel's OneAPI, which is free, has been installed, select Project/IntelCompiler/UseIntelOneAPICompiler, followed by Build/RebuildSolution. Some of the algorithms are faster when build with Intel's OneAPI compiler.

Other Resources

Benchmarks of C++ Standard Parallel Algorithms (STL) are provided, with benchmark code in ParallelSTL repository, which builds and runs on Linix and Windows.

Blogs:

paypal

View on GitHub
GitHub Stars66
CategoryDevelopment
Updated18d ago
Forks7

Languages

C++

Security Score

95/100

Audited on Mar 9, 2026

No findings