Libsais
The libsais library provides fast linear-time construction of suffix array (SA), generalized suffix array (GSA), longest common prefix (LCP) array, permuted LCP (PLCP) array, Burrows-Wheeler transform (BWT) and inverse BWT based on the induced sorting algorithm with optional OpenMP support for multi-core parallel construction.
Install / Use
/learn @IlyaGrebnov/LibsaisREADME
libsais
The libsais library provides fast (see Benchmarks below) linear-time construction of suffix array (SA), generalized suffix array (GSA), longest common prefix (LCP) array, permuted LCP (PLCP) array, Burrows-Wheeler transform (BWT) and inverse BWT, based on the induced sorting algorithm described in the following papers (with optional OpenMP support for multi-threaded parallel construction):
- Ge Nong, Sen Zhang, Wai Hong Chan Two Efficient Algorithms for Linear Suffix Array Construction, 2009
- Juha Karkkainen, Giovanni Manzini, Simon J. Puglisi Permuted Longest-Common-Prefix Array, 2009
- Nataliya Timoshevskaya, Wu-chun Feng SAIS-OPT: On the characterization and optimization of the SA-IS algorithm for suffix array construction, 2014
- Jing Yi Xie, Ge Nong, Bin Lao, Wentao Xu Scalable Suffix Sorting on a Multicore Machine, 2020
Copyright (c) 2021-2025 Ilya Grebnov ilya.grebnov@gmail.com
The libsais is inspired by libdivsufsort, sais libraries by Yuta Mori and msufsort by Michael Maniscalco.
libcubwt
If you are looking for even faster construction times, you can try libcubwt a library for GPU-based suffix array, inverse suffix array and Burrows-Wheeler transform construction.
Introduction
The libsais provides simple C99 API to construct suffix array and Burrows-Wheeler transformed string from a given string over constant-size alphabet. The algorithm runs in a linear time using typically only ~16KB of extra memory (with 2n bytes as absolute worst-case; where n is the length of the string). OpenMP acceleration uses 200KB of addition memory per thread.
- The libsais works with compilers from GNU, Microsoft and Intel, but I recommend Clang for best performance.
- The libsais is sensitive to fast memory and software prefetching and might not be suitable for some workloads. Please benchmark yourself.
License
The libsais is released under the Apache License Version 2.0
Multi-threading
The libsais is memory-bound, so performance scales primarily with memory bandwidth and concurrency, not raw compute. The optimal number of threads for suffix array construction depends on CPU and memory architecture, as different systems (DDR4 vs DDR5, Intel vs AMD, single- vs multi-CCD) saturate memory at different points with maximum throughput generally following the number of memory channels rather than total core count. The x86-64 dual-channel systems typically saturate near 8 threads, but in practice may show good scaling at 6, 12 or even 16 threads.
Changes
For the full changelog, see CHANGES.
- September 1, 2025 (2.10.4)
- Tuned prefetch distance for improved throughput.
- August 12, 2025 (2.10.3)
- No functional changes, added CMake install and export package rules.
- June 10, 2025 (2.10.2)
- Improved performance of suffix array and burrows wheeler transform construction on degenerate inputs.
- May 11, 2025 (2.10.1)
- No functional changes, slightly improved performance.
- April 12, 2025 (2.10.0)
- Improved performance, with noticeable gains on ARM architecture.
- Fixed compiler warnings and addressed undefined behavior.
- March 19, 2025 (2.9.1)
- No functional changes, resolved compiler warnings & undefined behavior.
- March 16, 2025 (2.9.0)
- Support for generalized suffix array (GSA) construction.
- Support for longest common prefix array (LCP) construction for generalized suffix array (GSA).
Versions of the libsais
- libsais.c (and corresponding libsais.h) is for suffix array, GSA, PLCP, LCP, forward BWT and reverse BWT construction over 8-bit inputs smaller than 2GB (2147483648 bytes).
- libsais64.c (and corresponding libsais64.h) is optional extension of the library for inputs larger or equlas to 2GB (2147483648 bytes).
- This versions of the library could also be used to construct suffix array of an integer array (with a caveat that input array must be mutable).
- libsais16.c + libsais16x64.c (and corresponding libsais16.h + libsais16x64.h) is independent version of the library for 16-bit inputs.
- This version of the library could also be used to construct suffix array and BWT of a set of strings by adding a unique end-of-string symbol to each string and then computing the result for the concatenated string.
Examples of APIs (see libsais.h, libsais16.h, libsais16x64.h and libsais64.h for complete APIs list)
/**
* Constructs the suffix array of a given string.
* @param T [0..n-1] The input string.
* @param SA [0..n-1+fs] The output array of suffixes.
* @param n The length of the given string.
* @param fs The extra space available at the end of SA array (0 should be enough for most cases).
* @param freq [0..255] The output symbol frequency table (can be NULL).
* @return 0 if no error occurred, -1 or -2 otherwise.
*/
int32_t libsais(const uint8_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq);
/**
* Constructs the suffix array of a given integer array.
* Note, during construction input array will be modified, but restored at the end if no errors occurred.
* @param T [0..n-1] The input integer array.
* @param SA [0..n-1+fs] The output array of suffixes.
* @param n The length of the integer array.
* @param k The alphabet size of the input integer array.
* @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance).
* @return 0 if no error occurred, -1 or -2 otherwise.
*/
int32_t libsais_int(int32_t * T, int32_t * SA, int32_t n, int32_t k, int32_t fs);
/**
* Constructs the burrows-wheeler transformed string (BWT) of a given string.
* @param T [0..n-1] The input string.
* @param U [0..n-1] The output string (can be T).
* @param A [0..n-1+fs] The temporary array.
* @param n The length of the given string.
* @param fs The extra space available at the end of A array (0 should be enough for most cases).
* @param freq [0..255] The output symbol frequency table (can be NULL).
* @return The primary index if no error occurred, -1 or -2 otherwise.
*/
int32_t libsais_bwt(const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, int32_t fs, int32_t * freq);
/**
* Constructs the original string from a given burrows-wheeler transformed string (BWT) with primary index.
* @param T [0..n-1] The input string.
* @param U [0..n-1] The output string (can be T).
* @param A [0..n] The temporary array (NOTE, temporary array must be n + 1 size).
* @param n The length of the given string.
* @param freq [0..255] The input symbol frequency table (can be NULL).
* @param i The primary index.
* @return 0 if no error occurred, -1 or -2 otherwise.
*/
int32_t libsais_unbwt(const uint8_t * T, uint8_t * U, int32_t * A, int32_t n, const int32_t * freq, int32_t i);
/**
* Constructs the permuted longest common prefix array (PLCP) of a given string and a suffix array.
* @param T [0..n-1] The input string.
* @param SA [0..n-1] The input suffix array.
* @param PLCP [0..n-1] The output permuted longest common prefix array.
* @param n The length of the string and the suffix array.
* @return 0 if no error occurred, -1 otherwise.
*/
int32_t libsais_plcp(const uint8_t * T, const int32_t * SA, int32_t * PLCP, int32_t n);
Example installation using CPM
CPMAddPackage(
NAME libsais
GITHUB_REPOSITORY IlyaGrebnov/libsais
GIT_TAG v2.10.4
OPTIONS
"LIBSAIS_USE_OPENMP OFF"
"LIBSAIS_BUILD_SHARED_LIB OFF"
)
target_link_libraries(<your target> libsais)
Algorithm description
The libsais uses the SA-IS (Suffix Array Induced Sorting) algorithm to construct both the suffix array and the Burrows-Wheeler transform through recursive decomposition and induced sorting:
- Initially, the algorithm classifies each position in a string as either an S-type or an L-type, based on whether the suffix starting at that position is lexicographically smaller or larger than the suffix at the adjacent right position. Positions identified as S-type, which have an adjacent left L-type position, are further categorized as LMS-type (Leftmost S-type) positions. Next, the algorithm splits the input string into LMS substrings, which start at an LMS-type position and extend up to the next adjacent LMS-type position. These LMS substrings are then lexicographically sorted through induced sorting and subsequently replaced in the input string with their corresponding sorted ranks, thus forming a new, compacted string. This compacted string reduces the problem size, enabling the algorithm to perform a recursive decomposition in which it is reapplied to construct the suffix array for the compacted string. And at the end of the recursive call, the suffix array for the input string is constructed from the suffix array of the compacted string using another round of induced sorting.
- The induced sorting is a core mechanic of the SA-IS algorithm and is employed twice during each recursive call: initially before the recursive call to establish the order of LMS substrings, and subsequently after the recursive call to finalize the order of the suffixes of the string. This process involves two sequential scans: a left-to-right scan that determines the order of L-type positions based on the LMS-type positions, followed by a right-to-left scan that establishes the order of S-type positions base
Related Skills
openhue
337.4kControl Philips Hue lights and scenes via the OpenHue CLI.
sag
337.4kElevenLabs text-to-speech with mac-style say UX.
weather
337.4kGet current weather and forecasts via wttr.in or Open-Meteo
tweakcc
1.4kCustomize Claude Code's system prompts, create custom toolsets, input pattern highlighters, themes/thinking verbs/spinners, customize input box & user message styling, support AGENTS.md, unlock private/unreleased features, and much more. Supports both native/npm installs on all platforms.
