Aircompressor
A port of Snappy, LZO, LZ4, and Zstandard to Java
Install / Use
/learn @airlift/AircompressorREADME
Compression for Java
This library provides a set of compression algorithms implemented in pure Java and
where possible native implementations. The Java implementations use sun.misc.Unsafe
to provide fast access to memory. The native implementations use java.lang.foreign
to interact directly with native libraries without the need for JNI.
Usage
Each algorithm provides a simple block compression API using the io.airlift.compress.v3.Compressor
and io.airlift.compress.v3.Decompressor classes. Block compression is the simplest form of
which simply compresses a small block of data provided as a byte[], or more generally a
java.lang.foreign.MemorySegment. Each algorithm may have one or more streaming format
which typically produces a sequence of block compressed chunks.
byte array API
byte[] data = ...
Compressor compressor = new Lz4JavaCompressor();
byte[] compressed = new byte[compressor.maxCompressedLength(data.length)];
int compressedSize = compressor.compress(data, 0, data.length, compressed, 0, compressed.length);
Decompressor decompressor = new Lz4JavaDecompressor();
byte[] uncompressed = new byte[data.length];
int uncompressedSize = decompressor.decompress(compressed, 0, compressedSize, uncompressed, 0, uncompressed.length);
MemorySegment API
Arena arena = ...
MemorySegment data = ...
Compressor compressor = new Lz4JavaCompressor();
MemorySegment compressed = arena.allocate(compressor.maxCompressedLength(toIntExact(data.byteSize())));
int compressedSize = compressor.compress(data, compressed);
compressed = compressed.asSlice(0, compressedSize);
Decompressor decompressor = new Lz4JavaDecompressor();
MemorySegment uncompressed = arena.allocate(data.byteSize());
int uncompressedSize = decompressor.decompress(compressed, uncompressed);
uncompressed = uncompressed.asSlice(0, uncompressedSize);
Algorithms
Zstandard (Zstd) (Recommended)
Zstandard is the recommended algorithm for most compression. It provides superior compression and performance at all levels compared to zlib. Zstandard is an excellent choice for most use cases, especially storage and bandwidth constrained network transfer.
The native implementation of Zstandard is provided by the ZstdNativeCompressor and
ZstdNativeDecompressor classes. The Java implementation is provided by the
ZstdJavaCompressor and ZstdJavaDecompressor classes.
The Zstandard streaming format is supported by ZstdInputStream and ZstdOutputStream.
LZ4
LZ4 is an extremely fast compression algorithm that provides compression ratios comparable to Snappy and LZO. LZ4 is an excellent choice for applications that require high-performance compression and decompression.
The native implementation of LZ4 is provided by Lz4NativeCompressor and Lz4NativeDecompressor.
The Java implementation is provided by Lz4JavaCompressor and Lz4JavaDecompressor.
Snappy
Snappy is not as fast as LZ4, but provides a guarantee on memory usage that makes it a good choice for extremely resource-limited environments (e.g. embedded systems like a network switch). If your application is not highly resource constrained, LZ4 is a better choice.
The native implementation of Snappy is provided by SnappyNativeCompressor and SnappyNativeDecompressor.
The Java implementation is provided by SnappyJavaCompressor and SnappyJavaDecompressor.
The Snappy framed format is supported by SnappyFramedInputStream and SnappyFramedOutputStream.
LZO
LZO is only provided for compatibility with existing systems that use LZO. We recommend rewriting LZO data using Zstandard or LZ4.
The Java implementation of LZO is provided by LzoJavaCompressor and LzoJavaDecompressor.
Due to licensing issues, the LZO only has a Java implementation which is based on LZ4.
Deflate
Deflate is the block compression algorithm used by the gzip and zlib libraries. Deflate is
provided for compatibility with existing systems that use Deflate. We recommend rewriting
Deflate data using Zstandard which provides superior compression and performance.
The implementation of Deflate is provided by DeflateCompressor and DeflateDecompressor.
This is implemented in the built-in Java libraries which internally use the native code.
Hash Functions
XXHash3 (Recommended)
XXHash3 is the latest generation of the XXHash family, providing faster hashing than XXHash64 at all input sizes. It supports both 64-bit and 128-bit hash outputs.
XXHash3 is only available as a native implementation via XxHash3Native. There is no Java
implementation available. The 128-bit variant has approximately 12ns of constant overhead due
to Java FFM pulling the 128-bit result back into Java. At small inputs (<512 bytes) this
overhead is noticeable, but at larger sizes (8KB+) it becomes a rounding error as hash
computation dominates (measured on M4 Apple Silicon).
// One-shot hashing (64-bit)
long hash = XxHash3Native.hash(data);
// One-shot hashing (128-bit)
XxHash128 hash = XxHash3Native.hash128(data);
// Streaming hashing (64-bit)
try (XxHash3Hasher hasher = XxHash3Native.newHasher()) {
hasher.update(chunk1);
hasher.update(chunk2);
long hash = hasher.digest();
}
// Streaming hashing (128-bit)
try (XxHash3Hasher128 hasher = XxHash3Native.newHasher128()) {
hasher.update(chunk1);
hasher.update(chunk2);
XxHash128 hash = hasher.digest();
}
XXHash64
XXHash64 is an extremely fast non-cryptographic hash function with excellent distribution properties.
The native implementation is provided by XxHash64NativeHasher and the Java implementation
is provided by XxHash64JavaHasher. The XxHash64Hasher interface provides static methods
that automatically select the best available implementation.
// One-shot hashing
long hash = XxHash64Hasher.hash(data);
long hash = XxHash64Hasher.hash(data, seed);
// Streaming hashing
try (XxHash64Hasher hasher = XxHash64Hasher.create()) {
hasher.update(chunk1);
hasher.update(chunk2);
long hash = hasher.digest();
}
Hadoop Compression
In addition to the raw block encoders, there are implementations of the Hadoop streams for the above algorithms. In addition, implementations of gzip and bzip2 are provided so that all standard Hadoop algorithms are available.
The HadoopStreams class provides a factory for creating InputStream and OutputStream
implementations without the need for any Hadoop dependencies. For environments
that have Hadoop dependencies, each algorithm also provides a CompressionCodec class.
Requirements
This library requires a Java 22+ virtual machine containing the sun.misc.Unsafe interface running on a little endian platform.
Configuration
Temporary directory used to unpack and load native libraries can be configured using the aircompressor.tmpdir system property,
with a default value of java.io.tmpdir. This is useful when the default temporary directory is mounted as noexec.
Loading of native libraries can be disabled entirely by setting the io.airlift.compress.v3.disable-native system property.
Users
This library is used in projects such as Trino (https://trino.io), a distributed SQL engine.
Related Skills
node-connect
342.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
342.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.7kCommit, push, and open a PR
